Arm Code Mcp

STDIOregistry active

Summary

If you're profiling Linux workloads on Arm64, this server plugs three optimization steps directly into Claude. It parses perf report stdio output into ranked hot symbols, searches a curated knowledge base of 110 NEON intrinsics using semantic similarity to suggest the right vmlaq_f32 or vaddq_u8 for your hot loop, and audits Python requirements files for packages that lack arm64 wheels or need special handling. All three tools run offline in a Docker container over stdio. The NEON retrieval hits the right intrinsic in the top 3 results 93% of the time on the eval set. Useful when you're porting x86 SIMD code or chasing down dependency blockers before deploying to Graviton.

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

arm-code-mcp

An MCP server that helps AI assistants optimize Linux workloads on Arm64. It parses perf report output, recommends NEON SIMD intrinsics for hot loops, and audits Python dependency manifests for arm64 wheel availability — all offline, all structured, all callable from Claude Code, GitHub Copilot, and Codex.

What's inside

analyze_perf_output — parse perf report --stdio into a ranked list of hot symbols
suggest_neon_intrinsic — semantic + keyword search over 110 curated NEON intrinsics
check_arm64_deps — flag packages in requirements.txt, pyproject.toml, or Dockerfile that lack arm64 wheels or require special handling

Prerequisites

Docker
An MCP-compatible AI assistant (Claude Code, GitHub Copilot, Codex)

Quick start

docker pull jeannjohnson/arm-code-mcp:latest

Add to your MCP client config (e.g. ~/.claude/mcp.json):

{
  "mcpServers": {
    "arm-code-mcp": {
      "command": "docker",
      "args": ["run", "--rm", "-i", "jeannjohnson/arm-code-mcp:latest"]
    }
  }
}

Restart your client. All three tools are now available.

Tools

`analyze_perf_output`

Parse raw perf report --stdio output and return the top hot symbols, ranked by overhead.

analyze_perf_output(
    perf_report_text: str,          # raw stdout of `perf report --stdio`
    top_n: int = 10,                # max symbols to return
    min_overhead_pct: float = 0.5,  # ignore symbols below this %
) -> dict

Example response:

{
  "summary": {
    "total_samples": 5432100,
    "total_events": null,
    "command": "myapp"
  },
  "hot_symbols": [
    {"overhead_pct": 24.17, "samples": 1245, "command": "myapp",
     "module": "myapp", "symbol": "process_buffer"},
    {"overhead_pct": 12.34, "samples": 636, "command": "myapp",
     "module": "libc-2.31.so", "symbol": "__memcpy_avx_unaligned_erms"}
  ],
  "warnings": []
}

`suggest_neon_intrinsic`

Recommend NEON intrinsics for a hot loop using hybrid semantic + exact-name retrieval over a curated knowledge base of 110 intrinsics.

suggest_neon_intrinsic(
    operation_description: str,    # e.g. "32-bit float multiply-accumulate"
    target_arch: str = "armv8-a",  # "armv8-a" | "armv8.2-a" | "armv9-a"
    top_k: int = 5,
) -> dict

Example response:

{
  "matches": [
    {
      "intrinsic": "vmlaq_f32",
      "signature": "float32x4_t vmlaq_f32(float32x4_t a, float32x4_t b, float32x4_t c)",
      "header": "<arm_neon.h>",
      "min_arch": "armv8-a",
      "description": "Multiply-accumulate: a + (b * c), lane-wise, 4x f32.",
      "score": 0.9142
    }
  ],
  "notes": "Filtered to armv8-a. KB contains 110 entries (103 compatible)."
}

`check_arm64_deps`

Scan a dependency manifest and flag packages with known arm64 compatibility issues. Fully offline — no network calls, fast, deterministic.

check_arm64_deps(
    file_content: str,                    # raw text of the manifest
    file_type: str = "requirements.txt",  # "requirements.txt" | "pyproject.toml" | "Dockerfile"
) -> dict

Example response:

{
  "checked": ["numpy", "tensorflow", "cupy-cuda12x", "faiss-cpu", "requests"],
  "issues": [
    {"package": "cupy-cuda12x", "severity": "error",
     "message": "GPU-only package with no arm64 wheel. Use cupy with ROCm or a CPU fallback."},
    {"package": "tensorflow", "severity": "warning",
     "message": "Official TensorFlow PyPI wheels are x86-only before 2.10; use tensorflow-aarch64 or build from source."},
    {"package": "faiss-cpu", "severity": "warning",
     "message": "No official arm64 wheel on PyPI; build from source or use the conda-forge package."},
    {"package": "numpy", "severity": "info",
     "message": "arm64 wheels available from PyPI since 1.21.0. Ensure version >= 1.21.0."}
  ],
  "summary": "Checked 5 package(s): 1 error(s), 2 warning(s), 1 info(s)."
}

Severity levels:

Level	Meaning
`error`	No arm64 wheel exists (e.g. GPU-only packages)
`warning`	Wheel exists but requires a workaround or alternative source
`info`	Wheel available; version constraint or system-lib note applies

Configuration

All env vars are optional. The server works with no configuration.

Variable	Default	Description
`ARM_CODE_MCP_LOG_LEVEL`	`INFO`	Log verbosity: `DEBUG`, `INFO`, `WARNING`
`ARM_CODE_MCP_KB_PATH`	bundled JSONL	Override path to `neon_intrinsics.jsonl`
`ARM_CODE_MCP_CACHE_DIR`	`~/.cache/arm-code-mcp`	Embedding cache directory

Pass env vars to the container:

docker run --rm -i \
  -e ARM_CODE_MCP_LOG_LEVEL=DEBUG \
  jeannjohnson/arm-code-mcp:latest

Evaluation

suggest_neon_intrinsic is evaluated against 15 hand-curated (query, expected intrinsic) pairs using the real all-MiniLM-L6-v2 embedding model. Current baseline:

Metric	Score
hit@1	0.667
hit@3	0.933
hit@5	1.000
MRR	0.817

The regression guard exits non-zero if hit@3 drops below 0.70.

Run the eval harness locally:

uv sync
make eval

See eval/README.md for methodology and known limitations.

Development

git clone https://github.com/jean-johnson-zwix/arm-code-mcp
cd arm-code-mcp
uv sync
make test    # 78 tests
make lint    # ruff check + format
make eval    # real model, 15 gold queries

Makefile targets:

Target	Description
`make setup`	`uv sync` + pre-commit install
`make test`	Run the full test suite
`make lint`	ruff check + ruff format --check
`make eval`	Run the NEON retrieval eval harness
`make docker-build`	Build `arm-code-mcp:dev` locally
`make docker-run`	Run the local dev image over stdio

Multi-arch images (linux/amd64 + linux/arm64) are built and pushed automatically by .github/workflows/release.yml on v*.*.* tags.

Knowledge base maintenance

The NEON intrinsics knowledge base lives in src/arm_code_mcp/kb/data/neon_intrinsics.jsonl (110 entries). To add intrinsics or refresh after a model upgrade, see docs/kb-refresh.md.

Roadmap

Tools

parse_flamegraph — extract hot paths from Linux perf flamegraph SVG
suggest_sve2_intrinsic — extend retrieval to SVE2 intrinsics (Neoverse V2, Cortex-X4)

Eval

Multi-query paraphrase expansion for each gold pair
Reranking pass over semantic candidates
Larger gold set (50+ queries) for lower metric variance

Demo

Coming soon.

Contributing

Stars, forks, and issues are welcome. Open a PR or file an issue on GitHub.

Good first issues:

Add more NEON intrinsic entries to kb/data/neon_intrinsics.jsonl
Add gold eval queries for SVE2 intrinsics
Add parse_flamegraph tool for Linux perf flamegraph SVG files

License

Apache 2.0 — same as arm/mcp.

Featured

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

Configuration

ARM_CODE_MCP_LOG_LEVEL

Log verbosity (DEBUG, INFO, WARNING). Default: INFO.

arm-code-mcp

What's inside

analyze_perf_output — parse perf report --stdio into a ranked list of hot symbols
suggest_neon_intrinsic — semantic + keyword search over 110 curated NEON intrinsics
check_arm64_deps — flag packages in requirements.txt, pyproject.toml, or Dockerfile that lack arm64 wheels or require special handling

Prerequisites

Docker
An MCP-compatible AI assistant (Claude Code, GitHub Copilot, Codex)

Quick start

docker pull jeannjohnson/arm-code-mcp:latest

Add to your MCP client config (e.g. ~/.claude/mcp.json):

{
  "mcpServers": {
    "arm-code-mcp": {
      "command": "docker",
      "args": ["run", "--rm", "-i", "jeannjohnson/arm-code-mcp:latest"]
    }
  }
}

Restart your client. All three tools are now available.

Tools

`analyze_perf_output`

Parse raw perf report --stdio output and return the top hot symbols, ranked by overhead.

analyze_perf_output(
    perf_report_text: str,          # raw stdout of `perf report --stdio`
    top_n: int = 10,                # max symbols to return
    min_overhead_pct: float = 0.5,  # ignore symbols below this %
) -> dict

Example response:

{
  "summary": {
    "total_samples": 5432100,
    "total_events": null,
    "command": "myapp"
  },
  "hot_symbols": [
    {"overhead_pct": 24.17, "samples": 1245, "command": "myapp",
     "module": "myapp", "symbol": "process_buffer"},
    {"overhead_pct": 12.34, "samples": 636, "command": "myapp",
     "module": "libc-2.31.so", "symbol": "__memcpy_avx_unaligned_erms"}
  ],
  "warnings": []
}

`suggest_neon_intrinsic`

Recommend NEON intrinsics for a hot loop using hybrid semantic + exact-name retrieval over a curated knowledge base of 110 intrinsics.

suggest_neon_intrinsic(
    operation_description: str,    # e.g. "32-bit float multiply-accumulate"
    target_arch: str = "armv8-a",  # "armv8-a" | "armv8.2-a" | "armv9-a"
    top_k: int = 5,
) -> dict

Example response:

{
  "matches": [
    {
      "intrinsic": "vmlaq_f32",
      "signature": "float32x4_t vmlaq_f32(float32x4_t a, float32x4_t b, float32x4_t c)",
      "header": "<arm_neon.h>",
      "min_arch": "armv8-a",
      "description": "Multiply-accumulate: a + (b * c), lane-wise, 4x f32.",
      "score": 0.9142
    }
  ],
  "notes": "Filtered to armv8-a. KB contains 110 entries (103 compatible)."
}

`check_arm64_deps`

Scan a dependency manifest and flag packages with known arm64 compatibility issues. Fully offline — no network calls, fast, deterministic.

check_arm64_deps(
    file_content: str,                    # raw text of the manifest
    file_type: str = "requirements.txt",  # "requirements.txt" | "pyproject.toml" | "Dockerfile"
) -> dict

Example response:

{
  "checked": ["numpy", "tensorflow", "cupy-cuda12x", "faiss-cpu", "requests"],
  "issues": [
    {"package": "cupy-cuda12x", "severity": "error",
     "message": "GPU-only package with no arm64 wheel. Use cupy with ROCm or a CPU fallback."},
    {"package": "tensorflow", "severity": "warning",
     "message": "Official TensorFlow PyPI wheels are x86-only before 2.10; use tensorflow-aarch64 or build from source."},
    {"package": "faiss-cpu", "severity": "warning",
     "message": "No official arm64 wheel on PyPI; build from source or use the conda-forge package."},
    {"package": "numpy", "severity": "info",
     "message": "arm64 wheels available from PyPI since 1.21.0. Ensure version >= 1.21.0."}
  ],
  "summary": "Checked 5 package(s): 1 error(s), 2 warning(s), 1 info(s)."
}

Severity levels:

Level	Meaning
`error`	No arm64 wheel exists (e.g. GPU-only packages)
`warning`	Wheel exists but requires a workaround or alternative source
`info`	Wheel available; version constraint or system-lib note applies

Configuration

All env vars are optional. The server works with no configuration.

Variable	Default	Description
`ARM_CODE_MCP_LOG_LEVEL`	`INFO`	Log verbosity: `DEBUG`, `INFO`, `WARNING`
`ARM_CODE_MCP_KB_PATH`	bundled JSONL	Override path to `neon_intrinsics.jsonl`
`ARM_CODE_MCP_CACHE_DIR`	`~/.cache/arm-code-mcp`	Embedding cache directory

Pass env vars to the container:

docker run --rm -i \
  -e ARM_CODE_MCP_LOG_LEVEL=DEBUG \
  jeannjohnson/arm-code-mcp:latest

Evaluation

suggest_neon_intrinsic is evaluated against 15 hand-curated (query, expected intrinsic) pairs using the real all-MiniLM-L6-v2 embedding model. Current baseline:

Metric	Score
hit@1	0.667
hit@3	0.933
hit@5	1.000
MRR	0.817

The regression guard exits non-zero if hit@3 drops below 0.70.

Run the eval harness locally:

uv sync
make eval

See eval/README.md for methodology and known limitations.

Development

git clone https://github.com/jean-johnson-zwix/arm-code-mcp
cd arm-code-mcp
uv sync
make test    # 78 tests
make lint    # ruff check + format
make eval    # real model, 15 gold queries

Makefile targets:

Target	Description
`make setup`	`uv sync` + pre-commit install
`make test`	Run the full test suite
`make lint`	ruff check + ruff format --check
`make eval`	Run the NEON retrieval eval harness
`make docker-build`	Build `arm-code-mcp:dev` locally
`make docker-run`	Run the local dev image over stdio

Multi-arch images (linux/amd64 + linux/arm64) are built and pushed automatically by .github/workflows/release.yml on v*.*.* tags.

Knowledge base maintenance

The NEON intrinsics knowledge base lives in src/arm_code_mcp/kb/data/neon_intrinsics.jsonl (110 entries). To add intrinsics or refresh after a model upgrade, see docs/kb-refresh.md.

Roadmap

Tools

parse_flamegraph — extract hot paths from Linux perf flamegraph SVG
suggest_sve2_intrinsic — extend retrieval to SVE2 intrinsics (Neoverse V2, Cortex-X4)

Eval

Multi-query paraphrase expansion for each gold pair
Reranking pass over semantic candidates
Larger gold set (50+ queries) for lower metric variance

Demo

Coming soon.

Contributing

Stars, forks, and issues are welcome. Open a PR or file an issue on GitHub.

Good first issues:

Add more NEON intrinsic entries to kb/data/neon_intrinsics.jsonl
Add gold eval queries for SVE2 intrinsics
Add parse_flamegraph tool for Linux perf flamegraph SVG files

License

Apache 2.0 — same as arm/mcp.

Arm Code Mcp

arm-code-mcp

What's inside

Prerequisites

Quick start

Tools

`analyze_perf_output`

`suggest_neon_intrinsic`

`check_arm64_deps`

Configuration

Evaluation

Development

Knowledge base maintenance

Roadmap

Demo

Contributing

License

Configuration

Arm Code Mcp

arm-code-mcp

What's inside

Prerequisites

Quick start

Tools

`analyze_perf_output`

`suggest_neon_intrinsic`

`check_arm64_deps`

Configuration

Evaluation

Development

Knowledge base maintenance

Roadmap

Demo

Contributing

License

Configuration

Related Databases MCP Servers

Related Databases MCP Servers