Gives Claude a semantic understanding of your codebase instead of forcing it to fumble through grep and cat calls. Uses tree-sitter to parse 15 languages into a graph of functions, classes, and dependencies, then lifts each entity into verb-object features like "validate JWT tokens" or "serialize config to disk." The semantic_snapshot tool compresses your whole repo into ~25K tokens so the LLM knows the architecture up front. When code changes, the graph auto-syncs before queries. You can lift features in-session with your agent subscription or run auto_lift to delegate to a cheap external API like Haiku. Search by intent with search_node, trace dependencies with explore_rpg and impact_radius, or ask plan_change to map out refactors. Built in Rust, ships as an npm binary, works with Claude Code, Cursor, or any MCP client.
Give your AI agent a brain for your codebase.
AI coding agents waste most of their tool calls fumbling through your codebase with grep, cat, find, and file reads. rpg-encoder fixes that. It builds a semantic graph of your code with Tree-sitter — not just what calls what, but what every function does — and gives your AI assistant whole-repo understanding via MCP in a single tool call.
claude mcp add rpg -- npx -y -p rpg-encoder rpg-mcp-server
One command. Works with Claude Code, Cursor, opencode, Windsurf, or any MCP-compatible agent. No Rust toolchain, no cloning, no building — npx downloads a pre-built binary for your platform.
Then open any repo and tell your agent:
"Build and lift the RPG for this repo"
Your agent handles everything: indexes entities (seconds), reads each function and adds intent-level features (a few minutes), organizes them into a semantic hierarchy, and commits .rpg/graph.json for your team.
For repos with ~100+ entities, lifting_status will tell your agent to delegate the lifting loop to a sub-agent or a cheaper model — feature extraction is pattern-matching, not novel reasoning. If your runtime has no sub-agent mechanism, run rpg-encoder lift --provider anthropic|openai from the terminal with an API key — the CLI drives an external LLM directly with no agent involvement. After the CLI finishes, call reload_rpg in your session to load the updated graph. The CLI lifts entities with no features; re-lifting stale entities (features present but outdated after code changes) is handled by the in-session MCP flow, not the CLI.
Once lifted, try:
grep, cat, findThe server instructions tell your agent to reach for RPG tools FIRST for any
question about code structure or behavior. That reflex matters — grep, cat,
and ad-hoc file reads burn tokens and miss semantic relationships RPG already
knows.
| If you'd otherwise reach for... | Use this instead |
|---|---|
grep -r / rg (by intent) | search_node(query="...") |
grep -r / rg (by name) | search_node(query="...", mode="snippets") |
cat / reading a function | fetch_node(entity_id="file:name") |
| chained greps for callers/callees | explore_rpg(entity_id="...", direction="...") |
| recursive grep for "what depends on X" | impact_radius(entity_id="...") |
wc -l / find / tree | rpg_info |
| reading many files for context | semantic_snapshot |
| manual search → fetch → explore chains | context_pack(query="...") |
| "how do I refactor X safely" | plan_change(goal="...") |
Fall back to grep, cat, or file reads only when the query is about literal text
(string search, comments, TODOs, log messages) — not about structure.
semantic_snapshot compresses the whole graph into ~25K tokens. Your LLM reads it once and knows the repo.
Instead of grepping through files, the LLM calls semantic_snapshot once and receives:
~25K tokens covers ~1000 entities. That's 2-3% of a 1M context window — the LLM starts every session already knowing your repo.
Whenever your working tree changes — committed, staged, or unstaged — the MCP server automatically re-syncs before responding to the next query. A changeset hash over (path, size, mtime) means repeated saves of the same file trigger one sync, and idle queries trigger none. Reverts are detected too: if a previously-dirty file returns to its HEAD state, the graph is restored.
| Mode | Command | Cost | Who pays |
|---|---|---|---|
| Agent lifting | "Build and lift the RPG" | Subscription tokens | Your Claude Code / Cursor subscription |
| Autonomous lifting | auto_lift(provider="anthropic", api_key_env="ANTHROPIC_API_KEY") | ~$0.02 per 100 entities | External API key (Haiku, GPT-4o-mini, OpenRouter, Gemini) |
auto_lift calls a cheap external LLM directly — your coding subscription never touches the lifting work. Use api_key_env to resolve keys from environment variables so they never appear in tool call transcripts.
Seven Rust crates, one MCP server binary, one CLI binary:
| Crate | Role |
|---|---|
rpg-core | Graph types (RPGraph, Entity, HierarchyNode), storage, LCA algorithm |
rpg-parser | Tree-sitter entity + dependency extraction (15 languages) |
rpg-encoder | Encoding pipeline, lifting utilities, incremental evolution |
rpg-nav | Search, fetch, explore, snapshot, TOON serialization |
rpg-lift | Autonomous LLM lifting (Anthropic, OpenAI, OpenRouter, Gemini) |
rpg-cli | CLI binary (rpg-encoder) |
rpg-mcp | MCP server binary (rpg-mcp-server) with 27 tools |
| Tool | Description |
|---|---|
build_rpg | Index the codebase (run once, instant) |
update_rpg | Incremental update from git changes |
reload_rpg | Reload graph from disk after external changes |
rpg_info | Graph statistics, hierarchy overview, per-area lifting coverage |
| Tool | Description |
|---|---|
semantic_snapshot | Whole-repo semantic understanding in one call (~25K tokens for 1000 entities) |
search_node | Search entities by intent or keywords (hybrid embedding + lexical scoring) |
fetch_node | Get entity metadata, source code, dependencies, and hierarchy context |
explore_rpg | Traverse dependency graph (upstream, downstream, or both) |
context_pack | Single-call search + fetch + explore with token budget |
| Tool | Description |
|---|---|
impact_radius | BFS reachability analysis — "what depends on X?" |
plan_change | Change planning — find relevant entities, modification order, blast radius |
find_paths | K-shortest dependency paths between two entities |
slice_between | Extract minimal connecting subgraph between entities |
analyze_health | Code health: coupling, instability, god objects, clone detection |
detect_cycles | Find circular dependencies and architectural cycles |
reconstruct_plan | Dependency-safe reconstruction execution plan |
| Tool | Description |
|---|---|
auto_lift | One-call autonomous lifting via cheap LLM API (Haiku, GPT-4o-mini, OpenRouter, Gemini) |
lifting_status | Dashboard — coverage, per-area progress, NEXT STEP |
get_entities_for_lifting | Get entity source code for your agent to analyze |
submit_lift_results | Submit the agent's semantic features back to the graph |
finalize_lifting | Aggregate file-level features, rebuild hierarchy metadata |
get_files_for_synthesis | Get file-level entity features for holistic synthesis |
submit_file_syntheses | Submit holistic file-level summaries |
build_semantic_hierarchy | Get domain discovery + hierarchy assignment prompts |
submit_hierarchy | Apply hierarchy assignments to the graph |
get_routing_candidates | Get entities needing semantic routing (drifted or newly lifted) |
submit_routing_decisions | Submit routing decisions (hierarchy path or "keep") |
15 languages via Tree-sitter:
| Language | Entity Extraction | Dependency Resolution |
|---|---|---|
| Python | Functions, classes, methods | imports, calls, inheritance |
| Rust | Functions, structs, traits, impl methods | use, calls, trait impls |
| TypeScript | Functions, classes, methods, interfaces | imports, calls, inheritance |
| JavaScript | Functions, classes, methods | imports, calls, inheritance |
| Go | Functions, structs, methods, interfaces | imports, calls |
| Java | Classes, methods, interfaces | imports, calls, inheritance |
| C / C++ | Functions, classes, methods, structs | includes, calls, inheritance |
| C# | Classes, methods, interfaces | using, calls, inheritance |
| PHP | Functions, classes, methods | use, calls, inheritance |
| Ruby | Classes, methods, modules | require, calls, inheritance |
| Kotlin | Functions, classes, methods | imports, calls, inheritance |
| Swift | Functions, classes, structs, protocols | imports, calls, inheritance |
| Scala | Functions, classes, objects, traits | imports, calls, inheritance |
| Bash | Functions | source, calls |
# Claude Code
claude mcp add rpg -- npx -y -p rpg-encoder rpg-mcp-server
# Cursor — add to ~/.cursor/mcp.json
{
"mcpServers": {
"rpg": {
"command": "npx",
"args": ["-y", "-p", "rpg-encoder", "rpg-mcp-server"]
}
}
}
The server auto-detects the project root from the current working directory — no path argument needed.
npm install -g rpg-encoder
# Build a graph
rpg-encoder build
# Query
rpg-encoder search "parse entities from source code"
rpg-encoder fetch "src/parser.rs:extract_entities"
rpg-encoder explore "src/parser.rs:extract_entities" --direction both --depth 2
rpg-encoder info
# Autonomous lifting via API
rpg-encoder lift --provider anthropic --dry-run # estimate cost
rpg-encoder lift --provider anthropic # lift with Haiku (~$0.02/100 entities)
# Incremental update
rpg-encoder update
# Pre-commit hook (auto-updates graph on commit)
rpg-encoder hook install
git clone https://github.com/userFRM/rpg-encoder.git
cd rpg-encoder && cargo build --release
Then point your MCP config at target/release/rpg-mcp-server.
rpg-encoder is built on the theoretical framework from the RPG-Encoder research paper, with original extensions inspired by tools across the code intelligence landscape:
G = (V_H ∪ V_L, E_dep ∪ E_feature).This is an independent implementation. All code is original work under the MIT license. Not affiliated with or endorsed by Microsoft.