Gives Claude surgical access to codebases across 10 languages without burning context on full file reads. Exposes 15 tools including tree-sitter AST extraction, hybrid search combining FTS5 and vector embeddings, call graph traversal, and O(1) symbol retrieval via byte-offset seeking. Index local folders or remote GitHub/GitLab repos, then query specific functions, trace who calls what, diff symbol changes for PR review, or search semantically across Python, TypeScript, Rust, Go, Java, C, C++, C#, Ruby, and JavaScript. Stores everything in per-repo SQLite databases with incremental SHA-256 based re-indexing. Reach for this when you need an AI agent to navigate large codebases efficiently or want to map dependencies before refactoring.
Intelligent code indexing MCP server. 15 tools, 10 languages, tree-sitter AST extraction, hybrid search (FTS5 + vector), call graphs, remote repo indexing, incremental indexing.
Save 99% of tokens — get exact function source via byte-offset seek instead of reading entire files.
pip install codemunch-pro
Add to your MCP client config:
{
"mcpServers": {
"codemunch-pro": {
"command": "codemunch-pro"
}
}
}
codemunch-pro --transport streamable-http --port 5002
| Tool | Description |
|---|---|
index_folder | Index a local directory (incremental, SHA-256 based) |
index_repo | Index a GitHub/GitLab repo (tarball download, no git needed) |
list_repos | List all indexed repositories with stats |
invalidate_cache | Force re-index a repository |
file_tree | Get directory tree with file counts |
file_outline | List symbols in a single file |
repo_outline | List all symbols in repo (summary) |
get_symbol | Get full source of one symbol (O(1) byte seek) |
get_symbols | Batch get multiple symbols |
search_symbols | Hybrid search (FTS5 + vector RRF) |
search_text | Full-text search in file contents |
get_callees | What does this function call? |
get_callers | Who calls this function? |
diff_symbols | What changed since last index? (PR review) |
dependency_map | What does this file depend on? What depends on it? |
Python, JavaScript, TypeScript, Go, Rust, Java, C, C++, C#, Ruby
All via tree-sitter-language-pack — zero compilation, pre-built binaries.
Every symbol stores its byte offset and length. get_symbol seeks directly to the function source — no reading entire files. A 200-byte function from a 40KB file = 99.5% token savings.
Files are hashed (SHA-256). Only changed files are re-parsed. Re-indexing a 10K file repo after changing one file takes milliseconds.
Combines BM25 keyword matching with semantic vector similarity using Reciprocal Rank Fusion. Search "authentication middleware" and find auth_middleware, verify_token, and login_handler.
Traces function calls through the AST. get_callees("main") shows what main calls. get_callers("authenticate") shows who calls authenticate. Supports depth traversal.
Index any public GitHub or GitLab repo by URL — no git binary needed. Downloads the tarball via API, extracts, and indexes. Cached locally with SHA-based freshness checks. Supports private repos with auth tokens and sparse paths.
Search raw file contents — string literals, TODO comments, config values, error messages. Not just symbol names.
~/.codemunch-pro/
├── myproject_a1b2c3d4e5f6.db # Per-repo SQLite database
├── otherproject_7890abcdef.db
└── ...
Each DB contains:
├── files # Indexed files with SHA-256 hashes
├── symbols # Functions, classes, methods, types
├── symbols_fts # FTS5 full-text search index
├── symbols_vec # sqlite-vec 384-dim vector index
├── call_edges # Call graph (caller → callee)
└── file_content_fts # Raw file content search
MIT
com.mcparmory/google-search
io.github.pipeworx-io/brave-search
marcopesani/mcp-server-serper
brave/brave-search-mcp-server
com.mcparmory/google-search-console
acamolese/google-search-console-mcp