This is a code indexing server that gives Claude surgical access to large codebases without burning context tokens. It uses tree-sitter to parse 10 languages into ASTs, then stores symbols in SQLite with FTS5 and vector embeddings for hybrid search. You get 15 tools including get_symbol for O(1) byte-offset retrieval (grab a 200-byte function from a 40KB file), search_symbols with semantic and keyword matching, call graph traversal with get_callers and get_callees, and index_repo to pull down any GitHub or GitLab repository without needing git installed. Incremental indexing via SHA-256 hashing means re-indexing after one file change takes milliseconds. Reach for this when you're building agents that need to navigate unfamiliar codebases or doing refactoring work where you need to trace dependencies before moving code around.
Intelligent code indexing MCP server. 15 tools, 10 languages, tree-sitter AST extraction, hybrid search (FTS5 + vector), call graphs, remote repo indexing, incremental indexing.
Save 99% of tokens — get exact function source via byte-offset seek instead of reading entire files.
Formerly
codemunch-pro. Same code, better name.
pip install tokennuke
Add to your MCP client config:
{
"mcpServers": {
"tokennuke": {
"command": "tokennuke"
}
}
}
tokennuke --transport streamable-http --port 5002
| Tool | Description |
|---|---|
index_folder | Index a local directory (incremental, SHA-256 based) |
index_repo | Index a GitHub/GitLab repo (tarball download, no git needed) |
list_repos | List all indexed repositories with stats |
invalidate_cache | Force re-index a repository |
file_tree | Get directory tree with file counts |
file_outline | List symbols in a single file |
repo_outline | List all symbols in repo (summary) |
get_symbol | Get full source of one symbol (O(1) byte seek) |
get_symbols | Batch get multiple symbols |
search_symbols | Hybrid search (FTS5 + vector RRF) |
search_text | Full-text search in file contents |
get_callees | What does this function call? |
get_callers | Who calls this function? |
diff_symbols | What changed since last index? (PR review) |
dependency_map | What does this file depend on? What depends on it? |
Python, JavaScript, TypeScript, Go, Rust, Java, C, C++, C#, Ruby
All via tree-sitter-language-pack — zero compilation, pre-built binaries.
Every symbol stores its byte offset and length. get_symbol seeks directly to the function source — no reading entire files. A 200-byte function from a 40KB file = 99.5% token savings.
Files are hashed (SHA-256). Only changed files are re-parsed. Re-indexing a 10K file repo after changing one file takes milliseconds.
Combines BM25 keyword matching with semantic vector similarity using Reciprocal Rank Fusion. Search "authentication middleware" and find auth_middleware, verify_token, and login_handler.
Traces function calls through the AST. get_callees("main") shows what main calls. get_callers("authenticate") shows who calls authenticate. Supports depth traversal.
Index any public GitHub or GitLab repo by URL — no git binary needed. Downloads the tarball via API, extracts, and indexes. Cached locally with SHA-based freshness checks. Supports private repos with auth tokens and sparse paths.
Search raw file contents — string literals, TODO comments, config values, error messages. Not just symbol names.
~/.tokennuke/
├── myproject_a1b2c3d4e5f6.db # Per-repo SQLite database
├── otherproject_7890abcdef.db
└── ...
Each DB contains:
├── files # Indexed files with SHA-256 hashes
├── symbols # Functions, classes, methods, types
├── symbols_fts # FTS5 full-text search index
├── symbols_vec # sqlite-vec 384-dim vector index
├── call_edges # Call graph (caller → callee)
└── file_content_fts # Raw file content search
MIT
com.mcparmory/google-search
io.github.pipeworx-io/brave-search
marcopesani/mcp-server-serper
brave/brave-search-mcp-server
com.mcparmory/google-search-console
acamolese/google-search-console-mcp