Most agents still burn thousands of tokens dumping entire documentation files into context to find one config block. This server indexes docs once by section hierarchy, then lets you search and retrieve at heading granularity with byte-precise extraction. It parses Markdown, reStructuredText, AsciiDoc, Jupyter notebooks, HTML, YAML, JSON, and OpenAPI specs into stable section IDs that survive re-indexing. Tools cover discovery, structure browsing, semantic search, and single or batch retrieval. Sections include summaries, tags, content hashes, and parent/child wiring. Implements jMRI-Full with local-first storage under `~/.doc-index/`. Built for workflows where you need agents to navigate technical docs structurally instead of grep-and-pray or brute-force file reading.
Most AI agents still explore documentation the expensive way:
open file → skim hundreds of irrelevant paragraphs → open another file → repeat
That burns tokens, floods context windows with noise, and forces models to reason through a lot of text they never needed in the first place.
jDocMunch-MCP lets AI agents navigate documentation by section instead of reading files by brute force.
It indexes a documentation set once, then retrieves exactly the section the agent actually needs, with byte-precise extraction from the original file.
| Task | Traditional approach | With jDocMunch |
|---|---|---|
| Find a configuration section | ~12,000 tokens | ~400 tokens |
| Browse documentation structure | ~40,000 tokens | ~800 tokens |
| Explore a full doc set | ~100,000 tokens | ~2,000 tokens |
Index once. Query cheaply forever.
Precision context beats brute-force context.
Commercial licenses
jDocMunch-MCP is free for non-commercial use.
Commercial use requires a paid license.
jDocMunch-only licenses
- Builder — $29 — 1 developer
- Studio — $99 — up to 5 developers
- Platform — $499 — org-wide internal deployment
Want the full jMunch suite (code + docs + data)?
1.x compatibility commitment
Every 1.x license entitles you to every future 1.x release. We will never ship a 1.x version that:
- removes or renames an MCP tool (deprecated tool names keep their aliases),
- drops a
Sectionfield from the response shape,- forces a reindex without auto-migrating your existing index on first load,
- changes the JSON wire format of any tool response in a way that breaks an existing consumer,
- or makes a previously-default behavior raise.
Anything that would require breaking these promises is reserved for a future major version (2.x). The full machine-checked contract is enforced via
tests/test_server.py(tool-name and required-field invariants) and the replay-fixture gate that runs on every release.
Stop dumping documentation files into context windows. Start navigating docs structurally.
jDocMunch indexes documentation once by heading hierarchy and section structure, then gives MCP-compatible agents precise access to the explanations they actually need instead of forcing them to brute-read files.
It is built for workflows where token efficiency, context hygiene, and agent reliability matter.
Large context windows do not fix bad retrieval.
Agents waste money and reasoning bandwidth when they:
jDocMunch fixes that by changing the unit of access from file to section.
Instead of handing an agent an entire document, it can retrieve exactly:
That makes documentation exploration cheaper, faster, and more stable.
Search and retrieve documentation by section, not just file path or keyword match.
Full content is pulled on demand from exact byte offsets into the original file.
Sections retain durable identities across re-indexing when path, heading text, and heading level remain unchanged.
Indexes and raw docs are stored locally. No hosted dependency required.
Works with Claude Desktop, Claude Code, Google Antigravity, and other MCP-compatible clients.
Every section stores:
This allows agents to discover documentation structurally, then request only the specific section they need.
Traditional doc retrieval methods all break in different ways:
jDocMunch preserves the structure the human author intended:
Agents do not need bigger context windows.
They need better navigation.
jDocMunch implements jMRI-Full — the open specification for structured retrieval MCP servers. jMRI-Full covers the full stack: discover, search, retrieve, and metadata operations with batch retrieval, hash-based drift detection, byte-offset addressing, and a complete _meta envelope on every call.
Discovery GitHub API or local directory walk
Security filtering Traversal protection, secret exclusion, binary detection
Parsing Format-aware section splitting: heading-based (Markdown/MDX/HTML/RST/AsciiDoc), structure-based (OpenAPI tags, JSON keys, XML elements), or cell-based (Jupyter)
Hierarchy wiring Parent/child relationships established
Summarization Heading text → AI batch summaries → title fallback
Storage
JSON index + raw files stored locally under ~/.doc-index/
Retrieval O(1) byte-offset seeking via stable section IDs
{repo}::{doc_path}::{ancestor-chain/slug}#{level}
The slug is prefixed with the ancestor heading chain, making IDs both readable and stable. A new heading inserted in one branch of a document never renumbers IDs in another branch.
Examples:
owner/repo::docs/install.md::installation#1owner/repo::docs/install.md::installation/prerequisites#3owner/repo::README.md::usage/configuration/advanced-configuration#4local/myproject::guide.md::configuration#2IDs remain stable across re-indexing when the file path, heading text, heading level, and parent heading chain do not change.
pippip install jdocmunch-mcp
Verify:
jdocmunch-mcp --help
PATH note: MCP clients often run with a restricted environment where
jdocmunch-mcpmay not be found even if it works in your shell. Usinguvxis the recommended approach because it resolves the package on demand without relying on your system PATH. If you preferpip install, use the absolute path to the executable instead.
/home/<username>/.local/bin/jdocmunch-mcp/Users/<username>/.local/bin/jdocmunch-mcpC:\\Users\\<username>\\AppData\\Roaming\\Python\\Python3xx\\Scripts\\jdocmunch-mcp.exeConfig file location:
| OS | Path |
|---|---|
| macOS | ~/Library/Application Support/Claude/claude_desktop_config.json |
| Linux | ~/.config/claude/claude_desktop_config.json |
| Windows | %APPDATA%\Claude\claude_desktop_config.json |
{
"mcpServers": {
"jdocmunch": {
"command": "uvx",
"args": ["jdocmunch-mcp"]
}
}
}
{
"mcpServers": {
"jdocmunch": {
"command": "uvx",
"args": ["jdocmunch-mcp"],
"env": {
"GITHUB_TOKEN": "ghp_...",
"ANTHROPIC_API_KEY": "sk-ant-..."
}
}
}
}
For Anthropic or Gemini, the base uvx jdocmunch-mcp command is enough once the
corresponding API key is present. For OpenAI-compatible providers such as OpenAI,
MiniMax, or GLM-5, include the optional dependency in the launcher command:
{
"mcpServers": {
"jdocmunch": {
"command": "uvx",
"args": ["--with", "openai", "jdocmunch-mcp"],
"env": {
"MINIMAX_API_KEY": "mx-...",
"JDOCMUNCH_SUMMARIZER_PROVIDER": "minimax"
}
}
}
}
After saving the config, restart Claude Desktop / Claude Code.
jDocMunch ships enforcement hooks that keep your agent honest:
Read a large doc file, suggesting search_sections + get_sectionEdit/Write so the index never goes staleInstall everything in one command:
jdocmunch-mcp init
This detects your MCP clients, patches their config, installs a Doc Exploration Policy into CLAUDE.md, sets up enforcement hooks, and indexes your current directory. Use --dry-run to preview, --demo for a benefit summary, or --yes for non-interactive mode.
For hooks only:
jdocmunch-mcp init --hooks
If you also use jCodeMunch, run both:
jcodemunch-mcp init
jdocmunch-mcp init
| Subcommand | Purpose |
|---|---|
serve (default) | Run the MCP server (stdio) |
init | One-command onboarding: detect clients, write config, install policy, hooks, index |
claude-md | Print or install the Doc Exploration Policy (--install global|project) |
index-local --path <dir> | Index a local folder (CLI, no MCP session needed) |
index-file <path> | Re-index a single file within an existing index |
hook-pretooluse | PreToolUse hook handler (reads JSON from stdin) |
hook-posttooluse | PostToolUse hook handler (reads JSON from stdin) |
hook-precompact | PreCompact hook handler (reads JSON from stdin) |
⋯ menu → MCP Servers → Manage MCP Serversmcp_config.json{
"mcpServers": {
"jdocmunch": {
"command": "uvx",
"args": ["jdocmunch-mcp"]
}
}
}
Option A — CLI (one command):
openclaw mcp set jdocmunch '{"command":"uvx","args":["jdocmunch-mcp"]}'
Option B — Edit config directly:
Add the entry to ~/.openclaw/openclaw.json under mcpServers:
{
"mcpServers": {
"jdocmunch": {
"command": "uvx",
"args": ["jdocmunch-mcp"],
"transport": "stdio"
}
}
}
With optional AI summaries:
{
"mcpServers": {
"jdocmunch": {
"command": "uvx",
"args": ["jdocmunch-mcp"],
"transport": "stdio",
"env": {
"ANTHROPIC_API_KEY": "${ANTHROPIC_API_KEY}"
}
}
}
}
Restart the gateway and verify:
openclaw gateway restart
openclaw mcp list
Per-agent routing (optional):
{
"agents": {
"researcher": {
"mcpServers": ["jdocmunch", "brave-search", "fetch"]
}
}
}
Without explicit instructions, your agent will ignore jDocMunch even though it's connected. Create a system prompt file (e.g. ~/.openclaw/agents/researcher.md) with:
## Documentation Policy
Always use jDocMunch-MCP tools for documentation exploration.
- Before reading a doc file: use search_sections or get_toc
- To retrieve specific content: use get_section with the section ID
- To index local docs: use index_local with the docs folder path
- Never open documentation files directly — navigate by section.
Point your agent at it in ~/.openclaw/openclaw.json:
{
"agents": {
"named": {
"researcher": {
"systemPromptFile": "~/.openclaw/agents/researcher.md"
}
}
}
}
index_local: { "path": "/path/to/docs" }
index_repo: { "url": "owner/repo" }
get_toc: { "repo": "owner/repo" }
get_toc_tree: { "repo": "owner/repo" }
get_document_outline: { "repo": "owner/repo", "doc_path": "docs/config.md" }
search_sections: { "repo": "owner/repo", "query": "authentication" }
get_section: { "repo": "owner/repo", "section_id": "owner/repo::docs/config.md::authentication#1" }
| Tool | Purpose |
|---|---|
index_local | Index a local documentation folder |
index_repo | Index a GitHub repository’s docs |
list_repos | List indexed documentation sets |
get_toc | Flat section list in document order |
get_toc_tree | Nested section tree per document |
get_document_outline | Section hierarchy for one document |
search_sections | Weighted search returning summaries only |
get_section | Full content of one section |
get_sections | Batch content retrieval |
get_section_context | Section + ancestor headings + child summaries |
delete_index | Remove a doc index |
get_broken_links | Detect internal links/anchors that no longer resolve |
get_doc_coverage | Which jcodemunch symbols have matching doc sections |
Search and retrieval tools include a _meta envelope with timing, token savings, and cost avoided.
Example:
"_meta": {
"latency_ms": 12,
"sections_returned": 5,
"tokens_saved": 1840,
"total_tokens_saved": 94320,
"cost_avoided": { "claude_opus": 0.0276, "gpt5_latest": 0.0184 },
"total_cost_avoided": { "claude_opus": 1.4148, "gpt5_latest": 0.9432 }
}
total_tokens_saved and total_cost_avoided accumulate across tool calls and persist to ~/.doc-index/_savings.json.
Every jDocMunch tool response includes a _meta block with tokens_saved (this call) and total_tokens_saved (lifetime). To check your cumulative savings, ask your agent to call any jDocMunch tool (e.g. get_toc or search_sections) and look at the _meta envelope. Lifetime stats persist in ~/.doc-index/_savings.json across sessions.
| Format | Extensions | Notes |
|---|---|---|
| Markdown | .md, .markdown | ATX (# Heading) and setext headings |
| MDX | .mdx | JSX tags, frontmatter, import/export stripped before parsing |
| Plain text | .txt | Paragraph-block section splitting |
| reStructuredText | .rst | Adornment-based heading detection |
| AsciiDoc | .adoc | = and == heading hierarchy |
| Jupyter Notebook | .ipynb | Markdown cells used as sections; code cells attached as content |
| HTML | .html | <h1>–<h6> headings; boilerplate stripped |
| OpenAPI / Swagger | .yaml, .yml, .json, .jsonc | OpenAPI 3.x and Swagger 2.x; operations grouped by tag as sections |
| JSON / JSONC | .json, .jsonc | Top-level keys as sections; JSONC comments stripped before parsing |
| XML / SVG / XHTML | .xml, .svg, .xhtml | Element hierarchy used for section structure |
See ARCHITECTURE.md for parser details.
Built-in protections include:
.env, *.pem, and similar)_safe_content_path()See SECURITY.md for details.
"auto", on whenever a provider is configured — but the core workflow remains structure-first)| Variable | Purpose | Required |
|---|---|---|
GITHUB_TOKEN | GitHub API auth | No |
ANTHROPIC_API_KEY | Section summaries via Claude Haiku | No |
GOOGLE_API_KEY | Section summaries via Gemini Flash; also Gemini embeddings | No |
OPENAI_API_KEY | OpenAI embeddings (text-embedding-3-small) | No |
JDOCMUNCH_EMBEDDING_PROVIDER | Force provider: gemini, openai, openai-compatible, sentence-transformers, none | No |
JDOCMUNCH_OPENAI_COMPAT_URL | Endpoint URL for openai-compatible embeddings | No |
JDOCMUNCH_OPENAI_COMPAT_MODEL | Model for openai-compatible embeddings | No |
JDOCMUNCH_OPENAI_COMPAT_API_KEY | Dedicated optional API key for openai-compatible embeddings | No |
JDOCMUNCH_OPENAI_COMPAT_BATCH_SIZE | Batch size for openai-compatible embeddings (default: 32) | No |
JDOCMUNCH_ST_MODEL | sentence-transformers model (default: all-MiniLM-L6-v2) | No |
DOC_INDEX_PATH | Custom cache path | No |
JDOCMUNCH_SHARE_SAVINGS | Set to 0 to disable anonymous community token savings reporting | No |
Each tool call can contribute an anonymous delta to a live global counter at j.gravelle.us. Only two values are sent:
No content, file paths, repo names, or identifying material are sent.
The anonymous install ID is generated once and stored in ~/.doc-index/_savings.json.
To disable reporting, set:
JDOCMUNCH_SHARE_SAVINGS=0
PRs welcome! All contributors must sign the Contributor License Agreement before their PR can be merged — CLA Assistant will prompt you automatically. See CONTRIBUTING.md for details.
This repository is free for non-commercial use under the terms below. Commercial use requires a paid commercial license.
jDocMunch plugs into any MCP-compatible agent or IDE. Tested configurations:
| Platform | Config |
|---|---|
| Claude Code / Claude Desktop | jdocmunch-mcp init (auto-detects and patches config) |
| Cursor / Windsurf | jdocmunch-mcp init or manual mcp.json |
| Hermes Agent | Add to ~/.hermes/config.yaml — see skill |
| Any MCP client | stdio: jdocmunch-mcp |
# ~/.hermes/config.yaml
mcp_servers:
jdocmunch:
command: "uvx"
args: ["jdocmunch-mcp"]
Copyright (c) 2026 J. Gravelle
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to use, copy, modify, merge, publish, and distribute the Software for personal, educational, research, hobby, or other non-commercial purposes, subject to the following conditions:
Commercial use of the Software requires a separate paid commercial license from the author.
“Commercial use” includes, but is not limited to:
For commercial licensing inquiries: j@gravelle.us https://j.gravelle.us
Until a commercial license is obtained, commercial use is not permitted.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND NONINFRINGEMENT.
IN NO EVENT SHALL THE AUTHOR OR COPYRIGHT HOLDER BE LIABLE FOR ANY CLAIM, DAMAGES, OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT, OR OTHERWISE, ARISING FROM, OUT OF, OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
com.mcparmory/google-search
io.github.pipeworx-io/brave-search
marcopesani/mcp-server-serper
brave/brave-search-mcp-server
com.mcparmory/google-search-console
acamolese/google-search-console-mcp