Routes document parsing through three engines with automatic fallback: markitdown (MIT licensed, handles PDF/DOCX/XLSX/HTML), Docling (for complex tables and scanned PDFs), and LlamaParse (cloud, bring your own key). The parse tool accepts file paths or URLs and returns markdown plus the backend chain it attempted. Includes parse_to_vault for saving converted docs to your Obsidian inbox with frontmatter metadata, and interpret for piping parsed content directly into Claude with prompt cache support. The benchmark tool runs all available backends side by side so you can see latency and quality differences. Also ships a chunk_text tool with document type detection that splits markdown into retrieval chunks while respecting structure (keeps paper abstracts whole, never splits numbered manual sections). Built on FastMCP with stdio transport.
Public tool metadata for what this MCP can expose to an agent.
parse_searchFind brands, organic AI prompts, citation sources, and market niches for marketer research. Use this first when the user names a brand, category, source, or AI visibility question.3 paramsFind brands, organic AI prompts, citation sources, and market niches for marketer research. Use this first when the user names a brand, category, source, or AI visibility question.
limitnumberquerystringtypesarrayparse_get_brandFetch a concise public marketing brief for one brand, including Parse score, strengths, weak spots, top prompts, citation sources, related brands, and next research questions.1 paramsFetch a concise public marketing brief for one brand, including Parse score, strengths, weak spots, top prompts, citation sources, related brands, and next research questions.
slug_or_idstringparse_get_promptFetch one public organic prompt by slug when the user wants to inspect the exact AI-search question behind a result.1 paramsFetch one public organic prompt by slug when the user wants to inspect the exact AI-search question behind a result.
slugstringparse_get_statsExplain the public Parse index scale and freshness: tracked brands, organic prompts, and citation observations.Explain the public Parse index scale and freshness: tracked brands, organic prompts, and citation observations.
No parameter schema in public metadata yet.
searchCompatibility alias for parse_search. Use for clients that expect a generic search tool.2 paramsCompatibility alias for parse_search. Use for clients that expect a generic search tool.
limitnumberquerystringfetchCompatibility alias that resolves fetch IDs like brand:stripe or prompt:best-crm into JSON-text results with human-readable text.1 paramsCompatibility alias that resolves fetch IDs like brand:stripe or prompt:best-crm into JSON-text results with human-readable text.
idstringOne MCP, many parsers. Default markitdown (free, fast, MIT). Escalate to Docling (table-heavy, scanned PDFs) or LlamaParse (cloud, BYOK) when markitdown's quality isn't enough. Plus an interpret tool that pipes parsed markdown into Claude for "summarize / extract X" so you stop juggling parsers and anthropic skills.
Open Claude Code, paste:
/plugin marketplace add adelaidasofia/parse-mcp
/plugin install parse-mcp@parse-mcp
Manual install (pre-plugin-marketplace). See SETUP.md for full details.
pip3 install --break-system-packages -r requirements.txt
pip3 install --break-system-packages 'markitdown[pdf,docx,pptx,xlsx]'
Then register the server in your client's .mcp.json:
{
"mcpServers": {
"parse": {
"command": "python3",
"args": ["/absolute/path/to/parse-mcp/server.py"]
}
}
}
| Tool | What it does |
|---|---|
parse(source, backend?, hints?) | File path or http(s) URL to markdown. Router picks backend, falls back on empty/error. Returns markdown plus a chain of every backend attempted. |
parse_url(url, backend?) | Shortcut for HTTP(S) inputs. Same return shape as parse. |
parse_to_vault(source, vault_folder?, backend?, overwrite?) | Parse + write the result as a markdown note in the vault. Default folder: <VAULT_ROOT>/📥 Inbox/Converted/. Frontmatter records source, format, backend, latency, bytes_in. Replaces the standalone markitdown_to_vault.py shell script. |
interpret(source, instruction, backend?, model?, max_tokens?) | Parse first, then ask Claude over the parsed markdown. Cache hits reuse parsed text for free input tokens. |
list_backends() | Which backends are installed + which are missing. Diagnostic. |
benchmark(source) | Run every available backend on the same input. Compare latency + output side by side. |
chunk_text(text, doc_type?, target_tokens?, max_tokens?, min_tokens?) | Chunk parsed markdown into retrieval-ready pieces using a doc-type-aware chunker. doc_type="auto" (default) runs structural detection and picks one of paper / book / manual / qa / resume / table / default. Each chunker honors document shape (e.g., paper keeps the abstract whole; manual never merges across numbered sections; qa pairs each question with its answer). Returns chunks + the resolved doc_type. See chunkers/ package. |
detect_doc_type(text) | Diagnostic. Run structural heuristics over markdown and return the doc_type that chunk_text would pick. |
pip install docling). Best for complex tables (97.9% on benchmark) + scanned PDFs. Downloads model weights on first run.pip install llama-cloud-services + LLAMA_CLOUD_API_KEY). Cloud, cleanest output on visually-complex PDFs.parse(source) with no backend arg: router picks based on file format, falls back if backend errors or returns empty.parse(source, backend="docling"): force a specific backend, no fallback. Diagnostic mode.FastMCP v3.2.3+, stdio transport, Python 3.13+. Registered in [VAULT_ROOT]/.mcp.json. No daemons, no listeners, no model weights downloaded by default.
See SETUP.md for install + per-backend opt-in.
Same author, same architecture pattern (FastMCP, draft+confirm on writes, vault auto-export where applicable):
This plugin sends a single anonymous install signal to myceliumai.co the first time it loads in a Claude Code session on a given machine.
What is sent:
slack-mcp)0.1.0)What is NOT sent:
Why: Helps the maintainer know which plugins people actually install, so attention goes to the ones that get used.
Opt out: Set the environment variable MYCELIUM_NO_PING=1 before launching Claude Code. The hook will skip the network call entirely. Already-pinged installs leave a sentinel at ~/.mycelium/onboarded-<plugin> — delete it if you want to reset state.
MIT. See LICENSE.
Full install or team version at diazroa.com.
LLAMAPARSE_API_KEYsecretOptional fallback backend; only required for llamaparse routing