Connects Claude to your local filesystem to read and analyze documents without sending them to external APIs. Exposes four tools: list_documents for directory browsing with glob patterns, document_info for metadata, read_document for text extraction with pagination, and visual_evaluate_document that returns page images so the AI can reason about charts and layouts directly. Handles PDFs, Excel, CSV, Word, PowerPoint, and images out of the box. The visual analysis is the standout feature here. Point it at a configurable documents root and you can ask Claude to summarize spreadsheets, extract data from forms, or explain what's in a slide deck without any manual export step.
An MCP (Model Context Protocol) server that lets AI assistants read and visually analyze local documents — PDFs, Excel spreadsheets, CSV files, Word documents, PowerPoint presentations, and images.
No API keys required. The host AI (GitHub Copilot, Claude, etc.) does all the reasoning directly.
| Format | Extensions | Read | Visual |
|---|---|---|---|
.pdf | ✅ | ✅ | |
| Excel | .xlsx, .xls | ✅ | ✅ |
| CSV / TSV | .csv, .tsv | ✅ | — |
| JSON | .json | ✅ | — |
| Word | .docx | ✅ | ✅ |
| PowerPoint | .pptx | ✅ | ✅ |
| Plain text | .txt, .md | ✅ | — |
| Images | .png, .jpg, .jpeg, .gif, .bmp, .tiff, .webp | — | ✅ |
| Tool | Description |
|---|---|
list_documents | List files under a directory, filtered by glob pattern |
document_info | Get metadata (size, modified date, sheets) for a file |
read_document | Extract text content from a document with pagination |
visual_evaluate_document | Return page images inline so the AI can analyze charts, tables, and diagrams |
Search for docalyze in the MCP server gallery (Extensions sidebar → MCP tab) and click Install.
pip install docalyze-mcp-server
npx docalyze-mcp-server
This requires uv or pipx installed — the npm wrapper calls uvx to run the Python package automatically.
Add to your VS Code mcp.json (or settings.json):
{
"servers": {
"docalyze": {
"type": "stdio",
"command": "python",
"args": ["-m", "docalyze_mcp_server"],
"env": {
"PYTHONIOENCODING": "utf-8"
}
}
}
}
Or, if you installed via pip and want to use the entry point:
{
"servers": {
"docalyze": {
"type": "stdio",
"command": "docalyze-mcp-server"
}
}
}
The base install handles PDF, Excel, CSV, JSON, and plain text. For additional formats:
# Word documents
pip install docalyze-mcp-server[docx]
# PowerPoint
pip install docalyze-mcp-server[pptx]
# OCR (requires Tesseract installed on your system)
pip install docalyze-mcp-server[ocr]
# Everything
pip install docalyze-mcp-server[all]
The server reads documents from a configurable root directory. Set the DOCUMENTS_ROOT environment variable to change it:
{
"servers": {
"docalyze": {
"type": "stdio",
"command": "docalyze-mcp-server",
"env": {
"DOCUMENTS_ROOT": "/path/to/your/documents"
}
}
}
}
If not set, it defaults to the directory containing the server script.
MIT
csoai-org/pdf-document-mcp
xt765/mcp-document-converter
io.github.xjtlumedia/markdown-formatter
io.github.ai-aviate/better-notion
suekou/mcp-notion-server
meterlong/mcp-doc