Here's a straightforward document reader that exposes a single read_document tool to Claude, handling DOCX, PDF, Excel, and plain text files through one unified interface. It's built on python-docx, pypdf, and openpyxl, so you get reliable extraction from all four formats without switching between different tools. The tool takes a filename parameter and returns the text content, letting Claude read your local documents during conversations. Reach for this when you need Claude to analyze reports, extract data from spreadsheets, or reference documentation that lives on your filesystem. Install it via pip or uvx, point it at your files, and you're done.
MCP (Model Context Protocol) Document Reader - A powerful MCP tool for reading documents in multiple formats, enabling AI agents to truly "read" your documents.
User Guide · API Reference · Contributing · Changelog · License
graph TB
A[AI Assistant / User] -->|Call read_document| B[MCP Document Reader]
B -->|Detect file type| C{File Type?}
C -->|.docx| D[DOCX Reader]
C -->|.pdf| E[PDF Reader]
C -->|.xlsx/.xls| F[Excel Reader]
C -->|.txt| G[Text Reader]
D -->|Extract text| H[Return Content]
E -->|Extract text| H
F -->|Extract text| H
G -->|Extract text| H
H -->|Text content| A
style A fill:#e1f5ff
style B fill:#fff4e1
style C fill:#f0f0f0
style D fill:#e8f5e9
style E fill:#e8f5e9
style F fill:#e8f5e9
style G fill:#e8f5e9
style H fill:#fff9c4
| Format | Extensions | MIME Type | Features |
|---|---|---|---|
| Excel | .xlsx, .xls | application/vnd.openxmlformats-officedocument.spreadsheetml.sheet | Sheet and cell data extraction |
| DOCX | .docx | application/vnd.openxmlformats-officedocument.wordprocessingml.document | Text and structure extraction |
| application/pdf | Text extraction | ||
| Text | .txt | text/plain | Plain text reading |
pip install mcp-documents-reader
git clone https://github.com/xt765/mcp_documents_reader.git
cd mcp_documents_reader
pip install -e .
This server provides the following tool:
read_documentRead any supported document type with a unified interface.
Arguments:
filename (string, required): Document file path, supports absolute or relative paths.Add the following to your MCP configuration file:
Option 1: Using PyPI (Recommended)
{
"mcpServers": {
"mcp-document-reader": {
"command": "uvx",
"args": [
"mcp-documents-reader"
]
}
}
}
Option 2: Using GitHub repository
{
"mcpServers": {
"mcp-document-reader": {
"command": "uvx",
"args": [
"--from",
"git+https://github.com/xt765/mcp_documents_reader",
"mcp_documents_reader"
]
}
}
}
Option 3: Using Gitee repository (Faster access in China)
{
"mcpServers": {
"mcp-document-reader": {
"command": "uvx",
"args": [
"--from",
"git+https://gitee.com/xt765/mcp_documents_reader",
"mcp_documents_reader"
]
}
}
}
After configuration, AI assistants can directly call the following tool:
# Read a DOCX file
read_document(filename="example.docx")
# Read a PDF file
read_document(filename="example.pdf")
# Read an Excel file
read_document(filename="example.xlsx")
# Read a text file
read_document(filename="example.txt")
from mcp_documents_reader import DocumentReaderFactory
# Using factory (recommended)
reader = DocumentReaderFactory.get_reader("document.pdf")
content = reader.read("/path/to/document.pdf")
# Check if format is supported
if DocumentReaderFactory.is_supported("file.xlsx"):
reader = DocumentReaderFactory.get_reader("file.xlsx")
content = reader.read("/path/to/file.xlsx")
Read any supported document type.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
| filename | string | ✅ | Document file path, supports absolute or relative paths |
mcp >= 1.26.0 - MCP protocol implementationpython-docx >= 1.2.0 - DOCX file readingpypdf >= 6.8.0 - PDF file reading (replaces PyPDF2)openpyxl >= 3.1.5 - Excel file readingpytest >= 8.0.0 - Testing frameworkpytest-asyncio >= 0.24.0 - Async testing supportpytest-cov >= 6.0.0 - Coverage reportingbasedpyright >= 0.28.0 - Type checkingruff >= 0.8.0 - Linting and formattingMIT License
Issues and Pull Requests are welcome!
csoai-org/pdf-document-mcp
xt765/mcp-document-converter
io.github.xjtlumedia/markdown-formatter
io.github.ai-aviate/better-notion
suekou/mcp-notion-server
meterlong/mcp-doc