CAT
/MCP
SkillsMCPMarketplacesDigestToolsAdvertise

This week in Claude

Every Monday: Claude Code, Agent SDK, MCP, and the Anthropic platform moves worth your time.

Skills by Category
Frontend DevelopmentBackend & APIsTesting & QASecurityDevOps & CI/CDGit & Pull RequestsDocumentationCode Review & QualityAI & Agent BuildingSkill Development
MCP Servers by Category
Sales & MarketingWeb & Browser AutomationDatabasesAI & LLM ToolsCloud & InfrastructureCommunication & MessagingDeveloper ToolsDesign & CreativeDocuments & KnowledgeSearch & Web Crawling
Marketplaces by Category
AI Agents & OrchestrationLLM IntegrationDevelopment ToolsFrontend & UIBackend & APIsDatabasesTesting & Code QualityDevOps & CloudSecurity & ComplianceGit & Version Control

Cross AI Tools

Discover Claude Code plugins, extensions, and tools. Automatically updated directory of Anthropic Claude AI marketplaces with development tools, productivity plugins, and integrations.

Resources

  • Browse Skills
  • Browse MCP Servers
  • Browse Marketplaces
  • Plugins Reference

Community

  • About
  • Tools
  • Feedback
  • Privacy Policy
  • Advertise

Built for the Claude Code community with Claude Code by @mertduzgun

Independent project, not affiliated with Anthropic

PDF Reader

sylphxai/pdf-reader-mcp
711
Summary

The Pdf Reader Mcp server provides AI agents with enterprise-grade PDF processing capabilities, enabling extraction of text, images, and metadata from PDF documents. It offers tools for parallel PDF processing with Y-coordinate based content ordering, flexible path handling, and per-page error resilience, achieving 5-10x faster performance compared to sequential processing. The server solves the problem of slow, unreliable PDF extraction by combining high-performance parallel processing with comprehensive error handling and natural content ordering that preserves document layout.

CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →
CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →

📄 @sylphx/pdf-reader-mcp

The PDF intelligence layer for AI agents that need source evidence, not just extracted text.

npm version License CI/CD codecov TypeScript Downloads

V3 smart tool surface · Agent Document Twin · Evidence-first extraction · Visual crops · OCR adapters · Tables, charts, formulas, figures · Trust & accessibility reports · Benchmark-gated releases

Security Validated

PDFs are not plain text files. They are layout, pixels, tables, hidden text, permissions, annotations, scanned pages, and ambiguous reading order.

PDF Reader MCP turns that mess into an Agent Document Twin: a linked, source-backed representation of the PDF that agents can inspect, search, verify, crop, OCR, enrich, cite, and read with confidence.

If your agent has ever hallucinated from a PDF, lost a table, trusted hidden text, missed a scanned page, or needed to cite the exact region that proves an answer, this is the MCP server for that workflow.

Why Agents Use It

NeedWhat PDF Reader MCP gives you
Read the documentMarkdown, JSON, HTML, page text, metadata, chunks, and semantic AST.
Prove the answerPage numbers, bounding boxes, evidence IDs, region crops, and source renders.
Handle scanned PDFsRendered pages routed through configured OCR providers with word boxes and provenance.
Recover tablesSelectable-text and OCR-derived tables with cells, geometry, confidence, warnings, and continuation hints.
See what text extraction missesVisual page evidence, focused crops, and configured visual-region provider adapters.
Protect the agentTrust reports for hidden text, prompt-injection-like content, visual spoofing, unsafe links, and redaction.
Route accessibility workTagged-PDF coverage, tag-visible coverage, headings, images, forms, links, permissions, and page grades.
Ship with proofCI, package smoke, deterministic quality benchmarks, provider artifacts, and release gates.

Quick Start

Claude Code

claude mcp add pdf-reader -- npx @sylphx/pdf-reader-mcp

Claude Desktop

Add this to claude_desktop_config.json:

{
  "mcpServers": {
    "pdf-reader": {
      "command": "npx",
      "args": ["@sylphx/pdf-reader-mcp"]
    }
  }
}

Any MCP Client

npx @sylphx/pdf-reader-mcp

Node.js >=22.13 is required. The default package works without downloading OCR models, vision models, Ollama, LM Studio, llama.cpp, or cloud credentials.

Need Cursor, VS Code, Windsurf, Cline, Warp, HTTP transport, Docker, or filesystem sandboxing? See the installation guide.

One Smart Tool First

The default V3 agent path is one tool call:

{
  "sources": [{ "path": "/absolute/path/to/report.pdf" }]
}

With no manual include_* flags, read_pdf profiles each PDF, chooses the extraction route, and returns the Agent Document Twin in one response. Digital text PDFs get Markdown, chunks, tables, layout routing, and source evidence. Mixed or scanned PDFs are routed toward configured OCR and visual providers when those providers are ready. Metadata, page geometry, warnings, provider readiness, and the selected read_pdf arguments are included so the agent can see what happened.

Agents can still force auto: false and use explicit include_* options for a precise manual extraction. Use auto_detail: "fast", "balanced", or "full" when the agent wants to control output depth without learning dozens of switches.

MCP Tool Surface

ToolUse it when the agent needs to...
read_pdfUse first. With only sources, it auto-inspects and reads the PDF in one call; with explicit include_* options, it runs precise manual extraction.
search_pdfSearch selectable text and optional OCR text with snippets, offsets, boxes, and provenance.
pdf_evidenceOne focused evidence tool for inspect, render_page, extract_regions, ocr_pages, and analyze_regions operations.

Full request and response details live in the API reference.

Agent Document Twin

The Agent Document Twin is the main reason to use this project instead of a plain text extractor. It keeps the document readable by agents while preserving the evidence needed to verify the answer.

LayerOutput
Lossless PDF layerText runs, lines, words, characters, fonts, transforms, page geometry, metadata coverage, outlines, forms, attachments, annotations, permissions, and structure signals where available.
Visual layerPage renders, region crops, crop provenance, visual candidates, OCR source renders, and provider-normalized visual evidence.
Semantic layerPage, section, paragraph, list, caption, header, footer, table, image, chart, formula, figure, and diagram nodes where available.
Evidence layerStable IDs, page ranges, bounding boxes, crop IDs, confidence, warnings, and extraction method provenance.
Agent layerMarkdown, JSON, HTML, citation chunks, routing plans, trust report, accessibility report, and document map indexes.

Example: Read With Evidence

{
  "sources": [{ "path": "/absolute/path/to/report.pdf" }],
  "include_markdown": true,
  "include_chunks": true,
  "include_tables": true,
  "include_text_layer": true,
  "include_document_map": true,
  "include_document_ast": true,
  "include_trust_report": true,
  "include_accessibility_report": true
}

Example: Search, Then Verify The Source Region

{
  "sources": [{ "path": "/absolute/path/to/report.pdf" }],
  "query": "revenue recognition",
  "max_matches_per_source": 10
}

Use the returned page and bounding box with pdf_evidence operation render_page or extract_regions when the agent needs visual proof before citing or summarizing.

Provider-Enabled Intelligence

The default package stays TypeScript-first and local-first. Heavy engines are optional, deployment-controlled adapters.

CapabilityDefault behaviorEnable with
Selectable-text PDFsWorks out of the boxNo extra dependency
Rendering and cropsWorks out of the boxNo extra dependency
Trust and accessibility reportsWorks out of the boxNo extra dependency
OCR for scanned pagesProvider-readyMCP_PDF_OCR_*
Visual table/chart/formula/figure/image enrichmentProvider-readyMCP_PDF_REGION_ANALYSIS_*

Supported visual provider paths include local commands, local HTTP servers, Ollama, OpenAI-compatible endpoints, LM Studio, and llama.cpp. Request payloads cannot choose arbitrary executables or arbitrary provider URLs; providers are configured by the deployment environment.

# Example shape only. Point these at your own local OCR command.
export MCP_PDF_OCR_COMMAND="tesseract"
export MCP_PDF_OCR_ARGS_JSON='["{input}", "stdout", "tsv"]'

See the guide and API reference for provider configuration details.

Release Proof

Strong README claims should be backed by shipped evidence. This repo publishes machine-readable artifacts and gates releases on them.

ArtifactCurrent proof
pdf_sota_release_gate.jsonpassed, 39/39 release-gate checks passing
pdf_quality_benchmark.jsonscore 1, 69/69 deterministic quality checks passing
pdf_provider_benchmark.jsonstrict provider evidence enabled, 4/4 final-bar provider profiles certified
pdf_corpus_benchmark.jsoncorpus-style PDF intelligence assertions with capability summaries
pdf_provider_manifest_crop_benchmark.jsondeterministic crop-substrate proof for provider-manifest regions
pdf_provider_manifest_benchmark.jsondeterministic scoring proof for table, formula, chart, figure, and image regions

Run the same proof locally:

bun run benchmark:release-artifacts
bun run benchmark:release-gate
bun run package:smoke

See performance and release evidence for the full benchmark contract.

Output Formats

read_pdf can return the same PDF in several agent-friendly forms:

  • Plain text and page text
  • Markdown for RAG and summarization
  • HTML for rendering or downstream transformation
  • Structured elements with page and geometry provenance
  • Document AST for semantic navigation
  • Citation chunks with page, element, table, and bbox references
  • Tables with rows, cells, geometry, warnings, and confidence
  • Trust and accessibility reports
  • Agent Document Twin indexes linking text, visual, OCR, table, trust, and accessibility evidence

Security Model

PDFs can contain hostile or misleading content. The server treats extraction as an evidence workflow, not as a trusted text dump.

  • Local-first by default.
  • URL loading is guarded by host, private-IP, size, and HTTP policy controls.
  • OCR and visual providers are configured by environment, not by request body.
  • Trust reports surface hidden text, near-invisible geometry, off-page text, overlapping text, unsafe links, redaction signals, and prompt-injection-like content.
  • Rendering, crops, OCR, and visual enrichment preserve provenance so agents can route weak evidence to verification instead of silently trusting it.

Documentation

TopicLink
Getting starteddocs/guide/getting-started.md
Installation and clientsdocs/guide/installation.md
API referencedocs/api/README.md
Capability overviewdocs/comparison/index.md
Architecture and designdocs/design/index.md
Performance and release proofdocs/performance/index.md

Development

git clone https://github.com/SylphxAI/pdf-reader-mcp.git
cd pdf-reader-mcp
bun install
bun run build
bun test

Useful checks:

bun run check
bun run typecheck
bun run docs:build
bun run package:smoke
bun run benchmark:release-gate

Support

  • Issues
  • Discussions
  • npm package

If you want local-first, evidence-backed PDF intelligence to keep improving for AI agents, star the repo. It helps the project reach more builders who need PDFs to be verifiable, not just readable.

License

MIT © SylphxAI

Star History

Star History Chart

Featured
CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →
Categories
Documents & Knowledge
UpdatedJan 31, 2026
View on GitHub

Related Documents & Knowledge MCP Servers

View all →
Pdf Document Mcp

csoai-org/pdf-document-mcp

pdf-document-mcp MCP server by MEOK AI Labs
Mcp Document Converter

xt765/mcp-document-converter

Convert PDF, DOCX, HTML, Markdown, and Text for AI assistant context injection.
10
Markdown Formatter

io.github.xjtlumedia/markdown-formatter

AI Answer Copier — Convert Markdown to PDF, DOCX, HTML, LaTeX, CSV, JSON, XML, XLSX, RTF, PNG
3
Better Notion

io.github.ai-aviate/better-notion

Operate Notion with a single Markdown document — read, create, and update pages in one call.
2
Notion

suekou/mcp-notion-server

Notion MCP Server enables LLMs to access Notion workspaces with optional Markdown conversion to save tokens.
892
Docx

meterlong/mcp-doc

A powerful Word document processing service based on FastMCP, enabling AI assistants to create, edit, and manage docx files with full formatting support. Preserves original styles when editing content. 基于FastMCP的强大Word文档处理服务,使AI助手能够创建、编辑和管理docx文件,支持完整的格式设置功能。在编辑内容时能够保留原始样式和格式,实现精确的文档操作。
185