A Rust-backed PDF toolkit that gives Claude native access to PDF generation, parsing, and manipulation without C dependencies or subprocess calls. Exposes 12 tools including extract_text, manipulate_pdf (split, merge, rotate), annotate_pdf, manage_forms, and secure_pdf for encryption and permissions. Also surfaces resources like available fonts and page sizes, plus prompts for guided workflows. The Python library underneath handles everything from creating PDFs with text and graphics to reading metadata and converting formats. Reach for this when you need AI to process documents end to end, whether that's extracting structured data, filling forms, splitting multi-page contracts, or generating new PDFs from scratch. Zero Java or Poppler required.
Rust-powered PDF library for Python. Generate, parse, split, merge, and manipulate PDFs with native performance. Ships with a built-in MCP server so AI agents can work with PDFs out of the box.
No C dependencies. No Java. No subprocess calls.
pip install oxidize-pdf # Core library
pip install "oxidize-pdf[mcp]" # + MCP server for AI agents
Platforms: Linux (x86_64, aarch64) | macOS (x86_64, Apple Silicon) | Windows (x86_64) Requires: Python 3.10+
| oxidize-pdf | Pure-Python libs | C/Java wrappers | |
|---|---|---|---|
| Performance | Native (compiled Rust) | Interpreted | Native but heavy |
| Dependencies | Zero | Varies | Poppler, Java, Ghostscript |
| Memory safety | Rust ownership model | GC-dependent | Manual / GC |
| Type stubs | Full (mypy/pyright) | Partial | Rare |
| AI-ready (MCP) | Built-in | No | No |
Give your AI agent full PDF capabilities in one line:
oxidize-mcp
The built-in Model Context Protocol server exposes 12 tools, 6 resources, and 5 prompts — compatible with Claude, GPT, and any MCP client.
Add to your claude_desktop_config.json:
{
"mcpServers": {
"oxidize-pdf": {
"command": "oxidize-mcp",
"env": {
"OXIDIZE_WORKSPACE": "/path/to/your/pdfs"
}
}
}
}
| Tool | What it does |
|---|---|
read_pdf | Read metadata — page count, version, encryption status, title, author |
extract_text | Extract text from all pages or a specific page |
convert_pdf | Convert to markdown, chunks, or RAG-optimized format |
create_pdf | Create a new PDF with optional metadata |
save_pdf | Save a session to disk, with optional encryption |
add_content | Add pages, text, and graphics to a session |
annotate_pdf | Add text annotations and highlights |
manipulate_pdf | Split, merge, rotate, extract pages, reverse, overlay |
manage_forms | Create, fill, read, and validate form fields |
secure_pdf | Encrypt, check permissions, verify signatures |
extract_entities | Extract structured entities from pages |
analyze_pdf | Validate structure, detect corruption, check PDF/A compliance |
The server also exposes resources (session data, capabilities, version info) and prompts (guided workflows for summarization, data extraction, form filling, and more).
OXIDIZE_WORKSPACE=/path/to/pdfs oxidize-mcp
Or start programmatically:
from oxidize_pdf.mcp.server import run
run()
from oxidize_pdf import Document, Page, Font, Color
doc = Document()
doc.set_title("My Document")
doc.set_author("Jane Doe")
page = Page.a4()
page.set_font(Font.HELVETICA, 24.0)
page.set_text_color(Color.black())
page.text_at(72.0, 750.0, "Hello from oxidize-pdf!")
page.set_font(Font.TIMES_ROMAN, 12.0)
page.text_at(72.0, 700.0, "Generated with Python + Rust.")
doc.add_page(page)
doc.save("output.pdf")
from oxidize_pdf import PdfReader
reader = PdfReader.open("document.pdf")
print(f"Pages: {reader.page_count}, Version: {reader.version}")
for i, text in enumerate(reader.extract_text()):
print(f"--- Page {i + 1} ---")
print(text)
from oxidize_pdf import split_pdf, merge_pdfs, rotate_pdf, extract_pages
split_pdf("input.pdf", "output_dir/") # Split into individual pages
merge_pdfs(["part1.pdf", "part2.pdf"], "merged.pdf") # Merge multiple PDFs
rotate_pdf("input.pdf", "rotated.pdf", 90) # Rotate all pages
extract_pages("input.pdf", "subset.pdf", [0, 2, 4]) # Extract specific pages
from oxidize_pdf import Document, Page, Color
doc = Document()
page = Page.a4()
page.set_fill_color(Color.hex("#3498db"))
page.draw_rect(72.0, 700.0, 200.0, 100.0)
page.fill()
page.set_stroke_color(Color.red())
page.set_line_width(2.0)
page.draw_circle(300.0, 500.0, 50.0)
page.stroke()
doc.add_page(page)
doc.save("graphics.pdf")
from oxidize_pdf import Color, Point, Rectangle, Margins, Font
# Colors
Color.rgb(1.0, 0.0, 0.0) # RGB
Color.hex("#ff6600") # Hex
Color.cmyk(0.0, 1.0, 1.0, 0.0) # CMYK
# Geometry
Point(72.0, 720.0)
Rectangle.from_xywh(72.0, 72.0, 468.0, 648.0)
Margins.uniform(72.0)
# Fonts — all 14 standard PDF fonts
Font.HELVETICA # Font.HELVETICA_BOLD
Font.TIMES_ROMAN # Font.TIMES_BOLD
Font.COURIER # Font.COURIER_BOLD
from oxidize_pdf import PdfReader, PdfError, PdfIoError, PdfParseError
try:
reader = PdfReader.open("missing.pdf")
except PdfIoError as e:
print(f"I/O error: {e}")
except PdfParseError as e:
print(f"Parse error: {e}")
except PdfError as e:
print(f"PDF error: {e}")
Exception hierarchy: PdfError > PdfIoError, PdfParseError, PdfEncryptionError, PdfPermissionError
oxidize-pdf includes an MCP server that exposes PDF capabilities to AI assistants like Claude. Install with the mcp extra:
pip install oxidize-pdf[mcp]
Add this to your claude_desktop_config.json:
{
"mcpServers": {
"oxidize-pdf": {
"command": "uvx",
"args": ["--from", "oxidize-pdf[mcp]", "oxidize-mcp"]
}
}
}
claude mcp add oxidize-pdf -- uvx --from "oxidize-pdf[mcp]" oxidize-mcp
| Tool | Description |
|---|---|
read_pdf | Open a PDF and get metadata (pages, version, encryption) |
extract_text | Extract text content from PDF pages |
convert_pdf | Convert between PDF versions |
analyze_pdf | Analyze structure, fonts, images, and compliance |
extract_entities | Extract images and digital signatures |
manipulate_pdf | Split, merge, rotate, extract, and reorder pages |
annotate_pdf | Add text annotations, highlights, and stamps |
manage_forms | Create, fill, and read PDF form fields |
secure_pdf | Encrypt, decrypt, and set document permissions |
create_pdf | Create a new PDF document with pages |
add_pdf_content | Add text, shapes, and images to pages |
save_pdf | Save the document to file or bytes |
oxidize://fonts — Available built-in PDF fontsoxidize://page-sizes — Standard page sizes with dimensionsoxidize://capabilities — Server capabilities and tool listingoxidize://version — Version informationoxidize://workspace — PDF files in the workspace directoryoxidize://session/{id} — Session data by IDDocument.encrypt() configures encryption parameters but the underlying Rust library does not yet serialize the encryption dictionary to the PDF output. Reading encrypted PDFs works correctly.MIT — see LICENSE for details.
csoai-org/pdf-document-mcp
xt765/mcp-document-converter
io.github.xjtlumedia/markdown-formatter
io.github.ai-aviate/better-notion
suekou/mcp-notion-server
meterlong/mcp-doc