Converts PDFs to Markdown and vice versa using PyMuPDF for extraction and ReportLab for generation. Exposes pdf_to_markdown and markdown_to_pdf tools through MCP, with options for image extraction, page sizing, margins, and font configuration. The engine automatically detects CJK characters and switches between ReportLab, xhtml2pdf, and fpdf2 based on content complexity. Handles Unicode properly across platforms by finding system fonts like Arial Unicode, PingFang SC, or Noto Sans CJK. Reach for this when you need bidirectional PDF conversion with Claude, especially if you're working with Chinese, Japanese, or Korean documents where font handling matters.
A high-quality, cross-platform PDF ↔ Markdown converter implemented as an MCP (Model Context Protocol) server. Supports bidirectional conversion with full Unicode/CJK character support.
This server is available in the Model Context Protocol Registry. Install it using your MCP client.
mcp-name: io.github.huoshuiai42/huoshui-pdf-converter
pip install huoshui-pdf-converter
Or using uv (recommended):
uv pip install huoshui-pdf-converter
Add to your Claude Desktop configuration:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
Linux: ~/.config/Claude/claude_desktop_config.json
{
"mcpServers": {
"huoshui-pdf-converter": {
"command": "uvx",
"args": ["huoshui-pdf-converter"],
"env": {}
}
}
}
Or if you prefer to use a specific Python environment:
{
"mcpServers": {
"huoshui-pdf-converter": {
"command": "python",
"args": ["-m", "huoshui_pdf_converter.server"],
"env": {}
}
}
}
# Convert PDF to Markdown
huoshui-pdf pdf-to-md input.pdf output.md
# Convert Markdown to PDF
huoshui-pdf md-to-pdf input.md output.pdf
# With options
huoshui-pdf md-to-pdf input.md output.pdf --page-size A4 --margin 2cm --font-size 12
import asyncio
from huoshui_pdf_converter import PDFToMarkdownConverter, MarkdownToPDFConverter
async def main():
# PDF to Markdown
pdf_converter = PDFToMarkdownConverter()
result = await pdf_converter.convert(
pdf_path="input.pdf",
output_path="output.md",
extract_images=True,
preserve_formatting=True
)
# Markdown to PDF
md_converter = MarkdownToPDFConverter()
result = await md_converter.convert(
markdown_path="input.md",
output_path="output.pdf",
page_size="A4",
margin="2cm",
font_size=12
)
asyncio.run(main())
When used as an MCP server, the following tools are available:
pdf_to_markdown: Convert PDF files to Markdown
{
"pdf_path": "path/to/input.pdf",
"output_path": "path/to/output.md",
"extract_images": true,
"preserve_formatting": true
}
markdown_to_pdf: Convert Markdown files to PDF
{
"markdown_path": "path/to/input.md",
"output_path": "path/to/output.pdf",
"page_size": "A4",
"margin": "2cm",
"font_size": 12
}
list_supported_formats: Get supported formats and engines
validate_file: Validate input files before conversion
The converter automatically detects and uses appropriate fonts for different languages:
PDF → Markdown
Markdown → PDF
# Clone the repository
git clone https://github.com/yourusername/huoshui-pdf-converter.git
cd huoshui-pdf-converter
# Install dependencies
uv pip install -e ".[dev]"
# Run tests
python test_converter.py
huoshui-pdf-converter/
├── huoshui_pdf_converter/
│ ├── __init__.py
│ ├── server.py # MCP server implementation
│ ├── pdf_converter.py # PDF to Markdown converter
│ └── markdown_converter.py # Markdown to PDF converter
├── pyproject.toml
├── README.md
├── LICENSE
└── test_converter.py
Chinese characters not displaying:
Import errors:
pip install huoshui-pdf-converter[all]MCP connection issues:
Enable debug logging:
import logging
logging.basicConfig(level=logging.DEBUG)
Contributions are welcome! Please:
git checkout -b feature/amazing-feature)git commit -m 'Add amazing feature')git push origin feature/amazing-feature)This project is licensed under the MIT License - see the LICENSE file for details.
csoai-org/pdf-document-mcp
xt765/mcp-document-converter
io.github.xjtlumedia/markdown-formatter
io.github.ai-aviate/better-notion
suekou/mcp-notion-server
meterlong/mcp-doc