CAT
/MCP
SkillsMCPMarketplacesDigestToolsAdvertise

This week in Claude

Every Monday: Claude Code, Agent SDK, MCP, and the Anthropic platform moves worth your time.

Skills by Category
Frontend DevelopmentBackend & APIsTesting & QASecurityDevOps & CI/CDGit & Pull RequestsDocumentationCode Review & QualityAI & Agent BuildingSkill Development
MCP Servers by Category
Sales & MarketingWeb & Browser AutomationDatabasesAI & LLM ToolsCloud & InfrastructureCommunication & MessagingDeveloper ToolsDesign & CreativeDocuments & KnowledgeSearch & Web Crawling
Marketplaces by Category
AI Agents & OrchestrationLLM IntegrationDevelopment ToolsFrontend & UIBackend & APIsDatabasesTesting & Code QualityDevOps & CloudSecurity & ComplianceGit & Version Control

Cross AI Tools

Discover Claude Code plugins, extensions, and tools. Automatically updated directory of Anthropic Claude AI marketplaces with development tools, productivity plugins, and integrations.

Resources

  • Browse Skills
  • Browse MCP Servers
  • Browse Marketplaces
  • Plugins Reference

Community

  • About
  • Tools
  • Feedback
  • Privacy Policy
  • Advertise

Built for the Claude Code community with Claude Code by @mertduzgun

Independent project, not affiliated with Anthropic

Huoshui Pdf Converter

huoshuiai42/huoshui-pdf-converter
STDIOregistry active
Summary

Converts PDFs to Markdown and vice versa using PyMuPDF for extraction and ReportLab for generation. Exposes pdf_to_markdown and markdown_to_pdf tools through MCP, with options for image extraction, page sizing, margins, and font configuration. The engine automatically detects CJK characters and switches between ReportLab, xhtml2pdf, and fpdf2 based on content complexity. Handles Unicode properly across platforms by finding system fonts like Arial Unicode, PingFang SC, or Noto Sans CJK. Reach for this when you need bidirectional PDF conversion with Claude, especially if you're working with Chinese, Japanese, or Korean documents where font handling matters.

CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →
CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →

活水 PDF 转换器 (Huoshui PDF Converter)

License: MIT Python 3.10+ MCP PyPI version

A high-quality, cross-platform PDF ↔ Markdown converter implemented as an MCP (Model Context Protocol) server. Supports bidirectional conversion with full Unicode/CJK character support.

Features

Core Capabilities

  • PDF → Markdown: Extract text and images with layout preservation
  • Markdown → PDF: Generate beautiful PDFs with multiple rendering engines
  • Unicode Support: Full support for Chinese, Japanese, Korean, and other Unicode characters
  • Cross-Platform: Works on Windows, macOS, and Linux
  • MCP Integration: Use with Claude Desktop or any MCP-compatible client

Technical Features

  • Pure Python: No external system dependencies required
  • Automatic Font Detection: Finds and uses system Unicode fonts
  • Smart Engine Selection: Automatically switches engines based on content
  • Comprehensive Error Handling: Graceful degradation and detailed logging
  • Async Architecture: Non-blocking operations for better performance

Installation

From MCP Registry (Recommended)

This server is available in the Model Context Protocol Registry. Install it using your MCP client.

mcp-name: io.github.huoshuiai42/huoshui-pdf-converter

As a Python Package

pip install huoshui-pdf-converter

Or using uv (recommended):

uv pip install huoshui-pdf-converter

As an MCP Server

Add to your Claude Desktop configuration:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json Windows: %APPDATA%\Claude\claude_desktop_config.json Linux: ~/.config/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "huoshui-pdf-converter": {
      "command": "uvx",
      "args": ["huoshui-pdf-converter"],
      "env": {}
    }
  }
}

Or if you prefer to use a specific Python environment:

{
  "mcpServers": {
    "huoshui-pdf-converter": {
      "command": "python",
      "args": ["-m", "huoshui_pdf_converter.server"],
      "env": {}
    }
  }
}

Usage

Command Line Interface

# Convert PDF to Markdown
huoshui-pdf pdf-to-md input.pdf output.md

# Convert Markdown to PDF
huoshui-pdf md-to-pdf input.md output.pdf

# With options
huoshui-pdf md-to-pdf input.md output.pdf --page-size A4 --margin 2cm --font-size 12

As a Python Library

import asyncio
from huoshui_pdf_converter import PDFToMarkdownConverter, MarkdownToPDFConverter

async def main():
    # PDF to Markdown
    pdf_converter = PDFToMarkdownConverter()
    result = await pdf_converter.convert(
        pdf_path="input.pdf",
        output_path="output.md",
        extract_images=True,
        preserve_formatting=True
    )

    # Markdown to PDF
    md_converter = MarkdownToPDFConverter()
    result = await md_converter.convert(
        markdown_path="input.md",
        output_path="output.pdf",
        page_size="A4",
        margin="2cm",
        font_size=12
    )

asyncio.run(main())

MCP Tools

When used as an MCP server, the following tools are available:

  1. pdf_to_markdown: Convert PDF files to Markdown

    {
      "pdf_path": "path/to/input.pdf",
      "output_path": "path/to/output.md",
      "extract_images": true,
      "preserve_formatting": true
    }
    
  2. markdown_to_pdf: Convert Markdown files to PDF

    {
      "markdown_path": "path/to/input.md",
      "output_path": "path/to/output.pdf",
      "page_size": "A4",
      "margin": "2cm",
      "font_size": 12
    }
    
  3. list_supported_formats: Get supported formats and engines

  4. validate_file: Validate input files before conversion

Supported Formats

Input Formats

  • PDF: All standard PDF files (PDF 1.0 - 1.7)
  • Markdown: CommonMark and GitHub Flavored Markdown

Output Options

  • Page Sizes: A4, A3, Letter, Legal
  • Margins: Customizable (e.g., "1cm", "0.5in")
  • Font Sizes: Any size in points
  • Images: PNG, JPEG extraction from PDFs

Unicode and Font Support

The converter automatically detects and uses appropriate fonts for different languages:

  • macOS: Arial Unicode, PingFang SC, STHeiti
  • Windows: Microsoft YaHei, SimSun, Arial Unicode MS
  • Linux: Noto Sans CJK, Source Han Sans, WenQuanYi

Architecture

Conversion Engines

PDF → Markdown

  • PyMuPDF (MuPDF): High-quality text and image extraction

Markdown → PDF

  • ReportLab: Best Unicode support, cross-platform compatibility
  • xhtml2pdf: Good HTML/CSS rendering (fallback)
  • fpdf2: Basic PDF generation (last resort)

Engine Selection Logic

  1. Detects CJK characters → Uses ReportLab
  2. Complex formatting → Uses xhtml2pdf
  3. Basic documents → Uses any available engine

Development

Setup Development Environment

# Clone the repository
git clone https://github.com/yourusername/huoshui-pdf-converter.git
cd huoshui-pdf-converter

# Install dependencies
uv pip install -e ".[dev]"

# Run tests
python test_converter.py

Project Structure

huoshui-pdf-converter/
├── huoshui_pdf_converter/
│   ├── __init__.py
│   ├── server.py           # MCP server implementation
│   ├── pdf_converter.py    # PDF to Markdown converter
│   └── markdown_converter.py # Markdown to PDF converter
├── pyproject.toml
├── README.md
├── LICENSE
└── test_converter.py

Troubleshooting

Common Issues

  1. Chinese characters not displaying:

    • Ensure Arial Unicode or similar fonts are installed
    • The converter will automatically detect and use appropriate fonts
  2. Import errors:

    • Install all dependencies: pip install huoshui-pdf-converter[all]
  3. MCP connection issues:

    • Check Claude Desktop logs
    • Ensure Python is in your PATH

Logging

Enable debug logging:

import logging
logging.basicConfig(level=logging.DEBUG)

Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • Built with FastMCP for Model Context Protocol support
  • Uses PyMuPDF for PDF parsing
  • Uses ReportLab for PDF generation
  • Inspired by the need for better PDF ↔ Markdown conversion tools

Support

  • Issues: GitHub Issues
  • Discussions: GitHub Discussions
  • Email: your.email@example.com
Featured
CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →
Categories
Documents & Knowledge
Registryactive
Packagehuoshui-pdf-converter
TransportSTDIO
UpdatedSep 11, 2025
View on GitHub

Related Documents & Knowledge MCP Servers

View all →
Pdf Document Mcp

csoai-org/pdf-document-mcp

pdf-document-mcp MCP server by MEOK AI Labs
Mcp Document Converter

xt765/mcp-document-converter

Convert PDF, DOCX, HTML, Markdown, and Text for AI assistant context injection.
10
Markdown Formatter

io.github.xjtlumedia/markdown-formatter

AI Answer Copier — Convert Markdown to PDF, DOCX, HTML, LaTeX, CSV, JSON, XML, XLSX, RTF, PNG
3
Better Notion

io.github.ai-aviate/better-notion

Operate Notion with a single Markdown document — read, create, and update pages in one call.
2
Notion

suekou/mcp-notion-server

Notion MCP Server enables LLMs to access Notion workspaces with optional Markdown conversion to save tokens.
892
Docx

meterlong/mcp-doc

A powerful Word document processing service based on FastMCP, enabling AI assistants to create, edit, and manage docx files with full formatting support. Preserves original styles when editing content. 基于FastMCP的强大Word文档处理服务,使AI助手能够创建、编辑和管理docx文件,支持完整的格式设置功能。在编辑内容时能够保留原始样式和格式,实现精确的文档操作。
185