A Go-based web fetcher that returns clean markdown from any URL through a single fetch operation. It strips ads and navigation using Mozilla's Readability algorithm, handles rate limiting per domain, and includes SSRF protection to block private IPs and validate redirects. The base build handles static pages in 5ms with 10MB RAM. Rebuild with tags to add image extraction or headless Chrome rendering for JavaScript-heavy sites. Ships with LRU caching (5 minute default), configurable timeouts, and exponential backoff retries. Reach for this when you need your AI to read web content without pulling in raw HTML or building scraping infrastructure yourself.
Give your AI assistant the ability to read any webpage.
A lightweight MCP server that fetches URLs and returns clean, readable markdown. Built with Go — starts in 5ms, uses ~10MB RAM, zero runtime dependencies.
You give it a URL → it returns clean markdown.
Good for:
Not good for:
Pick one method:
# Go (recommended)
go install github.com/paimonchan/paimon-mcp-fetch/cmd/paimon-mcp-fetch@latest
# Homebrew (macOS/Linux)
brew tap paimonchan/tap
brew install paimon-mcp-fetch
# Scoop (Windows)
scoop bucket add paimonchan https://github.com/paimonchan/scoop-bucket
scoop install paimon-mcp-fetch
# Winget (Windows)
winget install paimonchan.paimon-mcp-fetch
# Docker
docker run -i --rm ghcr.io/paimonchan/paimon-mcp-fetch:latest
Add this to your MCP client config:
{
"mcp": {
"paimon-mcp-fetch": {
"type": "local",
"command": ["paimon-mcp-fetch"],
"enabled": true
}
}
}
Your AI can now read any URL you give it.
| paimon-mcp-fetch | Basic text fetch | |
|---|---|---|
| Output | Structured markdown | Plain text |
| Article extraction | Readability algorithm (strips ads, nav, sidebars) | Raw HTML body |
| Images | Optional extraction + processing | None |
| JS rendering | Optional (headless Chrome) | Static only |
| Caching | Built-in LRU cache | None |
| Rate limiting | Per-domain, configurable | None |
| SSRF protection | 7-layer defense | None |
| Startup time | ~5ms | Varies |
| Memory | ~10MB | Varies |
Everything is controlled via environment variables. You probably don't need to change anything — defaults work well for most use cases.
| Variable | Default | What it does |
|---|---|---|
PAIMON_MCP_FETCH_TIMEOUT_MS | 12000 | Request timeout (ms) |
PAIMON_MCP_FETCH_MAX_HTML_BYTES | 10485760 | Max page size (10MB) |
PAIMON_MCP_FETCH_CACHE_TTL_SECS | 300 | Cache lifetime (5 min) |
PAIMON_MCP_FETCH_RATE_LIMIT_PER_SECOND | 5.0 | Requests/sec per domain |
PAIMON_MCP_FETCH_RATE_LIMIT_BURST | 10 | Max burst size |
PAIMON_MCP_FETCH_RETRY_MAX_ATTEMPTS | 3 | Retry on transient errors |
PAIMON_MCP_FETCH_JS_RENDER_ENABLED | false | Enable headless Chrome |
Extract and process images from webpages:
go build -tags image -o paimon-mcp-fetch ./cmd/paimon-mcp-fetch/
For JavaScript-heavy sites (SPAs, dynamic content):
go build -tags jsrender -o paimon-mcp-fetch ./cmd/paimon-mcp-fetch/
PAIMON_MCP_FETCH_JS_RENDER_ENABLED=true ./paimon-mcp-fetch
Note: Requires Chrome or Chromium installed. Slower (~3-5s/page) but handles sites that static fetch can't.
Works with any MCP-compatible client. Add the configuration below to your client's config file.
Config file: ~/.config/opencode/opencode.json
{
"mcp": {
"paimon-mcp-fetch": {
"type": "local",
"command": ["paimon-mcp-fetch"],
"enabled": true
}
}
}
Config file: claude_desktop_config.json
{
"mcpServers": {
"paimon-mcp-fetch": {
"command": "paimon-mcp-fetch"
}
}
}
Config file: .cursor/mcp.json (project) or ~/.cursor/mcp.json (global)
{
"mcpServers": {
"paimon-mcp-fetch": {
"command": "paimon-mcp-fetch",
"env": {}
}
}
}
Config file: .vscode/mcp.json (workspace) or user settings
{
"servers": {
"paimon-mcp-fetch": {
"type": "stdio",
"command": "paimon-mcp-fetch"
}
}
}
Config file: .cline/mcp.json
{
"mcpServers": {
"paimon-mcp-fetch": {
"command": "paimon-mcp-fetch"
}
}
}
Config file: .windsurf/mcp.json
{
"mcpServers": {
"paimon-mcp-fetch": {
"command": "paimon-mcp-fetch"
}
}
}
All clients support environment variables. Example with custom timeout and cache:
{
"mcpServers": {
"paimon-mcp-fetch": {
"command": "paimon-mcp-fetch",
"env": {
"PAIMON_MCP_FETCH_TIMEOUT_MS": "30000",
"PAIMON_MCP_FETCH_CACHE_TTL_SECS": "600"
}
}
}
}
Built with security in mind from day one:
Some antivirus may flag unsigned Go binaries as a false positive. This is a known industry issue. Solutions:
go install — antivirus sees the signed Go compilerBuilt with Clean Architecture principles:
MCP Server → UseCase → Domain (entities, ports)
↑
Adapters implement interfaces
Full details in the project plan.
MIT — do whatever you want with it.
Built with Go. Zero runtime dependencies. Single binary. ~10MB RAM. Starts in 5ms.
PAIMON_MCP_FETCH_MAX_LENGTHdefault: 50000Maximum content length to return in characters
PAIMON_MCP_FETCH_TIMEOUTdefault: 30HTTP request timeout in seconds
PAIMON_MCP_FETCH_DISABLE_ROBOTSdefault: falseDisable robots.txt checking (use with caution)
PAIMON_MCP_FETCH_USER_AGENTdefault: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36Custom User-Agent string for HTTP requests
csoai-org/pdf-document-mcp
xt765/mcp-document-converter
io.github.xjtlumedia/markdown-formatter
io.github.ai-aviate/better-notion
suekou/mcp-notion-server
meterlong/mcp-doc