CAT
/MCP
SkillsMCPMarketplacesDigestToolsAdvertise

This week in Claude

Every Monday: Claude Code, Agent SDK, MCP, and the Anthropic platform moves worth your time.

Skills by Category
Frontend DevelopmentBackend & APIsTesting & QASecurityDevOps & CI/CDGit & Pull RequestsDocumentationCode Review & QualityAI & Agent BuildingSkill Development
MCP Servers by Category
Sales & MarketingWeb & Browser AutomationDatabasesAI & LLM ToolsCloud & InfrastructureCommunication & MessagingDeveloper ToolsDesign & CreativeDocuments & KnowledgeSearch & Web Crawling
Marketplaces by Category
AI Agents & OrchestrationLLM IntegrationDevelopment ToolsFrontend & UIBackend & APIsDatabasesTesting & Code QualityDevOps & CloudSecurity & ComplianceGit & Version Control

Cross AI Tools

Discover Claude Code plugins, extensions, and tools. Automatically updated directory of Anthropic Claude AI marketplaces with development tools, productivity plugins, and integrations.

Resources

  • Browse Skills
  • Browse MCP Servers
  • Browse Marketplaces
  • Plugins Reference

Community

  • About
  • Tools
  • Feedback
  • Privacy Policy
  • Advertise

Built for the Claude Code community with Claude Code by @mertduzgun

Independent project, not affiliated with Anthropic

Intercept

bighippoman/intercept-mcp
7authSTDIOregistry active
Summary

Exposes fetch and search tools that convert web pages to clean markdown using a 15-strategy fallback pipeline. Handles site-specific formats like tweets, YouTube transcripts, arXiv papers, PDFs, and GitHub READMEs before falling back to Jina Reader, archive services, and stealth fetching. Routes requests through agentsweb.org's shared cache first for sub-50ms hits on previously fetched URLs. No API keys required for basic fetching, though you'll want Brave or SearXNG for reliable search and Cloudflare Browser Run for JavaScript-heavy sites. Includes research-topic and extract-article prompts that chain search with multi-source fetching. Reach for this when you need to give Claude reliable web access without hitting 403s or wrestling with raw HTML.

CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →
CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →

intercept-mcp

Give your AI the ability to read the web. One command, no API keys required.

Without it, your AI hits a URL and gets a 403, a wall, or a wall of raw HTML. With intercept, it almost always gets the content — clean markdown, ready to use.

Handles tweets, YouTube videos (with transcripts when available), arXiv papers, PDFs, Wikipedia articles, and GitHub repos. If the first strategy fails, it tries up to 14 more before giving up.

Works with any MCP client: Claude Code, Claude Desktop, Codex, Cursor, Windsurf, Cline, and more.

intercept-mcp MCP server

Install

Claude Code

claude mcp add intercept -s user -- npx -y intercept-mcp

Codex

codex mcp add intercept -- npx -y intercept-mcp

Cursor

Settings → MCP → Add Server:

{
  "mcpServers": {
    "intercept": {
      "command": "npx",
      "args": ["-y", "intercept-mcp"]
    }
  }
}

Windsurf

Settings → MCP → Add Server → same JSON config as above.

Claude Desktop

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "intercept": {
      "command": "npx",
      "args": ["-y", "intercept-mcp"]
    }
  }
}

Other MCP clients

Any client that supports stdio MCP servers can run npx -y intercept-mcp.

No API keys needed for the fetch tool.

How it works

URLs are processed in four stages:

1. Site-specific handlers

Known URL patterns are routed to dedicated handlers before the fallback pipeline:

PatternHandlerWhat you get
twitter.com/*/status/*, x.com/*/status/*Twitter/XTweet text, author, media, engagement stats (via third-party APIs)
youtube.com/watch?v=*, youtu.be/*YouTubeTitle, channel, duration, views, description, transcript (when captions available)
arxiv.org/abs/*, arxiv.org/pdf/*arXivPaper metadata, authors, abstract, categories
*.pdfPDFExtracted text (text-layer PDFs only)
*.wikipedia.org/wiki/*WikipediaClean article content via Wikimedia REST API
github.com/{owner}/{repo}GitHubRaw README.md content
github.com/{o}/{r}/blob/{ref}/{path}GitHubRaw file content, code-fenced by language
github.com/{o}/{r}/issues/{n}, /pull/{n}GitHubIssue/PR title, state, body, diff stats, comments (via GitHub API)
github.com/{o}/{r}/releases/tag/{t}, /releases/latestGitHubRelease notes (via GitHub API)

The GitHub API endpoints work unauthenticated (60 requests/hour). Set GITHUB_TOKEN to raise the limit.

2. Shared cache (agentsweb.org)

Before hitting any fetcher, every request checks agentsweb.org — a global shared markdown cache for AI agents backed by a 9-source parallel fetch pipeline with JS/SPA rendering (React, Vue, Angular via Cloudflare Browser Run). If another agent already fetched this URL, you get the result in under 50ms.

Every successful fetch contributes back automatically. Entries gain trust through a self-healing consensus model: when independent instances fetch the same URL and confirm the same content, confidence increases.

Opt out entirely with INTERCEPT_SHARED_CACHE=false, or use read-only mode (consume but never contribute) with INTERCEPT_CACHE_READ_ONLY=true.

agentsweb.org API

agentsweb.org also exposes standalone endpoints for direct use:

  • /web?q= — search the web
  • /research?q= — search + fetch + cache in one call
  • /fetch?url= — fetch on demand, auto-cached

See agentsweb.org/docs for full API documentation.

3. Fallback pipeline

If no handler matches (or the handler returns nothing), the URL enters the multi-tier pipeline:

TierFetcherStrategy
0agentsweb.orgGlobal shared markdown cache — instant if another agent already fetched this URL
1Cloudflare Browser RunJS/SPA rendering + markdown extraction — also powers agentsweb.org (optional, needs API token)
1Jina ReaderClean markdown extraction service
2Wayback MachineArchived version from archive.org
2Arquivo.ptPortuguese web archive (broad international coverage)
2Common CrawlPetabyte web archive read from Common Crawl's index + S3 — not subject to the origin's rate limits, bot detection, or paywall
2CodetabsCORS proxy
3Markdown endpointAsks the site for a native markdown version (<path>.md + Accept: text/markdown)
3archive.phArchived snapshots via timemap API + stealth TLS fetch
3Raw fetchDirect GET with browser headers + Turndown markdown conversion
3Stealth fetchBrowser TLS fingerprint impersonation via got-scraping (opt-in, see below)
3FlareSolverrReal-browser challenge solver for Cloudflare/DDoS-Guard (opt-in, needs a FlareSolverr instance)
3Web unlockerCommercial unlocker API — residential rotation + rendering + CAPTCHA (opt-in, BYO key, paid per request)
4RSS, CrossRef, Semantic Scholar, HN, RedditMetadata / discussion fallbacks
5OG MetaOpen Graph tags (guaranteed fallback)

Tier 2 fetchers run in parallel. When multiple succeed, the highest quality result wins. All other tiers run sequentially.

All fetchers return proper Markdown (headings, links, bold, tables, code blocks) via Turndown — not plain text.

4. Caching

Results are cached in-memory with TTL (60 min for successes, 5 min for failures). Max 250 entries with LRU eviction. Failed URLs are cached to prevent re-attempting known-dead URLs. All three knobs are configurable via INTERCEPT_CACHE_TTL_MS, INTERCEPT_CACHE_FAILURE_TTL_MS, and INTERCEPT_CACHE_SIZE.

Tools

fetch

Fetch a URL and return its content as clean markdown.

  • url (string, required) — URL to fetch
  • maxTier (number, optional, 1-5) — Stop at this tier for speed-sensitive cases
  • maxLength (number, optional, default 50000) — Maximum characters to return
  • startIndex (number, optional, default 0) — Character offset for paginating long content
  • noCache (boolean, optional) — Skip session and shared caches and fetch live

Long pages are truncated at maxLength with a notice telling the agent which startIndex continues the content. Structured output reports source, quality, contentLength, truncated, nextStartIndex, and cacheAgeSeconds so agents can branch on them programmatically.

Direct image URLs (.png, .jpg, .gif, .webp, up to 5 MB) are returned as an MCP image block instead of text, so the agent's own vision model can read charts, diagrams, screenshots, and scanned documents. The structured output reports source: "image", mimeType, and bytes.

fetch_batch

Fetch up to 10 URLs in parallel, each through the same handler/fallback chain.

  • urls (string[], required, 1-10) — URLs to fetch
  • maxTier, noCache — as in fetch
  • maxLength (number, optional, default 20000) — Per-URL character budget

research

Search the web and fetch the top results in one call — replaces a search followed by several fetches.

  • query (string, required) — Search query
  • count (number, optional, 1-5, default 3) — Results to fetch
  • maxLength (number, optional, default 20000) — Per-result character budget
  • site (string, optional) — Restrict to a domain
  • freshness (string, optional) — day, week, month, or year

search

Search the web and return results.

  • query (string, required) — Search query
  • count (number, optional, 1-20, default 5) — Number of results
  • site (string, optional) — Restrict results to a domain
  • freshness (string, optional) — day, week, month, or year
  • page (number, optional, 1-10) — Results page for pagination

Uses Brave Search API if BRAVE_API_KEY is set, then SearXNG if SEARXNG_URL is set, then DuckDuckGo as an unreliable last resort. freshness and page are ignored by the DuckDuckGo fallback.

extract

Extract specific values from a page as JSON instead of markdown prose — for when you need particular data, not the whole page. Honors per-domain auth and proxies.

  • url (string, required) — The URL to extract from
  • selectors (object, optional) — Map of field name → CSS selector. Each value is either a selector string (returns the first match's text) or { selector, attr?, all? } — attr extracts an attribute (e.g. href), all: true returns every match as an array.
  • tables (boolean, optional) — Convert every HTML table to an array of row objects (defaults to true when no selectors are given).
{
  "url": "https://shop.example.com/item",
  "selectors": {
    "title": "h1",
    "price": ".price",
    "images": { "selector": "img.gallery", "attr": "src", "all": true }
  }
}

Returns the extracted fields and/or tables as structured output.

Resources

intercept://session/recent

Markdown list of URLs fetched and cached in this session, most recent first. Re-fetching any of them is instant.

Prompts

research-topic

Search for a topic and fetch the top results for a multi-source summary.

  • topic (string) — The topic to research
  • depth (string, default "3") — Number of top results to fetch

extract-article

Fetch a URL and extract the key points from the content.

  • url (string) — The URL to fetch and summarize

Environment variables

VariableRequiredDescription
BRAVE_API_KEYNoBrave Search API key for search
SEARXNG_URLNoSelf-hosted SearXNG instance URL (recommended)
GITHUB_TOKENNoGitHub token raising API rate limits for the issue/PR/release handler
INTERCEPT_AUTHNoJSON map of domain → headers/cookies, to fetch content you're logged in to (see Per-domain authentication)
CF_API_TOKENNoCloudflare API token with "Browser Rendering - Edit" permission
CF_ACCOUNT_IDNoCloudflare account ID (required if CF_API_TOKEN is set)
USE_STEALTH_FETCHNoSet to true to enable stealth fetcher (see warning below)
FLARESOLVERR_URLNoURL of a FlareSolverr instance (e.g. http://localhost:8191) to solve Cloudflare/DDoS-Guard challenges
WEB_UNLOCKER_URLNoGET template (with a {url} placeholder and your API key) for a commercial web-unlocker like ScrapingBee/ScraperAPI/ZenRows — the paid last resort for the hardest sites
INTERCEPT_SHARED_CACHENoSet to false to disable the agentsweb.org shared cache
INTERCEPT_CACHE_READ_ONLYNoSet to true to consume but never contribute to the shared cache
INTERCEPT_CACHE_TTL_MSNoIn-memory cache TTL for successful fetches in ms (default 3600000 = 60 min)
INTERCEPT_CACHE_FAILURE_TTL_MSNoIn-memory cache TTL for failed fetches in ms (default 300000 = 5 min)
INTERCEPT_CACHE_SIZENoMax in-memory cache entries (default 250)
HTTPS_PROXY / HTTP_PROXYNoStandard proxy passthrough — routes all outbound fetches (including stealth) through the proxy. Honors NO_PROXY.
INTERCEPT_PROXIESNoComma/space-separated list of HTTP(S) proxies to rotate across, with automatic retry through the next proxy on a blocked response. Takes precedence over HTTPS_PROXY.

Search: Has a DuckDuckGo fallback but it's rate-limited and unreliable. For production use, self-host SearXNG and set SEARXNG_URL (see below), or get a Brave Search API key.

Fetch: Works without any keys. Set CF_API_TOKEN + CF_ACCOUNT_ID to enable Cloudflare Browser Run (formerly Browser Rendering) for JavaScript-heavy pages (SPAs, React sites).

Stealth fetch (USE_STEALTH_FETCH)

Use at your own risk. When enabled, this adds a fetcher that impersonates real browser TLS fingerprints (Chrome/Firefox cipher suites, HTTP/2 settings, header ordering) using got-scraping. This can bypass bot detection and CAPTCHA triggers on sites that would otherwise block automated requests.

This fetcher runs at tier 3 after the regular raw fetch. If the raw fetch gets blocked (CAPTCHA, Cloudflare challenge, 403), the stealth fetcher retries with browser impersonation.

This may violate the terms of service of some websites. The authors of intercept-mcp take no responsibility for how this feature is used. It is disabled by default and must be explicitly opted into.

Challenge solving (FLARESOLVERR_URL)

The stealth fetcher impersonates a browser's TLS fingerprint, but it can't execute a JavaScript challenge — so sites protected by a Cloudflare "Checking your browser" / DDoS-Guard interstitial still block it. FlareSolverr runs a real headless browser that solves the challenge and returns the page HTML.

Run it (Docker):

docker run -d -p 8191:8191 ghcr.io/flaresolverr/flaresolverr:latest

Then set FLARESOLVERR_URL=http://localhost:8191. It runs at tier 3 as a last resort after the raw and stealth fetchers, and only when this variable is set. Solving a challenge can take 30–60s, so it's the slowest fetcher — but it recovers pages nothing else can.

Commercial web unlocker (WEB_UNLOCKER_URL)

For the hardest targets — sites that need residential IP rotation and real-browser rendering and CAPTCHA handling together — a commercial unlocker is the pragmatic answer. intercept-mcp supports any unlocker that exposes a "GET this URL, return the HTML" endpoint, via a template with a {url} placeholder that holds your API key:

# ScrapingBee
WEB_UNLOCKER_URL='https://app.scrapingbee.com/api/v1/?api_key=KEY&render_js=true&url={url}'
# ScraperAPI
WEB_UNLOCKER_URL='https://api.scraperapi.com/?api_key=KEY&render=true&url={url}'
# ZenRows
WEB_UNLOCKER_URL='https://api.zenrows.com/v1/?apikey=KEY&js_render=true&url={url}'

intercept substitutes the (URL-encoded) target for {url} and converts the returned HTML (or JSON wrapping it) to markdown. It runs at tier 3 as a paid last resort after the free fetchers, only when this variable is set — and your credentials in the template are only ever sent to the unlocker, never to the target. Bright Data's proxy-based Web Unlocker is just an authenticated proxy, so use HTTPS_PROXY / INTERCEPT_PROXIES for that instead. This bills per request.

Bring-your-own proxy (HTTPS_PROXY)

If raw fetches start getting flagged, the most effective fix is usually a clean outbound IP — not a fancier fingerprint. intercept-mcp honors the standard HTTPS_PROXY / HTTP_PROXY / NO_PROXY env vars, so you can route all outbound traffic through whatever proxy you already have:

HTTPS_PROXY=http://user:pass@proxy.example.com:8080 npx intercept-mcp

This works with any HTTP(S) proxy — a self-hosted Squid, a Tailscale exit node, a $5 VPS running 3proxy, or commercial residential proxies (Bright Data, Oxylabs, etc.). The stealth fetcher and got-scraping calls also pick this up automatically.

Proxy rotation (INTERCEPT_PROXIES)

A single proxy still presents a single IP, which can itself get flagged under load. Set INTERCEPT_PROXIES to a comma- or space-separated list and intercept-mcp round-robins across them, automatically retrying through the next proxy when a request comes back blocked (HTTP 403, 429, 451, 503) or errors:

INTERCEPT_PROXIES="http://user:pass@p1.example.com:8080,http://user:pass@p2.example.com:8080,http://p3.example.com:8080" npx intercept-mcp

Requests spread across the list, and a blocked response is retried through a different egress (up to 3 attempts) before giving up — so a handful of cheap proxies, or a rotating residential endpoint listed multiple times, behave like a pool. INTERCEPT_PROXIES takes precedence over HTTPS_PROXY, applies per request (so the stealth and archive.ph got-scraping calls rotate too), and accepts HTTP(S) proxies. Invalid entries are ignored.

Per-domain authentication (INTERCEPT_AUTH)

Most of the web is behind a login. INTERCEPT_AUTH lets you attach your own headers or cookies to requests for a specific origin, so the fetch tools can read content you're legitimately signed in to — a paid subscription, a private dashboard, an intranet, an authenticated API.

It's a JSON object mapping a domain to a header map. A domain also matches its subdomains:

INTERCEPT_AUTH='{
  "nytimes.com": { "Cookie": "nyt-s=...; nyt-a=..." },
  "api.acme.com": { "Authorization": "Bearer eyJ..." }
}' npx intercept-mcp

To get a cookie: open the site logged-in, open DevTools → Network, copy the Cookie request header from any request to that domain.

Security model — read this before using it

  • Credentials only ever go to the configured origin. Headers are keyed on the actual host being contacted. When intercept fetches a page through Jina, a web archive, a CORS proxy, FlareSolverr, or the shared cache, those intermediaries connect to a different host, so your cookie/token is never sent to them — only a direct fetch of the origin carries it.
  • Authenticated responses never touch the shared cache. When a request matches an INTERCEPT_AUTH entry, intercept does not read from or write to the public agentsweb.org cache for that URL — so your private/paid content is never published, and you always get your authenticated view rather than a stranger's anonymous copy. (The in-process session cache still applies.)
  • Treat the value as a secret. It contains live session tokens. Environment variables are visible to the process and its children and may be captured in shell history or process listings — prefer a secrets manager or a non-committed env file, and never commit it. Cookies expire, so you'll periodically need to refresh them.
  • You are responsible for authorized use. Only supply credentials for accounts you own or are permitted to use, and respect each site's terms of service. intercept simply forwards the headers you provide.

Self-hosting SearXNG

For reliable search, self-host SearXNG with Docker. A config is included in the repo:

git clone https://github.com/bighippoman/intercept-mcp.git
cd intercept-mcp/searxng && docker compose up -d

Then set SEARXNG_URL=http://localhost:8888. No rate limits, no CAPTCHAs, aggregates Google + Bing + DuckDuckGo + Wikipedia + Brave.

Or use any existing SearXNG instance — just set SEARXNG_URL to its URL.

URL normalization

Incoming URLs are automatically cleaned:

  • Strips 60+ tracking params (UTM, click IDs, analytics, A/B testing, etc.)
  • Removes hash fragments
  • Upgrades to HTTPS
  • Cleans AMP artifacts
  • Preserves functional params (ref, format, page, offset, limit)

SSRF protection

Agents pass URLs taken from untrusted web content, so the fetch tools refuse anything pointing at local or internal infrastructure: loopback and private IPv4/IPv6 ranges, link-local addresses (including the 169.254.169.254 cloud metadata endpoint), CGNAT, multicast/reserved ranges, and local hostnames (localhost, *.local, *.internal, *.home.arpa). Literal IPs are checked, including alternate notations (decimal, hex) normalized by the URL parser; DNS is not resolved, so public hostnames pointing at private IPs are not caught.

Content quality detection

Each fetcher result is scored for quality. Automatic fail on:

  • CAPTCHA / Cloudflare challenges
  • Login walls
  • HTTP error pages in body
  • Content under 200 characters

Requirements

  • Node.js >= 20
  • No API keys required for basic use
Featured
CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →

Configuration

BRAVE_API_KEYsecret

Brave Search API key for the search tool (free tier: 2,000 queries/month). Not needed for the fetch tool.

Categories
Documents & Knowledge
Registryactive
Packageintercept-mcp
TransportSTDIO
AuthRequired
UpdatedMar 4, 2026
View on GitHub

Related Documents & Knowledge MCP Servers

View all →
Pdf Document Mcp

csoai-org/pdf-document-mcp

pdf-document-mcp MCP server by MEOK AI Labs
Mcp Document Converter

xt765/mcp-document-converter

Convert PDF, DOCX, HTML, Markdown, and Text for AI assistant context injection.
10
Markdown Formatter

io.github.xjtlumedia/markdown-formatter

AI Answer Copier — Convert Markdown to PDF, DOCX, HTML, LaTeX, CSV, JSON, XML, XLSX, RTF, PNG
3
Better Notion

io.github.ai-aviate/better-notion

Operate Notion with a single Markdown document — read, create, and update pages in one call.
2
Notion

suekou/mcp-notion-server

Notion MCP Server enables LLMs to access Notion workspaces with optional Markdown conversion to save tokens.
892
Docx

meterlong/mcp-doc

A powerful Word document processing service based on FastMCP, enabling AI assistants to create, edit, and manage docx files with full formatting support. Preserves original styles when editing content. 基于FastMCP的强大Word文档处理服务,使AI助手能够创建、编辑和管理docx文件,支持完整的格式设置功能。在编辑内容时能够保留原始样式和格式,实现精确的文档操作。
185