Gives Claude seven MCP tools to fetch web pages as markdown, search the web, take screenshots, crawl sites, extract structured data, monitor changes, and run browser actions. Built on WebPeel's API, which automatically escalates from simple HTTP to headless browsers when needed and includes 55+ domain-specific extractors for sites like Reddit, GitHub, and ArXiv that strip boilerplate before sending content to your agent. Useful when you need web data in agent workflows without managing Puppeteer yourself or burning tokens on raw HTML. Supports batch operations, persistent browser sessions for login flows, and JSON schema extraction. Compatible with Claude Desktop, Cursor, and other MCP clients through stdio transport.
Public tool metadata for what this MCP can expose to an agent.
webpeelYour complete web toolkit — fetch, search, screenshot, extract, monitor, and interact with any website. Handles JS rendering, Cloudflare, CAPTCHAs, and 55+ domain extractors automatically. 65-98% token savings. Describe what you want in plain English. Examples: 'read https://s...1 paramsYour complete web toolkit — fetch, search, screenshot, extract, monitor, and interact with any website. Handles JS rendering, Cloudflare, CAPTCHAs, and 55+ domain extractors automatically. 65-98% token savings. Describe what you want in plain English. Examples: 'read https://s...
taskstringwebpeel_readFetch any URL and return clean, LLM-optimized markdown. 65-98% fewer tokens than raw HTML. Automatically handles: web pages, YouTube transcripts (with timestamps), PDFs, JS-rendered SPAs, Cloudflare-protected sites, and 55+ domain-specific extractors (Amazon, Reddit, GitHub, e...7 paramsFetch any URL and return clean, LLM-optimized markdown. 65-98% fewer tokens than raw HTML. Automatically handles: web pages, YouTube transcripts (with timestamps), PDFs, JS-rendered SPAs, Cloudflare-protected sites, and 55+ domain-specific extractors (Amazon, Reddit, GitHub, e...
urlstringbudgetnumberformatstringmarkdown · text · htmldefault: markdownrenderbooleansummarybooleanquestionstringreadablebooleanwebpeel_seeCapture a screenshot of any web page. Returns the page as an image for visual inspection. Supports mobile, tablet, and desktop viewports. Use mode='design' for AI-powered design analysis and suggestions. Use mode='compare' with compare_url to diff two pages visually. Use full_...5 paramsCapture a screenshot of any web page. Returns the page as an image for visual inspection. Supports mobile, tablet, and desktop viewports. Use mode='design' for AI-powered design analysis and suggestions. Use mode='compare' with compare_url to diff two pages visually. Use full_...
urlstringmodestringscreenshot · design · comparedefault: screenshotviewportvaluefull_pagebooleancompare_urlstringwebpeel_findSearch the web or discover all pages on a site. Pass query= to search the web and get ranked results with titles, URLs, and snippets. Pass url= to map/crawl a domain and discover all its pages. Use depth='deep' for multi-source research that synthesizes answers from multiple p...4 paramsSearch the web or discover all pages on a site. Pass query= to search the web and get ranked results with titles, URLs, and snippets. Pass url= to map/crawl a domain and discover all its pages. Use depth='deep' for multi-source research that synthesizes answers from multiple p...
urlstringdepthstringquick · deepdefault: quicklimitnumberquerystringwebpeel_extractExtract structured JSON data from any URL. No LLM needed for built-in schemas. Pass fields=['price','title','description'] to extract specific named fields. Pass schema={...} with a full JSON schema for custom structured output. Built-in schemas: product, article, recipe, job,...4 paramsExtract structured JSON data from any URL. No LLM needed for built-in schemas. Pass fields=['price','title','description'] to extract specific named fields. Pass schema={...} with a full JSON schema for custom structured output. Built-in schemas: product, article, recipe, job,...
urlstringfieldsarrayformatstringjson · markdowndefault: jsonschemaobjectwebpeel_monitorTrack a web page for content changes over time. Call once to take a snapshot, call again to get a diff. Use selector= to monitor a specific CSS element (e.g. a price, a status badge). Use webhook= for persistent monitoring with automatic notifications when content changes. Use...4 paramsTrack a web page for content changes over time. Call once to take a snapshot, call again to get a diff. Use selector= to monitor a specific CSS element (e.g. a price, a status badge). Use webhook= for persistent monitoring with automatic notifications when content changes. Use...
urlstringwebhookstringintervalstringselectorstringwebpeel_actAutomate interactions with any web page using a real browser. Click buttons, fill forms, select dropdowns, scroll, wait for elements, and press keys. Returns extracted content and optionally a screenshot after all actions complete. Use for: logging into sites, submitting forms...4 paramsAutomate interactions with any web page using a real browser. Click buttons, fill forms, select dropdowns, scroll, wait for elements, and press keys. Returns extracted content and optionally a screenshot after all actions complete. Use for: logging into sites, submitting forms...
urlstringactionsarrayextract_afterbooleanscreenshot_afterbooleanQuick Start · Agent Integrations · Docs · Playground · Get API Key
Every AI agent that touches the web rebuilds the same brittle stack: HTTP fetch → headless browser → anti-bot bypass → HTML cleanup → markdown conversion → token budgeting. Each layer fails differently. Sites change. Cloudflare rotates challenges. Your agent gets empty strings at 2 AM and your pipeline breaks.
WebPeel replaces that entire stack with one function call. It handles engine selection, anti-bot escalation, domain-specific extraction, and token optimization so your agent gets clean, structured data every time — without managing browsers, proxies, or parsing logic.
# Zero-install — just run it
npx webpeel "https://example.com"
# Search the web
npx webpeel search "latest AI agent frameworks"
# Crawl an entire site
npx webpeel crawl docs.example.com --max-pages 50
# Screenshot any page
npx webpeel screenshot "https://stripe.com/pricing" --full-page
# Ask a question about any page
npx webpeel ask "https://arxiv.org/abs/2401.00001" "What is the main contribution?"
Or install globally:
npm install -g webpeel
Use as a library:
import { peel } from 'webpeel';
const result = await peel('https://news.ycombinator.com');
console.log(result.markdown); // Clean markdown, ready for your LLM
console.log(result.metadata); // Title, tokens saved, timing, etc.
Use via API:
curl "https://api.webpeel.dev/v1/fetch?url=https://stripe.com/pricing" \
-H "Authorization: Bearer $WEBPEEL_API_KEY"
{
"url": "https://stripe.com/pricing",
"markdown": "# Stripe Pricing\n\n**Integrated per-transaction fees**...",
"metadata": {
"title": "Pricing & Fees | Stripe",
"tokens": 420,
"tokensOriginal": 8200,
"savingsPct": 94.9
}
}
Get your free API key → · No credit card required · 500 requests/week free
Generic scrapers convert raw HTML to markdown and call it a day. WebPeel has purpose-built extractors for 55+ domains — Reddit, GitHub, YouTube, Amazon, ArXiv, Hacker News, Wikipedia, StackOverflow, Zillow, Polymarket, ESPN, and more. Each extractor understands the site's structure and returns clean, structured data without browser rendering.
Domain extractors strip navigation, ads, sidebars, and boilerplate before content reaches your agent. Less context consumed = lower costs, faster inference, and longer agent chains.
| Site | Raw HTML tokens | WebPeel tokens | Savings |
|---|---|---|---|
| News article | 18,000 | 640 | 96% |
| Reddit thread | 24,000 | 890 | 96% |
| Wikipedia page | 31,000 | 2,100 | 93% |
| GitHub README | 5,200 | 1,800 | 65% |
| E-commerce product | 14,000 | 310 | 98% |
WebPeel doesn't just try one method — it automatically escalates through 6 engines until it gets a good result:
Simple HTTP → Domain API → Browser render → Stealth browser → Cloaked browser → Search cache fallback
No manual --render flags for most sites. WebPeel knows which sites need JavaScript, which need stealth, and which have anti-bot protection — and picks the right engine automatically.
Already using Firecrawl-style workflows? WebPeel supports compatible /v1/scrape, /v2/scrape, /v1/crawl, /v1/search, and /v1/map endpoints, which makes migration dramatically easier than rebuilding your pipeline from scratch.
Give any MCP-compatible AI the ability to browse, search, and extract from the web.
{
"mcpServers": {
"webpeel": {
"command": "npx",
"args": ["-y", "webpeel", "mcp"],
"env": { "WEBPEEL_API_KEY": "wp_your_key_here" }
}
}
}
7 MCP tools exposed: webpeel_read · webpeel_find · webpeel_see · webpeel_extract · webpeel_monitor · webpeel_act · webpeel_crawl
import { WebPeelLoader } from 'webpeel/integrations/langchain';
const loader = new WebPeelLoader({ url: 'https://example.com', render: true });
const docs = await loader.load();
import { WebPeelReader } from 'webpeel/integrations/llamaindex';
const reader = new WebPeelReader();
const docs = await reader.loadData('https://example.com');
pip install webpeel
from webpeel import WebPeel
wp = WebPeel(api_key="wp_...")
result = wp.fetch("https://example.com")
print(result.markdown)
| Capability | CLI | API | Details |
|---|---|---|---|
| Fetch & extract | webpeel "url" | GET /v1/fetch | Clean markdown from any URL |
| Web search | webpeel search "query" | GET /v1/search | DuckDuckGo (free) or Brave (BYOK) |
| Smart search | — | POST /v1/search/smart | AI-powered structured results |
| Crawl sites | webpeel crawl "url" | POST /v1/crawl | Depth/page limits, rate control |
| Screenshots | webpeel screenshot "url" | POST /v1/screenshot | Full-page, multi-viewport, visual diff, filmstrip |
| Structured extraction | --extract-schema | POST /v1/extract | JSON schema → structured data |
| Q&A | webpeel ask "url" "q" | POST /v1/answer | Answer questions about any page |
| Deep research | — | POST /v1/deep-research | Multi-query autonomous research |
| Content monitoring | webpeel monitor "url" | POST /v1/watch | Change detection with webhooks |
| Browser sessions | — | POST /v1/session | Persistent sessions for login flows |
| Browser actions | --action 'click:.btn' | actions field | Click, type, scroll, wait |
| Batch scrape | webpeel batch file | POST /v1/batch/scrape | Parallel multi-URL processing |
| URL discovery | webpeel map "url" | POST /v1/map | Sitemap and link discovery |
| YouTube transcripts | auto-detected | auto-detected | Multiple export formats |
| PDF extraction | auto-detected | auto-detected | Text, tables, structure |
| Research agent | — | POST /v1/agent | Autonomous multi-step research |
RAG pipelines — Fetch docs, articles, or entire sites as clean markdown ready for chunking, embedding, and retrieval.
Price monitoring — Track product pages across major commerce sites with structured extraction and change detection.
Competitive intel — Monitor competitor pages, pricing tables, and job boards. Visual diff screenshots catch layout changes CSS selectors would miss.
Research agents — Give Claude, Codex, Cursor, or your own agent grounded web access through the API or MCP server.
Lead enrichment — Pull company details, public links, and page structure from business sites without writing per-site parsers.
Content aggregation — Crawl and extract from communities, docs sites, and publications with domain-native extractors that understand each site's structure.
Your Agent
↓
WebPeel (npm / API / MCP)
↓
┌─────────────────────────────────┐
│ Engine Ranker │
│ HTTP → Domain API → Browser │
│ → Stealth → Cloaked → Cache │
├─────────────────────────────────┤
│ 55+ Domain Extractors │
│ reddit · github · youtube │
│ amazon · arxiv · zillow · ... │
├─────────────────────────────────┤
│ Content Pipeline │
│ Readability → Turndown → │
│ Token budgeting → Chunking │
└─────────────────────────────────┘
↓
Clean markdown / structured JSON
WebPeel is built for production agent workflows, not just one-off demos.
file:// schemesDELETE /v1/account for full data erasure
Security policy → · SLA (99.9% uptime) →| Approach | What it gives you | Where it breaks down |
|---|---|---|
| Raw HTTP + HTML parsing | Cheap, simple fetches | Falls apart on JS-heavy sites, anti-bot pages, and noisy HTML |
| Pure browser automation | Maximum control | Expensive, slow, fragile, and high-maintenance for large-scale use |
| Search-only APIs | Great discovery | Weak page extraction, limited structured output, limited downstream actions |
| Single-purpose scrapers | Fast on one job | You end up composing 4–6 tools for real agent workflows |
| WebPeel | Fetch + search + crawl + extraction + screenshots + monitoring in one layer | Opinionated toward agent workflows rather than generic scraping |
📖 Documentation · 💰 Pricing · 🎮 Playground · 📝 Blog · 💬 Discussions · 🚀 Releases · 📊 Status · 🔒 Security · 📋 Changelog
Pull requests welcome. Please open an issue first to discuss major changes.
git clone https://github.com/webpeel/webpeel.git
cd webpeel && npm install
npm run build && npm test
WebPeel SDK License — free for personal and commercial use with attribution.
therealtimex/browser-use
jae-jae/fetcher-mcp
merajmehrabi/puppeteer-mcp-server
com.thenextgennexus/playwright-mcp-server
saik0s/mcp-browser-use