Wraps the Octen Extract API so Claude can pull live web pages as clean markdown with three filters most extract tools skip: category labels (tech, health, finance), page structure flags (article, homepage, login wall, no main content), and query driven highlights instead of full body dumps. The big win is filtering upstream. When your RAG pipeline fetches 100 URLs, maybe 20 are index pages or paywalls that cost you LLM tokens to discover they're useless. Octen flags them at fetch time via page_structure so you skip the embedding step entirely. Pass a query parameter and you get ranked snippets per page instead of paying to process full content. Supports batches of up to 20 URLs, configurable cache TTL, and optional image or video URL extraction.
MCP server for Octen. Plug it into Claude, Cursor, VS Code, Windsurf, or any MCP client to give your agent live web search and URL extraction.
Core capabilities:
search / news_search: search the live web with domain, text, and time filters.extract: turn one or more URLs into clean, LLM-ready content.What makes Octen useful for agents is that extract returns more than page text. Each successful result also includes:
category: what the page is aboutpage_structure: what kind of page it ishighlights: ranked snippets when you pass a queryThat lets an agent skip login walls, nav pages, and off-topic URLs before spending tokens on the full body.
You need an OCTEN_API_KEY from octen.ai.
For most MCP clients, the config is:
{
"mcpServers": {
"octen": {
"command": "npx",
"args": ["-y", "octen-mcp"],
"env": {
"OCTEN_API_KEY": "your-key-here"
}
}
}
}
Common config locations:
~/Library/Application Support/Claude/claude_desktop_config.json~/.cursor/mcp.json.vscode/mcp.json using servers instead of mcpServersFor Claude Code:
claude mcp add --scope user octen \
-e OCTEN_API_KEY=your-key-here \
-- npx -y octen-mcp
| Tool | What it does | Best for |
|---|---|---|
search | Search the live web with domain, text, time, and content controls | broad web search |
news_search | Same engine as search, fixed to news | current events and timely reporting |
extract | Fetch 1-20 URLs and return clean content, labels, and optional highlights | summarization, RAG, fact lookup |
Reference docs:
Most extract tools stop at "here is the page body." Octen helps one step earlier:
page_structure.primary == "No Main Content" tells the agent it hit a login wall, empty shell, or similar non-content page.category helps a pipeline ignore pages outside the target vertical before embedding or summarizing.query returns highlights when the user wants a specific fact instead of the full page.For the full decision tree and integration patterns, see docs/best-practices.md.
Fetch octen.ai and summarize the main product features.Search for recent MCP news from the last week.Fetch these URLs and only summarize the ones whose category is Finance.Search site:docs.anthropic.com prompt caching and return only the relevant highlights.| Variable | Required | Default |
|---|---|---|
OCTEN_API_KEY | yes | — |
OCTEN_API_URL | no | https://api.octen.ai |
git clone https://github.com/Octen-Team/octen-mcp.git
cd octen-mcp
npm install
npm run build
OCTEN_API_KEY=<key> npm run inspect
MIT © Octen
OCTEN_API_KEY*secretOcten API key. Get one at https://octen.ai (self-serve).
OCTEN_API_URLdefault: https://api.octen.aiOverride the Octen API base URL. Default: https://api.octen.ai.
io.github.ericm1018/skillfm-llm-cost-optimizer-openai-anthropic-usage
io.github.mikerawsonnz/llm-orchestration-agent
io.github.mikerawsonnz/authenticated-llm-agent
labforgedev/copilot-memory-mcp
csoai-org/agent-prompt-injection-firewall-mcp
io.github.mikerawsonnz/authenticated-multi-llm-agent