CAT
/MCP
SkillsMCPMarketplacesDigestToolsAdvertise

This week in Claude

Every Monday: Claude Code, Agent SDK, MCP, and the Anthropic platform moves worth your time.

Skills by Category
Frontend DevelopmentBackend & APIsTesting & QASecurityDevOps & CI/CDGit & Pull RequestsDocumentationCode Review & QualityAI & Agent BuildingSkill Development
MCP Servers by Category
Sales & MarketingWeb & Browser AutomationDatabasesAI & LLM ToolsCloud & InfrastructureCommunication & MessagingDeveloper ToolsDesign & CreativeDocuments & KnowledgeSearch & Web Crawling
Marketplaces by Category
AI Agents & OrchestrationLLM IntegrationDevelopment ToolsFrontend & UIBackend & APIsDatabasesTesting & Code QualityDevOps & CloudSecurity & ComplianceGit & Version Control

Cross AI Tools

Discover Claude Code plugins, extensions, and tools. Automatically updated directory of Anthropic Claude AI marketplaces with development tools, productivity plugins, and integrations.

Resources

  • Browse Skills
  • Browse MCP Servers
  • Browse Marketplaces
  • Plugins Reference

Community

  • About
  • Tools
  • Feedback
  • Privacy Policy
  • Advertise

Built for the Claude Code community with Claude Code by @mertduzgun

Independent project, not affiliated with Anthropic

Searxng Deepdive

burakaydinofficial/searxng-deepdive
STDIOregistry active
Summary

Wraps SearXNG's meta-search API with four distinct tools: broad search across all engines, targeted search on specific engines like arxiv or pubmed, category-scoped search, and a lightweight HTML-to-Markdown URL reader with TOC extraction. The server dynamically injects your instance's actual engine and category lists into tool descriptions so the LLM knows what's available without guessing names. Multi-page fanout lets you request several result pages in one call with automatic deduplication. Validation catches common mistakes like passing an engine name to the category parameter. The URL reader handles static HTML, JSON, YAML, and TOML without a headless browser, with extraction modes for reading just headings or specific sections to save tokens on long documentation pages.

CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →
CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →

searxng-deepdive

Tests License: MIT

An MCP server for SearXNG designed for LLM agents doing real research. Four tools with agent-friendly schemas, multi-page result fanout, lightweight URL→Markdown reading, and tool descriptions generated dynamically from the live engine pool of your SearXNG instance.

Why another mcp-searxng?

Existing packages are minimal — most expose a single search(query) tool with no way for the model to ask for more results, target specific engines, or constrain by category. The richer ones bake static descriptions, so the LLM never learns what's actually enabled on this instance. None of them treat agent-tool-selection ergonomics as a design priority.

searxng-deepdive opens those knobs up:

FeatureThisnpm mcp-searxng (ihor-sokoliuk)PyPI mcp-searxng (SecretiveShell)
Engine targeting✅ via search_on_engines❌❌
Category targeting✅ via search_by_category❌❌
Multi-page fanout in one call✅ via pages: N❌ (one page per call)❌
Pagination✅ via pageno✅❌
Compact response trim✅ via format: "compact"❌❌
Dynamic descriptions per instance✅ live engine list injected❌ static❌ static
Validation with cross-tool hints✅ engine-vs-category, case-insensitive❌❌
Zero-result hints✅ time_range / unresponsive engines / single-engine❌❌
URL reader (HTML→Markdown)✅ with TOC scan + section extraction✅ basic❌
Test suite✅ 102 unit + integrationminimal❌

Quickstart

Install via npx -y from any MCP client:

{
  "mcpServers": {
    "searxng": {
      "command": "npx",
      "args": ["-y", "searxng-deepdive"],
      "env": { "SEARXNG_URL": "http://127.0.0.1:7979/" }
    }
  }
}

SEARXNG_URL should point at your running SearXNG instance. Need one? The companion repo SearXNG-Compose ships a plug-and-play Docker stack tuned for LLM consumption.

Tools

The server registers four tools. The LLM picks among them based on the descriptions below, augmented at startup with the live engine and category list from your instance.

search(query, [...])

Broad web search across the full enabled engine pool. Use when you don't have a specific source preference. Returns merged, deduplicated results across however many engines respond.

search_on_engines(query, engines, [...])

Search using only the specified engines (e.g. ["arxiv", "pubmed", "semantic scholar"]). The tool description registered with the MCP client includes the actual list of engines enabled on your instance — agents don't have to guess names. Validation rejects invalid names with a "did you mean" hint when they look like categories instead of engines.

search_by_category(query, categories, [...])

Search within specific categories — runs every engine tagged with each. Description includes the live category list and which engines belong to each. Same validation: invalid category names produce a clear error that points at search_on_engines when the offending value is actually an engine name.

web_url_read(url, [readHeadings, section, paragraphRange, startChar, maxLength])

Fetch a URL and convert its HTML to clean Markdown. Lightweight HTTP + HTML→Markdown (no headless browser) — handles ~80% of the static-HTML web (Wikipedia, docs sites, blogs, news, GitHub READMEs).

Token-efficient extraction modes (priority order, first set wins):

  • readHeadings: true — return only the heading list as a hierarchical TOC
  • section: "Installation" — return content under matching heading
  • paragraphRange: "3-7" — 1-indexed paragraph slice
  • startChar + maxLength — character window pagination

Recommended workflow for long pages: TOC scan first (readHeadings), then targeted read (section). Far more token-efficient than fetching the full page up front.

If readHeadings comes back with no entries (Reddit threads, comment sections, blog posts that use bold paragraphs instead of <h*> tags), the page is structurally flat — fall through to paragraphRange for sequential sampling, or just fetch without an extraction mode.

web_url_read also accepts JSON, YAML, and TOML content-types directly (spec files, package manifests, registry API responses, CI workflow YAML), so research agents can read these without the HTML-only stub.

For JS-rendered SPAs and bot-protected sites this tool returns minimal/empty content — fall back to a Chromium-backed reader (e.g. Crawl4AI) for those.

Common parameters across all search tools

  • pageno — 1-indexed starting page (default 1)
  • pages — multi-page fanout in one call (1–5, default 1)
  • time_range — day / week / month / year (warning: not all engines support this; some return empty when set)
  • language — BCP-47 code or all
  • safe_search — 0 / 1 / 2
  • format — compact (default) or full

Configuration

Env varDefaultMeaning
SEARXNG_URLhttp://127.0.0.1:8080Base URL of the SearXNG instance

Development

git clone <this repo>
cd searxng-deepdive          # or wherever you cloned to
npm install
npm run build                # tsc
npm test                     # vitest
SEARXNG_URL=http://127.0.0.1:7979 npm run probe    # exercise the SearXNG client
SEARXNG_URL=http://127.0.0.1:7979 npm run dev      # start the MCP stdio server

Pointing an MCP client at the source during development

Use tsx to run from src/ directly so you don't need to rebuild on every edit:

{
  "mcpServers": {
    "searxng": {
      "command": "npx",
      "args": ["-y", "tsx", "/absolute/path/to/searxng-deepdive/src/index.ts"],
      "env": { "SEARXNG_URL": "http://127.0.0.1:7979/" }
    }
  }
}

MCP clients cache the subprocess. When you edit code, the running server keeps the old behavior until the subprocess is killed and respawned. Quit the host (LM Studio, Claude Desktop, etc.) fully and reopen — closing the chat window alone usually isn't enough. Symptom of not doing this: a fix you just shipped doesn't appear to take effect.

Testing

npm test

Test coverage spans seven files:

  • normalize-name — case-insensitive name handling
  • validators — engine/category validation with cross-reference hints
  • zero-result-hint — every hint trigger and its inverse
  • trim-to-compact — response trimming + hint inclusion
  • descriptions — anti-pattern regex checks for the description copy that misled real models in earlier versions ("ignored by engines", "Default 'auto'", etc.) — failing build if they reappear
  • searxng-client — HTTP client with MockAgent: malformed JSON, HTML 502 pages, 429 rate-limit handling, multi-page fanout dedup, all-pages-fail throws
  • url-reader — extraction modes + HTTP integration

Design notes

  • Why four tools instead of one with optional engine/category params? Cleaner agent decision-making. With distinct tools the LLM sees explicit purposes; with one fat tool it has to remember when to set which optional flags. Trade-off: more entries in the MCP tool list, mostly identical handler code. Net: better agent ergonomics, especially for smaller models.

  • Why format: "compact" as default? SearXNG's full result objects are several times heavier than just url+title+content+engine. For the typical agent workflow (rank candidates, pick a few to fetch in detail), the compact form is what the LLM actually uses. format: "full" is one parameter away when you need scores, dates, authors, or DOI.

  • Why dynamic descriptions? Static descriptions either list every upstream engine (most aren't enabled on a given instance — wastes context) or list none (LLM has no idea what to put in engines). Live introspection of /config at server startup gives the LLM exactly the right hint for this instance.

  • Why convert silent-wrong into informatively-wrong? Real LM Studio testing showed agents repeatedly stuck in retry loops because failed searches looked successful (zero results, looked like "no matches"; or 60 garbage results, looked like the search ran). The validation + zero-result-hint pattern surfaces the actual cause every time. The description-anti-pattern test suite locks in copy that was empirically shown to mislead models.

Security notes

This package is designed to run locally, inside the user's trust boundary, alongside an MCP-speaking LLM client (Claude Desktop, LM Studio, Cursor, etc.). The trust model assumes:

  • the LLM is acting on the user's behalf
  • the user controls what model is connected to the server
  • the MCP transport is stdio, not exposed to remote callers

Within that boundary, two surfaces are worth knowing about:

  • web_url_read will fetch any HTTP(S) URL the model hands it, with up to five redirects. On a host that can route to private networks, the model can therefore reach intranet services, link-local addresses, or cloud-instance metadata endpoints (169.254.169.254, etc.). This is by design for a local research tool but means you should not run this MCP server in topologies where an untrusted party can pick the URLs (e.g. a hosted MCP gateway facing the public internet). Body size is capped at 10 MB and content-type is sniffed before conversion, so a malformed upstream can't trivially OOM the process.

  • The search tools forward the model's query verbatim to SearXNG. SearXNG is the trust boundary for upstream engine traffic; this package does not add additional rate-limiting or query rewriting.

  • Tool output is adversarial input — prompt injection is possible. Search-result snippets and the Markdown returned by web_url_read both contain text the model will read as part of its working context. A page or snippet you don't control can carry instructions ("ignore previous instructions and …"). This isn't a defect in this MCP server — it's inherent to any tool that returns external text — but agent loops that auto-act on tool output without human review are the threat model. Treat tool output as untrusted input, especially for web_url_read against URLs the model picked rather than the user.

This package is provided as-is under MIT with no warranty or liability for damages — see LICENSE. Report suspected vulnerabilities privately via GitHub Security Advisories rather than opening a public issue. See SECURITY.md.

License

MIT — see LICENSE.

Featured
CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →

Configuration

SEARXNG_URL

Base URL of the SearXNG instance (e.g. http://127.0.0.1:7979/). Defaults to http://127.0.0.1:8080 if unset; set this to the address of your running SearXNG.

Categories
Documents & KnowledgeSearch & Web Crawling
Registryactive
Packagesearxng-deepdive
TransportSTDIO
UpdatedMay 3, 2026
View on GitHub

Related Documents & Knowledge MCP Servers

View all →
Pdf Document Mcp

csoai-org/pdf-document-mcp

pdf-document-mcp MCP server by MEOK AI Labs
Mcp Document Converter

xt765/mcp-document-converter

Convert PDF, DOCX, HTML, Markdown, and Text for AI assistant context injection.
10
Markdown Formatter

io.github.xjtlumedia/markdown-formatter

AI Answer Copier — Convert Markdown to PDF, DOCX, HTML, LaTeX, CSV, JSON, XML, XLSX, RTF, PNG
3
Better Notion

io.github.ai-aviate/better-notion

Operate Notion with a single Markdown document — read, create, and update pages in one call.
2
Notion

suekou/mcp-notion-server

Notion MCP Server enables LLMs to access Notion workspaces with optional Markdown conversion to save tokens.
892
Docx

meterlong/mcp-doc

A powerful Word document processing service based on FastMCP, enabling AI assistants to create, edit, and manage docx files with full formatting support. Preserves original styles when editing content. 基于FastMCP的强大Word文档处理服务,使AI助手能够创建、编辑和管理docx文件,支持完整的格式设置功能。在编辑内容时能够保留原始样式和格式,实现精确的文档操作。
185