CAT
/MCP
SkillsMCPMarketplacesDigestToolsAdvertise

This week in Claude

Every Monday: Claude Code, Agent SDK, MCP, and the Anthropic platform moves worth your time.

Skills by Category
Frontend DevelopmentBackend & APIsTesting & QASecurityDevOps & CI/CDGit & Pull RequestsDocumentationCode Review & QualityAI & Agent BuildingSkill Development
MCP Servers by Category
Sales & MarketingWeb & Browser AutomationDatabasesAI & LLM ToolsCloud & InfrastructureCommunication & MessagingDeveloper ToolsDesign & CreativeDocuments & KnowledgeSearch & Web Crawling
Marketplaces by Category
AI Agents & OrchestrationLLM IntegrationDevelopment ToolsFrontend & UIBackend & APIsDatabasesTesting & Code QualityDevOps & CloudSecurity & ComplianceGit & Version Control

Cross AI Tools

Discover Claude Code plugins, extensions, and tools. Automatically updated directory of Anthropic Claude AI marketplaces with development tools, productivity plugins, and integrations.

Resources

  • Browse Skills
  • Browse MCP Servers
  • Browse Marketplaces
  • Plugins Reference

Community

  • About
  • Tools
  • Feedback
  • Privacy Policy
  • Advertise

Built for the Claude Code community with Claude Code by @mertduzgun

Independent project, not affiliated with Anthropic

Rag

fieldcure/fieldcure-mcp-rag
STDIOregistry active
Summary

A production-grade RAG implementation that handles the messy parts: multi-format ingestion (DOCX, PDF with OCR, audio via Whisper), Korean-optimized chunking, and hybrid BM25 plus vector search with RRF. The indexing pipeline uses AI-powered chunk contextualization to generate bilingual keywords and preserve document context across chunking boundaries. A two-commit model saves expensive OCR and transcription work before hitting embedding APIs, so a rate limit or crash doesn't force you to re-process a 600-page scan. Supports Ollama, OpenAI, Gemini, and Anthropic for both embedding and contextualization. Exposes search tools over MCP stdio and runs headless indexing via CLI. Handles multiple knowledge bases in a single server process with SQLite WAL for concurrent read access during writes.

CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →
CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →

FieldCure MCP RAG Server

NuGet License: MIT

A Model Context Protocol (MCP) server for indexing and searching local document collections. Supports DOCX, HWPX, PDF (with OCR), Excel, PowerPoint, and audio (Whisper transcription, Windows-only), with hybrid keyword + semantic search optimized for Korean and English.

Built with C# and the official MCP C# SDK.

Commands

fieldcure-mcp-rag
├── serve         --base-path <path>                         # Multi-KB MCP search server (stdio)
├── exec          --path <kb-path> [--force] [--partial ...]  # Headless indexing for a single KB
├── exec-queue    --queue-file <path> [--sweep-all]           # Process deferred indexing queue
├── prune-orphans --base-path <path>                         # Delete orphan KB folders
└── smoke-ocr     --pdf <scanned.pdf>                        # Self-test: OCR a scanned PDF (Windows)
  • serve — read-only MCP server serving all knowledge bases under the base path. Single process handles multiple KBs via kb_id parameter. Can run while exec is indexing (SQLite WAL).
  • exec — scans source folders, chunks documents, contextualizes with AI, embeds, stores in SQLite. --partial re-runs only downstream stages when models change, preserving OCR output.
  • exec-queue — sequential orchestrator consuming a deferred indexing queue. One entry at a time, no GPU contention. --sweep-all processes deferred entries too (used at app shutdown).
  • prune-orphans — deletes orphan KB folders (GUID-named, no config.json). Protected folders (., _ prefix, -backup-) are never touched.
  • smoke-ocr — diagnostic mode. Loads a scanned PDF through the OCR fallback parser, prints recognized text to stdout, and exits 0 on a non-empty result. Surfaces DllNotFoundException / BadImageFormatException distinctly so a missing or arch-mismatched native is immediately visible. Useful for verifying that the OCR native path is wired correctly on a given host (notably win-arm64 dnx installs).

Features

Search

  • Hybrid BM25 + vector search with Reciprocal Rank Fusion (RRF)
  • BM25-only fallback when no embedding provider is configured
  • Korean-optimized chunking (sentence boundary, decimal protection, parenthesis-aware)
  • SIMD-accelerated cosine similarity via System.Numerics.Vector
  • FTS5 trigram index for substring and CJK-friendly keyword matching

Indexing

  • Incremental indexing with SHA256 change detection
  • AI-powered chunk contextualization with bilingual keyword enrichment (see Chunk Contextualization)
  • 2-commit pipeline preserves expensive upstream work across embedding failures (see How Indexing Works)
  • Math equation extraction from DOCX/HWPX as [math: LaTeX] blocks
  • PDF with OCR fallback (Tesseract eng+kor) for scanned pages
  • Audio transcription (.mp3, .wav, .m4a, .ogg, .flac, .webm) via Whisper.net — Windows-only. Model size (Tiny→Large) is auto-selected from detected GPU/RAM/cores at startup; each transcript chunk records audio.model_size and audio.transcribed_at for future reindex auditing
  • Cross-process indexing lock with stale PID auto-cleanup
  • Orphan cleanup for deleted files

Queue Orchestrator

  • All indexing requests flow through start_reindex MCP tool — no direct exec spawn
  • Scope merge rules: full ⊃ contextualization ⊃ embedding (duplicate requests upgrade, not duplicate)
  • PID-based orchestrator lock with reuse defense (orchestrator.lock)
  • Logical KB deletion (config.json removal) + prune-orphans physical cleanup
  • Deferred indexing for app-shutdown batch processing (--sweep-all)

Operations

  • Multi-KB serve: single process serves all knowledge bases under a base path, lazy-loaded per KB
  • SQLite WAL mode allows search during indexing
  • Graceful shutdown via cancel file
  • Per-KB config.json with provider configuration

Integration

  • Ollama native — embedding via /api/embed, contextualization via /api/chat with keep_alive and num_ctx support. Requires Ollama 0.4.0+.
  • OpenAI-compatible — embedding via /v1/embeddings, contextualization via /v1/chat/completions. Works with OpenAI, Azure OpenAI, Groq, LM Studio, Together AI.
  • Gemini native — embedding via /v1beta/models/{model}:embedContent with task_type asymmetric retrieval (RETRIEVAL_DOCUMENT / RETRIEVAL_QUERY) and Matryoshka dimension truncation (768 / 1536 / 3072). gemini-embedding-2, multilingual, 8k token input.
  • Anthropic — contextualization via /v1/messages.
  • API keys via environment variables — OPENAI_API_KEY, ANTHROPIC_API_KEY, etc. Batch indexing commands (exec, exec-queue) are env-var-only. Interactive MCP search can fall back to MCP elicitation when the client supports it.
  • Standard MCP stdio transport (JSON-RPC over stdin/stdout)

Chunk Contextualization

Standard RAG chunking loses context — a sentence about "the protocol" becomes ambiguous when ripped from its surrounding paragraphs. This server addresses that with Unified Chunk Contextualization: a single LLM call per chunk that produces both contextual framing and bilingual (Korean + English) keywords in one pass.

The result is stored alongside the original chunk text:

  • Original text is preserved for accurate retrieval display
  • Contextualized text is what gets embedded and indexed in BM25
  • Bilingual keywords enable cross-lingual search — a Korean query can retrieve English documents and vice versa

This is enabled by setting contextualizer in config.json. It can be disabled (set provider/model to empty) if you prefer raw chunk indexing.

How Indexing Works

The exec command runs a 5-stage pipeline per file:

  1. Extract — text from document (DOCX, PDF OCR, audio transcription, etc.)
  2. Chunk — split into ~1000 char windows
  3. Contextualize — LLM enrichment (optional, see above)
  4. Embed — vector embedding via API
  5. Persist — save to SQLite

For large files, Stage 1 alone can take 20+ minutes — OCR on a 596-page scanned PDF, or Whisper transcription of a multi-hour audio recording. The first audio file in any KB also pays a one-time ggml model download (cached under {UserProfile}/.fieldcure/whisper-models/). To prevent expensive upstream work from being lost when later stages fail, the pipeline uses a 2-commit model:

Stages 1-3 (Extract → Chunk → Contextualize)
        ↓
[Commit 1] chunks saved as PendingEmbedding
        ↓
Stage 4 (Embed)
   ├─ success → [Commit 2a] promote chunks to Indexed
   └─ failure → chunks remain PendingEmbedding (retry next exec)

Why this matters: A 25-minute OCR result is persisted on disk before any embedding API call. If Stage 4 fails (network error, rate limit, token limit, process crash, even power loss), the chunks survive. The next exec hash-skips the file (no OCR re-run) and the deferred retry pass attempts only Stage 4.

Per-Chunk Failure Isolation (Binary Split)

If a single chunk in a file exceeds the embedding model's token limit (e.g., a math-dense page in a textbook), the binary split algorithm isolates that one chunk:

EmbedBatch([0..1249])         → 400 "input[846] too long"
  ├─ EmbedBatch([0..624])     → OK (promote 625)
  └─ EmbedBatch([625..1249])  → 400
      ├─ EmbedBatch([625..937])  → 400
      │   ... (binary search narrows toward chunk 846)
      │   └─ EmbedBatch([846..846]) → 400 (mark chunk 846 Failed)
      └─ EmbedBatch([938..1249]) → OK (promote 312)

Result: 1249 chunks indexed, only chunk 846 marked Failed. The file's status becomes Degraded — partially searchable instead of completely missing.

Deferred Retry Pass

Each exec ends with a retry pass over any chunks left in PendingEmbedding state from previous runs:

  • Reads enriched text from DB — no OCR or contextualization re-run
  • Calls the embedding API only — typically seconds, not minutes
  • Up to 3 retries per chunk; on exhaustion, the chunk is marked Failed
  • Auth errors (401/403) flag the provider as unavailable and skip the rest of the pass

File States

StatusMeaningHash-skip behavior
ReadyFully indexedSkip if hash matches
DegradedSome chunks failed (binary-split isolated)Skip if hash matches
PartiallyDeferredChunks pending embedding retryMain loop skips; deferred pass picks up
FailedExtraction or repeated embedding failureSkip; requires --force to retry
NeedsActionUser intervention requiredSkip with separate counter

Schema Versioning

Each KB DB carries a PRAGMA user_version tag. The exec command migrates older schemas automatically as part of InitializeSchema(). The serve command opens DBs read-only and never triggers migration — older-schema KBs continue to serve search queries correctly while their new-feature columns remain unused.

Installation

dotnet tool (recommended)

dotnet tool install -g FieldCure.Mcp.Rag

From source

git clone https://github.com/fieldcure/fieldcure-mcp-rag.git
cd fieldcure-mcp-rag
dotnet build

Requirements

  • .NET 8.0 Runtime or later
  • OCR: Windows x64 only — Tesseract OCR for scanned PDFs loads lazily on first use (Windows only). On other platforms, PDFs with embedded text work normally; scanned pages without a text layer are silently skipped.
  • An embedding provider (Ollama, OpenAI, etc.) — optional, BM25 search works without it
  • Ollama 0.4.0 or later (if using Ollama for embedding or contextualization)

Quick Start

Index a folder and search it without any embedding setup (BM25 only):

# 1. Install
dotnet tool install -g FieldCure.Mcp.Rag

# 2. Create a minimal config
$kbPath = "$env:LOCALAPPDATA\FieldCure\Mcp.Rag\demo"
New-Item -ItemType Directory -Force -Path $kbPath
@'
{
  "id": "demo",
  "name": "Demo KB",
  "sourcePaths": ["C:\\my-docs"]
}
'@ | Set-Content "$kbPath\config.json"

# 3. Index
fieldcure-mcp-rag exec --path $kbPath

# 4. Start the search server
fieldcure-mcp-rag serve --base-path "$env:LOCALAPPDATA\FieldCure\Mcp.Rag"

For full retrieval quality with semantic search and contextualization, add embedding and contextualizer blocks to config.json — see Usage below.

Usage

1. Create a knowledge base folder

%LOCALAPPDATA%\FieldCure\Mcp.Rag\{kb-id}\config.json
{
  "id": "my-kb-001",
  "name": "Project Docs",
  "created": "2026-04-03T00:00:00Z",
  "sourcePaths": ["C:\\Users\\me\\Documents\\project-docs"],
  "contextualizer": {
    "provider": "anthropic",
    "model": "claude-haiku-4-5-20251001",
    "apiKeyPreset": "Claude"
  },
  "embedding": {
    "provider": "openai",
    "model": "text-embedding-3-small",
    "apiKeyPreset": "OpenAI"
  }
}

API keys are resolved from environment variables: apiKeyPreset: "OpenAI" → OPENAI_API_KEY, "Claude" → ANTHROPIC_API_KEY, "Gemini" (or "Google") → GEMINI_API_KEY.

Gemini embedding example — asymmetric retrieval with 1536-dim Matryoshka truncation (50% storage of full 3072 with identical MTEB score):

"embedding": {
  "provider": "gemini",
  "model": "gemini-embedding-2",
  "apiKeyPreset": "Gemini",
  "dimension": 1536
}
DimensionMTEBStorageUse case
76867.9925%Storage-constrained
153668.1750%Recommended default
307268.17100%Maximum quality (pre-normalized)
In serve mode, search_documents can also prompt via MCP elicitation when the client supports it. In exec and exec-queue, missing keys must be provided via environment variables.

2. Index documents

fieldcure-mcp-rag exec --path "C:\Users\me\AppData\Local\FieldCure\Mcp.Rag\my-kb-001"

3. Start MCP search server

fieldcure-mcp-rag serve --base-path "C:\Users\me\AppData\Local\FieldCure\Mcp.Rag"

A single serve process handles all knowledge bases under the base path. Tools accept a kb_id parameter to target a specific KB.

Claude Desktop

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "rag": {
      "command": "fieldcure-mcp-rag",
      "args": ["serve", "--base-path", "C:\\Users\\me\\AppData\\Local\\FieldCure\\Mcp.Rag"],
      "env": {
        "OPENAI_API_KEY": "sk-...",
        "ANTHROPIC_API_KEY": "sk-ant-..."
      }
    }
  }
}

config.json Reference

FieldDescription
idKnowledge base identifier
nameDisplay name
sourcePathsList of folders to index (multiple supported)
contextualizer.provider"anthropic", "openai", "ollama", or empty to disable
embedding.provider"openai", "ollama", "gemini", or empty to disable
embedding.dimensionOutput dimension. 0 = provider default. Gemini supports MRL truncation: 768 / 1536 / 3072.
contextualizer.modelModel ID, or empty to disable contextualization
contextualizer.apiKeyPresetMaps to env var: "OpenAI" → OPENAI_API_KEY, "Claude" → ANTHROPIC_API_KEY
contextualizer.baseUrlAPI base URL override (null = provider default)
embedding.*Same structure as contextualizer
embedding.maxChunkCharsMax chars per chunk before pre-split (default: 4000)
embedding.batchSizeMax chunks per embedding API call (default: auto from provider table)
embedding.keepAliveOllama only: VRAM retention duration (default: "5m")
embedding.numCtxOllama only: context window tokens (default: 8192). Contextualizer only.
systemPromptCustom system prompt for contextualization (null = built-in default)

Tools

All tools (except list_knowledge_bases) require a kb_id parameter to specify the target knowledge base.

ToolDescription
list_knowledge_basesList all available KBs with status (file/chunk counts, indexing status)
search_documentsHybrid BM25 + vector search with RRF. Supports search_mode: auto, bm25, vector
get_document_chunkRetrieve full content of a specific chunk by ID
start_reindexQueue an indexing request. Scope merge, force/deferred flags, orchestrator auto-spawn
cancel_reindexRemove a pending (not-yet-started) queue entry
get_index_infoIndex metadata, queue state (status/position/deferred/last_error), contextualization health
check_changesDry-run filesystem scan. Lightweight, no API calls

Search Modes

search_modeBehavior
autoHybrid when embedding available, else BM25. Recommended
bm25Keyword-only (FTS5). No embedding call
vectorSemantic-only. Errors if no embedding provider

Supported Formats

Document formats are provided by FieldCure.DocumentParsers:

  • DOCX — Microsoft Word (with math equation extraction)
  • HWPX — Korean standard document (OWPML, with math equation extraction)
  • XLSX — Excel spreadsheets
  • PPTX — PowerPoint presentations
  • PDF — PDF text extraction with ## Page N headers; OCR fallback for scanned pages (Tesseract, eng+kor)
  • TXT, MD — Plain text / Markdown

Project Structure

src/FieldCure.Mcp.Rag/
├── Program.cs                     # CLI entry (exec | exec-queue | serve | prune-orphans)
├── MultiKbContext.cs              # Multi-KB manager (lazy load, Classify, lazy unload)
├── ExecQueueRunner.cs             # Deferred queue orchestrator
├── OrphanCleanupRunner.cs         # prune-orphans CLI
├── Configuration/
│   ├── RagConfig.cs               # config.json model (KeepAlive, NumCtx fields)
│   └── OllamaDefaults.cs          # Shared defaults (KeepAlive="5m", NumCtx=8192)
├── Indexing/
│   ├── IndexingEngine.cs          # 5-stage pipeline (2-commit model)
│   └── EmbeddingBatchSplitter.cs  # Binary-split per-chunk failure isolation
├── Contextualization/
│   ├── IChunkContextualizer.cs
│   ├── OpenAiChunkContextualizer.cs   # /v1/chat/completions
│   ├── OllamaChunkContextualizer.cs   # /api/chat (keep_alive + num_ctx)
│   ├── AnthropicChunkContextualizer.cs
│   └── NullChunkContextualizer.cs
├── Embedding/
│   ├── IEmbeddingProvider.cs
│   ├── OpenAiCompatibleEmbeddingProvider.cs  # /v1/embeddings
│   ├── OllamaEmbeddingProvider.cs            # /api/embed (keep_alive)
│   ├── NullEmbeddingProvider.cs
│   └── EmbeddingBatchSizes.cs
├── Storage/
│   └── SqliteVectorStore.cs       # SQLite + FTS5 + SIMD cosine similarity
├── Search/
│   ├── HybridSearcher.cs          # BM25 + Vector → RRF
│   └── RrfFusion.cs
├── Chunking/
│   ├── TextChunker.cs
│   └── ChunkLimits.cs
└── Tools/
    ├── ListKnowledgeBasesTool.cs
    ├── SearchDocumentsTool.cs
    ├── GetDocumentChunkTool.cs
    ├── StartReindexTool.cs        # Queue entry point + orchestrator spawn
    ├── CancelReindexTool.cs       # Remove pending queue entry
    ├── GetIndexInfoTool.cs        # Includes queue state
    └── CheckChangesTool.cs

Data Storage

Knowledge base data is stored at %LOCALAPPDATA%\FieldCure\Mcp.Rag\{kb-id}\:

  • config.json — knowledge base configuration
  • rag.db — SQLite database (chunks, embeddings, FTS5 index, file hashes, indexing lock)

Queue and lock files at %LOCALAPPDATA%\FieldCure\Mcp.Rag\:

  • .deferred-queue.json — pending indexing requests
  • orchestrator.lock — PID lock for the queue orchestrator

Development

# Build
dotnet build

# Test
dotnet test

# Pack as dotnet tool
dotnet pack src/FieldCure.Mcp.Rag -c Release

See Also

Part of the AssistStudio ecosystem.

License

MIT

Featured
CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →

Configuration

OPENAI_API_KEY
ANTHROPIC_API_KEY
GEMINI_API_KEY
VOYAGE_API_KEY
GROQ_API_KEY
Categories
AI & LLM ToolsSearch & Web Crawling
Registryactive
PackageFieldCure.Mcp.Rag
TransportSTDIO
UpdatedMay 25, 2026
View on GitHub

Related AI & LLM Tools MCP Servers

View all →
SkillFM LLM Cost Optimizer

io.github.ericm1018/skillfm-llm-cost-optimizer-openai-anthropic-usage

LLM cost optimizer for OpenAI, Anthropic, token usage, BYOK, and SkillFM Beacon audits.
Llm Orchestration Agent

io.github.mikerawsonnz/llm-orchestration-agent

Run a prompt through a LangChain (system + human) chain over Gemini on Vertex AI; optional LangSmith
Authenticated Llm Agent

io.github.mikerawsonnz/authenticated-llm-agent

JWT-gated LLM gateway: authenticate (bcrypt/JWT), then run a LangChain-on-Vertex Gemini completion.
Copilot Memory MCP

labforgedev/copilot-memory-mcp

Persistent semantic memory for AI agents using local ChromaDB vector search. No cloud required.
1
Agent Prompt Injection Firewall Mcp

csoai-org/agent-prompt-injection-firewall-mcp

The WAF for agents. Pattern-based + heuristic firewall scans prompts, RAG documents, tool argume...
Authenticated Multi Llm Agent

io.github.mikerawsonnz/authenticated-multi-llm-agent

Google-OAuth-gated LLM gateway: verify a Google ID token, then run a Gemini (Vertex AI) completion f