Turns your codebase into a persistent, queryable knowledge base that lives across AI sessions. Index once with the CLI, then search through SQLite FTS5 for keyword matching, Qdrant with Ollama embeddings for semantic queries, or hybrid mode that fuses both with reciprocal rank. Ships with six search modes including regex and symbol lookup, plus a code agent that bundles session context and applies changes with automatic test hooks. Everything stays local unless you opt into Qdrant Cloud. Works with Claude, Cursor, Windsurf, and Antigravity. Supports streaming results, fuzzy matching for typos, and optional reranking. If you're tired of re-explaining your architecture every conversation or manually grep-ing files to paste into context, this gives the model direct read access to your indexed repo with line-range precision.
Local MCP server — index your repo once, search it in every AI session
Keyword (SQLite FTS5) · Semantic (Qdrant + Ollama embeddings) · Hybrid — your code stays on disk
MCP server (vibe-hnindex) latest: v0.12.0 · hnindex-cli v0.12.0 — Docs · Changelog · GitHub Releases
vibe-hnindex is a Model Context Protocol server. After you index a folder once, assistants (Claude, Cursor, Windsurf, Antigravity, …) can search that codebase with paths and line ranges — data is stored locally (SQLite + optional Qdrant). Embeddings use Ollama; vectors use Qdrant (Docker, local, or Qdrant Cloud with QDRANT_API_KEY).
📚 Full docs site: docs.hnindex.cloud — 16 pages covering Getting Started, Configuration, Tools Reference, Guides, and Code Agent.
| Page | What you'll learn |
|---|---|
| Introduction | What vibe-hnindex does, key features, how it works |
| Installation | Node, Ollama, Qdrant setup + MCP config |
| Quick Start | 5-minute walkthrough with CLI + agent skill |
| Configuration | All 25+ env vars with embedding model comparison |
| Search | 6 search modes, regex, fuzzy, streaming, cache |
| Code Agent 🆕 | code_session + code_apply with safety scopes |
| Setup MCP | Per-platform config (Claude, Cursor, Antigravity, VS Code...) |
Also available in-repo: docs/getting-started.md, docs/configuration.md, docs/tools-reference.md.
hnindex)Optional — writes the MCP JSON for you (merge-safe, same npx -y vibe-hnindex block as in the docs):
npm install -g hnindex-cli
# Setup MCP config
hnindex init --mcp antigravity # or: claude, cursor, windsurf, vscode, codex
hnindex init --list # show all targets and paths
# Install AI agent skill (recommended)
hnindex init-skill --target claude # or: antigravity, cursor, windsurf, vscode
hnindex init-skill --list # show all skill targets
# Update
hnindex update # npm update -g hnindex-cli
See docs.hnindex.cloud for full documentation.
npm install does not need a C++ compiler. See Troubleshooting → Windows if npm i vibe-hnindex fails.ollama pull bge-m3:567m and keep ollama serve running (or set OLLAMA_URL to a remote server).docker run -d --name qdrant -p 6333:6333 qdrant/qdrant (or use Qdrant Cloud). Keyword-only search works without Qdrant.{
"mcpServers": {
"vibe-hnindex": {
"command": "npx",
"args": ["-y", "vibe-hnindex"],
"env": {
"OLLAMA_URL": "http://localhost:11434",
"OLLAMA_MODEL": "bge-m3:567m",
"QDRANT_URL": "http://localhost:6333",
"SEARCH_STREAM_ENABLED": "true",
"CODE_AGENT_ENABLED": "true",
"CODE_AGENT_SCOPE": "moderate",
"CHAT_MEMORY_ENABLED": "true"
}
}
}
}
For Qdrant Cloud, add QDRANT_API_KEY and set QDRANT_URL to your HTTPS cluster URL — details in Getting started.
RERANK_URL)Semantic/hybrid search already uses Ollama (OLLAMA_URL, OLLAMA_MODEL e.g. bge-m3:567m) for query vectors and Qdrant for retrieval. After that, the server can reorder the top pool of hits:
RERANK_URL: reorder by Qdrant semantic scores (no extra network service). This is enough for most setups, including when you only run Ollama + Qdrant.RERANK_URL: POST JSON { "query", "documents" } to your URL; response { "scores": number[] } (same length as documents). Use a small HTTP service you host that wraps your reranker; Ollama does not expose this contract on :11434 by default.Ollama vs rerank: pulling a reranker model in Ollama (e.g. qllama/bge-reranker-v2-m3) does not replace RERANK_URL—you still need an adapter service unless you only rely on the built-in Qdrant reorder. See Configuration → Rerank.
| Env | Role |
|---|---|
SEARCH_RERANK | false disables post-retrieval reorder entirely (default: enabled). |
SEARCH_RERANK_POOL | Max candidates considered before trim (default 50). |
RERANK_URL | Full URL of your {query, documents} → {scores} API (optional). |
RERANK_TIMEOUT_MS | Timeout for that POST (default 15000). |
To prevent hanging when Ollama or Qdrant are unresponsive, vibe-hnindex applies timeouts on all external calls. You can tune these via environment variables:
| Env | Default | Controls |
|---|---|---|
OLLAMA_TIMEOUT_MS | 30000 (30s) | Max wait for Ollama /api/embed and /api/tags calls |
QDRANT_TIMEOUT_MS | 15000 (15s) | Max wait for Qdrant API calls (search, upsert, etc.) |
SEARCH_TIMEOUT_MS | 60000 (60s) | Overall timeout for the entire search operation |
Set any of these to a higher value if you have a slow machine or large dataset. Set to 0 to disable the timeout for that layer (not recommended).
Use the same mcpServers block as above, but save it in Antigravity’s MCP file:
| File | mcp_config.json under .gemini/antigravity/ in your user folder |
| Windows | C:\Users\<your-username>\.gemini\antigravity\mcp_config.json |
| macOS / Linux | ~/.gemini/antigravity/mcp_config.json |
| UI | ⋮ menu → MCP → Manage MCP Servers → View raw config |
Step-by-step: Integrations → Google Antigravity.
| Search | 6 modes: keyword (FTS5+BM25), semantic (Qdrant vectors), hybrid (RRF fusion), regex, symbol, auto |
| Code Agent | code_session — 1 call replaces 5-15 searches. code_apply — safe code changes with auto test/lint/typecheck |
| Chat Memory 🆕 | Auto-track tool calls, semantic search via Qdrant, persistent AI context across sessions |
| Streaming | Parallel keyword+semantic search (~1.5-2× faster), 4-phase progress notifications |
| Fuzzy Search | Levenshtein distance auto-corrects typos ("fucntion" → "function") |
| Smart Context | Task-aware context: impact analysis, test file detection, similar code patterns |
| Storage | SQLite on disk + Qdrant for vectors; 100% local, no cloud required |
| Indexing | Incremental (SHA-1 hash), parallel workers (~3-4× faster), watch mode (auto re-index on save), 40+ languages, .hnindexignore |
| Resilience | Keyword search works without Qdrant or Ollama; graceful degradation |
| Benchmark | Built-in benchmark_search tool — compare streaming vs non-streaming, all search modes |
| Multiple Embedding Models | bge-m3 (default), nomic-embed-text, qwen3-embedding, mxbai-embed-large, and more |
graph TB
subgraph Input["📂 Input"]
A["💻 Your Codebase<br/>.ts .py .go .rs ..."]
end
subgraph Server["⚙️ vibe-hnindex MCP Server"]
B["🔍 Search Router<br/>keyword | semantic | hybrid"]
C["🔀 RRF Fusion"]
end
subgraph Storage["💾 Storage"]
D[("SQLite<br/>FTS5 + Keyword")]
E[("Qdrant<br/>Vector Embeddings")]
end
subgraph Memory["🧠 Chat Memory (v0.12)"]
F[("SQLite<br/>Chat Context")]
G[("Qdrant<br/>Chat Vectors")]
end
subgraph Infra["🏗️ Infrastructure"]
H["Ollama<br/>Embeddings"]
I["Qdrant<br/>localhost:6333"]
end
subgraph Output["🤖 AI Clients"]
J["Claude · Cursor · Windsurf<br/>Antigravity · VS Code"]
end
A -->|"index_codebase"| Storage
A -->|scan| H
B -->|"keyword"| D
B -->|"semantic"| E
B -->|"hybrid"| C
C --> D
C --> E
B -.->|"auto-track"| F
F --> H
H --> G
D --> J
E --> J
H -.-> I
style F fill:#6366f1,color:#fff
style G fill:#6366f1,color:#fff
style B fill:#f59e0b,color:#000
style J fill:#22c55e,color:#fff
MIT — see LICENSE.
Issues and PRs: github.com/AndyAnh174/vibe-hnindex.
Ho Viet Anh (AndyAnh174) · hovietanh147@gmail.com · GitHub
OLLAMA_URLOllama embedding server URL
OLLAMA_MODELOllama embedding model name (default: bge-m3:567m)
QDRANT_URLQdrant vector database URL
STORAGE_PATHSQLite database storage path
csoai-org/pdf-document-mcp
xt765/mcp-document-converter
io.github.xjtlumedia/markdown-formatter
io.github.ai-aviate/better-notion
suekou/mcp-notion-server
meterlong/mcp-doc