A Windows-only MCP server that hooks Claude directly into your local Markdown vault without uploading anything. It exposes two tools: omniclip.status to verify the index is live, and omniclip.search for semantic retrieval across your notes, PDFs, and 1,290 other formats via Tika. Built on top of the standalone OmniClip RAG app, which uses BAAI/bge-m3 embeddings and hot-reloads new files automatically. The MCP layer is read-only by design, so Claude can pull context but never write back. If you keep a large personal knowledge base in Obsidian or Logseq and want Claude to search it during conversations, this bridges the gap without Docker or cloud sync.
A silent gravity field between your private notes and the universe of AI.
(Supports 1290 formats since V0.3.3, and ships an MCP Registry / MCPB line since V0.4.1)
中文说明 | Changelog | Architecture | MCP Setup | Third-Party Notices | Website
[!TIP] TL;DR: MCP Quickstart
OmniClip RAG now ships a read-only local-first MCP server for searching private Markdown, PDF, and Tika-backed knowledge bases on Windows. Download
OmniClipRAG-MCP-v0.4.8-win64.zipfor manualstdiosetup oromniclip-rag-mcp-win-x64-v0.4.8.mcpbfor the official MCP Registry / MCPB path. Point your MCP client atOmniClipRAG-MCP.exe, then ask the AI to callomniclip.statusfirst andomniclip.searchfor the actual retrieval flow. Full details: MCP_SETUP.md.
What is it? It is a local Markdown semantic search software, a local RAG knowledge base, and now a read-only MCP retrieval server.
How to use it? Just open the application, input your Markdown notes path, and click "Build Knowledge Base" to set up your local RAG vault. Once built, you can use it to semantically search your notes. The retrieved content can be copied and sent to any AI for in-depth discussion, or used for your own deep reading.
What are the benefits? No need to upload any of your data, and no vendor lock-in. It requires no complex configuration or setup. Moreover, it features hot-reloading—newly written notes automatically enter the RAG vault! New notes can also be an organized collection of your historical conversations with AIs, which in turn implicitly provides a permanent memory for them.
[!NOTE] Introduction: Handing Over Our "Cyber-Underwear" in the AI Era!
OmniClip RAG uniquely achieves the impossible: You can have it all!
- We Demand: Our Markdown notes remain completely ours.
- We Also Demand: Any AI to deeply participate within our permitted and supervised scope. The note vault and the AI must be deeply decoupled yet highly interactive.
- And We Demand: An out-of-the-box experience without any tedious setup, featuring a robust hot-reload capability so new notes automatically enter the RAG semantic pool! It can even compile your historical AI conversations, granting your LLMs a permanent, rolling memory.
In the AI era, the more we rely on large models, the more personal privacy we surrender. Most knowledge base RAG tools on the market are either agonizingly complex to configure (involving server-like Docker or Python environments), demand a steep learning curve that costs too much time, forcibly tether you to a bloated chat interface, or require you to upload your notes completely. They all attempt to lock your data into their products, making it impossible for you to ever leave them.
To ensure my notes and thoughts genuinely remain mine, I spent considerable time thinking through and comparing numerous possibilities before finalizing and hand-crafting this pure local semantic retrieval tool—OmniClip RAG. I pushed its core functionalities to the absolute limit, ensuring that it both runs smoothly on most computers and maintains professional-grade capabilities. It functions as a local knowledge firewall, allowing you to selectively let AI deeply read your "second brain" without worrying about your data being hijacked by any cloud or local software.
OmniClip RAG is a radically decoupled "privacy firewall" and "manual-transfer local RAG search engine" meticulously crafted for the Markdown note ecosystem (natively compatible with Logseq, Obsidian, Typora, MarkText, Zettlr, and any plain text application).
It exclusively performs one highly refined task: it semantic-searches tens of thousands of pages locally via embedded vector algorithms (e.g., BAAI/bge-m3) and structural indexing, meticulously packs the most high-value contextual snippets, and lets you manually clip and paste them into any external top-tier AI (such as ChatGPT, Claude, Kimi, etc.) for profound interactions. In short: As long as your materials are in Markdown formats, this engine acts as the ultimate "second brain permanent memory extractor."
OmniClip is intentionally not trying to win with flashy UI tricks. The real work went into making local knowledge retrieval dependable, explainable, and maintainable without forcing users into cloud upload or environment chaos.
|
|
|
⚙️ Configuration and Indexing UI |
🌙 Dark Mode Aesthetics |
OmniClip perfectly integrates smoothly into your workflow:
The foundation is built as a single portable green EXE. No complicated scripting or dev environments are needed. Just pure "Download, double-click, and run":
OmniClip RAG MCP Server lets MCP-capable AI clients search your local knowledge base through the same read-only retrieval core that powers the desktop app.
From v0.4.8, the MCP line is packaged in two parallel distribution forms:
OmniClipRAG-MCP-v0.4.8-win64.zip for manual file-based setupomniclip-rag-mcp-win-x64-v0.4.8.mcpb for the official MCP Registry and MCPB-aware clients[!CAUTION] What You Need First Use
OmniClipRAG-MCP.exeonly as the headless read-only bridge for AI clients. It does not build knowledge bases. You MUST build or install your knowledge base from the desktop app first. If your index has not been built yet, the MCP side will return an explicitindex_not_readystyle error instead of silently pretending everything is fine.
Since v0.4.1, OmniClip RAG keeps a first-class MCP Registry / MCPB line, so clients that support Registry discovery or MCPB installation can use that path first.
io.github.msjsc001/omniclip-rag-mcpomniclip-rag-mcp-win-x64-v0.4.8.mcpbFor the full Registry/MCPB explanation and client-specific setup notes, see MCP_SETUP.md.
If you downloaded the ZIP package manually, or your client does not support the official MCPB format yet, use the traditional absolute-path stdio setup.
In Jan.ai, create a new MCP server with the following values:
Server Name: OmniClip RAGTransport Type: STDIOCommand: the full path to OmniClipRAG-MCP.exe (e.g. D:\Apps\OmniClip RAG\dist\OmniClipRAG-MCP-v0.4.8\OmniClipRAG-MCP.exe)Arguments: leave emptyEnvironment Variables: leave empty by defaultRegister the MCP server in OpenClaw's config file (%USERPROFILE%\.openclaw\openclaw.json):
{
"mcpServers": {
"omniclip-rag": {
"transport": "stdio",
"command": "D:\\Apps\\OmniClip RAG\\dist\\OmniClipRAG-MCP-v0.4.8\\OmniClipRAG-MCP.exe",
"args": []
}
}
}
Then restart OpenClaw or its gateway process so it reloads the config.
V1 intentionally keeps the MCP surface very small and stable:
omniclip.status: checks whether your local search environment is ready, tells the AI whether it is running in hybrid mode or a degraded lexical_only mode.omniclip.search: searches your local knowledge base, returns explicit source labels such as Markdown · xxx.md or PDF · xxx.pdf · Page N.Once the MCP server is connected, you can simply speak to the AI in natural language. These prompts work well:
"Use OmniClip to search my local knowledge base for 'project roadmap' and summarize the most useful points.""First call omniclip.status, then tell me whether my local knowledge base is ready.""Search only PDF results in OmniClip for 'attention mechanism'.""Find notes related to 'my thinking model' in OmniClip and show me the most relevant 5 snippets with sources."If you want an AI to behave more like it has a built-in RAG habit instead of waiting for you to remind it every time, the following two prompt templates work well.
Use this when the AI can call omniclip.search by itself:
From now on, whenever we discuss a topic, please first search my local knowledge base for information relevant to the current question, and then talk to me based on both the search results and my knowledge base as the boundary of what I know, even if I do not always remember that knowledge clearly myself.
Because my knowledge base may contain hundreds of thousands of chunks, please do not over-expand the search beyond what is needed for the current topic.
If you need more of my background later in the conversation, keep using the same pattern: search my local knowledge base first, then continue the discussion based on the relevant results.
Use this when you are talking to a normal web AI that cannot call MCP directly and must ask you for search terms:
From now on, whenever we discuss a topic, please first decide what information you need from my local knowledge base, then ask me for the exact keywords or phrases you want me to search. I will manually search with my local RAG tool and send the retrieved snippets back to you.
Because this web chat does not have MCP access, please behave as if you do: ask me for the search terms you need, wait for the retrieved snippets, and then continue the discussion based on both those snippets and my knowledge base as the boundary of what I know, even if I do not always remember that knowledge clearly myself.
Because my knowledge base may contain hundreds of thousands of chunks, please do not over-expand the search beyond what is needed for the current topic.
If you need more of my background later in the conversation, keep using the same pattern.
[!IMPORTANT] Core Philosophy: Stop thinking of OmniClip RAG as "just another AI software". Instead, treat it as the ultimate "Local Knowledge Router & Context Dispenser" sitting between you and any state-of-the-art AI. The AI is no longer just chatting with you out of thin air; it is reasoning based on your lifetime of accumulated insights.
From an architecture and knowledge-management perspective, we highly recommend the following high-leverage workflows to unlock emergent abilities:
Use OmniClip as your "Single Source of Truth (SSOT)". Since the frontend is physically decoupled from any specific AI, you can take a single highly-dense Context Pack retrieved locally and feed it in parallel to different engines: Have Claude 3.5 Sonnet write core refactoring code based on the snippet, forward the exact same context to O1 for boundary security reviews, and use another model for a localized report. You are leveraging your stable local index to "arbitrage" the strengths of different cloud LLMs, mitigating the blind spots of any single model.
Fully exploit the support for PDF and Tika (1290 formats). Treat your Markdown vault as your "Cognitive Mainchain" (where your judgments and thoughts live), and treat PDFs/DOCXs/EMLs as your "Raw Evidence Silo".
Dump unstructured industry reports and reference books into local folders without manually organizing them. When researching, intentionally isolate your queries: search Markdown for "what I thought", and search PDF/Tika for "what the raw evidence says". Over the MCP protocol, you can simply ask an AI: "Search my recent PDF reports on attention mechanisms and cross-reference them with my markdown reflections." Your PC instantly becomes an offline, private intelligence war room.
For long-running, complex projects containing requirements, deprecated drafts, and meeting notes, never manually dig through folders when you are stuck. Instead, perform fuzzy searches using sentences or conflicting pairs (e.g., Why did we make this architectural decision back then? or Privacy vs Convenience).
By abusing the "fuzzy semantic association" of the vector engine, OmniClip might suddenly connect a psychology note you wrote two years ago with your current architecture hurdle, triggering true "Serendipity". It prevents the friction of "re-thinking what you have already thought through".
Whenever an AI helps you solve a profound issue or draft an ingenious architecture, immediately summarize it into a clean Markdown file and drop it into your Vault. OmniClip's millisecond hot-reload mechanism instantly pulls it into the LanceDB and FTS5 retrieval pool. Over time, you physically mount a continuously-growing, strictly supervised "past-life memory" onto any AI you use. Your vault stays entirely local. You are only handing the AI the "Minimum Viable Context"—never uploading the gold mine itself. This guarantees absolute data sovereignty and privacy.
🌟 The Golden Rule: Before writing a massive document, making a complex decision, or starting a deep chat with an AI — Search first, chat later. OmniClip's true power isn't doing the thinking for you; it's handing back exactly what you've already thought, read, and accumulated, right when you and your AI need it most.
When pasting your retrieved context packs to an AI, you may want the AI to utilize the knowledge effectively without just "summarizing" or "parroting" your notes. We highly recommend including the following guidelines in your System Prompt or initial message to the AI:
- In our conversation, I may sometimes include RAG semantic retrieval snippets (not the full text) related to the topic as background information for our discussion.
- This information comes from my local RAG retrieval software, which searches all relevant snippets within my local note vault. This allows me to establish a deep, critical connection between you and my knowledge base, without the time and effort of uploading the entire vault, thus maximizing the privacy of my notes while enabling in-depth interaction.
- The sole purpose of providing these snippets is to synchronize you with my knowledge boundaries and make our conversation deeper and more meaningful.
- Some of the snippets I provide may be irrelevant; please ignore this noise on your own.
- Please directly treat these snippets as known premises and converse with me based on them. Absolutely do not summarize, simply agree with, parrot back, or distill this background information.
- When you find it necessary, or when your reply is inspired by a specific snippet:
- Please naturally mention the relevant note title and paragraph so I can accurately locate it locally (this also helps me with subsequent additions, deletions, or modifications to my local notes).
- During your reasoning and our conversation, you can ask me to provide supplemental information at any time if needed.
- If you need specific support, please explicitly tell me the exact words or phrases to search for. I will use those to retrieve the key snippets and return them to you.
- If you find that key content is truncated when reviewing a snippet, you can directly ask me to provide the complete note page.
flowchart LR
A["Markdown / Logseq Vault"] --> B["Parser"]
B --> C["SQLite + FTS5"]
B --> D["LanceDB + Embeddings"]
C --> E["Hybrid Retrieval"]
D --> E
E --> F["Context Pack"]
F --> G["Any AI / MCP Client"]
Everything you own strictly stays in designated bounds.
By default, data generation sits securely in %APPDATA%\OmniClip RAG. Under prohibitive permissions or system limits, it downgrades gracefully to %LOCALAPPDATA%\OmniClip RAG.
—— It heavily repudiates creating messy temp logs or intrusive directories inside system installs or directly littering your precious note vaults.
External heavy runtime payloads (e.g., native Torch environments) stay outside the packaged EXE and are now designed to converge into a shared AppData sidecar root after user-authorized installation (see RUNTIME_SETUP.md). Packaged GUI builds now carry their own bundled Python runtime for Runtime installation, so end users no longer need to preinstall Python just to download or repair OmniClip's local Runtime sidecar. Runtime manifests now lock exact wheel files instead of loose dependency ranges, and even CUDA entries on non-NVIDIA machines remain manually repairable for installation-chain testing without being misreported as GPU-ready. Lean releases remain clean, while healthy legacy runtimes can still be reused across packaged version folders.
OmniClip is completely open-sourced on GitHub. Whether you're interested in the code repository, demand high standards for personal data sovereignty, or your note vault is simply too vast to traverse natively, you can dive deeply into its control at any time.
Currently, all source code and distribution packages have survived rigorous unit testing and smoke protocols:
Start the Desktop GUI:
.\scripts\run_gui.ps1
Build the Packaged Windows EXE:
.\scripts\build_exe.ps1
Run the headless MCP self-check from source:
python launcher_mcp.py --mcp-selfcheck
For Automation and Terminal Devs, the native CLI is still on active duty:
.\scripts\run.ps1 status
.\scripts\run.ps1 query "your question"
v0.4.8 is the startup responsiveness and extension-source removal release: packaged GUI startup now yields the first window before heavyweight Runtime-management and extension-summary refresh work finishes, while 拓展格式 > PDF / Tika source rows now have a true remove action that clears matching extension index/state data and then removes the source directory from extension settings without ever touching the user's original files.
v0.4.8 version line.v0.4.6 is the UI cognition and navigation polish release: the Start page now speaks in the clearer Primary / Included scope mental model for Markdown source directories, the configuration tabs follow the real product workflow order, and the desktop shell gains a unified theme-aware hover-help layer that can be globally turned on or off.
READY / query_ready / vector_ready state truth.pypdf metadata was missing from the frozen environment.v0.4.3 is the release that turns the recent hotfix line into a cleaner public release: semantic retrieval now tells the truth, download flows finally follow the active environment end to end, and the MCP/packaging line is again fully reproducible from the repository itself.
vector_backend is disabled or semantic vectors have not been rebuilt yet, the desktop app and MCP payloads now say so explicitly instead of pretending full hybrid search is active.ModelScope -> HF mirror -> Hugging Face official, exposes live terminal output and heartbeat logs, and keeps download failures visible instead of looking frozen.OmniClipRAG-MCP.spec is now back in-tree, so build.py can rebuild the GUI ZIP, MCP ZIP, and .mcpb bundle from one version source of truth.v0.4.2 is the release that turns OmniClip's recent internal convergence work into a publishable product line: data-root truth is now unified, GUI recovery can repair broken environments without silent fallback, and the desktop shell is cleaner to operate day to day.
v0.4.1 turned the new MCP line from "a working second shell" into "a Registry-ready delivery line" so OmniClip could be published through the official MCP Registry instead of living only as a raw manual ZIP.
server.json metadata file for MCP Registry publishing instead of targeting the deprecated modelcontextprotocol/servers README list.omniclip-rag-mcp-win-x64-v0.4.1.mcpb becomes the Registry/MCPB-aware distribution asset, while the old ZIP remains for manual users.MCP_SETUP.md now explains ZIP vs .mcpb explicitly.0.4.1 is reserved as the first clean MCP Registry version so hash, metadata, and release assets can be verified before later automation.v0.4.0 introduced the first dedicated read-only MCP shell on top of the existing retrieval core, turning OmniClip from a desktop-only app into a standard MCP-capable local search engine.
(See Releases page for historical version update notes from V0.1.0 to the present).
OmniClip stands on a serious amount of open-source work. The core projects that are directly integrated, explicitly relied on, or used in build/test flows today include:
Python,Qt / PySide6 / Shiboken6,SQLite,LanceDB,Apache Arrow / PyArrow,PyTorch,sentence-transformers,Transformers / Hugging Face Hub,BAAI/bge-m3,BAAI/bge-reranker-v2-m3,PyPDF,Apache Tika,Eclipse Temurin / Adoptium,watchdog,PyInstaller,pytest,ONNX Runtime,MCP Python SDK / Model Context Protocol.Thanks to these projects and their maintainers for the long-term engineering work that makes a tool like this possible.
Formal repository-level third-party license and distribution notes now live in THIRD_PARTY_NOTICES.md. The README section above is a human-readable summary, not the legal source of truth.
This project is released under the MIT License.
[!WARNING] OmniClip RAG / 方寸引 is provided on an "as is" and "as available" basis, without warranties of any kind, whether express or implied, including but not limited to merchantability, fitness for a particular purpose, non-infringement, uninterrupted operation, or error-free behavior.
You are solely responsible for:
- verifying all retrieval results, exported context packs, and AI-generated outputs before relying on them
- maintaining backups of your notes, databases, models, and exported materials
- reviewing the legality, sensitivity, and sharing scope of any data you index or paste into third-party AI tools
- complying with the licenses, terms, and usage restrictions of third-party models, libraries, datasets, and services used with this project
OmniClip RAG may return incomplete, outdated, misleading, or incorrect results. Any downstream AI may also hallucinate, misinterpret, overgeneralize, or fabricate conclusions even when the retrieved context is accurate. This project is not a substitute for professional judgment, internal review, or independent verification.
Do not use OmniClip RAG or any exported context pack as the sole basis for medical, legal, financial, compliance, safety-critical, security-critical, employment, academic misconduct, or other high-stakes decisions.
The maintainers and contributors are not liable for any direct, indirect, incidental, consequential, special, exemplary, or punitive damages, or for any data loss, downtime, model misuse, privacy incident, operational interruption, or decision made based on the use or misuse of this project, to the maximum extent permitted by applicable law.
All third-party product names, model names, platforms, and trademarks mentioned in this repository remain the property of their respective owners. Their appearance here does not imply affiliation, endorsement, certification, or partnership.
io.github.ericm1018/skillfm-llm-cost-optimizer-openai-anthropic-usage
io.github.mikerawsonnz/llm-orchestration-agent
io.github.mikerawsonnz/authenticated-llm-agent
labforgedev/copilot-memory-mcp
csoai-org/agent-prompt-injection-firewall-mcp
io.github.mikerawsonnz/authenticated-multi-llm-agent