Gives Claude and other MCP-compatible tools a persistent knowledge graph instead of starting every conversation from scratch. Nine tools let you capture thoughts with typed relationships (refines, cites, refuted_by), search via FTS5 or embeddings, and traverse the graph to pull in connected context. Ships with a Forage skill that runs on your own Claude Code subscription to backfill embeddings, deduplicate memories, and detect contradictions over time. Single SQLite file, no Docker or Python runtime. Auto-configures for Claude Desktop, Cursor, Windsurf, Gemini CLI, and others. The memory layer is extracted from Shelby, the Mac AI coworker, so you can run it standalone with whatever editor or CLI you're already using.
The memory backbone for Shelby — and a standalone knowledge-graph memory server for any MCP-compatible AI tool.
Mem0-grade intelligence. Engram-grade simplicity.
Quick Start · Architecture · Agent Setup · Contributing
Every AI memory server is a bag of embeddings. ShelbyMCP connects your thoughts.
ShelbyMCP is the open-source memory backbone of Shelby — your AI coworker on Mac — and a zero-dependency MCP memory server you can run standalone with any MCP-compatible AI tool. It gives Claude Code, Cursor, Codex, Windsurf, Gemini, Antigravity, and others persistent memory that understands how your thoughts are related — not just what they contain.
Ship it with the Forage skill, a scheduled task you run in your own Claude Code (or Codex / Gemini CLI) session to continuously enrich, consolidate, and connect your memories. Forage uses your subscription — the same way you'd use Claude Code yourself — so ShelbyMCP itself stays zero-cost and zero-cloud. No Docker. No Python. No cloud accounts. Just a binary and a database file.
Where this fits in Shelby: Shelby is an AI coworker built on three layers — companion (the experience), harness (the runtime that carries context, enforces governance, holds history), and memory (this server). If you want the full coworker experience, install Shelby for Mac. If you only want the memory backbone for your existing AI tools, ShelbyMCP standalone is what you want.
| Problem | ShelbyMCP's answer |
|---|---|
| Every conversation starts from zero | Persistent memory across sessions |
| Memories are a flat pile of text | Knowledge graph with typed relationships (refines, cites, refuted_by, tags) |
| Search results blow up your context window | Pre-computed summaries — search returns one-liners, fetch full content on demand |
| No memory maintenance | Forage skill auto-consolidates, deduplicates, and connects |
| Vector search requires heavy infra | Forage skill backfills embeddings via the AI tool you already run yourself (Claude Code, Codex, Gemini CLI) — Shelby never touches your subscription auth |
| Requires Docker/Python/Cloud | npx shelbymcp, single SQLite file |
# npx (no install needed)
npx shelbymcp
# Or install globally
npm install -g shelbymcp
# Or build from source
git clone https://github.com/Studio-Moser/shelbymcp.git
cd shelbymcp && npm install && npm run build
The CLI auto-configures everything — MCP server registration, Memory Protocol, and optional Forage skill:
shelbymcp setup claude-code --forage # Claude Code CLI
shelbymcp setup claude-desktop --forage # Claude Desktop app
shelbymcp setup cursor --forage # Cursor IDE
shelbymcp setup codex --forage # OpenAI Codex
shelbymcp setup windsurf --forage # Windsurf (Codeium)
shelbymcp setup gemini --forage # Gemini CLI
shelbymcp setup antigravity --forage # Antigravity (Google)
Drop --forage if you just want the MCP server without the scheduled enrichment skill.
That's it. The CLI registers the MCP server, adds the Memory Protocol to the right place, and installs the Forage skill. See docs/AGENT-SETUP.md for manual config and platform-specific details.
Most agents get the Memory Protocol added automatically during setup. For agents that require manual steps, the CLI will tell you exactly what to do. Here's where it goes:
shelbymcp protocol >> ~/.claude/CLAUDE.md # Claude Code (auto)
shelbymcp protocol >> ~/.codex/AGENTS.md # Codex (auto)
shelbymcp protocol >> ~/.codeium/windsurf/memories/global_rules.md # Windsurf (auto)
shelbymcp protocol >> ~/.gemini/GEMINI.md # Gemini CLI / Antigravity (auto)
For Cursor, paste into Settings > Rules > User Rules. For Claude Desktop, paste into Settings > General > "What personal preferences should Claude consider in responses?". The Memory Protocol tells your agent when to save and search — without it, the tools are available but won't be used proactively.
Your database is empty after install. There are three ways to make it useful immediately:
# Option A: Run the onboarding interview (recommended)
# Paste this into a conversation — it asks a few questions and saves 15-30 memories
shelbymcp onboard
# Option B: Import from another AI tool
# Paste this prompt into ChatGPT/Claude/Gemini, copy the response back
shelbymcp migrate
# Option C: Just start working — memories accumulate naturally over time
The onboard skill runs a conversational interview covering who you are, what you're building, your team, preferences, and anti-patterns. Takes about 5 minutes. The migrate prompt tells your other AI tools to export everything they know about you in a structured format that ShelbyMCP can import.
Ask your agent: "What memory tools do you have available?"
It should list 9 tools. Then test: "Remember that I prefer dark mode in all my apps." — and in a new session: "What do you know about my preferences?"
You (in Claude Code): "We decided to use CloudKit for sync instead of Firebase"
Claude Code → capture_thought tool → ShelbyMCP:
1. Stores thought in SQLite
2. Agent provides metadata (type: decision, topics: [sync, cloud])
3. Agent suggests relationships to existing thoughts
4. FTS5 indexes content for keyword search
Later:
You: "What did we decide about our sync strategy?"
Claude Code → search_thoughts tool → ShelbyMCP:
1. FTS5 keyword search for "sync strategy"
2. Returns thought + all connected thoughts via knowledge graph
3. Agent has full context of decisions, alternatives considered, and related work
Forage runs on Claude Code's scheduler:
1. Backfills embeddings for thoughts that don't have them
2. Re-classifies poorly tagged thoughts
3. Finds duplicate thoughts and merges them
4. Detects contradictions ("we said PostgreSQL last month but SQLite this week")
5. Discovers connections between thoughts across projects
6. Sweeps for stale action items that fell through the cracks
7. Generates a weekly digest of your thinking
9 focused tools — research shows 5-8 tools per server is the sweet spot for agent accuracy.
| Tool | Description |
|---|---|
capture_thought | Store a thought with summary, metadata, topics, and relationships. Accepts an array for bulk capture. |
search_thoughts | Full-text search with knowledge graph expansion. Auto-detects FTS5 vs. vector mode. Returns summaries, not full content. |
list_thoughts | Browse/filter by type, topic, person, project, date range |
get_thought | Retrieve a specific thought by ID (full content) |
update_thought | Update content or metadata. Accepts ids array for bulk updates. |
delete_thought | Remove a thought |
manage_edges | Create or remove typed relationships between thoughts (link, unlink) |
explore_graph | Traverse the knowledge graph from a starting thought. Depth 1 = direct connections, 2+ = full traversal. |
thought_stats | Aggregate statistics about your memory |
What makes ShelbyMCP different from every other memory server is the knowledge graph. Thoughts aren't isolated — they're connected.
| Edge Type | Meaning | Example |
|---|---|---|
refines | A thought that adds detail to another | "Use CloudKit" → "Configure change tokens for sync" |
cites | A thought that references another as evidence | "Decision doc" → "Performance benchmark results" |
refuted_by | A thought that contradicts another | "Use Firebase" ← "Switch to CloudKit" |
tags | A thought that categorizes another | "Architecture" → "CloudKit sync design" |
related | General association | "Auth system" ↔ "User migration plan" |
follows | Sequential relationship | "Phase 1 plan" → "Phase 2 plan" |
Agents create edges at capture time ("this decision relates to thought X") and the Forage skill discovers additional connections over time.
ShelbyMCP ships with shelby-forage, a scheduled skill you run in your own AI tool of choice (Claude Code, Codex, Gemini CLI) to continuously improve your memory. The skill executes inside your session, on your subscription — ShelbyMCP itself never authenticates the call. The server stays zero-dependency, your subscription terms apply normally, and the intelligence comes from tools you're already paying for.
| Task | Frequency | What it does |
|---|---|---|
| Summary backfill | Daily | Generate one-liners for thoughts missing summaries |
| Auto-classify | Daily | Improve type/topics/people on poorly tagged thoughts |
| Consolidation | Daily | Find and merge duplicate thoughts |
| Contradiction detection | Daily | Flag conflicting memories (tagged needs-attention) |
| Connection discovery | Daily | Create edges between related thoughts |
| Stale sweep | Weekly (Mon) | Flag forgotten action items (tagged needs-attention) |
| Digest | Weekly (Mon) | Summary of the week's thinking by project/topic |
| Forage log | Every run | Audit trail for continuity between runs |
See docs/AGENT-SETUP.md for full setup instructions, platform compatibility table, and gotchas for each agent.
ShelbyMCP works fine without it — you get persistent storage, FTS5 search, and knowledge graph. The Forage skill adds the intelligence layer that makes memories smarter over time. Think of it as optional but recommended.
An empty memory database is a cold start problem — your AI tools can't personalize until they know something about you. ShelbyMCP ships with two tools to solve this:
A one-time conversational interview that seeds 15-30 foundational memories. Paste the prompt into a conversation with your primary AI tool:
shelbymcp onboard
It covers:
Takes about 5 minutes. The skill adapts its questions based on your answers — if you're a solo founder, it won't ask about team structure. Memories are saved after each round so you can see them accumulate in real time.
Already have context stored in ChatGPT, Claude, Gemini, or another AI? Export it:
shelbymcp migrate
This prints a prompt you paste into your other AI tool. That tool dumps everything it knows about you in a structured format. Copy the response back into your ShelbyMCP-connected agent — the onboard skill will parse and import it, or you can just paste it into any conversation and ask the agent to import it.
Works with any AI that has memory or conversation history about you. Run it once per tool you're migrating from.
shelbymcp Start the MCP server (stdio)
shelbymcp --transport http Start as HTTP server (Streamable HTTP)
shelbymcp setup <agent> Set up ShelbyMCP for an agent
shelbymcp setup <agent> --forage ...and install the Forage skill
shelbymcp uninstall <agent> Remove ShelbyMCP from an agent
shelbymcp protocol Print the Memory Protocol
shelbymcp forage Print the Forage skill prompt
shelbymcp onboard Print the onboarding interview prompt
shelbymcp migrate Print the migration prompt for other AI tools
shelbymcp help Show help
shelbymcp --version Print version
Supported agents: claude-code, claude-desktop, cursor, codex, windsurf, gemini, antigravity
Server flags:
| Flag | Default | Description |
|---|---|---|
--db <path> | ~/.shelbymcp/memory.db | Custom database path |
--verbose | off | Debug logging |
--transport <stdio|http> | stdio | Transport mode |
--port <number> | 3100 | HTTP port (only with --transport http) |
--host <address> | 127.0.0.1 | HTTP bind address (only with --transport http) |
Remote server example (Streamable HTTP):
# Start ShelbyMCP as an HTTP server
shelbymcp --transport http --port 3100
# Connect from Claude Code
claude mcp add --transport http shelby http://localhost:3100/mcp
See docs/AGENT-SETUP.md for manual config, platform-specific details, and setup for other MCP-compatible clients.
ShelbyMCP can run as a remote HTTP server, letting you share one memory database across multiple machines. All configuration is via environment variables:
| Variable | Default | Description |
|---|---|---|
SHELBY_TRANSPORT | stdio | Set to http for remote server mode |
PORT | 3100 | HTTP port (most cloud platforms inject this) |
HOST | 0.0.0.0 (http) / 127.0.0.1 (stdio) | Bind address |
SHELBY_DB_PATH | ~/.shelbymcp/memory.db | SQLite database path (use a persistent volume) |
SHELBY_API_KEY | (none) | Bearer token for auth — set this for any internet-facing deployment |
CLI flags (--transport, --port, --host, --db) override env vars when both are set.
docker build -t shelbymcp .
docker run -d \
-e SHELBY_TRANSPORT=http \
-e SHELBY_API_KEY=your-secret-key \
-v shelby-data:/data \
-e SHELBY_DB_PATH=/data/memory.db \
-p 3100:3100 \
shelbymcp
claude mcp add --transport http shelby-cloud https://your-server.example.com/mcp \
--header "Authorization: Bearer your-secret-key"
GET /health returns 200 {"status": "ok"} (unauthenticated). Configure this as your platform's health check endpoint.
When SHELBY_API_KEY is set, all requests to /mcp require an Authorization: Bearer <key> header. Without the key, requests return 401. If SHELBY_API_KEY is not set, auth is disabled (suitable for local-only use).
ShelbyMCP is a single binary that communicates via MCP and stores everything in a single SQLite file. Supports both stdio (local, default) and Streamable HTTP (remote/multi-client) transports.
AI Tool (Claude Code, Cursor, etc.)
│
│ MCP (stdio or Streamable HTTP)
│
▼
┌──────────────────────┐
│ ShelbyMCP │
│ │
│ ┌──────────────────┐ │
│ │ MCP Protocol │ │ ← JSON-RPC request/response
│ │ (stdio or HTTP) │ │
│ └────────┬─────────┘ │
│ │ │
│ ┌────────▼─────────┐ │
│ │ Tool Router │ │ ← Routes to capture/search/link/etc.
│ └────────┬─────────┘ │
│ │ │
│ ┌────────▼─────────┐ │
│ │ SQLite DB │ │
│ │ ┌─────────────┐ │ │
│ │ │ thoughts │ │ │ ← Content, metadata, embeddings
│ │ │ thought_fts │ │ │ ← FTS5 full-text index
│ │ │ edges │ │ │ ← Knowledge graph relationships
│ │ └─────────────┘ │ │
│ └──────────────────┘ │
└──────────────────────┘
│
▼
~/.shelbymcp/memory.db (single file)
See docs/ARCHITECTURE.md for the full technical design.
| ShelbyMCP | Engram | Mem0 | Basic Memory | Cipher | |
|---|---|---|---|---|---|
| Dependencies | Zero | Zero | Docker+Qdrant+Neo4j | Python+pip | Node.js |
| Storage | SQLite | SQLite | Qdrant+Neo4j+SQLite | Markdown+SQLite | Files |
| Knowledge graph | Native (typed edges) | No | Neo4j (separate service) | Derived from Markdown | No |
| Full-text search | FTS5 | FTS5 | Limited | FTS5 | No |
| Vector search | Via Forage skill | No | Built-in | Optional | No |
| Memory maintenance | Forage skill (daily) | No | Built-in | No | No |
| Contradiction detection | Forage skill | No | No | No | No |
| Zero-install | npx shelbymcp | go install | No | pip install | npm install |
| Single file DB | Yes | Yes | No | No (Markdown files) | No |
ShelbyMCP is the open-source memory server. Shelby for Mac (coming soon) is the native macOS app that adds:
ShelbyMCP and Shelby for Mac use the same SQLite database. Start with the MCP server, upgrade to the Mac app when you want more.
These are non-negotiable. They exist because MCP servers directly impact token costs for every user on every message.
Tool descriptions MUST be static. Tool definitions become part of the agent's system prompt and are sent on every message. Dynamic data (counts, timestamps, user-specific info) in descriptions breaks prompt caching — costing 10x more tokens. Put dynamic data in tool responses, not descriptions. See Architecture: Token Efficiency Patterns.
Search returns summaries, not full content. A search hitting 20 thoughts at 2,000 words each = 40K wasted tokens. Search results return the agent-provided summary field (one line). The agent calls get_thought for full content when it needs it.
All list/search tools have a limit parameter. Default 20, max 100. Responses include total_count and has_more. No unbounded queries.
The server runs zero inference. Agents provide metadata (type, topics, summary, relationships) at capture time. The Forage skill handles enrichment. The server is pure storage + retrieval.
All logging to stderr. console.error only. stdout is the MCP JSON-RPC channel. A single console.log breaks everything.
Keep the tool count focused. 9 tools. Research (Block, Phil Schmid, Docker) shows 5-8 per server is optimal. Consolidate related operations into single tools with action parameters.
Errors are instructions. Return isError: true with actionable messages. "Not found" is useless. "No thought with ID abc123. Try search_thoughts to find it by content." helps the agent self-correct.
Every tool gets annotations. MCP spec annotations (readOnlyHint, destructiveHint, idempotentHint, openWorldHint) on every registered tool. See Architecture: Tool Annotations.
# Install dependencies
npm install
# Build
npm run build
# Run tests
npm test
# Run in development
npm run dev -- --db ./test.db --verbose
See docs/DEVELOPMENT.md for the full development guide.
We welcome contributions! See CONTRIBUTING.md for the workflow.
MIT - see LICENSE for details.
Built by Studio Moser. Your AI deserves a memory that connects the dots.
io.github.ericm1018/skillfm-llm-cost-optimizer-openai-anthropic-usage
io.github.mikerawsonnz/llm-orchestration-agent
io.github.mikerawsonnz/authenticated-llm-agent
labforgedev/copilot-memory-mcp
csoai-org/agent-prompt-injection-firewall-mcp
io.github.mikerawsonnz/authenticated-multi-llm-agent