A production-grade persistent memory layer that solves the context problem for long-running AI sessions. Stores facts, preferences, decisions, and skills in SQLite with FTS5 full-text search plus sqlite-vec semantic retrieval, fused via reciprocal rank fusion and reranked by a local Qwen3 cross-encoder. The capture action accepts six typed context categories and deduplicates via embeddings. Includes a temporal knowledge graph with entity resolution, LLM-driven compression of old memories, and encrypted cross-machine sync to Google Drive or S3. Ships with skill definitions that teach Claude when to commit memories and how to recall context proactively. Reach for this when you need the AI to remember conversations across sessions without manually managing context windows or paying for third-party memory APIs.
mcp-name: io.github.n24q02m/mnemo-mcp
Persistent AI memory with hybrid search and embedded sync. Open, free, unlimited.
| Project | Tagline | Tag |
|---|---|---|
| better-code-review-graph | Knowledge graph for token-efficient code reviews -- semantic search and call-... | MCP |
| better-email-mcp | IMAP/SMTP email for AI agents -- read, send, organize folders, and manage att... | MCP |
| better-godot-mcp | Composite MCP server for Godot Engine -- 17 composite tools for AI-assisted g... | MCP |
| better-notion-mcp | Markdown-first Notion for AI agents -- pages, databases, blocks, and comments... | MCP |
| better-telegram-mcp | Telegram for AI agents -- messages, chats, media, and contacts across both bo... | MCP |
| claude-plugins | Claude Code plugin marketplace for the n24q02m MCP servers -- install web sea... | Marketplace |
| imagine-mcp | Image and video understanding + generation for AI agents -- across Gemini, Op... | MCP |
| jules-task-archiver | Chrome Extension for bulk operations on Jules tasks via batchexecute API -- a... | Tooling |
| mcp-core | Shared foundation for building MCP servers -- Streamable HTTP transport, OAut... | MCP |
| mnemo-mcp | Persistent AI memory with hybrid search and embedded sync. Open, free, unlimi... | MCP |
| qwen3-embed | Lightweight Qwen3 text embedding and reranking via ONNX Runtime and GGUF | Library |
| skret | Secrets without the server. | CLI |
| tacet | TACET: a self-distilling neuro-symbolic cascade that amortises LLM cost in kn... | Tooling |
| web-core | Shared web infrastructure package for search, scraping, HTTP security, and st... | Library |
| wet-mcp | Open-source MCP server for AI agents: web search, content extraction, and lib... | MCP |
| Phase | Version | Status | Highlights |
|---|---|---|---|
| Phase 1 | v1.x | Shipped | Typed memory(action="capture") (6 context_types + dedup) -- RRF (k=60) hybrid fusion + cross-encoder rerank + temporal decay -- importance x recency archive policy + restore -- Alembic migrations -- multi-provider LLM dispatch -- plugin trinity (recall-context + memory-commit skills, SessionStart + opt-in PostToolUse hooks) |
| Phase 2 | v1.x+1 | Shipped | LLM-driven compression of older memories + Passport sync (encrypted import/export bundle for cross-machine bootstrap) -- AES-256-GCM + Argon2id, S3 / R2 / B2 / MinIO + GDrive backends, delta-sync with LWW per row |
| Phase 3 | v2.0.0 | Shipped (BREAKING) | Temporal knowledge graph -- bitemporal valid_from / valid_to columns -- entity resolution via embedding KNN -- entity_search / entity_graph / history actions -- KG-aware passport bundle sections -- KG_AUTO_ENABLED opt-in auto-extract on capture |
RERANK_MODELS, order = litellm fallback; empty -> local qwen3-reranker) with temporal decay and importance boostmemory(action="capture") with 6 context_types (conversation/fact/preference/skill/task/decision), embedding-based dedup, and a configurable LLM chain (LLM_MODELS, order = litellm fallback)recency_factor * (1 - importance) > 1.0; restore action available/recall-context + /memory-commit skills and SessionStart + opt-in PostToolUse hooks (see docs/ARCHITECTURE.md)passport-bootstrap skill.valid_from / valid_to / superseded_by) on every memory + entity-resolution dedup (embedding KNN at default 0.85 cosine threshold) + audit trail (memory_audit table with prev/new state hashes) + new actions (entity_search / entity_graph / history) + opt-in KG_AUTO_ENABLED auto-extract on capture. BREAKING for clients that called memory.get expecting historical-inclusive results: pass as_of for time-travel; default now filters to current-state (valid_to IS NULL).| Feature | mnemo-mcp | Mem0 | Letta | OpenMemory |
|---|---|---|---|---|
| Hybrid retrieval (FTS + vec) | yes (FTS5 + sqlite-vec + RRF) | yes | partial | yes |
| Cross-encoder rerank chain | yes (qwen3 local + Jina + Cohere) | partial (Cohere only) | no | no |
| Temporal decay scoring | yes (exp half-life) | no | no | no |
| Importance boost in rank | yes (LLM 0.0-1.0) | no | no | no |
| Soft-archive + restore policy | yes (importance x recency) | no | no | no |
| Self-hostable (single SQLite file) | yes (zero ext deps) | partial (cloud-first) | yes (Postgres) | yes (Postgres + Qdrant) |
| Multi-provider LLM dispatch | yes (LLM_MODELS chain, any litellm provider) | partial | yes | partial |
| Plugin trinity (skills + hooks) | yes (recall-context + memory-commit) | n/a | n/a | n/a |
| Multi-machine sync | yes (GDrive bundled OAuth) | yes (cloud) | n/a | n/a |
| E2E-encrypted passport sync | yes (AES-256-GCM + Argon2id, S3 + GDrive) | no | no | no |
| LLM compression on capture | yes (multi-provider, ~3x at >=0.90 retention) | no | no | no |
| Backend-pluggable sync architecture | yes (S3 / R2 / B2 / MinIO + GDrive) | no | no | no |
Bitemporal valid_from / valid_to queries | yes (as_of time-travel) | no | partial (events only) | no |
| Entity resolution via embedding KNN | yes (cosine threshold tunable) | no | no | no |
| Audit trail with state hashes | yes (memory_audit table) | no | no | no |
2026-05-02 -- Architecture stabilization update
Past months saw significant churn around credential handling and the daemon-bridge auto-spawn pattern. This caused multi-process races, browser tab spam, and inconsistent setup UX across plugins. The architecture is now stable: 2 clean modes (stdio + HTTP), no daemon-bridge layer, no auto-spawn from stdio.
Apologies for the instability period. If you encountered issues with prior versions, please update to the latest release and follow the current setup docs -- most prior workarounds are no longer needed.
Related plugins from the same author:
- wet-mcp -- Web search + content extraction
- imagine-mcp -- Image/video understanding + generation
- better-notion-mcp -- Notion API
- better-email-mcp -- Email management
- better-telegram-mcp -- Telegram
- better-godot-mcp -- Godot Engine
- better-code-review-graph -- Code review knowledge graph
All plugins share the same architecture -- install once, learn pattern transfers.
Full docs at mcp.n24q02m.com/servers/mnemo-mcp/setup/:
Install with AI agent -- paste this to your AI coding agent:
Install MCP server
mnemo-mcpfollowing the steps at https://raw.githubusercontent.com/n24q02m/claude-plugins/main/plugins/mnemo-mcp/setup-with-agent.md
15 MCP tools, 17 memory actions. The memory surface is exposed both as 11 specialized single-purpose tools and a legacy memory dispatcher (same actions), plus config, help, and config__open_relay:
| Tool | Actions | Description |
|---|---|---|
add_memory, search_memory, list_memories, update_memory, delete_memory, export_memories, import_memories, memory_stats, restore_memory, archived_memories, consolidate_memories | (one action each) | Specialized single-purpose memory tools -- the recommended surface |
memory (legacy dispatcher) | add, capture, search, list, update, delete, export, import, stats, restore, archived, archive_now, consolidate, compress, entity_search, entity_graph, history | Core CRUD + typed capture (6 context_types) + hybrid search (RRF + rerank + temporal decay) + import/export + soft-archive + restore + on-demand archive sweep + LLM consolidation + LLM compression + temporal KG (entity search / graph / history) |
config | status, sync, set, warmup, setup_sync, setup_status, setup_start, setup_skip, setup_reset, setup_complete, setup_relay, sync_now, export_passport, import_passport | Server status, trigger sync, update settings, pre-download embedding model, authenticate sync provider, manage HTTP setup form lifecycle, passport export/import |
help | topic="memory" or topic="config" | Full documentation for any tool |
config__open_relay | (HTTP relay mode) | Open the zero-config relay setup form (registered via mcp-core) |
Plugin trinity (Claude Code marketplace install):
| Component | Trigger | Purpose |
|---|---|---|
mnemo:recall-context skill | session start, before significant decisions, "what do I know about X?" | Pulls cwd / topic-relevant memories with context_type filtering |
mnemo:memory-commit skill | "remember this" / "save this" / "ghi nho" / "luu lai" | Typed manual capture with context_type decision tree |
mnemo:knowledge-audit skill | periodic / "audit memory" | Find duplicates, contradictions, stale entries; consolidate |
mnemo:session-handoff skill | end of session | Capture decisions / preferences / corrections / conventions / open questions |
| SessionStart hook | every session init | Non-blocking nudge to invoke recall-context |
| PostToolUse hook (opt-in) | CAPTURE_AUTO_ENABLED=true | Hint memory-commit after Write/Edit of CLAUDE.md / AGENTS.md / ARCHITECTURE.md / docs/*.md |
| URI | Description |
|---|---|
mnemo://stats | Database statistics and server status |
| Prompt | Parameters | Description |
|---|---|---|
save_summary | summary | Generate prompt to save a conversation summary as memory |
recall_context | topic | Generate prompt to recall relevant memories about a topic |
~/.mnemo-mcp/tokens/ with 600 permissionsgit clone https://github.com/n24q02m/mnemo-mcp.git
cd mnemo-mcp
uv sync
uv run mnemo-mcp
This plugin implements TC-Local (machine-bound, single trust principal). The mode/storage/encryption breakdown below is the full classification.
| Mode | Storage | Encryption | Who can read your data? |
|---|---|---|---|
| stdio (default) | ~/.mnemo-mcp/config.json | AES-GCM, machine-bound key | Only your OS user (file perm 0600) |
| HTTP self-host | Same as stdio | Same | Only you (admin = user) |
HTTP multi-user remote (PUBLIC_URL) | Per-JWT-sub credential store | AES-GCM | Only the authenticated user (per-sub isolation) |
MIT -- See LICENSE.
API_KEYSsecretAPI keys for cloud embedding (format: ENV_VAR:key). Without this, uses built-in local Qwen3 model.
io.github.ericm1018/skillfm-llm-cost-optimizer-openai-anthropic-usage
io.github.mikerawsonnz/llm-orchestration-agent
io.github.mikerawsonnz/authenticated-llm-agent
labforgedev/copilot-memory-mcp
csoai-org/agent-prompt-injection-firewall-mcp
io.github.mikerawsonnz/authenticated-multi-llm-agent