Cortex Memory Engine

1630 toolsSTDIOregistry active

Summary

Gives Claude a persistent memory layer that runs entirely on your machine. You get 29 MCP tools for ingesting conversations, searching across four memory tiers (working, episodic, semantic, procedural), and retrieving context with token budgets. The people graph resolves identities across channels, Bayesian beliefs self-correct with evidence, and temporal queries understand "recently" versus "first time." Everything stays local in SQLite with optional encrypted sync to your own cloud storage. Benchmarks show 568µs retrieval versus 300ms for cloud alternatives, and it outscores Mem0 by 6.8% on the LoCoMo academic memory benchmark while costing nothing. Reach for this when you want Claude to remember across sessions without sending your data to a third party API.

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

Tools

Public tool metadata for what this MCP can expose to an agent.

30 tools

listAllEntitiesList and filter catalog entities with support for pagination, search, and various filters including groups, types, owners, and git repositories. If the client is trying to fetch data for teams, use 'type': 'team' in these APIs.16 params

List and filter catalog entities with support for pagination, search, and various filters including groups, types, owners, and git repositories. If the client is trying to fetch data for teams, use 'type': 'team' in these APIs.

Parameters* required

pageinteger

Page number to return, 0-indexed. Default 0.default: 0

querystring

Filter based on a [search query](https://docs.cortex.io/settings/search). This will search across entity properties. If provided, results will be sorted by relevance.default:

typesarray

Filter the response to specific types of entities. By default, this includes services, resources, and domains. Corresponds to the `x-cortex-type` field in the entity descriptor.

groupsarray

Filter based on groups, which correspond to the `x-cortex-groups` field in the Catalog Descriptor. Accepts a comma-delimited list of groups

ownersarray

Filter based on owner group names, which correspond to the `x-cortex-owners` field in the Catalog Descriptor. Accepts a comma-delimited list of owner group names

contextstring

Explain why you're invoking this tool now and how its output will be used. Then state how this call supports your *overall objective* and fits into your broader plan across all tool calls (e.g., why this tool vs. others, and what step it unblocks). Never share any personal details or sensitive information.

pageSizeinteger

Number of results to return per page, between 1 and 1000. Default 250.default: 250

includeLinksboolean

Whether to include links for each entity in the responsedefault: false

includeOwnersboolean

Whether to include ownership information for each entity in the responsedefault: false

hierarchyDepthstring

Depth of the parent / children hierarchy nodes. Can be 'full' or a valid integerdefault: full

gitRepositoriesarray

Supports only GitHub repositories in the `org/repo` format

includeArchivedboolean

Whether to include archived entities in the responsedefault: false

includeMetadataboolean

Whether to include custom data for each entity in the responsedefault: false

includeNestedFieldsarray

List of sub fields to include for different types

includeSlackChannelsboolean

Whether to include Slack channels for each entity in the response

includeHierarchyFieldsarray

List of sub fields to include for hierarchies. Only supports 'groups'

listEntityDescriptorsCortex Catalog API - Access and manage your service catalog, teams, domains, and resources5 params

Cortex Catalog API - Access and manage your service catalog, teams, domains, and resources

Parameters* required

pageinteger

Page number to return, 0 indexed

yamlboolean

When true, returns the YAML representation of the descriptors

typesarray

Filter the response to specific types of entities. By default, this includes services, resources, and domains. Corresponds to the `x-cortex-type` field in the entity descriptor.

contextstring

pageSizeinteger

Number of entities to return per page

listDependenciesForEntityList all dependencies for an entity including both incoming (who depends on this service) and outgoing (what this service depends on) relationships. Essential for understanding service interactions, planning changes, and assessing blast radius6 params

List all dependencies for an entity including both incoming (who depends on this service) and outgoing (what this service depends on) relationships. Essential for understanding service interactions, planning changes, and assessing blast radius

Parameters* required

pageinteger

Page number to return, 0-indexed. Default 0.default: 0

contextstring

pageSizeinteger

Number of results to return per page, between 1 and 1000. Default 250.default: 250

callerTagstring

includeIncomingboolean

default: false

includeOutgoingboolean

default: true

getDependencyGet specific dependency details between two entities including method, path, and metadata. Useful for understanding the nature of the relationship, API contracts, and communication patterns between services5 params

Get specific dependency details between two entities including method, path, and metadata. Useful for understanding the nature of the relationship, API contracts, and communication patterns between services

Parameters* required

pathstring

methodstring

contextstring

calleeTagstring

callerTagstring

getEntityDetailsRetrieve comprehensive details about a specific entity including its metadata, ownership, hierarchies, and relationships. This is the primary method for getting complete information about services, teams, or domains.5 params

Retrieve comprehensive details about a specific entity including its metadata, ownership, hierarchies, and relationships. This is the primary method for getting complete information about services, teams, or domains.

Parameters* required

contextstring

tagOrIdstring

Entity identifier - can be a tag or CID

includeOwnersboolean

Include ownership information, default is true

hierarchyDepthstring

Depth of the parent / children hierarchy nodes. Can be 'full' or a valid integerdefault: full

includeHierarchyFieldsarray

List of sub fields to include for hierarchies. Only supports 'groups'

getCustomDataForEntityList all custom data key-value pairs associated with an entity. Retrieve metadata, configuration settings, and custom attributes stored for services, resources, or domains. Supports pagination for entities with large amounts of custom data4 params

List all custom data key-value pairs associated with an entity. Retrieve metadata, configuration settings, and custom attributes stored for services, resources, or domains. Supports pagination for entities with large amounts of custom data

Parameters* required

pageinteger

Page number to return, 0-indexed. Default 0.default: 0

contextstring

tagOrIdstring

Entity identifier - can be a tag or CID

pageSizeinteger

Number of results to return per page, between 1 and 1000. Default 250.default: 250

getCustomDataForEntityByKeyRetrieve a specific custom data value by key for an entity. Efficiently access individual metadata attributes, configuration values, or custom properties without fetching all custom data3 params

Retrieve a specific custom data value by key for an entity. Efficiently access individual metadata attributes, configuration values, or custom properties without fetching all custom data

Parameters* required

keystring

contextstring

tagOrIdstring

Entity identifier - can be a tag or CID

listCustomEventsForEntityList custom events for an entity with optional filtering by type and time range. Supports pagination and filtering by event type, start time, and end time to retrieve historical event data8 params

List custom events for an entity with optional filtering by type and time range. Supports pagination and filtering by event type, start time, and end time to retrieve historical event data

Parameters* required

pageinteger

Page number to return, 0-indexed. Default 0.default: 0

typestring

contextstring

endTimestring

If provided, events with less than or equal to timestamp will be returned (a date-time without a time-zone in the ISO-8601 calendar system)

tagOrIdstring

Entity identifier - can be a tag or CID

pageSizeinteger

Number of results to return per page, between 1 and 1000. Default 250.default: 250

startTimestring

If provided, events with greater than or equal to timestamp will be returned (a date-time without a time-zone in the ISO-8601 calendar system)

timestampstring

Use 'startTime' instead

getCustomEventForEntityByUuidRetrieve a specific custom event by its UUID. Returns event details including title, description, timestamp, type, and any custom data associated with the event3 params

Retrieve a specific custom event by its UUID. Returns event details including title, description, timestamp, type, and any custom data associated with the event

Parameters* required

uuidstring

contextstring

tagOrIdstring

Entity identifier - can be a tag or CID

getDeploysForEntityList all deployments for a specific catalog entity. Returns deployment history including timestamps, environments, SHAs, and deployment types in paginated format4 params

List all deployments for a specific catalog entity. Returns deployment history including timestamps, environments, SHAs, and deployment types in paginated format

Parameters* required

pageinteger

Page number to return, 0-indexed. Default 0.default: 0

contextstring

tagOrIdstring

Entity identifier - can be a tag or CID

pageSizeinteger

Number of results to return per page, between 1 and 1000. Default 250.default: 250

getCurrentOncallForEntityRetrieve the current on-call personnel for an entity in real-time. Shows who is actively responsible for incident response, including primary and secondary on-call, contact information, and rotation schedules2 params

Retrieve the current on-call personnel for an entity in real-time. Shows who is actively responsible for incident response, including primary and secondary on-call, contact information, and rotation schedules

Parameters* required

contextstring

tagOrIdstring

Entity identifier - can be a tag or CID

getEntityDescriptorCortex Catalog API - Access and manage your service catalog, teams, domains, and resources3 params

Cortex Catalog API - Access and manage your service catalog, teams, domains, and resources

Parameters* required

yamlboolean

When true, returns the YAML representation of the descriptor

contextstring

tagOrIdstring

Entity identifier - can be a tag or CID

listEntityDestinationsForRelationshipTypeList all destinations for a certain relationship type & entity. Use the listRelationshipTypes tool to find the relevant relationshipTypeTag.5 params

List all destinations for a certain relationship type & entity. Use the listRelationshipTypes tool to find the relevant relationshipTypeTag.

Parameters* required

depthstring

Maximum depth to traverse in the relationship hierarchy. Defaults to 1 (i.e., direct relationships only).

contextstring

tagOrIdstring

Entity identifier - can be a tag or CID

includeArchivedboolean

If true will include relationships that traverse archived entitiesdefault: false

relationshipTypeTagstring

listEntitySourcesForRelationshipTypeList all sources for a certain relationship type & entity. Use the listRelationshipTypes tool to find the relevant relationshipTypeTag.5 params

List all sources for a certain relationship type & entity. Use the listRelationshipTypes tool to find the relevant relationshipTypeTag.

Parameters* required

depthstring

Maximum depth to traverse in the relationship hierarchy. Defaults to 1 (i.e., direct relationships only).

contextstring

tagOrIdstring

Entity identifier - can be a tag or CID

includeArchivedboolean

If true will include relationships that traverse archived entitiesdefault: false

relationshipTypeTagstring

getCustomMetricDataRetrieve custom metric data points for an entity. Returns paginated time-series data for a specific custom metric, with optional filtering by date range to analyze trends and patterns7 params

Retrieve custom metric data points for an entity. Returns paginated time-series data for a specific custom metric, with optional filtering by date range to analyze trends and patterns

Parameters* required

pageinteger

Page number to return, 0-indexed. Default 0.default: 0

contextstring

endDatestring

End date for the filter (inclusive)

tagOrIdstring

Entity identifier - can be a tag or CID

pageSizeinteger

Number of results to return per page, between 1 and 1000. Default 250.default: 250

startDatestring

Start date for the filter (inclusive). Default: 6 months

customMetricKeystring

Key for the custom metric filter

queryPointInTimeMetricsExecute point-in-time queries for one or more engineering metrics. Returns current metric values for specified time periods, with support for batch queries and optional period-over-period comparisons. Time range (startTime/endTime) cannot exceed 6 months (180 days). PREREQUISI...14 params

Execute point-in-time queries for one or more engineering metrics. Returns current metric values for specified time periods, with support for batch queries and optional period-over-period comparisons. Time range (startTime/endTime) cannot exceed 6 months (180 days). PREREQUISI...

Parameters* required

limitinteger

Maximum number of results to return

contextstring

endTimestring

End time for the query period

filtersarray

Filters to apply to the data

groupByarray

Fields to group results by

metricsarray

List of metrics to query with their aggregation functions

orderByarray

Sort order for results

nextPagestring

Pagination token for next page of results

startTimestring

Start time for the query period

comparisonvalue

nestedGroupByarray

Fields to group nested results by

nestedMetricsarray

Optional nested metrics for advanced queries

timeAttributestring

Time attribute to use for queries

nestedTimeAttributestring

Time attribute for nested queries

listMetricDefinitionsList all available engineering metric definitions. USAGE - Call this endpoint BEFORE querying metrics (queryPointInTimeMetrics): 1. Once at start: Call with view='basic' to discover all available metrics - cache this response 2. Once per metric: Call with view='full' and key=M...3 params

List all available engineering metric definitions. USAGE - Call this endpoint BEFORE querying metrics (queryPointInTimeMetrics): 1. Once at start: Call with view='basic' to discover all available metrics - cache this response 2. Once per metric: Call with view='full' and key=M...

Parameters* required

keyarray

viewstring

default: basic

contextstring

listInitiativesList all initiatives in the organization with optional filters for draft and expired initiatives. View active improvement programs, strategic projects, and their current status to understand organizational priorities and track progress5 params

List all initiatives in the organization with optional filters for draft and expired initiatives. View active improvement programs, strategic projects, and their current status to understand organizational priorities and track progress

Parameters* required

pageinteger

Page number to return, 0-indexed. Default 0.default: 0

contextstring

pageSizeinteger

Number of results to return per page, between 1 and 1000. Default 250.default: 250

includeDraftsboolean

Whether or not to include draft Initiatives in the responsedefault: false

includeExpiredboolean

Whether or not to include expired Initiatives in the responsedefault: false

getInitiativeRetrieve detailed information about a specific initiative including its goals, timeline, affected entities, scorecard targets, and current progress. Essential for understanding initiative scope and tracking achievement of objectives2 params

Retrieve detailed information about a specific initiative including its goals, timeline, affected entities, scorecard targets, and current progress. Essential for understanding initiative scope and tracking achievement of objectives

Parameters* required

cidstring

contextstring

getMyWorkspaceTOOL for retrieving current user's owned resources and work items across the Cortex workspace. FLEXIBLE REQUEST STRUCTURE: The request accepts an object with optional fields for each resource type: - myEntitiesRequest: Fetch entities (services, resources, domains) owned by the...7 params

TOOL for retrieving current user's owned resources and work items across the Cortex workspace. FLEXIBLE REQUEST STRUCTURE: The request accepts an object with optional fields for each resource type: - myEntitiesRequest: Fetch entities (services, resources, domains) owned by the...

Parameters* required

contextstring

myTeamsRequestobject

Request for teams the user belongs to

myOpenPRsRequestobject

Request for user's open pull requests across all Git repositories

myEntitiesRequestobject

Request for all entities (services, resources, domains) owned by the user

myWorkItemsRequestobject

Request for work items (Jira, Linear, Azure DevOps issues) assigned to the user

myScorecardsRequestobject

Request for scorecards associated with the user's entities

myRequestedReviewsRequestobject

Request for pull requests where the user is requested as a reviewer

listRelationshipTypesList all available relationship types with pagination. View relationship type configurations to understand what kinds of relationships can be created between entities like services, resources, domains, and teams3 params

List all available relationship types with pagination. View relationship type configurations to understand what kinds of relationships can be created between entities like services, resources, domains, and teams

Parameters* required

pageinteger

Page number to return, 0-indexed. Default 0.default: 0

contextstring

pageSizeinteger

Number of results to return per page, between 1 and 1000. Default 250.default: 250

getRelationshipTypeDetailsGet complete details of a specific relationship type including its configuration, rules, source/destination filters, and inheritance settings. Essential for understanding how entities can be connected and what validation rules apply2 params

Get complete details of a specific relationship type including its configuration, rules, source/destination filters, and inheritance settings. Essential for understanding how entities can be connected and what validation rules apply

Parameters* required

contextstring

relationshipTypeTagstring

listEntityRelationshipsList all entity relationships/full graph for a specific relationship type across the entire organization. Returns paginated results showing all source-destination pairs, useful for understanding the complete relationship graph and finding all connections of a particular type4 params

List all entity relationships/full graph for a specific relationship type across the entire organization. Returns paginated results showing all source-destination pairs, useful for understanding the complete relationship graph and finding all connections of a particular type

Parameters* required

pageinteger

Page number to return, 0-indexed. Default 0.default: 0

contextstring

pageSizeinteger

Number of results to return per page, between 1 and 1000. Default 250.default: 250

relationshipTypeTagstring

listScorecardsList all scorecards in the organization with optional filtering. View scorecard configurations to understand quality standards, compliance requirements, and maturity models. Supports filtering by groups, entities, and teams to find relevant scorecards7 params

List all scorecards in the organization with optional filtering. View scorecard configurations to understand quality standards, compliance requirements, and maturity models. Supports filtering by groups, entities, and teams to find relevant scorecards

Parameters* required

pageinteger

Page number to return, 0-indexed. Default 0.default: 0

teamsarray

Filter based on team (either tags or CIDs). Accepts a comma-delimited list of team tag or CIDs, please use only one type of identifier

groupsarray

Filter based on groups, which correspond to the `x-cortex-groups` field in the Catalog Descriptor. Accepts a comma-delimited list of groups

contextstring

entitiesarray

Filter based on entity (either tags or CIDs). Accepts a comma-delimited list of entity tag or CIDs, please use only one type of identifier

pageSizeinteger

Number of results to return per page, between 1 and 1000. Default 250.default: 250

showDraftsboolean

getScorecardGet complete details of a scorecard including its configuration, rules, levels, weights, exemption settings, and evaluation criteria. Essential for understanding how services are evaluated and what standards they must meet2 params

Get complete details of a scorecard including its configuration, rules, levels, weights, exemption settings, and evaluation criteria. Essential for understanding how services are evaluated and what standards they must meet

Parameters* required

tagstring

Unique tag for the Scorecard

contextstring

getScorecardNextStepsForEntityGet actionable next steps for an entity to improve its scorecard performance. Shows which rules need to be satisfied to reach the next maturity level, helping teams prioritize improvements and track progress toward compliance goals3 params

Get actionable next steps for an entity to improve its scorecard performance. Shows which rules need to be satisfied to reach the next maturity level, helping teams prioritize improvements and track progress toward compliance goals

Parameters* required

tagstring

Unique tag for the Scorecard

contextstring

entityTagstring

The entity tag (`x-cortex-tag`) that identifies the entity.

listScorecardScoresRetrieve scores for all entities evaluated by a specific scorecard. Returns paginated results showing how each service, resource, or domain performs against the scorecard's rules, including individual rule scores and overall scorecard scores5 params

Retrieve scores for all entities evaluated by a specific scorecard. Returns paginated results showing how each service, resource, or domain performs against the scorecard's rules, including individual rule scores and overall scorecard scores

Parameters* required

tagstring

Unique tag for the Scorecard

pageinteger

Page number to return, 0-indexed. Default 0.default: 0

contextstring

pageSizeinteger

Number of results to return per page, between 1 and 1000. Default 250.default: 250

entityTagstring

Entity tag (x-cortex-tag)

getTeamDetailsRetrieve detailed information about a specific team by its tag or ID. Returns complete team data including members, slack channels, metadata, and whether it's backed by an identity provider group2 params

Retrieve detailed information about a specific team by its tag or ID. Returns complete team data including members, slack channels, metadata, and whether it's backed by an identity provider group

Parameters* required

contextstring

tagOrIdstring

Entity identifier - can be a tag or CID

query_docsQuery the Cortex knowledge base for answers. Args: query: The question to ask Cortex docs Returns: Response from Cortex including answer and metadata2 params

Query the Cortex knowledge base for answers. Args: query: The question to ask Cortex docs Returns: Response from Cortex including answer and metadata

Parameters* required

querystring

contextstring

get_more_toolsCheck for additional tools whenever your task might benefit from specialized capabilities - even if existing tools could work as a fallback.1 params

Check for additional tools whenever your task might benefit from specialized capabilities - even if existing tools could work as a fallback.

Parameters* required

contextvalue

default:

Cortex

🧠 Try Cortex in your browser — zero install, 124KB WASM, runs entirely client-side.

If Cortex helps your AI remember, give it a ⭐ — it takes 1 second and helps others discover the project.

中文 | 日本語 | 한국어

Memory for AI agents that never leaves your device.

Private. Free. Local. — a memory engine for personal AI agents.

Your AI's memory lives on your device — your data never leaves, never costs, never spies. Pure Rust. 3.8MB binary. No third-party servers in the data path, zero telemetry, zero cost. Syncs through your own cloud storage. (On-device semantic search downloads a ~30MB model once on first use, then runs fully offline — or go 100% offline with CORTEX_NO_EMBEDDINGS=1. See Security & Privacy.)

Cortex remembering across sessions — a real, local cortex-mcp-server recording

What you get

🔒 Private by default — memories live in a local SQLite file, never leave your device, zero telemetry (CI-enforced).
🧠 Real memory, not a text file — 4 tiers, multi-signal retrieval, self-correcting Bayesian beliefs, a cross-channel people graph.
⚡ Sub-millisecond — 156µs ingest, 568µs search. ~528× faster than cloud memory APIs, with no network round-trip.
🔌 Drop-in for any agent — one MCP server gives Claude Code / Claude Desktop (or any MCP client) persistent cross-session memory.
☁️ Yours across devices — optional end-to-end-encrypted sync through your own iCloud / Drive / Dropbox. No server of ours, ever.

See it remember across sessions — ~30 seconds:

brew install gambletan/tap/cortex-mcp-server          # or: cargo build --release -p cortex-mcp-server
claude mcp add cortex-memory -- cortex-mcp-server ~/.cortex/memory.db

Tell Claude "remember I deploy on Fly.io and always run tests before pushing." Open a brand-new session and ask "how do I deploy this project?" — it answers from memory, 100% on your machine.

⭐ If that's useful, give it a star — it helps others find a memory engine that respects their privacy.

LLMs start blank every session — they forget your name, your preferences, yesterday's conversation, last week's decision. The usual fixes are flat text files (no ranking, no decay), keyword grep, or cloud APIs that add 200–500ms, charge you, and ship your personal data to someone else's server. Cortex gives your AI structured, self-evolving long-term memory that persists across sessions and channels — all local, all yours. Your memories are not a cloud provider's training data, a startup's monetization asset, or a surveillance target.

Cortex vs Mem0 vs OpenAI Memory

	Cortex	Mem0	OpenAI Memory
Privacy	100% local, zero cloud	Cloud API (your data on their servers)	OpenAI servers
Latency	156µs ingest, 568µs search	~200-500ms	~300-800ms
Cost	Free, forever	$99+/mo (Pro)	ChatGPT Plus ($20/mo)
Memory tiers	4 (Working/Episodic/Semantic/Procedural)	1 (flat)	1 (flat)
Bayesian beliefs	Self-correcting with evidence	No	No
People graph	Cross-channel identity resolution	Paid tier only	No
Conversation compression	Automatic session summarization	No	No
Relationship inference	Pattern-based (EN + CN)	No	No
Temporal retrieval	Intent-aware ("recently" / "first time")	No	No
Contradiction detection	Automatic with confidence scores	No	No
Consolidation	Episodic → Semantic auto-promotion	No	No
Context injection	Token-budgeted LLM-ready output	Manual	Automatic but opaque
Import/Export	Full JSON backup & restore	API only	No export
Self-hosted	Native binary, Docker, MCP	Cloud only	Cloud only
Binary size	3.8 MB	npm package	N/A
Dependencies	0 runtime services (single binary)	Node.js + cloud	N/A
Open source	MIT	Partial	No
Encryption	AES-256-GCM encrypted sync (opt-in)	No	No
Key rotation	Versioned envelopes, forward secrecy	No	No
Privacy levels	Private (default, never syncs) / Shared / Public — per-memory opt-in, demote retracts from other devices	No	No
Tool authorization	Deny-by-default capability policy on the MCP surface	No	No
Zero telemetry	No analytics, no phone-home, verifiable	Unknown	No
Cost	Free forever, unlimited	$99+/mo (Pro)	$20/mo (Plus)
Chinese NLP	Native (inference, retrieval, relationships)	No	Limited
Namespace isolation	Per-user/context memory separation	No	No
Plugin system	Compile-time hooks for ingest/retrieve/consolidation	No	No
MCP tools	30 tools for Claude/LLM integration	3rd party	N/A

Performance Benchmarks

Operation	Cortex	Mem0 (cloud)	File-based
Ingest	156µs	~200ms	~1ms
Search (top-10)	568µs	~300ms	~10ms
Context generation	621µs	~500ms	manual
Belief update	66µs	N/A	N/A
People graph	51µs	paid tier	N/A
Structured facts	45µs	N/A	N/A
1K memories search	1.6ms	~500ms	~50ms

528x faster than Mem0 cloud. With features neither Mem0 nor OpenAI Memory offer.

Note: Benchmarks include proactive inference (auto-extracting facts, preferences, relationships) on every ingest. Raw ingest without inference is ~15µs. Numbers from cargo bench on M-series Mac.

LoCoMo Benchmark (ACL 2024)

Academic-grade long-term conversation memory evaluation — 10 conversations, 1540 QA pairs across 4 categories.

System	Single-hop	Multi-hop	Open-domain	Temporal	Overall
Backboard	89.4%	75.0%	91.2%	91.9%	90.0%
MemMachine v0.2	—	—	—	—	84.9%
Cortex	72.5%	59.5%	88.8%	74.1%	73.7%
Mem0-Graph	65.7%	47.2%	75.7%	58.1%	68.4%
Mem0	67.1%	51.2%	72.9%	55.5%	66.9%
OpenAI Memory	—	—	—	—	52.9%

Key findings:

Open-domain 88.8% — leads Mem0 (72.9%) by +15.9%
Temporal 74.1% — leads Mem0 (55.5%) by +18.6%
Single-hop 72.5% — leads Mem0 (67.1%) by +5.4%
Multi-hop 59.5% — leads Mem0 (51.2%) by +8.3%
Overall 73.7% — beats Mem0 (66.9%) by +6.8%, beats OpenAI Memory (52.9%) by +20.8%

Cortex outperforms Mem0 on all 4 categories — while running 100% locally, end-to-end encrypted, at $0 cost.

Setup: Claude Sonnet 4 (QA + judge), nomic-embed-text (embeddings via Ollama), top-30 retrieval. Reproducible with that setup: python3 bench/locomo_bench.py (needs ANTHROPIC_API_KEY + a local Ollama with nomic-embed-text). Numbers measured on the v1.7 engine; the v2.2 retrieval beam fix (paraphrase recall 40%→90% at 5K, see docs/scale-test-2026-06-13.md) has not yet been re-run on LoCoMo, so these are reported as the last verified figures, not a v2.2 claim.

Architecture

Cortex implements a 4-tier memory model inspired by human cognition:

                    +---------------------+
                    |   Working Memory    |  Current session context
                    +---------------------+
                              |
                    +---------------------+
                    |   Episodic Memory   |  Raw experiences: conversations, events, observations
                    +---------------------+
                              |  consolidation (decay, promotion, pattern extraction)
                    +---------------------+
                    |   Semantic Memory   |  Distilled facts, preferences, relationships
                    +---------------------+
                              |
                    +---------------------+
                    | Procedural Memory   |  Learned routines, user-specific workflows
                    +---------------------+

Working holds the current session scratch pad. Episodic stores raw experiences with timestamps and source metadata. The Consolidation Engine periodically promotes recurring patterns into Semantic facts and decays stale episodes. Procedural captures learned workflows and routines.

Key Components

People Graph

Cross-channel identity resolution. The same person messaging you on Telegram, emailing you, and showing up in calendar events gets unified into a single identity node. Interactions, relationship strength, and communication patterns are tracked per-person.

Bayesian Belief System

Self-correcting understanding of the world. Beliefs are formed from evidence, updated with each new observation, and can be contradicted. Confidence scores reflect actual certainty rather than recency bias.

cortex.observe_belief("user_prefers_morning_meetings", true, 0.8)?;
cortex.observe_belief("user_prefers_morning_meetings", false, 0.6)?;
// Confidence adjusts automatically via Bayesian update

Consolidation Engine

Episodic-to-semantic promotion, decay of stale memories, and pattern extraction. Runs as a background cycle that keeps the memory store lean and queryable. Returns a report of what was promoted, decayed, and merged.

Multi-signal Retrieval

Queries combine five signals for relevance ranking:

Similarity -- vector cosine distance against query embedding
Temporal -- recency weighting with configurable decay
Salience -- importance scoring from access patterns and explicit hints
Social -- boost for memories involving specific people
Channel -- filter or boost by source channel

Context Injection Protocol

Generates LLM-ready context strings from memory state. Pass a token budget, optional channel/person filters, and get back a structured text block your LLM can consume directly.

Storage

SQLite for persistence, in-memory vector index for fast similarity search. Single-file database, no external services required. Designed for edge deployment -- runs on a laptop, a Raspberry Pi, or a server.

Cloud Sync

Sync memories across devices through your own cloud storage — no third-party server involved.

Device A (Mac)              Your Cloud Storage              Device B (iPhone)
┌──────────┐         ┌──────────────────────┐         ┌──────────┐
│ SQLite DB │ ──W──>  │ iCloud / GDrive /    │  <──R── │ SQLite DB│
│ (local)   │         │ OneDrive / Dropbox   │         │ (local)  │
│           │ <──R──  │                      │  ──W──> │          │
└──────────┘         └──────────────────────┘         └──────────┘

Changelog-based: Each device writes append-only operation logs to its own subfolder
No conflicts: Devices never write to the same file. Merge uses Last-Writer-Wins with Hybrid Logical Clocks
Encrypted: AES-256-GCM encryption (opt-in). Even if your cloud account is compromised, memories stay private
Tamper-evident: the sync manifest and every operation carry an HMAC; tampered or plaintext-injected oplog lines are rejected, and a manifest without integrity protection refuses to load (no key-rollback path)
Key rotation & forward secrecy: rotate to a new key version (ENC2 envelopes) without re-encrypting history; old versions stay readable, new writes are unreadable to a leaked old key
Privacy-aware, per-memory opt-in: Private memories (the default) never leave your device. Mark a memory shared to sync it; demote it back to private and a retraction deletes it from your other devices (local copy kept)
Survives restarts: sync settings persist in the database (passphrase never touches disk — macOS login Keychain or CORTEX_SYNC_PASSPHRASE); the server resumes sync and starts background pull (30s poll + fs watcher) automatically

Supported providers: iCloud Drive, Google Drive, OneDrive, Dropbox (auto-detected).

use cortex_core::sync::SyncConfig;
use cortex_core::types::PrivacyLevel;

// Enable sync with encryption (settings persist; passphrase goes to the OS keychain)
let config = SyncConfig::new(sync_dir, device_id, device_name)
    .with_encryption("my-strong-passphrase");
cortex.enable_sync(config)?;

// Opt a memory into sync — everything is Private unless you say otherwise
cortex.set_memory_privacy(mem_id, PrivacyLevel::Shared { scope: "all".into() })?;

// Pull changes from other devices (also happens automatically in the background)
let applied = cortex.sync_pull()?;
println!("Applied {} remote changes", applied);

Security & Privacy

Feature	Detail
Encryption	AES-256-GCM with Argon2id key derivation (per-line random nonce)
Key rotation	Versioned `ENC2` envelopes with per-version passphrase-derived keys — forward secrecy against AES-key exfiltration, no full re-encryption needed
Integrity	HMAC on the sync manifest and on every sync operation; plaintext lines in an encrypted oplog are rejected outright (injection defense)
Privacy levels	Private (default, never syncs), Shared, Public — set at ingest (`privacy` arg / `--privacy`) or later (`memory_set_privacy`); demoting to Private retracts the memory from other devices
Capability policy	Deny-by-default tool authorization on the MCP surface: a `capabilities.json` grants tool groups (`read`/`write`/`sync`/`plugins`) or exact tools; ungranted tools are invisible and uncallable; malformed policy fails closed
Query budget	Every retrieval is bounded (candidate cap + wall-clock cap) — query cost never scales with total store size; DoS guard and timing-side-channel bound in one
Secret handling	Sync passphrase is never written to disk by Cortex — macOS login Keychain or env var only; missing passphrase fails safe (sync off, never plaintext)
Memory zeroization	Sensitive data cleared from RAM on drop (`zeroize` crate)
Zero telemetry	No analytics, no phone-home, no user data ever leaves the device — enforced in CI (`scripts/check-no-network-egress.sh`): the build fails if any network/telemetry crate enters `cortex-core`'s default tree, and the check also proves the `--no-default-features` binary is completely zero-network.
Embedding model fetch (one-time)	The default `cortex-mcp-server` enables on-device semantic search, which downloads a ~30 MB model (all-MiniLM-L6-v2) from the Hugging Face CDN on first ingest, then runs fully offline and sends none of your data. For a 100%-offline setup: run with `CORTEX_NO_EMBEDDINGS=1` (keyword/FTS recall, zero network) or build `--no-default-features`. A one-time stderr notice is printed before any download — nothing is ever fetched silently.
No accounts	No API key, no registration, no cloud dependency

See SECURITY.md for the full threat model.

Prerequisites

Install the Rust toolchain (provides cargo):

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

After installation, either restart your terminal or run:

source "$HOME/.cargo/env"

Verify:

cargo --version

Real-World Example: A Personal AI That Actually Remembers

Imagine your AI assistant across a week of real conversations:

# Day 1 — You chat on Telegram
You: "Sarah works at Stripe. She's interested in our API."

  Cortex auto-extracts:
  ├── episodic memory stored (156µs)
  ├── fact: Sarah → works_at → Stripe (confidence: 0.70)
  └── person resolved: sarah_telegram

# Day 2 — Sarah emails you
From: sarah@stripe.com
"Here's the technical spec we discussed."

  Cortex:
  ├── person resolved: sarah@stripe.com → merged with sarah_telegram
  │   (same person, different channel — automatic identity resolution)
  └── fact: Sarah → sent → technical spec

# Day 3 — You ask your AI
You: "What's the status with Stripe?"

  Cortex retrieves (568µs):
  ├── Sarah works at Stripe (semantic fact)
  ├── Meeting went well, interested in API (episodic, Day 1)
  ├── She sent technical spec (episodic, Day 2)
  └── Cross-channel context: Telegram + Email unified under one person

  Your AI responds with full context — no "sorry, I don't remember" 🎯

# Day 5 — New information arrives
You: "Sarah now works at Anthropic."

  Cortex:
  ├── contradiction detected: Sarah works_at Stripe vs Sarah works_at Anthropic
  ├── old fact superseded + decayed: Stripe (salience ×0.3, kept as history)
  ├── new fact stored: Sarah → works_at → Anthropic
  └── current employer now ranks first; self-correcting, no manual cleanup

  (Third-party relations are extracted from natural-language verbs —
   "works at / works for / joined / now works at", "runs on", "hosted in",
   "manages", "part of", … — between two proper-noun entities.)

# Day 7 — Consolidation runs
  Cortex auto-consolidation:
  ├── 3 episodic memories about Sarah → promoted to semantic summary
  ├── stale memories from other topics → decayed
  └── pattern detected: you have recurring Monday meetings

All of this happens locally in <1ms per operation. No cloud. No API calls. No one else sees your data.

Install

Homebrew (macOS / Linux)

brew tap gambletan/tap
brew install cortex-mcp-server

From source

cargo build --release -p cortex-mcp-server
cp target/release/cortex-mcp-server ~/.local/bin/

Official packages (avoid look-alikes)

Cortex is published under the cortex-ai-memory name. Several similarly-named packages on npm/PyPI are not affiliated with this project — use exactly these:

Ecosystem	Official package	Use for
Binary / MCP server	GitHub Releases, or `brew install gambletan/tap/cortex-mcp-server`	the memory engine (primary)
PyPI	`cortex-ai-memory`	Python bindings
npm	`@cortex-ai-memory/cortex-memory` (scoped)	OpenClaw memory plugin

⚠️ Not us: npm cortex-mcp, npm cortex-ai-memory (unscoped), PyPI cortex-memory. The source of truth is always this repo — github.com/gambletan/cortex. When in doubt, the binary from Releases is the canonical install.

Quick Start

use cortex_core::Cortex;

// Open (or create) a memory database
let cortex = Cortex::open("memory.db")?;

// Ingest a memory from a Telegram conversation
let embedding = your_embedding_fn("Met with Alice about the Q3 roadmap");
cortex.ingest(
    "Met with Alice about the Q3 roadmap",
    "telegram",               // source channel
    Some("alice_123"),         // user ID (triggers identity resolution)
    Some(0.8),                 // salience hint
    Some(embedding),           // vector embedding
)?;

// Add a semantic fact directly
cortex.add_fact(
    "Alice", "works_at", "Acme Corp",
    0.95, "telegram", None,
)?;

// Store a preference
cortex.add_preference("timezone", "America/Los_Angeles", 0.9)?;

// Retrieve relevant memories
let results = cortex.retrieve(
    "What do I know about Alice?",
    5,                         // top-k
    None,                      // any channel
    None,                      // any person
    Some(query_embedding),     // vector for similarity search
)?;

// Generate LLM-ready context (token-budgeted)
let context = cortex.get_context(
    2000,                      // max tokens
    Some("telegram"),          // channel filter
    None,                      // no person filter
)?;
// Pass `context` as system/user message prefix to your LLM

// Run consolidation (call periodically)
let report = cortex.run_consolidation()?;
println!("Promoted: {}, Decayed: {}", report.promoted, report.decayed);

Python Bindings

Coming soon via PyO3. The cortex-python crate will expose the full API as a native Python module:

from cortex import Cortex

cx = Cortex.open("memory.db")
cx.ingest("Had lunch with Bob at the Thai place", channel="imessage", user_id="bob")
results = cx.retrieve("Where does Bob like to eat?", limit=5)

Integration with unified-channel-hub

Cortex is designed as the memory layer for unified-channel-hub. Messages flow in from any channel adapter, Cortex ingests and indexes them, and the context injection protocol feeds relevant memory back to your LLM before each response.

Telegram ─┐                          ┌─ Context
Discord  ─┤  unified-channel-hub  →  │  Cortex  →  LLM
Email    ─┤  (ingest)                 │  (retrieve + inject)
Calendar ─┘                          └─ Response

Integration with LangGraph

Add persistent memory to any LangGraph agent via langchain-mcp-adapters — no custom code needed.

from langchain_mcp_adapters.client import MultiServerMCPClient
from langgraph.prebuilt import create_react_agent
from langchain_openai import ChatOpenAI

model = ChatOpenAI(model="gpt-4o")

async with MultiServerMCPClient({
    "cortex": {
        "command": "cortex-mcp-server",
        "args": ["~/.cortex/memory.db"]
    }
}) as client:
    agent = create_react_agent(model, client.get_tools())
    # Agent now has all 30 Cortex memory tools
    result = await agent.ainvoke({
        "messages": [{"role": "user", "content": "What do you remember about Alice?"}]
    })

Your LangGraph agent gets instant access to memory_search, memory_ingest, fact_add, belief_observe, person_resolve, and 25 more tools — all running locally.

Integration with DeerFlow (ByteDance)

Cortex works as a persistent memory layer for DeerFlow — ByteDance's open-source multi-agent orchestration platform. Zero code changes needed.

# Add to DeerFlow config.yaml
mcp_servers:
  cortex-memory:
    command: cortex-mcp-server
    args:
      - ~/.cortex/deerflow.db

All DeerFlow agents (Telegram, Slack, Feishu) get instant access to 30 memory tools — cross-session memory, fact storage, people graph, and belief tracking across all channels.

CLI

Cortex doubles as a standalone CLI tool — no MCP client required.

$ cortex-mcp-server --help
Cortex memory engine — MCP server & CLI tools

Usage: cortex-mcp-server [DB_PATH] [COMMAND]

Commands:
  ingest  Store a new memory
  search  Search memories
  stats   Show memory statistics
  sync    Show cloud sync status and detected providers
  export  Export all data as JSON
  import  Import data from JSON file
  info    Show version, DB path, and capabilities
  help    Print this message or the help of the given subcommand(s)

Arguments:
  [DB_PATH]  Path to the Cortex database file (default: ~/.cortex/memory.db)

Options:
  -h, --help     Print help
  -V, --version  Print version

Examples:

# Store a memory
cortex-mcp-server ~/.cortex/memory.db ingest "Met with Alice about Q3 roadmap"
cortex-mcp-server ~/.cortex/memory.db ingest -c telegram "Sarah now works at Anthropic"

# Search
cortex-mcp-server ~/.cortex/memory.db search "Alice"
cortex-mcp-server ~/.cortex/memory.db search -l 10 "Q3 roadmap"

# Stats
cortex-mcp-server ~/.cortex/memory.db stats

# Cloud sync
cortex-mcp-server ~/.cortex/memory.db sync                        # status
cortex-mcp-server ~/.cortex/memory.db sync enable                  # auto-detect provider
cortex-mcp-server ~/.cortex/memory.db sync enable -p icloud        # specific provider
cortex-mcp-server ~/.cortex/memory.db sync pull                    # pull remote changes

# Export / Import (backup & restore)
cortex-mcp-server ~/.cortex/memory.db export -o backup.json
cortex-mcp-server ~/.cortex/new.db import backup.json

# Version & capabilities
cortex-mcp-server ~/.cortex/memory.db info

No subcommand = MCP stdio mode (for Claude Code / Claude Desktop integration).

MCP Server (Claude Code / Claude Desktop)

Cortex ships as an MCP server — works with any MCP-compatible client.

Setup

1. Build & install the binary:

mkdir -p ~/.local/bin ~/.cortex
cargo build --release -p cortex-mcp-server
cp target/release/cortex-mcp-server ~/.local/bin/

2. Register as MCP server:

Claude Code (CLI):

# Global (all projects)
claude mcp add cortex --scope user -- ~/.local/bin/cortex-mcp-server ~/.cortex/memory.db

# Or per-project
claude mcp add cortex -- ~/.local/bin/cortex-mcp-server ~/.cortex/memory.db

Claude Desktop — add to ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "cortex": {
      "command": "/Users/you/.local/bin/cortex-mcp-server",
      "args": ["/Users/you/.cortex/memory.db"]
    }
  }
}

3. Allow tools in "don't ask" mode:

Add to ~/.claude/settings.json → permissions.allow:

"mcp__cortex__*"

Note: MCP tool permissions do not support parentheses format (e.g. mcp__cortex__memory_ingest(*)). Use the wildcard mcp__cortex__* instead.

4. Make it automatic — add to your CLAUDE.md (project or global ~/.claude/CLAUDE.md):

# Memory (Cortex)
You have persistent memory via Cortex MCP tools. Use them automatically:
- Start of conversation: call `memory_context` to load what you know about the user
- When the user shares a preference, fact, or personal info: call `memory_ingest` to store it
- When you learn a structured fact: call `fact_add` (e.g. "User works_at Google")
- When you detect a preference: call `preference_set` (e.g. editor=neovim)
- When evidence supports or contradicts a belief: call `belief_observe`
- When talking to someone new: call `person_resolve` to track identity
- Periodically: call `memory_consolidate` to clean up stale memories

5. Auto-inject memory on session start (Claude Code hooks — fully automatic):

Create ~/.claude/hooks/cortex-memory-inject.sh:

#!/bin/bash
CORTEX_BIN="${CORTEX_BIN:-$HOME/.local/bin/cortex-mcp-server}"
CORTEX_DB="${CORTEX_DB:-$HOME/.cortex/memory.db}"
[ -x "$CORTEX_BIN" ] || exit 0

printf '%s\n%s\n%s\n' \
  '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"hook","version":"1.0"}}}' \
  '{"jsonrpc":"2.0","method":"notifications/initialized"}' \
  '{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"memory_context","arguments":{"max_tokens":1500}}}' \
  | "$CORTEX_BIN" "$CORTEX_DB" 2>/dev/null \
  | grep '"id":2' \
  | python3 -c "import sys,json; r=json.load(sys.stdin); print(r['result']['content'][0]['text'])" 2>/dev/null

Add to ~/.claude/settings.json:

{
  "hooks": {
    "SessionStart": [
      {
        "matcher": "",
        "hooks": [
          {
            "type": "command",
            "command": "~/.claude/hooks/cortex-memory-inject.sh"
          }
        ]
      }
    ]
  }
}

Now every new Claude Code session automatically loads your memory context — zero manual effort. Claude learns as you work and remembers across sessions.

Cross-Device Memory Sync

Your Claude's memory follows you across all your devices — MacBook, iMac, work laptop — through your own cloud storage.

Enable sync (one command):

You: "Enable cross-device memory sync"

Claude calls sync_enable → auto-detects iCloud Drive →
  generates device ID + AES-256-GCM encryption key → done.

Output:
  Provider:   iCloud Drive
  Encryption: AES-256-GCM
  Passphrase: a1b2c3...  ← save this for your other devices

On your second device — one script does everything (build/install, wait for iCloud, join, restore):

git clone https://github.com/gambletan/cortex && cortex/scripts/setup-device-sync.sh
# Prompts for your passphrase (hidden input; or set CORTEX_SYNC_PASSPHRASE)
# → full restore on join, passphrase saved to that device's login Keychain

Or conversationally:

You: "Enable sync with passphrase a1b2c3..."

Claude calls sync_enable(passphrase: "a1b2c3...") →
  connects to the same iCloud sync folder → pulls all memories.

Now both devices share the same memory — and keep sharing it
automatically (background sync: 30s poll + filesystem watcher).

What syncs and what doesn't:

Private memories (default) never leave your device. Opt in per memory: memory_ingest with privacy: "shared", cortex-mcp-server ingest --privacy shared, or memory_set_privacy on an existing memory
Demote a shared memory back to private and it is retracted (deleted) from your other devices — the local copy stays
All sync data is AES-256-GCM encrypted with HMAC integrity — even if your cloud account is compromised, memories stay private and tampering is detected
Sync survives restarts: settings persist, the passphrase lives in the OS keychain, the server resumes automatically
No server, no API, no account — just your own cloud folder

CLI alternative:

# Device A
cortex-mcp-server sync enable
# Save the passphrase from the output

# Device B
cortex-mcp-server sync enable --passphrase "your-passphrase-from-device-A"

# Manual pull (background sync also pulls automatically)
cortex-mcp-server sync pull

Multi-Project Isolation

Working across multiple projects? Use separate databases for physical memory isolation — no cross-project leakage, zero code changes needed.

~/.cortex/
├── global.db          # User preferences, people graph, cross-project knowledge
├── my-app.db          # Project A memories
└── my-api.db          # Project B memories

Global config (~/.claude/settings.json) — user-level knowledge:

{
  "mcpServers": {
    "cortex-global": {
      "command": "~/.local/bin/cortex-mcp-server",
      "args": ["~/.cortex/global.db"]
    }
  },
  "permissions": { "allow": ["mcp__cortex-global__*", "mcp__cortex-project__*"] }
}

Per-project config (~/.claude/projects/<path>/settings.json) — project-specific:

{
  "mcpServers": {
    "cortex-project": {
      "command": "~/.local/bin/cortex-mcp-server",
      "args": ["~/.cortex/my-app.db"]
    }
  }
}

Then add these memory isolation rules to your project's CLAUDE.md:

## Memory Isolation

Two Cortex MCP servers: `cortex-project` (project DB) and `cortex-global` (global DB).

### Write Policy
- Save to `cortex-project` if the memory is about this repo's architecture, code,
  modules, tests, workflows, configs, bugs, decisions, or terminology.
- Save to `cortex-global` only for long-term user preferences, communication style,
  cross-project habits, or personal background useful across repos.
- **Default: if uncertain, save to `cortex-project`.**

### Read Policy
1. Query `cortex-project` first.
2. Query `cortex-global` second, only for user-level preferences.
3. Prefer project memory when they conflict.

### Anti-Leak Rules
- Never auto-copy from `cortex-project` into `cortex-global`.
- Never store repo-specific paths, module names, or account names in `cortex-global`.
- Never treat project implementation details as user-global preferences.

### Update Rule
- Cortex is append-only. To update: search old entry → delete → ingest new.

This gives you two independent Cortex instances per project — complete isolation with shared user knowledge.

30 Tools

Tool access is governed by an optional deny-by-default capability policy: drop a capabilities.json next to your database ({"version":1,"grants":["read","write"]}) and only granted tool groups (read / write / sync / plugins / all) or exact tool names are listed and callable. No policy file = everything enabled (legacy).

Tool	Purpose
`memory_ingest`	Store a memory (text, channel, person context, optional `privacy`)
`memory_set_privacy`	Change a memory's privacy level — promote to `shared` to sync it, demote to `private` to retract it from other devices
`memory_search`	Semantic search across all memory tiers
`memory_context`	Generate LLM-ready context summary (token-budgeted)
`memory_consolidate`	Run decay + promotion + sweep cycle
`memory_infer`	Preview inference without storing
`memory_compress`	Compress old conversation sessions
`memory_stats`	Get memory statistics (counts per tier, index size)
`memory_decay`	Run temporal decay on episodic memories
`belief_observe`	Update a Bayesian belief with evidence
`belief_list`	Query beliefs above confidence threshold
`fact_add`	Store structured knowledge (subject-predicate-object)
`fact_query`	Query facts by entity (SQL-indexed)
`preference_set`	Store user preference with confidence
`preference_query`	Query preferences by key pattern
`person_resolve`	Cross-channel identity resolution
`person_list`	List all known people
`contradiction_check`	Check for fact contradictions
`relationship_extract`	Extract relationships from text
`sync_status`	Cloud sync status (provider, devices, pending ops)
`sync_providers`	Detect available cloud storage providers
`sync_enable`	Enable cross-device cloud sync with optional encryption
`sync_pull`	Pull and apply remote changes from other devices
`memory_archive`	Archive a memory to cold storage
`memory_restore`	Restore an archived memory back to an active tier
`memory_delete`	Permanently delete a memory by ID
`memory_ingest_batch`	Ingest multiple memories in a single transaction
`tag_list_taxonomy`	List all tags in use across memories with counts
`namespace_list`	List all namespaces with memory counts
`person_merge`	Merge two person identities into one

OpenClaw Plugin

Give your OpenClaw agent persistent memory with auto-recall and auto-capture.

Install:

# 1. Install Cortex binary
curl -fsSL https://raw.githubusercontent.com/gambletan/cortex/main/install.sh | bash

# 2. Install the OpenClaw plugin
openclaw plugin add @cortex-ai-memory/cortex-memory

Configure (optional — works with defaults):

{
  "plugins": {
    "@cortex-ai-memory/cortex-memory": {
      "autoCapture": true,
      "autoRecall": true,
      "topK": 10
    }
  }
}

What it does:

autoCapture: Automatically stores conversation context after each turn
autoRecall: Injects relevant memories before each turn (your agent "remembers")
7 tools: memory_search, memory_store, fact_add, belief_observe, person_resolve, and more

See openclaw-plugin/README.md for full configuration options.

Project Structure

cortex/
├── cortex-core/          # Rust core library (all memory logic)
│   ├── src/
│   │   ├── lib.rs              # Cortex entry point
│   │   ├── types.rs            # MemObject, MemoryTier, etc.
│   │   ├── inference.rs        # Proactive inference (EN + CN)
│   │   ├── episode.rs          # Episodic memory store
│   │   ├── semantic.rs         # Semantic facts + preferences
│   │   ├── working.rs          # Working memory (session scratch pad)
│   │   ├── procedural.rs       # Learned routines
│   │   ├── people.rs           # People graph + identity resolution
│   │   ├── belief.rs           # Bayesian belief system
│   │   ├── consolidation.rs    # Episodic→semantic promotion + decay
│   │   ├── retrieval.rs        # Multi-signal retrieval engine
│   │   ├── context.rs          # LLM context generation
│   │   ├── sync/               # Cloud sync (oplog, HLC, merge, encryption)
│   │   └── storage/            # SQLite + in-memory vector index
│   └── benches/                # Performance benchmarks
├── cortex-http/          # HTTP REST API (axum, local-only)
├── cortex-mcp-server/    # MCP server binary (3.8MB)
├── cortex-python/        # Python bindings (PyO3, WIP)
├── openclaw-plugin/      # OpenClaw memory plugin
├── Dockerfile            # Self-hosted Docker image
└── Cargo.toml            # Workspace root

HTTP API

Cortex ships a lightweight HTTP server for integration with any language or framework. Binds to 127.0.0.1 by default — your data never leaves your machine.

# Build & run
cargo build --release -p cortex-http
./target/release/cortex-http --port 3315 --db ~/.cortex/memory.db

# Or via Docker (pre-built from GHCR)
docker run -v ~/.cortex:/data -p 3315:3315 ghcr.io/gambletan/cortex/cortex-http:latest

# Or build locally
docker build -t cortex .
docker run -v ~/.cortex:/data -p 3315:3315 cortex

Endpoints

Method	Path	Description
GET	`/health`	Health check
POST	`/v1/memories`	Ingest a memory
POST	`/v1/memories/search`	Semantic search
GET	`/v1/memories/context`	Generate LLM context
POST	`/v1/memories/consolidate`	Run consolidation cycle
POST	`/v1/memories/infer`	Preview inference (no store)
POST	`/v1/facts`	Add a semantic fact
POST	`/v1/facts/contradictions`	Check for contradictions
POST	`/v1/preferences`	Set a preference
GET	`/v1/beliefs`	List beliefs
POST	`/v1/beliefs/observe`	Update belief with evidence
POST	`/v1/people`	Resolve person identity
POST	`/v1/memories/compress`	Compress old conversation sessions
POST	`/v1/relationships/extract`	Extract relationships from text
GET	`/v1/export`	Export all data (JSON backup)
POST	`/v1/import`	Import data from backup

Examples

# Store a memory
curl -X POST http://localhost:3315/v1/memories \
  -H 'Content-Type: application/json' \
  -d '{"text": "I prefer dark mode", "channel": "cli"}'

# Search
curl -X POST http://localhost:3315/v1/memories/search \
  -H 'Content-Type: application/json' \
  -d '{"query": "preferences", "limit": 5}'

# Export all data (backup to iCloud, NAS, etc.)
curl http://localhost:3315/v1/export > ~/iCloud/cortex-backup.json

# Import from backup
curl -X POST http://localhost:3315/v1/import \
  -H 'Content-Type: application/json' \
  -d @~/iCloud/cortex-backup.json

Roadmap

v0.2 ✅ — Local embedding integration (all-MiniLM-L6-v2/ONNX), batch queries, importance-aware decay + auto-consolidation
v0.3 ✅ — Proactive inference (auto-extract facts), temporal awareness, contradiction detection, Chinese NLP
v0.4 ✅ — HTTP REST API (axum), import/export (JSON backup), Docker packaging
v0.5 ✅ — Conversation compression, relationship inference (EN + CN), temporal retrieval enhancement, 112 tests
v1.0 ✅ — Feature comparison table, benchmark update, 18-feature Cortex vs Mem0 vs OpenAI
v1.1 ✅ — HNSW vector index (50K search: 12ms → 91µs), Python SDK (pip install cortex-ai-memory)
v1.2 ✅ — Negation detection (EN + CN), multi-hop retrieval, 117 tests
v1.3 ✅ — Context quality optimization, query expansion, bidirectional relationships, 126 tests
v1.4 ✅ — Incremental HNSW, SQL-indexed entity queries, LLM summarizer hook, 18 MCP tools, configurable decay, LLM-assisted inference, 131 tests
v1.5 ✅ — Docker image (GHCR auto-publish), batch ingest, dedup, namespace isolation, plugin system, event bus, archival, 351 tests
v1.6 ✅ — Int8 quantization (75% storage reduction), materialized column indexes, FTS5 triggers, LRU caches (MemObject + entity-facts), rayon parallel decay, Arc embedding, generation-based cache invalidation, 25 MCP tools, batch inference, enhanced Chinese NLP
v1.7 ✅ — Cloud sync (changelog-based, HLC ordering, LWW merge), AES-256-GCM encryption (Argon2id KDF), privacy enforcement (Private/Shared/Public), zeroize (memory wiping), SECURITY.md, 27 MCP tools, 400+ tests
v2.0 ✅ — Background sync (filesystem watcher + polling), Web Dashboard, Homebrew tap, integration docs (CrewAI/AutoGen/LangGraph/DeerFlow), /v1/memories/recent API, 12 rounds Codex review fixes, 489 tests
v2.1 ✅ — WASM build (124KB, runs entirely in the browser, GitHub Pages demo)
v2.2 ✅ — Security hardening series (self-evolution iterations 11–17): manifest + per-operation HMAC, plaintext-injection rejection, timing-attack hardening, key rotation with forward secrecy (ENC2), bounded query budget, deny-by-default MCP capability policy, per-memory privacy opt-in with cross-device retraction, persistent sync (Keychain) + auto background sync, frecency ranking, one-shot device setup script, 30 MCP tools, 500+ tests
v2.3 — Mobile targets (iOS/Android), multi-modal memory

If you find Cortex useful, please consider giving it a star ⭐ — it helps others discover the project and motivates continued development!

License

MIT

Featured

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

Cortex

🧠 Try Cortex in your browser — zero install, 124KB WASM, runs entirely client-side.

If Cortex helps your AI remember, give it a ⭐ — it takes 1 second and helps others discover the project.

中文 | 日本語 | 한국어

Memory for AI agents that never leaves your device.

Private. Free. Local. — a memory engine for personal AI agents.

Cortex remembering across sessions — a real, local cortex-mcp-server recording

What you get

🔒 Private by default — memories live in a local SQLite file, never leave your device, zero telemetry (CI-enforced).
🧠 Real memory, not a text file — 4 tiers, multi-signal retrieval, self-correcting Bayesian beliefs, a cross-channel people graph.
⚡ Sub-millisecond — 156µs ingest, 568µs search. ~528× faster than cloud memory APIs, with no network round-trip.
🔌 Drop-in for any agent — one MCP server gives Claude Code / Claude Desktop (or any MCP client) persistent cross-session memory.
☁️ Yours across devices — optional end-to-end-encrypted sync through your own iCloud / Drive / Dropbox. No server of ours, ever.

See it remember across sessions — ~30 seconds:

brew install gambletan/tap/cortex-mcp-server          # or: cargo build --release -p cortex-mcp-server
claude mcp add cortex-memory -- cortex-mcp-server ~/.cortex/memory.db

⭐ If that's useful, give it a star — it helps others find a memory engine that respects their privacy.

Cortex vs Mem0 vs OpenAI Memory

	Cortex	Mem0	OpenAI Memory
Privacy	100% local, zero cloud	Cloud API (your data on their servers)	OpenAI servers
Latency	156µs ingest, 568µs search	~200-500ms	~300-800ms
Cost	Free, forever	$99+/mo (Pro)	ChatGPT Plus ($20/mo)
Memory tiers	4 (Working/Episodic/Semantic/Procedural)	1 (flat)	1 (flat)
Bayesian beliefs	Self-correcting with evidence	No	No
People graph	Cross-channel identity resolution	Paid tier only	No
Conversation compression	Automatic session summarization	No	No
Relationship inference	Pattern-based (EN + CN)	No	No
Temporal retrieval	Intent-aware ("recently" / "first time")	No	No
Contradiction detection	Automatic with confidence scores	No	No
Consolidation	Episodic → Semantic auto-promotion	No	No
Context injection	Token-budgeted LLM-ready output	Manual	Automatic but opaque
Import/Export	Full JSON backup & restore	API only	No export
Self-hosted	Native binary, Docker, MCP	Cloud only	Cloud only
Binary size	3.8 MB	npm package	N/A
Dependencies	0 runtime services (single binary)	Node.js + cloud	N/A
Open source	MIT	Partial	No
Encryption	AES-256-GCM encrypted sync (opt-in)	No	No
Key rotation	Versioned envelopes, forward secrecy	No	No
Privacy levels	Private (default, never syncs) / Shared / Public — per-memory opt-in, demote retracts from other devices	No	No
Tool authorization	Deny-by-default capability policy on the MCP surface	No	No
Zero telemetry	No analytics, no phone-home, verifiable	Unknown	No
Cost	Free forever, unlimited	$99+/mo (Pro)	$20/mo (Plus)
Chinese NLP	Native (inference, retrieval, relationships)	No	Limited
Namespace isolation	Per-user/context memory separation	No	No
Plugin system	Compile-time hooks for ingest/retrieve/consolidation	No	No
MCP tools	30 tools for Claude/LLM integration	3rd party	N/A

Performance Benchmarks

Operation	Cortex	Mem0 (cloud)	File-based
Ingest	156µs	~200ms	~1ms
Search (top-10)	568µs	~300ms	~10ms
Context generation	621µs	~500ms	manual
Belief update	66µs	N/A	N/A
People graph	51µs	paid tier	N/A
Structured facts	45µs	N/A	N/A
1K memories search	1.6ms	~500ms	~50ms

528x faster than Mem0 cloud. With features neither Mem0 nor OpenAI Memory offer.

Note: Benchmarks include proactive inference (auto-extracting facts, preferences, relationships) on every ingest. Raw ingest without inference is ~15µs. Numbers from cargo bench on M-series Mac.

LoCoMo Benchmark (ACL 2024)

Academic-grade long-term conversation memory evaluation — 10 conversations, 1540 QA pairs across 4 categories.

System	Single-hop	Multi-hop	Open-domain	Temporal	Overall
Backboard	89.4%	75.0%	91.2%	91.9%	90.0%
MemMachine v0.2	—	—	—	—	84.9%
Cortex	72.5%	59.5%	88.8%	74.1%	73.7%
Mem0-Graph	65.7%	47.2%	75.7%	58.1%	68.4%
Mem0	67.1%	51.2%	72.9%	55.5%	66.9%
OpenAI Memory	—	—	—	—	52.9%

Key findings:

Open-domain 88.8% — leads Mem0 (72.9%) by +15.9%
Temporal 74.1% — leads Mem0 (55.5%) by +18.6%
Single-hop 72.5% — leads Mem0 (67.1%) by +5.4%
Multi-hop 59.5% — leads Mem0 (51.2%) by +8.3%
Overall 73.7% — beats Mem0 (66.9%) by +6.8%, beats OpenAI Memory (52.9%) by +20.8%

Cortex outperforms Mem0 on all 4 categories — while running 100% locally, end-to-end encrypted, at $0 cost.

Setup: Claude Sonnet 4 (QA + judge), nomic-embed-text (embeddings via Ollama), top-30 retrieval. Reproducible with that setup: python3 bench/locomo_bench.py (needs ANTHROPIC_API_KEY + a local Ollama with nomic-embed-text). Numbers measured on the v1.7 engine; the v2.2 retrieval beam fix (paraphrase recall 40%→90% at 5K, see docs/scale-test-2026-06-13.md) has not yet been re-run on LoCoMo, so these are reported as the last verified figures, not a v2.2 claim.

Architecture

Cortex implements a 4-tier memory model inspired by human cognition:

                    +---------------------+
                    |   Working Memory    |  Current session context
                    +---------------------+
                              |
                    +---------------------+
                    |   Episodic Memory   |  Raw experiences: conversations, events, observations
                    +---------------------+
                              |  consolidation (decay, promotion, pattern extraction)
                    +---------------------+
                    |   Semantic Memory   |  Distilled facts, preferences, relationships
                    +---------------------+
                              |
                    +---------------------+
                    | Procedural Memory   |  Learned routines, user-specific workflows
                    +---------------------+

Key Components

People Graph

Bayesian Belief System

cortex.observe_belief("user_prefers_morning_meetings", true, 0.8)?;
cortex.observe_belief("user_prefers_morning_meetings", false, 0.6)?;
// Confidence adjusts automatically via Bayesian update

Consolidation Engine

Multi-signal Retrieval

Queries combine five signals for relevance ranking:

Similarity -- vector cosine distance against query embedding
Temporal -- recency weighting with configurable decay
Salience -- importance scoring from access patterns and explicit hints
Social -- boost for memories involving specific people
Channel -- filter or boost by source channel

Context Injection Protocol

Generates LLM-ready context strings from memory state. Pass a token budget, optional channel/person filters, and get back a structured text block your LLM can consume directly.

Storage

Cloud Sync

Sync memories across devices through your own cloud storage — no third-party server involved.

Device A (Mac)              Your Cloud Storage              Device B (iPhone)
┌──────────┐         ┌──────────────────────┐         ┌──────────┐
│ SQLite DB │ ──W──>  │ iCloud / GDrive /    │  <──R── │ SQLite DB│
│ (local)   │         │ OneDrive / Dropbox   │         │ (local)  │
│           │ <──R──  │                      │  ──W──> │          │
└──────────┘         └──────────────────────┘         └──────────┘

Changelog-based: Each device writes append-only operation logs to its own subfolder
No conflicts: Devices never write to the same file. Merge uses Last-Writer-Wins with Hybrid Logical Clocks
Encrypted: AES-256-GCM encryption (opt-in). Even if your cloud account is compromised, memories stay private
Tamper-evident: the sync manifest and every operation carry an HMAC; tampered or plaintext-injected oplog lines are rejected, and a manifest without integrity protection refuses to load (no key-rollback path)
Key rotation & forward secrecy: rotate to a new key version (ENC2 envelopes) without re-encrypting history; old versions stay readable, new writes are unreadable to a leaked old key
Privacy-aware, per-memory opt-in: Private memories (the default) never leave your device. Mark a memory shared to sync it; demote it back to private and a retraction deletes it from your other devices (local copy kept)
Survives restarts: sync settings persist in the database (passphrase never touches disk — macOS login Keychain or CORTEX_SYNC_PASSPHRASE); the server resumes sync and starts background pull (30s poll + fs watcher) automatically

Supported providers: iCloud Drive, Google Drive, OneDrive, Dropbox (auto-detected).

use cortex_core::sync::SyncConfig;
use cortex_core::types::PrivacyLevel;

// Enable sync with encryption (settings persist; passphrase goes to the OS keychain)
let config = SyncConfig::new(sync_dir, device_id, device_name)
    .with_encryption("my-strong-passphrase");
cortex.enable_sync(config)?;

// Opt a memory into sync — everything is Private unless you say otherwise
cortex.set_memory_privacy(mem_id, PrivacyLevel::Shared { scope: "all".into() })?;

// Pull changes from other devices (also happens automatically in the background)
let applied = cortex.sync_pull()?;
println!("Applied {} remote changes", applied);

Security & Privacy

Feature	Detail
Encryption	AES-256-GCM with Argon2id key derivation (per-line random nonce)
Key rotation	Versioned `ENC2` envelopes with per-version passphrase-derived keys — forward secrecy against AES-key exfiltration, no full re-encryption needed
Integrity	HMAC on the sync manifest and on every sync operation; plaintext lines in an encrypted oplog are rejected outright (injection defense)
Privacy levels	Private (default, never syncs), Shared, Public — set at ingest (`privacy` arg / `--privacy`) or later (`memory_set_privacy`); demoting to Private retracts the memory from other devices
Capability policy	Deny-by-default tool authorization on the MCP surface: a `capabilities.json` grants tool groups (`read`/`write`/`sync`/`plugins`) or exact tools; ungranted tools are invisible and uncallable; malformed policy fails closed
Query budget	Every retrieval is bounded (candidate cap + wall-clock cap) — query cost never scales with total store size; DoS guard and timing-side-channel bound in one
Secret handling	Sync passphrase is never written to disk by Cortex — macOS login Keychain or env var only; missing passphrase fails safe (sync off, never plaintext)
Memory zeroization	Sensitive data cleared from RAM on drop (`zeroize` crate)
Zero telemetry	No analytics, no phone-home, no user data ever leaves the device — enforced in CI (`scripts/check-no-network-egress.sh`): the build fails if any network/telemetry crate enters `cortex-core`'s default tree, and the check also proves the `--no-default-features` binary is completely zero-network.
Embedding model fetch (one-time)	The default `cortex-mcp-server` enables on-device semantic search, which downloads a ~30 MB model (all-MiniLM-L6-v2) from the Hugging Face CDN on first ingest, then runs fully offline and sends none of your data. For a 100%-offline setup: run with `CORTEX_NO_EMBEDDINGS=1` (keyword/FTS recall, zero network) or build `--no-default-features`. A one-time stderr notice is printed before any download — nothing is ever fetched silently.
No accounts	No API key, no registration, no cloud dependency

See SECURITY.md for the full threat model.

Prerequisites

Install the Rust toolchain (provides cargo):

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

After installation, either restart your terminal or run:

source "$HOME/.cargo/env"

Verify:

cargo --version

Real-World Example: A Personal AI That Actually Remembers

Imagine your AI assistant across a week of real conversations:

# Day 1 — You chat on Telegram
You: "Sarah works at Stripe. She's interested in our API."

  Cortex auto-extracts:
  ├── episodic memory stored (156µs)
  ├── fact: Sarah → works_at → Stripe (confidence: 0.70)
  └── person resolved: sarah_telegram

# Day 2 — Sarah emails you
From: sarah@stripe.com
"Here's the technical spec we discussed."

  Cortex:
  ├── person resolved: sarah@stripe.com → merged with sarah_telegram
  │   (same person, different channel — automatic identity resolution)
  └── fact: Sarah → sent → technical spec

# Day 3 — You ask your AI
You: "What's the status with Stripe?"

  Cortex retrieves (568µs):
  ├── Sarah works at Stripe (semantic fact)
  ├── Meeting went well, interested in API (episodic, Day 1)
  ├── She sent technical spec (episodic, Day 2)
  └── Cross-channel context: Telegram + Email unified under one person

  Your AI responds with full context — no "sorry, I don't remember" 🎯

# Day 5 — New information arrives
You: "Sarah now works at Anthropic."

  Cortex:
  ├── contradiction detected: Sarah works_at Stripe vs Sarah works_at Anthropic
  ├── old fact superseded + decayed: Stripe (salience ×0.3, kept as history)
  ├── new fact stored: Sarah → works_at → Anthropic
  └── current employer now ranks first; self-correcting, no manual cleanup

  (Third-party relations are extracted from natural-language verbs —
   "works at / works for / joined / now works at", "runs on", "hosted in",
   "manages", "part of", … — between two proper-noun entities.)

# Day 7 — Consolidation runs
  Cortex auto-consolidation:
  ├── 3 episodic memories about Sarah → promoted to semantic summary
  ├── stale memories from other topics → decayed
  └── pattern detected: you have recurring Monday meetings

All of this happens locally in <1ms per operation. No cloud. No API calls. No one else sees your data.

Install

Homebrew (macOS / Linux)

brew tap gambletan/tap
brew install cortex-mcp-server

From source

cargo build --release -p cortex-mcp-server
cp target/release/cortex-mcp-server ~/.local/bin/

Official packages (avoid look-alikes)

Cortex is published under the cortex-ai-memory name. Several similarly-named packages on npm/PyPI are not affiliated with this project — use exactly these:

Ecosystem	Official package	Use for
Binary / MCP server	GitHub Releases, or `brew install gambletan/tap/cortex-mcp-server`	the memory engine (primary)
PyPI	`cortex-ai-memory`	Python bindings
npm	`@cortex-ai-memory/cortex-memory` (scoped)	OpenClaw memory plugin

⚠️ Not us: npm cortex-mcp, npm cortex-ai-memory (unscoped), PyPI cortex-memory. The source of truth is always this repo — github.com/gambletan/cortex. When in doubt, the binary from Releases is the canonical install.

Quick Start

use cortex_core::Cortex;

// Open (or create) a memory database
let cortex = Cortex::open("memory.db")?;

// Ingest a memory from a Telegram conversation
let embedding = your_embedding_fn("Met with Alice about the Q3 roadmap");
cortex.ingest(
    "Met with Alice about the Q3 roadmap",
    "telegram",               // source channel
    Some("alice_123"),         // user ID (triggers identity resolution)
    Some(0.8),                 // salience hint
    Some(embedding),           // vector embedding
)?;

// Add a semantic fact directly
cortex.add_fact(
    "Alice", "works_at", "Acme Corp",
    0.95, "telegram", None,
)?;

// Store a preference
cortex.add_preference("timezone", "America/Los_Angeles", 0.9)?;

// Retrieve relevant memories
let results = cortex.retrieve(
    "What do I know about Alice?",
    5,                         // top-k
    None,                      // any channel
    None,                      // any person
    Some(query_embedding),     // vector for similarity search
)?;

// Generate LLM-ready context (token-budgeted)
let context = cortex.get_context(
    2000,                      // max tokens
    Some("telegram"),          // channel filter
    None,                      // no person filter
)?;
// Pass `context` as system/user message prefix to your LLM

// Run consolidation (call periodically)
let report = cortex.run_consolidation()?;
println!("Promoted: {}, Decayed: {}", report.promoted, report.decayed);

Python Bindings

Coming soon via PyO3. The cortex-python crate will expose the full API as a native Python module:

from cortex import Cortex

cx = Cortex.open("memory.db")
cx.ingest("Had lunch with Bob at the Thai place", channel="imessage", user_id="bob")
results = cx.retrieve("Where does Bob like to eat?", limit=5)

Integration with unified-channel-hub

Telegram ─┐                          ┌─ Context
Discord  ─┤  unified-channel-hub  →  │  Cortex  →  LLM
Email    ─┤  (ingest)                 │  (retrieve + inject)
Calendar ─┘                          └─ Response

Integration with LangGraph

Add persistent memory to any LangGraph agent via langchain-mcp-adapters — no custom code needed.

from langchain_mcp_adapters.client import MultiServerMCPClient
from langgraph.prebuilt import create_react_agent
from langchain_openai import ChatOpenAI

model = ChatOpenAI(model="gpt-4o")

async with MultiServerMCPClient({
    "cortex": {
        "command": "cortex-mcp-server",
        "args": ["~/.cortex/memory.db"]
    }
}) as client:
    agent = create_react_agent(model, client.get_tools())
    # Agent now has all 30 Cortex memory tools
    result = await agent.ainvoke({
        "messages": [{"role": "user", "content": "What do you remember about Alice?"}]
    })

Your LangGraph agent gets instant access to memory_search, memory_ingest, fact_add, belief_observe, person_resolve, and 25 more tools — all running locally.

Integration with DeerFlow (ByteDance)

Cortex works as a persistent memory layer for DeerFlow — ByteDance's open-source multi-agent orchestration platform. Zero code changes needed.

# Add to DeerFlow config.yaml
mcp_servers:
  cortex-memory:
    command: cortex-mcp-server
    args:
      - ~/.cortex/deerflow.db

All DeerFlow agents (Telegram, Slack, Feishu) get instant access to 30 memory tools — cross-session memory, fact storage, people graph, and belief tracking across all channels.

CLI

Cortex doubles as a standalone CLI tool — no MCP client required.

$ cortex-mcp-server --help
Cortex memory engine — MCP server & CLI tools

Usage: cortex-mcp-server [DB_PATH] [COMMAND]

Commands:
  ingest  Store a new memory
  search  Search memories
  stats   Show memory statistics
  sync    Show cloud sync status and detected providers
  export  Export all data as JSON
  import  Import data from JSON file
  info    Show version, DB path, and capabilities
  help    Print this message or the help of the given subcommand(s)

Arguments:
  [DB_PATH]  Path to the Cortex database file (default: ~/.cortex/memory.db)

Options:
  -h, --help     Print help
  -V, --version  Print version

Examples:

# Store a memory
cortex-mcp-server ~/.cortex/memory.db ingest "Met with Alice about Q3 roadmap"
cortex-mcp-server ~/.cortex/memory.db ingest -c telegram "Sarah now works at Anthropic"

# Search
cortex-mcp-server ~/.cortex/memory.db search "Alice"
cortex-mcp-server ~/.cortex/memory.db search -l 10 "Q3 roadmap"

# Stats
cortex-mcp-server ~/.cortex/memory.db stats

# Cloud sync
cortex-mcp-server ~/.cortex/memory.db sync                        # status
cortex-mcp-server ~/.cortex/memory.db sync enable                  # auto-detect provider
cortex-mcp-server ~/.cortex/memory.db sync enable -p icloud        # specific provider
cortex-mcp-server ~/.cortex/memory.db sync pull                    # pull remote changes

# Export / Import (backup & restore)
cortex-mcp-server ~/.cortex/memory.db export -o backup.json
cortex-mcp-server ~/.cortex/new.db import backup.json

# Version & capabilities
cortex-mcp-server ~/.cortex/memory.db info

No subcommand = MCP stdio mode (for Claude Code / Claude Desktop integration).

MCP Server (Claude Code / Claude Desktop)

Cortex ships as an MCP server — works with any MCP-compatible client.

Setup

1. Build & install the binary:

mkdir -p ~/.local/bin ~/.cortex
cargo build --release -p cortex-mcp-server
cp target/release/cortex-mcp-server ~/.local/bin/

2. Register as MCP server:

Claude Code (CLI):

# Global (all projects)
claude mcp add cortex --scope user -- ~/.local/bin/cortex-mcp-server ~/.cortex/memory.db

# Or per-project
claude mcp add cortex -- ~/.local/bin/cortex-mcp-server ~/.cortex/memory.db

Claude Desktop — add to ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "cortex": {
      "command": "/Users/you/.local/bin/cortex-mcp-server",
      "args": ["/Users/you/.cortex/memory.db"]
    }
  }
}

3. Allow tools in "don't ask" mode:

Add to ~/.claude/settings.json → permissions.allow:

"mcp__cortex__*"

Note: MCP tool permissions do not support parentheses format (e.g. mcp__cortex__memory_ingest(*)). Use the wildcard mcp__cortex__* instead.

4. Make it automatic — add to your CLAUDE.md (project or global ~/.claude/CLAUDE.md):

# Memory (Cortex)
You have persistent memory via Cortex MCP tools. Use them automatically:
- Start of conversation: call `memory_context` to load what you know about the user
- When the user shares a preference, fact, or personal info: call `memory_ingest` to store it
- When you learn a structured fact: call `fact_add` (e.g. "User works_at Google")
- When you detect a preference: call `preference_set` (e.g. editor=neovim)
- When evidence supports or contradicts a belief: call `belief_observe`
- When talking to someone new: call `person_resolve` to track identity
- Periodically: call `memory_consolidate` to clean up stale memories

5. Auto-inject memory on session start (Claude Code hooks — fully automatic):

Create ~/.claude/hooks/cortex-memory-inject.sh:

#!/bin/bash
CORTEX_BIN="${CORTEX_BIN:-$HOME/.local/bin/cortex-mcp-server}"
CORTEX_DB="${CORTEX_DB:-$HOME/.cortex/memory.db}"
[ -x "$CORTEX_BIN" ] || exit 0

printf '%s\n%s\n%s\n' \
  '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"hook","version":"1.0"}}}' \
  '{"jsonrpc":"2.0","method":"notifications/initialized"}' \
  '{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"memory_context","arguments":{"max_tokens":1500}}}' \
  | "$CORTEX_BIN" "$CORTEX_DB" 2>/dev/null \
  | grep '"id":2' \
  | python3 -c "import sys,json; r=json.load(sys.stdin); print(r['result']['content'][0]['text'])" 2>/dev/null

Add to ~/.claude/settings.json:

{
  "hooks": {
    "SessionStart": [
      {
        "matcher": "",
        "hooks": [
          {
            "type": "command",
            "command": "~/.claude/hooks/cortex-memory-inject.sh"
          }
        ]
      }
    ]
  }
}

Now every new Claude Code session automatically loads your memory context — zero manual effort. Claude learns as you work and remembers across sessions.

Cross-Device Memory Sync

Your Claude's memory follows you across all your devices — MacBook, iMac, work laptop — through your own cloud storage.

Enable sync (one command):

You: "Enable cross-device memory sync"

Claude calls sync_enable → auto-detects iCloud Drive →
  generates device ID + AES-256-GCM encryption key → done.

Output:
  Provider:   iCloud Drive
  Encryption: AES-256-GCM
  Passphrase: a1b2c3...  ← save this for your other devices

On your second device — one script does everything (build/install, wait for iCloud, join, restore):

git clone https://github.com/gambletan/cortex && cortex/scripts/setup-device-sync.sh
# Prompts for your passphrase (hidden input; or set CORTEX_SYNC_PASSPHRASE)
# → full restore on join, passphrase saved to that device's login Keychain

Or conversationally:

You: "Enable sync with passphrase a1b2c3..."

Claude calls sync_enable(passphrase: "a1b2c3...") →
  connects to the same iCloud sync folder → pulls all memories.

Now both devices share the same memory — and keep sharing it
automatically (background sync: 30s poll + filesystem watcher).

What syncs and what doesn't:

Private memories (default) never leave your device. Opt in per memory: memory_ingest with privacy: "shared", cortex-mcp-server ingest --privacy shared, or memory_set_privacy on an existing memory
Demote a shared memory back to private and it is retracted (deleted) from your other devices — the local copy stays
All sync data is AES-256-GCM encrypted with HMAC integrity — even if your cloud account is compromised, memories stay private and tampering is detected
Sync survives restarts: settings persist, the passphrase lives in the OS keychain, the server resumes automatically
No server, no API, no account — just your own cloud folder

CLI alternative:

# Device A
cortex-mcp-server sync enable
# Save the passphrase from the output

# Device B
cortex-mcp-server sync enable --passphrase "your-passphrase-from-device-A"

# Manual pull (background sync also pulls automatically)
cortex-mcp-server sync pull

Multi-Project Isolation

Working across multiple projects? Use separate databases for physical memory isolation — no cross-project leakage, zero code changes needed.

~/.cortex/
├── global.db          # User preferences, people graph, cross-project knowledge
├── my-app.db          # Project A memories
└── my-api.db          # Project B memories

Global config (~/.claude/settings.json) — user-level knowledge:

{
  "mcpServers": {
    "cortex-global": {
      "command": "~/.local/bin/cortex-mcp-server",
      "args": ["~/.cortex/global.db"]
    }
  },
  "permissions": { "allow": ["mcp__cortex-global__*", "mcp__cortex-project__*"] }
}

Per-project config (~/.claude/projects/<path>/settings.json) — project-specific:

{
  "mcpServers": {
    "cortex-project": {
      "command": "~/.local/bin/cortex-mcp-server",
      "args": ["~/.cortex/my-app.db"]
    }
  }
}

Then add these memory isolation rules to your project's CLAUDE.md:

## Memory Isolation

Two Cortex MCP servers: `cortex-project` (project DB) and `cortex-global` (global DB).

### Write Policy
- Save to `cortex-project` if the memory is about this repo's architecture, code,
  modules, tests, workflows, configs, bugs, decisions, or terminology.
- Save to `cortex-global` only for long-term user preferences, communication style,
  cross-project habits, or personal background useful across repos.
- **Default: if uncertain, save to `cortex-project`.**

### Read Policy
1. Query `cortex-project` first.
2. Query `cortex-global` second, only for user-level preferences.
3. Prefer project memory when they conflict.

### Anti-Leak Rules
- Never auto-copy from `cortex-project` into `cortex-global`.
- Never store repo-specific paths, module names, or account names in `cortex-global`.
- Never treat project implementation details as user-global preferences.

### Update Rule
- Cortex is append-only. To update: search old entry → delete → ingest new.

This gives you two independent Cortex instances per project — complete isolation with shared user knowledge.

30 Tools

Tool access is governed by an optional deny-by-default capability policy: drop a capabilities.json next to your database ({"version":1,"grants":["read","write"]}) and only granted tool groups (read / write / sync / plugins / all) or exact tool names are listed and callable. No policy file = everything enabled (legacy).

Tool	Purpose
`memory_ingest`	Store a memory (text, channel, person context, optional `privacy`)
`memory_set_privacy`	Change a memory's privacy level — promote to `shared` to sync it, demote to `private` to retract it from other devices
`memory_search`	Semantic search across all memory tiers
`memory_context`	Generate LLM-ready context summary (token-budgeted)
`memory_consolidate`	Run decay + promotion + sweep cycle
`memory_infer`	Preview inference without storing
`memory_compress`	Compress old conversation sessions
`memory_stats`	Get memory statistics (counts per tier, index size)
`memory_decay`	Run temporal decay on episodic memories
`belief_observe`	Update a Bayesian belief with evidence
`belief_list`	Query beliefs above confidence threshold
`fact_add`	Store structured knowledge (subject-predicate-object)
`fact_query`	Query facts by entity (SQL-indexed)
`preference_set`	Store user preference with confidence
`preference_query`	Query preferences by key pattern
`person_resolve`	Cross-channel identity resolution
`person_list`	List all known people
`contradiction_check`	Check for fact contradictions
`relationship_extract`	Extract relationships from text
`sync_status`	Cloud sync status (provider, devices, pending ops)
`sync_providers`	Detect available cloud storage providers
`sync_enable`	Enable cross-device cloud sync with optional encryption
`sync_pull`	Pull and apply remote changes from other devices
`memory_archive`	Archive a memory to cold storage
`memory_restore`	Restore an archived memory back to an active tier
`memory_delete`	Permanently delete a memory by ID
`memory_ingest_batch`	Ingest multiple memories in a single transaction
`tag_list_taxonomy`	List all tags in use across memories with counts
`namespace_list`	List all namespaces with memory counts
`person_merge`	Merge two person identities into one

OpenClaw Plugin

Give your OpenClaw agent persistent memory with auto-recall and auto-capture.

Install:

# 1. Install Cortex binary
curl -fsSL https://raw.githubusercontent.com/gambletan/cortex/main/install.sh | bash

# 2. Install the OpenClaw plugin
openclaw plugin add @cortex-ai-memory/cortex-memory

Configure (optional — works with defaults):

{
  "plugins": {
    "@cortex-ai-memory/cortex-memory": {
      "autoCapture": true,
      "autoRecall": true,
      "topK": 10
    }
  }
}

What it does:

autoCapture: Automatically stores conversation context after each turn
autoRecall: Injects relevant memories before each turn (your agent "remembers")
7 tools: memory_search, memory_store, fact_add, belief_observe, person_resolve, and more

See openclaw-plugin/README.md for full configuration options.

Project Structure

cortex/
├── cortex-core/          # Rust core library (all memory logic)
│   ├── src/
│   │   ├── lib.rs              # Cortex entry point
│   │   ├── types.rs            # MemObject, MemoryTier, etc.
│   │   ├── inference.rs        # Proactive inference (EN + CN)
│   │   ├── episode.rs          # Episodic memory store
│   │   ├── semantic.rs         # Semantic facts + preferences
│   │   ├── working.rs          # Working memory (session scratch pad)
│   │   ├── procedural.rs       # Learned routines
│   │   ├── people.rs           # People graph + identity resolution
│   │   ├── belief.rs           # Bayesian belief system
│   │   ├── consolidation.rs    # Episodic→semantic promotion + decay
│   │   ├── retrieval.rs        # Multi-signal retrieval engine
│   │   ├── context.rs          # LLM context generation
│   │   ├── sync/               # Cloud sync (oplog, HLC, merge, encryption)
│   │   └── storage/            # SQLite + in-memory vector index
│   └── benches/                # Performance benchmarks
├── cortex-http/          # HTTP REST API (axum, local-only)
├── cortex-mcp-server/    # MCP server binary (3.8MB)
├── cortex-python/        # Python bindings (PyO3, WIP)
├── openclaw-plugin/      # OpenClaw memory plugin
├── Dockerfile            # Self-hosted Docker image
└── Cargo.toml            # Workspace root

HTTP API

Cortex ships a lightweight HTTP server for integration with any language or framework. Binds to 127.0.0.1 by default — your data never leaves your machine.

# Build & run
cargo build --release -p cortex-http
./target/release/cortex-http --port 3315 --db ~/.cortex/memory.db

# Or via Docker (pre-built from GHCR)
docker run -v ~/.cortex:/data -p 3315:3315 ghcr.io/gambletan/cortex/cortex-http:latest

# Or build locally
docker build -t cortex .
docker run -v ~/.cortex:/data -p 3315:3315 cortex

Endpoints

Method	Path	Description
GET	`/health`	Health check
POST	`/v1/memories`	Ingest a memory
POST	`/v1/memories/search`	Semantic search
GET	`/v1/memories/context`	Generate LLM context
POST	`/v1/memories/consolidate`	Run consolidation cycle
POST	`/v1/memories/infer`	Preview inference (no store)
POST	`/v1/facts`	Add a semantic fact
POST	`/v1/facts/contradictions`	Check for contradictions
POST	`/v1/preferences`	Set a preference
GET	`/v1/beliefs`	List beliefs
POST	`/v1/beliefs/observe`	Update belief with evidence
POST	`/v1/people`	Resolve person identity
POST	`/v1/memories/compress`	Compress old conversation sessions
POST	`/v1/relationships/extract`	Extract relationships from text
GET	`/v1/export`	Export all data (JSON backup)
POST	`/v1/import`	Import data from backup

Examples

# Store a memory
curl -X POST http://localhost:3315/v1/memories \
  -H 'Content-Type: application/json' \
  -d '{"text": "I prefer dark mode", "channel": "cli"}'

# Search
curl -X POST http://localhost:3315/v1/memories/search \
  -H 'Content-Type: application/json' \
  -d '{"query": "preferences", "limit": 5}'

# Export all data (backup to iCloud, NAS, etc.)
curl http://localhost:3315/v1/export > ~/iCloud/cortex-backup.json

# Import from backup
curl -X POST http://localhost:3315/v1/import \
  -H 'Content-Type: application/json' \
  -d @~/iCloud/cortex-backup.json

Roadmap

v0.2 ✅ — Local embedding integration (all-MiniLM-L6-v2/ONNX), batch queries, importance-aware decay + auto-consolidation
v0.3 ✅ — Proactive inference (auto-extract facts), temporal awareness, contradiction detection, Chinese NLP
v0.4 ✅ — HTTP REST API (axum), import/export (JSON backup), Docker packaging
v0.5 ✅ — Conversation compression, relationship inference (EN + CN), temporal retrieval enhancement, 112 tests
v1.0 ✅ — Feature comparison table, benchmark update, 18-feature Cortex vs Mem0 vs OpenAI
v1.1 ✅ — HNSW vector index (50K search: 12ms → 91µs), Python SDK (pip install cortex-ai-memory)
v1.2 ✅ — Negation detection (EN + CN), multi-hop retrieval, 117 tests
v1.3 ✅ — Context quality optimization, query expansion, bidirectional relationships, 126 tests
v1.4 ✅ — Incremental HNSW, SQL-indexed entity queries, LLM summarizer hook, 18 MCP tools, configurable decay, LLM-assisted inference, 131 tests
v1.5 ✅ — Docker image (GHCR auto-publish), batch ingest, dedup, namespace isolation, plugin system, event bus, archival, 351 tests
v1.6 ✅ — Int8 quantization (75% storage reduction), materialized column indexes, FTS5 triggers, LRU caches (MemObject + entity-facts), rayon parallel decay, Arc embedding, generation-based cache invalidation, 25 MCP tools, batch inference, enhanced Chinese NLP
v1.7 ✅ — Cloud sync (changelog-based, HLC ordering, LWW merge), AES-256-GCM encryption (Argon2id KDF), privacy enforcement (Private/Shared/Public), zeroize (memory wiping), SECURITY.md, 27 MCP tools, 400+ tests
v2.0 ✅ — Background sync (filesystem watcher + polling), Web Dashboard, Homebrew tap, integration docs (CrewAI/AutoGen/LangGraph/DeerFlow), /v1/memories/recent API, 12 rounds Codex review fixes, 489 tests
v2.1 ✅ — WASM build (124KB, runs entirely in the browser, GitHub Pages demo)
v2.2 ✅ — Security hardening series (self-evolution iterations 11–17): manifest + per-operation HMAC, plaintext-injection rejection, timing-attack hardening, key rotation with forward secrecy (ENC2), bounded query budget, deny-by-default MCP capability policy, per-memory privacy opt-in with cross-device retraction, persistent sync (Keychain) + auto background sync, frecency ranking, one-shot device setup script, 30 MCP tools, 500+ tests
v2.3 — Mobile targets (iOS/Android), multi-modal memory

If you find Cortex useful, please consider giving it a star ⭐ — it helps others discover the project and motivates continued development!

License

MIT