Suma Memory

authSTDIO, HTTPregistry active

Summary

A persistent knowledge graph for AI coding sessions that solves the cold start problem. Exposes six MCP tools: suma_ingest writes architectural decisions and bug fixes to a weighted graph, suma_search retrieves context by natural language query, suma_talk combines search and learning in one call, and suma_correct supersedes wrong information without deletion. Runs as a hosted service on Cloud Run, so no local server setup. The K-WIL gravity algorithm ranks facts by recency, density, semantic similarity, and emotional weight rather than flat chunking. Reach for this when you're tired of re-explaining your auth flow or database schema every time you open a new Claude chat, or when multiple agents need to share context across sessions without explicit handoffs.

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

SUMA Memory MCP

Stop re-explaining your project to Claude every time you start a new chat.

Your repos now have permanent memory. SUMA gives any MCP-compatible AI client (Claude Code, Cursor, Devin) a persistent knowledge graph that remembers architectural decisions, bug root causes, and project rules — across sessions, across machines, across your entire team.

Install (30 seconds)

Get an API key at sumapro.quadframe.work — free tier available.

Add to your .mcp.json:

{
  "mcpServers": {
    "suma-memory": {
      "url": "https://sumapro.quadframe.work/mcp",
      "headers": {
        "Authorization": "Bearer sk_live_your_key_here"
      }
    }
  }
}

That's it. No local server. No Docker. No npm install. SUMA runs on Cloud Run — stateless, auto-scaled, always available.

Onboard Your Project (2 minutes)

After installing, run this once per repo to seed your permanent context:

suma_ingest(text="Project: [name]. Framework: [Next.js / Flask / etc].
Auth lives in: [path/to/auth.py]. Database: [PostgreSQL / SQLite / etc].
Rules never to break: [e.g. never store plaintext keys, all routes require org_id filter].
Deployment target: [Cloud Run / Vercel / etc].")

From this point forward, every new session inherits this context. You never explain it again.

How It Works

SUMA stores knowledge in a weighted graph. Every node has a gravity score across four dimensions:

Recency — newer facts surface first
Density — well-connected facts outrank isolated ones
Semantic similarity — vector distance to your query
Emotional weight — high-signal facts are reinforced over time

When you call suma_search, the K-WIL gravity algorithm traverses the graph and returns the highest-relevance context — not a flat list of chunks, not a raw embedding match, but the facts that actually matter for what you're doing right now.

Core Tools

Tool	What it does
`suma_ping`	Health check — verify connection and API key
`suma_ingest`	Add knowledge to the graph (architecture decisions, bug fixes, rules)
`suma_search`	Retrieve relevant context by natural language query
`suma_talk`	Search + learn in one call — retrieves context and updates graph
`suma_correct`	Fix wrong information — supersedes original, queues replacement
`suma_clean`	Remove noise nodes that pollute search results

Three Use Cases

1. Persistent Architecture Memory

# After finalizing a decision:
suma_ingest(text="We chose REST over GraphQL. Root cause: GraphQL N+1 queries
            caused 3x latency on /search. Architect ruling Apr 10 2026.")

# Next session, cold start — full context in one call:
suma_search(query="why did we switch to REST?")
# → Returns ruling with full context. No re-explaining.

2. Bug Root Cause Archive

# After fixing a hard bug:
suma_ingest(text="Cloud Run WebSocket bug: asyncio.run() in daemon thread killed
            by Cloud Run recycling. Fix: use asyncio.get_event_loop() instead.
            Never use asyncio.run() in long-lived Cloud Run services.")

# Six months later, same error:
suma_search(query="asyncio cloud run daemon thread crash")
# → Root cause retrieved instantly. Hours saved.

3. Multi-Agent Knowledge Fusion

Architect, developer, and QA agents each write to SUMA using their own sessions. Their knowledge merges into one shared org graph. When QA asks "what did the architect decide about auth?", it retrieves the architect's ruling — zero explicit handoff required.

Enterprise Safety

Anti-flood protection: Each source machine is rate-limited to 5 ingests per 60 seconds. Runaway agent loops are broken gracefully — the 6th request returns {"status": "throttled"} without crashing or corrupting the graph.

Multi-tenant isolation: Every node is scoped to org_id at the database layer. Two organizations on the same Cloud Run instance cannot access each other's data — enforced by SQL, not application logic.

Immutable audit trail: suma_correct and suma_clean never delete data. Nodes are superseded and invisible to the API while preserved in storage for compliance.

Key Metrics (Live Production — April 2026)

Metric	Value
Compression ratio	94.7% — 801 nodes replace 15.2M tokens
Cost saved per org	$14.47 across 538 queries
K-WIL fidelity	96.3% — 26/27 facts recoverable from 5-node graph
Automated tests	118 (102 Playwright E2E + 16 pytest)

Pricing

Plan	Queries/month	Price
Starter	20,000	Free
Developer	100,000	$4.99/mo
Team	500,000	$29/mo
Enterprise	Unlimited	Contact

Get your key: sumapro.quadframe.work

© 2025–2026 Suman Addanke / A2 Vibe Creators LLC
US Patent applications pending — 6 filed (2025–2026). Unauthorized commercial use prohibited.

Featured

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

Configuration

SUMA_API_KEY*secret

Your SUMA Pro API key — get one free at sumapro.quadframe.work

SUMA Memory MCP

Stop re-explaining your project to Claude every time you start a new chat.

Install (30 seconds)

Get an API key at sumapro.quadframe.work — free tier available.

Add to your .mcp.json:

{
  "mcpServers": {
    "suma-memory": {
      "url": "https://sumapro.quadframe.work/mcp",
      "headers": {
        "Authorization": "Bearer sk_live_your_key_here"
      }
    }
  }
}

That's it. No local server. No Docker. No npm install. SUMA runs on Cloud Run — stateless, auto-scaled, always available.

Onboard Your Project (2 minutes)

After installing, run this once per repo to seed your permanent context:

suma_ingest(text="Project: [name]. Framework: [Next.js / Flask / etc].
Auth lives in: [path/to/auth.py]. Database: [PostgreSQL / SQLite / etc].
Rules never to break: [e.g. never store plaintext keys, all routes require org_id filter].
Deployment target: [Cloud Run / Vercel / etc].")

From this point forward, every new session inherits this context. You never explain it again.

How It Works

SUMA stores knowledge in a weighted graph. Every node has a gravity score across four dimensions:

Recency — newer facts surface first
Density — well-connected facts outrank isolated ones
Semantic similarity — vector distance to your query
Emotional weight — high-signal facts are reinforced over time

Core Tools

Tool	What it does
`suma_ping`	Health check — verify connection and API key
`suma_ingest`	Add knowledge to the graph (architecture decisions, bug fixes, rules)
`suma_search`	Retrieve relevant context by natural language query
`suma_talk`	Search + learn in one call — retrieves context and updates graph
`suma_correct`	Fix wrong information — supersedes original, queues replacement
`suma_clean`	Remove noise nodes that pollute search results

Three Use Cases

1. Persistent Architecture Memory

# After finalizing a decision:
suma_ingest(text="We chose REST over GraphQL. Root cause: GraphQL N+1 queries
            caused 3x latency on /search. Architect ruling Apr 10 2026.")

# Next session, cold start — full context in one call:
suma_search(query="why did we switch to REST?")
# → Returns ruling with full context. No re-explaining.

2. Bug Root Cause Archive

# After fixing a hard bug:
suma_ingest(text="Cloud Run WebSocket bug: asyncio.run() in daemon thread killed
            by Cloud Run recycling. Fix: use asyncio.get_event_loop() instead.
            Never use asyncio.run() in long-lived Cloud Run services.")

# Six months later, same error:
suma_search(query="asyncio cloud run daemon thread crash")
# → Root cause retrieved instantly. Hours saved.

3. Multi-Agent Knowledge Fusion

Enterprise Safety

Immutable audit trail: suma_correct and suma_clean never delete data. Nodes are superseded and invisible to the API while preserved in storage for compliance.

Key Metrics (Live Production — April 2026)

Metric	Value
Compression ratio	94.7% — 801 nodes replace 15.2M tokens
Cost saved per org	$14.47 across 538 queries
K-WIL fidelity	96.3% — 26/27 facts recoverable from 5-node graph
Automated tests	118 (102 Playwright E2E + 16 pytest)

Pricing

Plan	Queries/month	Price
Starter	20,000	Free
Developer	100,000	$4.99/mo
Team	500,000	$29/mo
Enterprise	Unlimited	Contact

Get your key: sumapro.quadframe.work

© 2025–2026 Suman Addanke / A2 Vibe Creators LLC
US Patent applications pending — 6 filed (2025–2026). Unauthorized commercial use prohibited.

Suma Memory

SUMA Memory MCP

Install (30 seconds)

Onboard Your Project (2 minutes)

How It Works

Core Tools

Three Use Cases

1. Persistent Architecture Memory

2. Bug Root Cause Archive

3. Multi-Agent Knowledge Fusion

Enterprise Safety

Key Metrics (Live Production — April 2026)

Pricing

Configuration

Suma Memory

SUMA Memory MCP

Install (30 seconds)

Onboard Your Project (2 minutes)

How It Works

Core Tools

Three Use Cases

1. Persistent Architecture Memory

2. Bug Root Cause Archive

3. Multi-Agent Knowledge Fusion

Enterprise Safety

Key Metrics (Live Production — April 2026)

Pricing

Configuration

Related AI & LLM Tools MCP Servers

Related AI & LLM Tools MCP Servers