Logic Server

2authSTDIOregistry active

Summary

A logic engine that sends only changed facts between conversation turns instead of replaying full history. Extracts structured predicates from natural language, runs backward-chaining inference to find goal-reachable facts, and returns token-optimized deltas. Exposes `/context` for initial reduction, `/memory/context` for goal-driven queries, and `/context/diff` for incremental updates. Built on Hexastore indexing with truth maintenance, so retracting a fact auto-retracts downstream inferences. Works as an MCP server, REST API, or SDK. Useful when you're burning tokens on repeated tool outputs or state replays and need deterministic, reproducible compression without vector search or summarization drift.

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

NocturnusAI

The context engineering engine for AI agents: send only what changed.

NocturnusAI — Context Engineering Engine for AI Agents

Before / After

# ❌ Without NocturnusAI — replay everything, every turn
messages = system_prompt + full_history + tool_outputs   # ~1,259 tokens/turn
response = llm(messages)                                 # $13,600/mo at scale

# ✅ With NocturnusAI — send only what changed
ctx = nocturnus.process_turns(raw_turns)                 # extract → infer → delta
messages = system_prompt + ctx.briefing_delta             # ~221 tokens/turn
response = llm(messages)                                 # $2,400/mo. Same accuracy.

The Numbers

Measured on live APIs. 15-turn product support conversation. Real usage.input_tokens counts. Run it yourself.

	Naive replay	RAG-optimized	NocturnusAI
Tokens per turn	~1,259	~800	~221
Cost per month (1K req/hr, Opus 4, $15/1M)	$13,600	$12,000	$2,400
Latency	high	medium	low
Truth-preserving	no	no	yes

Claude Opus 4: 5.7× reduction. Gemini 2.0 Flash: 10.0×. Full calculations.

Install

pip install nocturnusai          # Python
npm install nocturnusai-sdk      # TypeScript
docker run -p 9300:9300 ghcr.io/auctalis/nocturnusai:latest  # Docker

Or use the setup wizard:

curl -fsSL https://raw.githubusercontent.com/Auctalis/nocturnusai/main/install.sh | bash

Why Developers Star This Repo

Reproducible token reduction — benchmark in the repo, methodology published, run it against your own workload
Deterministic inference — same query, same result, every time. No embedding drift, no cosine similarity lottery
Truth maintenance — retract a fact, all derived conclusions auto-retract. No stale context, no hallucination on operational state
Plugs into existing stacks — LangChain, LlamaIndex, CrewAI, AutoGen, MCP, Vercel AI SDK, OpenAI Agents SDK, Mastra
Benchmarkable against naive replay — numbers derived, not invented. Every claim traces to a notebook cell

Framework Quickstarts

Framework	Integration	Link
LangChain / LangGraph	Drop-in `NocturnusContextProvider`, LangSmith trace pass-through	Docs
CrewAI	Task-scoped context per agent role	Docs
AutoGen	Context server callable by any agent	Docs
MCP	Spec-compliant server for Claude Desktop, Cursor, Continue	Config
OpenAI Agents SDK	Context middleware, no tool modifications	Docs
Vercel AI SDK	Edge-compatible adapter for Next.js, Nuxt, SvelteKit	Docs
Python SDK	`pip install nocturnusai`	Docs
TypeScript SDK	`npm install nocturnusai-sdk`	Docs

How It Works

Three steps. Every turn.

Extract — raw conversation turns → structured facts via LLM extraction
Infer — backward-chaining logical inference finds only the facts reachable from the agent's current goal
Return the delta — a briefingDelta containing only what changed since the last turn

This is not vector search. It is not summarization. It is deterministic inference on a logic engine — Hexastore indexing, backward chaining, and truth maintenance.

The Working Loop

LLM required for natural-language turns. The examples below send raw text turns through an LLM to extract structured facts. If you start the server without an LLM provider, natural-language turns will return zero facts. See Quick Start for setup options, or use predicate syntax (e.g., "customer_tier(acme_corp, enterprise)") which works without any LLM.

1. First reduction: `POST /context`

curl -X POST http://localhost:9300/context \
  -H 'Content-Type: application/json' \
  -H 'X-Tenant-ID: default' \
  -d '{
    "turns": [
      "user: Customer says they are enterprise and blocked on SLA credits.",
      "tool: CRM says account is Acme Corp with a 2M ARR contract.",
      "agent: Last week support promised to review SLA eligibility.",
      "tool: Billing note says renewal is due next month."
    ],
    "maxFacts": 12
  }'

2. Goal-driven pass: `POST /memory/context`

curl -X POST http://localhost:9300/memory/context \
  -H 'Content-Type: application/json' \
  -H 'X-Tenant-ID: default' \
  -d '{
    "goals": [{"predicate":"eligible_for_sla","args":["acme_corp"]}],
    "maxFacts": 12,
    "sessionId": "ticket-42"
  }'

3. Later turns: `POST /context/diff`

curl -X POST http://localhost:9300/context/diff \
  -H 'Content-Type: application/json' \
  -H 'X-Tenant-ID: default' \
  -d '{"sessionId": "ticket-42", "maxFacts": 12}'

Returns only added and removed entries between snapshots.

4. End of thread: `POST /context/session/clear`

curl -X POST http://localhost:9300/context/session/clear \
  -H 'Content-Type: application/json' \
  -H 'X-Tenant-ID: default' \
  -d '{"sessionId":"ticket-42"}'

Choose Your Surface

Python SDK

from nocturnusai import SyncNocturnusAIClient

with SyncNocturnusAIClient("http://localhost:9300") as client:
    ctx = client.process_turns(
        turns=[
            "user: Customer says they are enterprise and blocked on SLA credits.",
            "tool: CRM says account is Acme Corp with a 2M ARR contract.",
        ],
        scope="ticket-42",
        session_id="ticket-42",
    )

    diff = client.diff_context(session_id="ticket-42", max_facts=12)
    client.clear_context_session("ticket-42")

    print(ctx.briefing_delta)

TypeScript SDK

import { NocturnusAIClient } from 'nocturnusai-sdk';

const client = new NocturnusAIClient({
  baseUrl: 'http://localhost:9300',
  tenantId: 'default',
});

const ctx = await client.processTurns({
  turns: [
    'user: Customer says they are enterprise and blocked on SLA credits.',
    'tool: CRM says account is Acme Corp with a 2M ARR contract.',
  ],
  scope: 'ticket-42',
  sessionId: 'ticket-42',
});

const diff = await client.diffContext({ sessionId: 'ticket-42', maxFacts: 12 });
await client.clearContextSession('ticket-42');
console.log(ctx.briefingDelta);

MCP

{
  "mcpServers": {
    "nocturnus": {
      "url": "http://localhost:9300/mcp/sse",
      "transport": "sse"
    }
  }
}

Use the context tool each turn for a salience-ranked working set. Pair MCP with the HTTP context endpoints when you need goal-driven assembly and diffs.

What Lives Behind The Workflow

When you do need backend mechanics, NocturnusAI provides them:

Deterministic fact and rule storage
Backward-chaining inference with proof chains
Truth maintenance and contradiction handling
Temporal facts with ttl, validFrom, and validUntil
Multi-tenancy via X-Database and X-Tenant-ID
MCP, REST, Python SDK, TypeScript SDK, and CLI surfaces over the same engine

Quick Start

Docker (fastest)

docker run -d --name nocturnusai -p 9300:9300 \
  --restart unless-stopped \
  -v nocturnusai-data:/data \
  ghcr.io/auctalis/nocturnusai:latest

curl http://localhost:9300/health   # Verify it's running

Docker with Ollama (enables natural-language extraction)

docker run -d --name nocturnusai -p 9300:9300 \
  --add-host=host.docker.internal:host-gateway \
  -e LLM_PROVIDER=ollama \
  -e LLM_MODEL=granite3.3:8b \
  -e LLM_BASE_URL=http://host.docker.internal:11434/v1 \
  -e EXTRACTION_ENABLED=true \
  ghcr.io/auctalis/nocturnusai:latest

From this repo

make up-ollama && make smoke

CLI

nocturnusai                                # Interactive REPL
nocturnusai -e "context 10"               # Salience-ranked working set
nocturnusai -e "compress"                 # POST /memory/compress
nocturnusai -e "cleanup 0.05"             # POST /memory/cleanup

Documentation

Full docs: nocturnus.ai


Start Here	The turn-reduction workflow
Context Workflow	Raw turns → optimize → diff → clear
API Reference	REST endpoints and response shapes
SDKs	Python and TypeScript client methods
Integrations	LangChain, CrewAI, AutoGen, MCP, and more
Benchmark	Measured token reduction on live APIs
Calculations	Every number, derived
How It Works	The extraction → inference → delta pipeline

Docker Compose (advanced)

git clone https://github.com/Auctalis/nocturnusai.git && cd nocturnusai

make up                                        # Server using .env.example defaults
make up-ollama                                 # + Ollama (reuses host or starts bundled)
make up-monitoring                             # + Prometheus + Grafana
make smoke                                     # Verify health + context endpoint

Build from Source

Requires JDK 17+.

./gradlew :nocturnusai-server:run              # HTTP server on :9300
./gradlew :nocturnusai-cli:run                 # Interactive REPL (JVM)
./gradlew :nocturnusai-cli:nativeCompile       # Build native binary
./gradlew test                                 # Full test suite

Contributing

See CONTRIBUTING.md. Issues labelled good first issue are good entry points.

Security

Report vulnerabilities privately via GitHub Security Advisories. See SECURITY.md.

License

Business Source License 1.1 (SPDX: BUSL-1.1). Free for internal use — including internal production — inside your own organization. Offering NocturnusAI or substantial functionality as a product/hosted service to third parties requires a commercial license (licensing@nocturnus.ai). Converts to Apache 2.0 on 2030-02-19. See LICENSE and DISCLAIMER.md.

LEGAL & SAFETY NOTICE

NocturnusAI is a deterministic reasoning engine, but its output is only as reliable as the facts provided to it.

No Warranty of Truth. "Verified" refers to logical consistency of inference, not accuracy of real-world claims.

Not for Autonomous High-Stakes Decisions. Do not use this engine for unsupervised medical, financial, legal, or physical-safety decisions without an independent human verification step.

Logic Layer Only. NocturnusAI provides information and inference; it does not execute actions.

No Liability. See DISCLAIMER.md and LICENSE.

Featured

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

Configuration

NOCTURNUSAI_URL

Base URL of the NocturnusAI server

NOCTURNUSAI_API_KEYsecret

API key for authenticated access

NOCTURNUSAI_DATABASE

Database name to use

NOCTURNUSAI_TENANT

Tenant ID for multi-tenant isolation

NocturnusAI

The context engineering engine for AI agents: send only what changed.

NocturnusAI — Context Engineering Engine for AI Agents

Before / After

# ❌ Without NocturnusAI — replay everything, every turn
messages = system_prompt + full_history + tool_outputs   # ~1,259 tokens/turn
response = llm(messages)                                 # $13,600/mo at scale

# ✅ With NocturnusAI — send only what changed
ctx = nocturnus.process_turns(raw_turns)                 # extract → infer → delta
messages = system_prompt + ctx.briefing_delta             # ~221 tokens/turn
response = llm(messages)                                 # $2,400/mo. Same accuracy.

The Numbers

Measured on live APIs. 15-turn product support conversation. Real usage.input_tokens counts. Run it yourself.

	Naive replay	RAG-optimized	NocturnusAI
Tokens per turn	~1,259	~800	~221
Cost per month (1K req/hr, Opus 4, $15/1M)	$13,600	$12,000	$2,400
Latency	high	medium	low
Truth-preserving	no	no	yes

Claude Opus 4: 5.7× reduction. Gemini 2.0 Flash: 10.0×. Full calculations.

Install

pip install nocturnusai          # Python
npm install nocturnusai-sdk      # TypeScript
docker run -p 9300:9300 ghcr.io/auctalis/nocturnusai:latest  # Docker

Or use the setup wizard:

curl -fsSL https://raw.githubusercontent.com/Auctalis/nocturnusai/main/install.sh | bash

Why Developers Star This Repo

Reproducible token reduction — benchmark in the repo, methodology published, run it against your own workload
Deterministic inference — same query, same result, every time. No embedding drift, no cosine similarity lottery
Truth maintenance — retract a fact, all derived conclusions auto-retract. No stale context, no hallucination on operational state
Plugs into existing stacks — LangChain, LlamaIndex, CrewAI, AutoGen, MCP, Vercel AI SDK, OpenAI Agents SDK, Mastra
Benchmarkable against naive replay — numbers derived, not invented. Every claim traces to a notebook cell

Framework Quickstarts

Framework	Integration	Link
LangChain / LangGraph	Drop-in `NocturnusContextProvider`, LangSmith trace pass-through	Docs
CrewAI	Task-scoped context per agent role	Docs
AutoGen	Context server callable by any agent	Docs
MCP	Spec-compliant server for Claude Desktop, Cursor, Continue	Config
OpenAI Agents SDK	Context middleware, no tool modifications	Docs
Vercel AI SDK	Edge-compatible adapter for Next.js, Nuxt, SvelteKit	Docs
Python SDK	`pip install nocturnusai`	Docs
TypeScript SDK	`npm install nocturnusai-sdk`	Docs

How It Works

Three steps. Every turn.

Extract — raw conversation turns → structured facts via LLM extraction
Infer — backward-chaining logical inference finds only the facts reachable from the agent's current goal
Return the delta — a briefingDelta containing only what changed since the last turn

This is not vector search. It is not summarization. It is deterministic inference on a logic engine — Hexastore indexing, backward chaining, and truth maintenance.

The Working Loop

LLM required for natural-language turns. The examples below send raw text turns through an LLM to extract structured facts. If you start the server without an LLM provider, natural-language turns will return zero facts. See Quick Start for setup options, or use predicate syntax (e.g., "customer_tier(acme_corp, enterprise)") which works without any LLM.

1. First reduction: `POST /context`

curl -X POST http://localhost:9300/context \
  -H 'Content-Type: application/json' \
  -H 'X-Tenant-ID: default' \
  -d '{
    "turns": [
      "user: Customer says they are enterprise and blocked on SLA credits.",
      "tool: CRM says account is Acme Corp with a 2M ARR contract.",
      "agent: Last week support promised to review SLA eligibility.",
      "tool: Billing note says renewal is due next month."
    ],
    "maxFacts": 12
  }'

2. Goal-driven pass: `POST /memory/context`

curl -X POST http://localhost:9300/memory/context \
  -H 'Content-Type: application/json' \
  -H 'X-Tenant-ID: default' \
  -d '{
    "goals": [{"predicate":"eligible_for_sla","args":["acme_corp"]}],
    "maxFacts": 12,
    "sessionId": "ticket-42"
  }'

3. Later turns: `POST /context/diff`

curl -X POST http://localhost:9300/context/diff \
  -H 'Content-Type: application/json' \
  -H 'X-Tenant-ID: default' \
  -d '{"sessionId": "ticket-42", "maxFacts": 12}'

Returns only added and removed entries between snapshots.

4. End of thread: `POST /context/session/clear`

curl -X POST http://localhost:9300/context/session/clear \
  -H 'Content-Type: application/json' \
  -H 'X-Tenant-ID: default' \
  -d '{"sessionId":"ticket-42"}'

Choose Your Surface

Python SDK

from nocturnusai import SyncNocturnusAIClient

with SyncNocturnusAIClient("http://localhost:9300") as client:
    ctx = client.process_turns(
        turns=[
            "user: Customer says they are enterprise and blocked on SLA credits.",
            "tool: CRM says account is Acme Corp with a 2M ARR contract.",
        ],
        scope="ticket-42",
        session_id="ticket-42",
    )

    diff = client.diff_context(session_id="ticket-42", max_facts=12)
    client.clear_context_session("ticket-42")

    print(ctx.briefing_delta)

TypeScript SDK

import { NocturnusAIClient } from 'nocturnusai-sdk';

const client = new NocturnusAIClient({
  baseUrl: 'http://localhost:9300',
  tenantId: 'default',
});

const ctx = await client.processTurns({
  turns: [
    'user: Customer says they are enterprise and blocked on SLA credits.',
    'tool: CRM says account is Acme Corp with a 2M ARR contract.',
  ],
  scope: 'ticket-42',
  sessionId: 'ticket-42',
});

const diff = await client.diffContext({ sessionId: 'ticket-42', maxFacts: 12 });
await client.clearContextSession('ticket-42');
console.log(ctx.briefingDelta);

MCP

{
  "mcpServers": {
    "nocturnus": {
      "url": "http://localhost:9300/mcp/sse",
      "transport": "sse"
    }
  }
}

Use the context tool each turn for a salience-ranked working set. Pair MCP with the HTTP context endpoints when you need goal-driven assembly and diffs.

What Lives Behind The Workflow

When you do need backend mechanics, NocturnusAI provides them:

Deterministic fact and rule storage
Backward-chaining inference with proof chains
Truth maintenance and contradiction handling
Temporal facts with ttl, validFrom, and validUntil
Multi-tenancy via X-Database and X-Tenant-ID
MCP, REST, Python SDK, TypeScript SDK, and CLI surfaces over the same engine

Quick Start

Docker (fastest)

docker run -d --name nocturnusai -p 9300:9300 \
  --restart unless-stopped \
  -v nocturnusai-data:/data \
  ghcr.io/auctalis/nocturnusai:latest

curl http://localhost:9300/health   # Verify it's running

Docker with Ollama (enables natural-language extraction)

docker run -d --name nocturnusai -p 9300:9300 \
  --add-host=host.docker.internal:host-gateway \
  -e LLM_PROVIDER=ollama \
  -e LLM_MODEL=granite3.3:8b \
  -e LLM_BASE_URL=http://host.docker.internal:11434/v1 \
  -e EXTRACTION_ENABLED=true \
  ghcr.io/auctalis/nocturnusai:latest

From this repo

make up-ollama && make smoke

CLI

nocturnusai                                # Interactive REPL
nocturnusai -e "context 10"               # Salience-ranked working set
nocturnusai -e "compress"                 # POST /memory/compress
nocturnusai -e "cleanup 0.05"             # POST /memory/cleanup

Documentation

Full docs: nocturnus.ai


Start Here	The turn-reduction workflow
Context Workflow	Raw turns → optimize → diff → clear
API Reference	REST endpoints and response shapes
SDKs	Python and TypeScript client methods
Integrations	LangChain, CrewAI, AutoGen, MCP, and more
Benchmark	Measured token reduction on live APIs
Calculations	Every number, derived
How It Works	The extraction → inference → delta pipeline

Docker Compose (advanced)

git clone https://github.com/Auctalis/nocturnusai.git && cd nocturnusai

make up                                        # Server using .env.example defaults
make up-ollama                                 # + Ollama (reuses host or starts bundled)
make up-monitoring                             # + Prometheus + Grafana
make smoke                                     # Verify health + context endpoint

Build from Source

Requires JDK 17+.

./gradlew :nocturnusai-server:run              # HTTP server on :9300
./gradlew :nocturnusai-cli:run                 # Interactive REPL (JVM)
./gradlew :nocturnusai-cli:nativeCompile       # Build native binary
./gradlew test                                 # Full test suite

Contributing

See CONTRIBUTING.md. Issues labelled good first issue are good entry points.

Security

Report vulnerabilities privately via GitHub Security Advisories. See SECURITY.md.

License

LEGAL & SAFETY NOTICE

NocturnusAI is a deterministic reasoning engine, but its output is only as reliable as the facts provided to it.

No Warranty of Truth. "Verified" refers to logical consistency of inference, not accuracy of real-world claims.

Not for Autonomous High-Stakes Decisions. Do not use this engine for unsupervised medical, financial, legal, or physical-safety decisions without an independent human verification step.

Logic Layer Only. NocturnusAI provides information and inference; it does not execute actions.

No Liability. See DISCLAIMER.md and LICENSE.

Logic Server

NocturnusAI

Before / After

The Numbers

Install

Why Developers Star This Repo

Framework Quickstarts

How It Works

The Working Loop

1. First reduction: POST /context

2. Goal-driven pass: POST /memory/context

3. Later turns: POST /context/diff

4. End of thread: POST /context/session/clear

Choose Your Surface

What Lives Behind The Workflow

Quick Start

Docker (fastest)

Docker with Ollama (enables natural-language extraction)

From this repo

CLI

Documentation

Docker Compose (advanced)

Build from Source

Contributing

Security

License

Configuration

Logic Server

NocturnusAI

Before / After

The Numbers

Install

Why Developers Star This Repo

Framework Quickstarts

How It Works

The Working Loop

1. First reduction: POST /context

2. Goal-driven pass: POST /memory/context

3. Later turns: POST /context/diff

4. End of thread: POST /context/session/clear

Choose Your Surface

What Lives Behind The Workflow

Quick Start

Docker (fastest)

Docker with Ollama (enables natural-language extraction)

From this repo

CLI

Documentation

Docker Compose (advanced)

Build from Source

Contributing

Security

License

Configuration

Related AI & LLM Tools MCP Servers

Related AI & LLM Tools MCP Servers

1. First reduction: `POST /context`

2. Goal-driven pass: `POST /memory/context`

3. Later turns: `POST /context/diff`

4. End of thread: `POST /context/session/clear`

1. First reduction: `POST /context`

2. Goal-driven pass: `POST /memory/context`

3. Later turns: `POST /context/diff`

4. End of thread: `POST /context/session/clear`