CAT
/MCP
SkillsMCPMarketplacesDigestToolsAdvertise

This week in Claude

Every Monday: Claude Code, Agent SDK, MCP, and the Anthropic platform moves worth your time.

Skills by Category
Frontend DevelopmentBackend & APIsTesting & QASecurityDevOps & CI/CDGit & Pull RequestsDocumentationCode Review & QualityAI & Agent BuildingSkill Development
MCP Servers by Category
Sales & MarketingWeb & Browser AutomationDatabasesAI & LLM ToolsCloud & InfrastructureCommunication & MessagingDeveloper ToolsDesign & CreativeDocuments & KnowledgeSearch & Web Crawling
Marketplaces by Category
AI Agents & OrchestrationLLM IntegrationDevelopment ToolsFrontend & UIBackend & APIsDatabasesTesting & Code QualityDevOps & CloudSecurity & ComplianceGit & Version Control

Cross AI Tools

Discover Claude Code plugins, extensions, and tools. Automatically updated directory of Anthropic Claude AI marketplaces with development tools, productivity plugins, and integrations.

Resources

  • Browse Skills
  • Browse MCP Servers
  • Browse Marketplaces
  • Plugins Reference

Community

  • About
  • Tools
  • Feedback
  • Privacy Policy
  • Advertise

Built for the Claude Code community with Claude Code by @mertduzgun

Independent project, not affiliated with Anthropic

Mini Data Engine Runtime Copilot

kroq86/data-engineering-runtime-lab
STDIOregistry active
Summary

A runtime diagnostics layer for local data engineering labs that wraps PostgreSQL-like and Databricks-like toy implementations with MCP tooling. Exposes 47 operations including explain_run for traced execution timelines, project_run_regression for full test suites with explainability, health_check and benchmark_calls for SLO tracking, and memory_search for incident retrieval. Built for teams running onboarding workshops or teaching storage internals who want runtime operations accessible through structured tooling instead of shell scripts. The explain layer treats both successful runs and controlled failure scenarios as first-class traced events, so you can inspect idempotency conflicts or concurrency storms the same way you'd review a prod incident.

CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →
CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →

Interactive Data Systems Lab

Runnable Python and Rust data-system internals with an MCP-native runtime copilot layer.

This repository combines two things:

  • a local-first engineering lab for PostgreSQL-like and Databricks-like internals,
  • a Runtime Copilot MCP surface for diagnostics, explainability, regression checks, and operational memory.

Core capabilities

  • Storage and planner internals: heap tables, B-tree indexing, selectivity, and plan choice.
  • Persistence and replay: WAL/checkpoint style flows and deterministic state transitions.
  • Workflow and write-path modeling: event-first architecture, idempotency, retry semantics.
  • Explainable runtime operations: traced runs, failure summaries, regression verdicts, baseline compare.
  • MCP access: machine-usable operational interface instead of ad hoc shell scripts.

Who this is for

  • Data engineers learning warehouse and query-engine internals.
  • Platform and infrastructure engineers teaching storage and execution fundamentals.
  • Teams building onboarding labs, workshops, and demo environments.

Quick start

Requirements:

  • Python 3.10+ (tested with Python 3.14)

Setup:

python3 -m venv .venv
source .venv/bin/activate
python -m pip install "psycopg[binary]"

Run end-to-end flow:

cargo run --bin e2e_flow

Run core demos:

.venv/bin/python mini_pg_like.py
.venv/bin/python mini_databricks_clone.py
cargo run --bin mini_pg_like
cargo run --bin mini_databricks_clone

What this repository contains

  • mini_pg_like.py: PostgreSQL-like toy engine with heap table, B-tree index, and planner output.
  • mini_databricks_clone.py: Databricks-like toy platform with versioning, partitions, DAGs, and events.
  • src/bin/mini_pg_like.rs: Rust PostgreSQL-like demo.
  • src/bin/mini_databricks_clone.rs: Rust Databricks-like demo.
  • src/lib.rs, src/common.rs, src/pg.rs: shared Rust core modules.
  • mcp_engine_server.py: MCP runtime adapter for diagnostics and regression workflows.

Why this exists

Most internals content stops at diagrams. This project stays runnable and inspectable:

  • compare Python and Rust implementations of the same system ideas,
  • trace write-path behavior with concrete events and state transitions,
  • run explainable regression checks through MCP,
  • turn runtime operations into a discoverable control surface for AI clients.

MCP Adapter Layer

This repo also includes a minimal MCP server that wraps the lab operations:

  • mcp_engine_server.py
  • Cursor config: .cursor/mcp.json

Current MCP tool list for this release (47 tools total):

Engine state and runtime:

  • init_engine
  • insert_row
  • upsert_row
  • create_index
  • explain_customer
  • reindex_project
  • run_e2e_flow

Explainability and demos:

  • explain_run
  • demo_explain_run
  • demo_explain_run_failure
  • demo_explain_semantic_failure
  • demo_explain_idempotency_conflict
  • demo_explain_concurrency_failure_storm
  • explain_regression_suite

Trace and retrieval:

  • record_tool_trace
  • similar_incidents
  • refresh_trace_path
  • refresh_docs_path
  • memory_upsert
  • memory_search

SLO and ROI:

  • health_check
  • benchmark_calls
  • scenario_load_test
  • capture_roi_baseline
  • report_drift_bug
  • decision_gate

Schema evaluation (verdict + report via MCP, no external Postgres):

  • schema_load_tool
  • schema_explain_tool
  • schema_evaluate_tool
  • schema_evaluate_full_tool

Project contract and regression:

  • project_manifest
  • project_capabilities
  • project_tool_catalog
  • project_get_defaults
  • project_run_regression
  • project_capture_baseline
  • project_compare_baseline

Generic project state:

  • project_list_entities
  • project_get_entity
  • project_upsert_entity
  • project_delete_entity
  • project_append_event
  • project_ingest_trace
  • project_explain_run
  • project_export_state

Generic heuristics:

  • project_list_heuristics
  • project_run_heuristic

For machine-readable discovery, prefer:

  • project_tool_catalog
  • project_get_defaults

Current heuristic profiles available through project_run_heuristic:

  • pain_structure
  • naive_bias
  • price_distribution
  • liquidity_signals
  • price_liquidity_matrix
  • cross_category
  • sale_format
  • speed_signals
  • trust_signals

If Cursor MCP auto-discovery is enabled, restart Cursor and connect mini-data-engine. Default MCP runtime data paths are under tests/artifacts/mcp/*.

Cursor approval setup (reduce repeated prompts)

If Cursor keeps asking for MCP or command approval on every call, apply this once:

  1. Enable workspace trust in Cursor user settings:
"security.workspace.trust.enabled": true
  1. In Cursor, open Settings -> Agents -> Auto-Run and set:

    • Auto-run mode: Run in Sandbox
    • MCP Allowlist: add mini-data-engine tools you use often
    • Command Allowlist: add frequently used safe commands
  2. Keep this repo opened as the same trusted workspace and reload the window once.

Notes:

  • MCP server approval and per-tool allowlist behavior are enforced by Cursor security settings.
  • In some Cursor versions, allowlist behavior can be best-effort and still prompt in edge cases.

Fastest way to see the new explainability use case in action through MCP:

demo_explain_run

That single tool call creates a traced run, records step-level events under one run_id, and returns an explanation with:

  • ordered timeline
  • tool path
  • total elapsed time
  • failure summary if anything breaks

You can then replay the same explanation directly with:

explain_run(run_id="...")

Explain Regression Suite

Use explain_regression_suite when you want regression checks to run through MCP and come back as explainable run summaries instead of isolated test output.

The suite drives the current validation surface through the MCP layer, attaches run_id traces, and returns explain output for each check so regressions can be inspected with the same mechanism used for runtime incidents.

It currently runs:

  • Python unit tests via python -m unittest discover
  • Rust tests via cargo test, including the current engine_cli integration tests
  • health_check
  • benchmark_calls
  • scenario_load_test
  • explainability control demos

The explainability demos intentionally include both positive and negative controls:

  • demo_explain_run as expected_success
  • demo_explain_run_failure as expected_failure
  • demo_explain_semantic_failure as expected_failure
  • demo_explain_idempotency_conflict as expected_failure
  • demo_explain_concurrency_failure_storm as expected_failure

That means the suite is not only checking that the happy path stays green. It also checks that the explain layer still classifies and summarizes known failure classes correctly.

The current regression surface covers:

  • happy-path traced execution
  • runtime/path failures
  • semantic data validation failures
  • idempotency conflict failures
  • concurrency and failure-storm control scenarios
  • sampled benchmark and scenario SLO regressions

Fastest MCP call for the full regression bundle:

explain_regression_suite

Use it as the top-level MCP regression entrypoint when you want one answer that includes:

  • which checks passed
  • which failures were expected controls
  • explain summaries for each traced run
  • early signals that a latency or behavior regression appeared

The MCP layer is an access interface, not the core product idea. The core of the repository is the runnable lab itself.

Product note:

  • PRODUCT_NOTE_RUNTIME_EXPLAINABILITY.md Short note describing the runtime explainability use case, the required signals, and the explain_run MVP.
  • PRODUCT_NOTE_RUNTIME_COPILOT.md Product framing for Runtime Copilot as an MCP-native operational brain.
  • EXPLAIN_REGRESSION_SUITE_FEASIBILITY.md Short article describing what this repository validated about explain-first regression suites and where the current denominator still stays narrow.

Use in Codex:

  • skill package: codex/skills/runtime-copilot/SKILL.md
  • automation examples: codex/automations
  • guide: docs/use-in-codex.md

Run persistent engine CLI (productization path):

# Initialize storage
cargo run --bin engine_cli -- init ./tests/artifacts/engine/data orders

# Insert and upsert (WAL append)
cargo run --bin engine_cli -- insert ./tests/artifacts/engine/data orders 1 4242 50
cargo run --bin engine_cli -- upsert ./tests/artifacts/engine/data orders 1 4242 55

# Build index and explain
cargo run --bin engine_cli -- index ./tests/artifacts/engine/data orders
cargo run --bin engine_cli -- explain ./tests/artifacts/engine/data orders 4242

# Write snapshot and truncate WAL
cargo run --bin engine_cli -- checkpoint ./tests/artifacts/engine/data orders

# Transaction simulation: begin/commit/rollback semantics,
# per-table write lock, snapshot read, and conflict detection
cargo run --bin engine_cli -- tx-demo ./tests/artifacts/engine/data orders

# Crash/restart recovery for transaction journals
cargo run --bin engine_cli -- tx-recovery-list ./tests/artifacts/engine/data orders
cargo run --bin engine_cli -- tx-recovery-commit ./tests/artifacts/engine/data orders <tx_id>
cargo run --bin engine_cli -- tx-recovery-rollback ./tests/artifacts/engine/data orders <tx_id>

What You Should See

  • In mini_pg_like.py: selective predicate switches to Index Scan; non-selective stays Seq Scan.
  • In mini_databricks_clone.py: layer-by-layer demo output, workflow DAG order/metrics, and canonical events count from the single write path.
  • In mini_pg_like (Rust): same planner behavior with shared core modules.
  • In mini_databricks_clone (Rust): same layered demo using shared Rust library code.
  • In engine_cli (Rust): persistent snapshot + WAL replay flow with simple operational commands.
  • In engine_cli tx-demo: explicit transaction scopes, snapshot reads, per-table write lock, and concurrent upsert conflict detection.
  • In engine_cli tx-recovery-*: staged transaction operations survive process restarts via per-transaction journal files and can be committed or rolled back explicitly.
  • In e2e_flow: one command runs write path, checkpoint, bronze->silver transform, planner explain, and DuckDB SQL validation on persisted data.

Technical Design Backbone

TECHNICAL_DESIGN_GENERIC.md captures the architectural discipline behind the code:

  • cross-layer reasoning (Idea -> API -> Runtime -> Storage -> Perf),
  • deterministic state transitions,
  • event-first design,
  • adapter contracts,
  • DAG-driven orchestration,
  • measurable go/no-go criteria.

It is not a separate product claim. It is the review and implementation spine used across the lab.

Docker package

Use the local build (recommended for development)

Build the image from this repo so MCP uses your local code (including schema tools) instead of the GitHub image:

./scripts/docker-build-local.sh

This builds mini-data-engine:local. To drive another project (e.g. threads) with this MCP, set Cursor MCP to use the local image and mount that project as workspace:

  • Copy .cursor/mcp.docker.local.json into your project’s .cursor/mcp.json (or merge the mcpServers entry into your Cursor user config).
  • Open the project you want to drive (e.g. threads). ${workspaceFolder} will be that project; the container gets WORKSPACE_ROOT=/workspace and your project mounted at /workspace, so e.g. schema_path="schema.sql" resolves to that project’s file.

Example local config (uses mini-data-engine:local and mounts current workspace as /workspace):

{
  "mcpServers": {
    "mini-data-engine": {
      "command": "docker",
      "args": [
        "run", "--rm", "-i",
        "-e", "WORKSPACE_ROOT=/workspace",
        "-v", "${workspaceFolder}:/workspace",
        "-v", "${workspaceFolder}/tests/artifacts:/app/tests/artifacts",
        "mini-data-engine:local"
      ]
    }
  }
}

Use the published image (GHCR)

Image is published to GHCR:

  • ghcr.io/kroq86/data-engineering-runtime-lab:latest

Pull:

docker pull ghcr.io/kroq86/data-engineering-runtime-lab:latest

Use in Cursor MCP config (example):

{
  "mcpServers": {
    "mini-data-engine": {
      "command": "docker",
      "args": [
        "run",
        "--rm",
        "-i",
        "-v",
        "${workspaceFolder}/tests/artifacts:/app/tests/artifacts",
        "ghcr.io/kroq86/data-engineering-runtime-lab:latest"
      ]
    }
  }
}

Loom stack

MCP surface for loom-ops and ops runbooks. Ecosystem: ECOSYSTEM.md

pip install ops-runtime-mcp
ops-runtime-mcp   # stdio MCP; see docs/use-in-codex.md
Featured
CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →
Categories
AI & LLM ToolsData & Analytics
Registryactive
Packageghcr.io/kroq86/data-engineering-runtime-lab:0.1.12
TransportSTDIO
UpdatedMar 16, 2026
View on GitHub

Related AI & LLM Tools MCP Servers

View all →
SkillFM LLM Cost Optimizer

io.github.ericm1018/skillfm-llm-cost-optimizer-openai-anthropic-usage

LLM cost optimizer for OpenAI, Anthropic, token usage, BYOK, and SkillFM Beacon audits.
Llm Orchestration Agent

io.github.mikerawsonnz/llm-orchestration-agent

Run a prompt through a LangChain (system + human) chain over Gemini on Vertex AI; optional LangSmith
Authenticated Llm Agent

io.github.mikerawsonnz/authenticated-llm-agent

JWT-gated LLM gateway: authenticate (bcrypt/JWT), then run a LangChain-on-Vertex Gemini completion.
Copilot Memory MCP

labforgedev/copilot-memory-mcp

Persistent semantic memory for AI agents using local ChromaDB vector search. No cloud required.
1
Agent Prompt Injection Firewall Mcp

csoai-org/agent-prompt-injection-firewall-mcp

The WAF for agents. Pattern-based + heuristic firewall scans prompts, RAG documents, tool argume...
Authenticated Multi Llm Agent

io.github.mikerawsonnz/authenticated-multi-llm-agent

Google-OAuth-gated LLM gateway: verify a Google ID token, then run a Gemini (Vertex AI) completion f