Openclaw Health Mcp

STDIOregistry active

Summary

Gives you a single tool call to check whether your AI agent deployment is actually healthy right now. Surfaces gateway status, CPU/RAM/swap usage, recent errors from journalctl, skill registry integrity, upgrade outcomes, cron state, and disk usage with a HEALTHY/DEGRADED/CRITICAL classification per component plus ranked critical findings. Built for the OpenClaw ecosystem but the linux-proc backend works on any host via psutil. Pairs with silentwatch-mcp for deeper cron monitoring. If you're running production agents on a VPS and want infrastructure observability inside the same Claude conversation that's running your agent, this slots in where Datadog feels like overkill.

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

openclaw-health-mcp

MCP server for AI agent deployment health — gateway status, CPU/RAM/swap, recent errors from journalctl/dmesg, skill-registry integrity, upgrade outcomes, cron + disk usage in a single tool call. Each component gets a HEALTHY/DEGRADED/CRITICAL classification, with overall rollup + ranked critical findings. Linux-proc backend works on any Linux/macOS/Windows host; OpenClaw operators get native ~/.openclaw/ parsing as a built-in reference implementation. Keywords: AI agent health, production AI monitoring, deployment readiness, MCP infrastructure observability.

What it does

Anyone running production AI agents needs a single tool that answers "is this deployment healthy right now?" without SSH'ing in to run six separate commands. The HN front-page thread Ask HN: How are you monitoring AI agents in production? (March 2026) made the gap explicit — the most-upvoted comments described:

"observability and governance cannot live inside the agent framework. They have to live in an independent execution layer" — the framework-level monitoring leaks audit trail when teams use multiple frameworks
"agent makes 10,000 correct $0.02 decisions that collectively don't make sense" — per-call rate limits miss systemic patterns
The gap that "actually hurts during post-mortems" — knowing whether a model drifted, context window failed, or tool misbehaved

Existing options (LangSmith, Langfuse, AgentShield, OTEL/LGTM) sit at the framework or proxy layer. openclaw-health-mcp sits one level closer to the agent runtime — read-only, local, MCP-native — surfacing infrastructure-layer health (gateway, CPU/RAM, recent errors, skill-registry, upgrade outcome, cron, disk) to the same Claude conversation that's running the agent. Works on any Linux/macOS/Windows host out of the box via the linux-proc backend; OpenClaw operators get an additional native backend that parses ~/.openclaw/ paths.

> claude: is my OpenClaw deployment healthy?
[MCP tool: health_overview]
overall_health: critical
component_summary:
  gateway: degraded         (bound to 0.0.0.0, 1 crash in 24h)
  resources: degraded       (memory at 78%, swap at 12%)
  skill_registry: critical  (skill 'clawhub-trending-bot-v2' flagged suspicious)
  upgrade: degraded         (last upgrade rolled back)
  cron: degraded            (1 overdue job)
  disk: degraded            (root at 82%, log dir +187 MB/24h)

critical_findings:
  [CRITICAL] Skill 'clawhub-trending-bot-v2' flagged — possible exfiltration. Disable.
  [DEGRADED] Last upgrade 2026.4.23→2026.4.26 rolled back: websocket_stalls, cpu_spike.
  [DEGRADED] Root disk at 82% — set up log rotation before reaching 95%.
  [DEGRADED] 1 cron job(s) overdue. Install silentwatch-mcp for silent-failure detection.

Why `openclaw-health-mcp`

Three things that existing tools (Datadog, Prometheus, raw top/free/df) don't do for OpenClaw specifically:

OpenClaw-aware probes. Detects 0.0.0.0-binding (the default-publicly-exposed misconfig per the 135k exposed-instances stat), parses ClawHub skill-registry diffs, recognizes named upgrade-regression patterns (websocket_stalls, cpu_spike post-2026.4.26), distinguishes intentional restarts from crashes.
MCP-native, no integration layer. Claude Desktop, Cline, Continue, OpenClaw agents — any MCP-aware client queries directly. No Grafana plugin, no API wrapper, no JSON to parse manually.
Composable with the rest of the production-AI MCP stack. Pairs with silentwatch-mcp (cron silent-failure detection — cron_health here is intentionally basic and defers to silentwatch when present). Skill-registry vetting in this server is light heuristics; deep static analysis goes in openclaw-skill-vetter-mcp (planned).

Built for the SMB self-hoster running OpenClaw on a $40 VPS where Datadog is overkill — but the OpenClaw-specific patterns are valuable on enterprise infra too.

Tool surface

The server registers these MCP tools (full spec in SPEC.md):

Tool	Returns
`health_overview`	Full snapshot — every component + overall HealthLevel + ranked critical findings
`gateway_status`	Gateway alive/dead, uptime, restarts, crashes, bind address
`cpu_memory_health`	CPU/memory/swap snapshot + 24h OOM count + load averages
`recent_errors(window_hours, min_severity)`	Recent error/warning entries, filterable by lookback + severity
`skill_registry_check`	Skill counts, recent additions/modifications, light heuristic flags
`last_upgrade_status`	From-version, to-version, outcome, regression markers, available upgrade
`cron_health`	Basic cron summary (defers to silentwatch-mcp when richer detection wanted)
`disk_usage`	Root disk + log directory size + 24h growth + largest log files

Resources:

health://overview — full snapshot (same as health_overview tool)
health://gateway — gateway-only
health://resources — CPU/memory-only

Prompts:

diagnose-degraded-health — diagnostic walk-through, ranked corrective actions
summarize-health-trend — daily operational digest

Quickstart

Install

pip install openclaw-health-mcp

Quick verify (~30 seconds, no config)

After install, run the bundled demo to see all 7 health checks fire against the mock backend:

openclaw-health-mcp-demo

You'll see a one-page health overview with gateway / CPU+memory / errors / skills / upgrade / cron / disk sections — typically a CRITICAL verdict driven by the mock backend's ClawHavoc-pattern skill exfiltration flag + post-rollback degradation. No external I/O, no API keys — safe to run anywhere. Useful first-30-seconds check before wiring up Claude Desktop or pointing at a real ~/.openclaw/ directory.

Configure for Claude Desktop

Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "openclaw-health": {
      "command": "python",
      "args": ["-m", "openclaw_health_mcp"],
      "env": {
        "OPENCLAW_HEALTH_BACKEND": "mock"
      }
    }
  }
}

Restart Claude Desktop. Test:

Show me a full health snapshot of my OpenClaw deployment.

The mock backend returns deliberately mixed data (gateway DEGRADED, skill registry CRITICAL, etc.) so the response demonstrates the full schema.

Backends

Backend	Status	Description
`mock`	✅ v1.0	Sample data for protocol-wiring verification (default)
`linux-proc`	✅ v1.0	psutil-based system metrics (CPU/memory/swap/load/disk) cross-platform; Linux-specific OOM-event detection via `journalctl`/`dmesg`; recent-error log parsing via journalctl. Returns UNKNOWN for OpenClaw-specific components (gateway, skill_registry, upgrade, cron) — those need the `openclaw` backend
`openclaw`	⏳ v1.1	Parses OpenClaw config + log directory + ClawHub manifest + upgrade journal

Select via OPENCLAW_HEALTH_BACKEND env var. Multi-backend support (federating linux-proc system metrics + openclaw application-specific) is planned for v1.2.

Roadmap

Version	Scope	Status
v0.1	Protocol wiring, mock backend, 8 tools / 3 resources / 2 prompts, 40 tests	✅
v1.0	`linux-proc` backend (psutil + journalctl/dmesg OOM detection + log parsing); GitHub Actions CI matrix; PyPI Trusted Publishing; MCP Registry submission; 59 tests	✅
v1.1	`openclaw` backend — parses OpenClaw config, log dir, ClawHub manifest, upgrade journal	⏳
v1.2	Backend federation (`linux-proc + openclaw`); expanded log sources	⏳
v1.x	`cowork` backend, custom backend SDK, webhook emitter for alerts	⏳

Need this adapted to your stack?

openclaw-health-mcp ships with a mock backend at v0.1 (Linux + OpenClaw backends in v0.2). If your AI agent runtime is different — Claude Code, Cowork, custom Python services, agent harnesses on AWS / GCP — and you want the same single-pane health visibility for it, that's a Custom MCP Build engagement.

Tier	Scope	Investment	Timeline
Simple	Single backend adapter for an existing runtime with documented logging/metrics	$8,000–$10,000	1–2 weeks
Standard	Custom backend + custom severity rules + integration with your existing alerting	$15,000–$20,000	2–4 weeks
Complex	Multi-backend federation + RBAC + audit-log integration + on-call workflow	$25,000–$35,000	4–8 weeks

To engage:

Email hello@temhan.dev with subject Custom MCP Build inquiry
Include: a 1-paragraph description of your stack + which tier you're considering
Reply within 2 business days with a 30-min discovery call slot

This server is part of a production-AI infrastructure MCP suite — companion to silentwatch-mcp (cron silent-failure detection) and the upcoming AI Production Discipline Framework Notion template (the methodology these tools operationalize).

Production AI audits

If you're running production AI and want an outside practitioner to score readiness, find the failure patterns already present, and write the corrective-action plan — that's what this MCP is built into supporting:

Tier	Scope	Investment	Timeline
Audit Lite	One system, top-5 findings, written report	$1,500	1 week
Audit Standard	Full audit, all 14 patterns, 5 Cs findings, 90-day follow-up	$3,000	2–3 weeks
Audit + Workshop	Standard audit + 2-day team workshop + first monthly audit included	$7,500	3–4 weeks

Same email channel: hello@temhan.dev with subject AI audit inquiry.

Contributing

PRs welcome. Backends are intentionally pluggable — see src/openclaw_health_mcp/backends/ for the contract.

To add a new backend:

Subclass HealthBackend in backends/<your_backend>.py
Implement the 7 abstract probe methods (one per component)
Register in backends/__init__.py
Add tests in tests/test_backend_<your_backend>.py

Bug reports + feature requests: open a GitHub issue.

License

MIT — see LICENSE.

Production-AI MCP Suite (Gumroad bundle) — this server plus 6 others in one curated 7-pack bundle with a decision tree, day-one drill, and Custom MCP Build CTA. $29.
silentwatch-mcp — cron silent-failure detection. Install alongside this server for richer cron_health data.
openclaw-cost-tracker-mcp — token-cost telemetry + 429 prediction (v1.1+)
openclaw-skill-vetter-mcp — ClawHub skill security vetting
openclaw-upgrade-orchestrator-mcp — read-only upgrade advisor + provider-side regression detection (v1.2+)
openclaw-output-vetter-mcp — agent claim verification (inline grounding-check + swallowed-exception scanner + multi-turn transcript review)
AI Production Discipline Framework — Notion template, $19 — the methodology these MCP tools implement.
SPEC.md — full server design.
Model Context Protocol — protocol overview.

Built by Temur Khan — production AI engineer. Contact: hello@temhan.dev

Featured

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

Registryactive

Packageopenclaw-health-mcp

TransportSTDIO

UpdatedMay 6, 2026

View on GitHub

openclaw-health-mcp

MCP server for AI agent deployment health — gateway status, CPU/RAM/swap, recent errors from journalctl/dmesg, skill-registry integrity, upgrade outcomes, cron + disk usage in a single tool call. Each component gets a HEALTHY/DEGRADED/CRITICAL classification, with overall rollup + ranked critical findings. Linux-proc backend works on any Linux/macOS/Windows host; OpenClaw operators get native ~/.openclaw/ parsing as a built-in reference implementation. Keywords: AI agent health, production AI monitoring, deployment readiness, MCP infrastructure observability.

What it does

"observability and governance cannot live inside the agent framework. They have to live in an independent execution layer" — the framework-level monitoring leaks audit trail when teams use multiple frameworks
"agent makes 10,000 correct $0.02 decisions that collectively don't make sense" — per-call rate limits miss systemic patterns
The gap that "actually hurts during post-mortems" — knowing whether a model drifted, context window failed, or tool misbehaved

> claude: is my OpenClaw deployment healthy?
[MCP tool: health_overview]
overall_health: critical
component_summary:
  gateway: degraded         (bound to 0.0.0.0, 1 crash in 24h)
  resources: degraded       (memory at 78%, swap at 12%)
  skill_registry: critical  (skill 'clawhub-trending-bot-v2' flagged suspicious)
  upgrade: degraded         (last upgrade rolled back)
  cron: degraded            (1 overdue job)
  disk: degraded            (root at 82%, log dir +187 MB/24h)

critical_findings:
  [CRITICAL] Skill 'clawhub-trending-bot-v2' flagged — possible exfiltration. Disable.
  [DEGRADED] Last upgrade 2026.4.23→2026.4.26 rolled back: websocket_stalls, cpu_spike.
  [DEGRADED] Root disk at 82% — set up log rotation before reaching 95%.
  [DEGRADED] 1 cron job(s) overdue. Install silentwatch-mcp for silent-failure detection.

Why `openclaw-health-mcp`

Three things that existing tools (Datadog, Prometheus, raw top/free/df) don't do for OpenClaw specifically:

OpenClaw-aware probes. Detects 0.0.0.0-binding (the default-publicly-exposed misconfig per the 135k exposed-instances stat), parses ClawHub skill-registry diffs, recognizes named upgrade-regression patterns (websocket_stalls, cpu_spike post-2026.4.26), distinguishes intentional restarts from crashes.
MCP-native, no integration layer. Claude Desktop, Cline, Continue, OpenClaw agents — any MCP-aware client queries directly. No Grafana plugin, no API wrapper, no JSON to parse manually.
Composable with the rest of the production-AI MCP stack. Pairs with silentwatch-mcp (cron silent-failure detection — cron_health here is intentionally basic and defers to silentwatch when present). Skill-registry vetting in this server is light heuristics; deep static analysis goes in openclaw-skill-vetter-mcp (planned).

Built for the SMB self-hoster running OpenClaw on a $40 VPS where Datadog is overkill — but the OpenClaw-specific patterns are valuable on enterprise infra too.

Tool surface

The server registers these MCP tools (full spec in SPEC.md):

Tool	Returns
`health_overview`	Full snapshot — every component + overall HealthLevel + ranked critical findings
`gateway_status`	Gateway alive/dead, uptime, restarts, crashes, bind address
`cpu_memory_health`	CPU/memory/swap snapshot + 24h OOM count + load averages
`recent_errors(window_hours, min_severity)`	Recent error/warning entries, filterable by lookback + severity
`skill_registry_check`	Skill counts, recent additions/modifications, light heuristic flags
`last_upgrade_status`	From-version, to-version, outcome, regression markers, available upgrade
`cron_health`	Basic cron summary (defers to silentwatch-mcp when richer detection wanted)
`disk_usage`	Root disk + log directory size + 24h growth + largest log files

Resources:

health://overview — full snapshot (same as health_overview tool)
health://gateway — gateway-only
health://resources — CPU/memory-only

Prompts:

diagnose-degraded-health — diagnostic walk-through, ranked corrective actions
summarize-health-trend — daily operational digest

Quickstart

Install

pip install openclaw-health-mcp

Quick verify (~30 seconds, no config)

After install, run the bundled demo to see all 7 health checks fire against the mock backend:

openclaw-health-mcp-demo

Configure for Claude Desktop

Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "openclaw-health": {
      "command": "python",
      "args": ["-m", "openclaw_health_mcp"],
      "env": {
        "OPENCLAW_HEALTH_BACKEND": "mock"
      }
    }
  }
}

Restart Claude Desktop. Test:

Show me a full health snapshot of my OpenClaw deployment.

The mock backend returns deliberately mixed data (gateway DEGRADED, skill registry CRITICAL, etc.) so the response demonstrates the full schema.

Backends

Backend	Status	Description
`mock`	✅ v1.0	Sample data for protocol-wiring verification (default)
`linux-proc`	✅ v1.0	psutil-based system metrics (CPU/memory/swap/load/disk) cross-platform; Linux-specific OOM-event detection via `journalctl`/`dmesg`; recent-error log parsing via journalctl. Returns UNKNOWN for OpenClaw-specific components (gateway, skill_registry, upgrade, cron) — those need the `openclaw` backend
`openclaw`	⏳ v1.1	Parses OpenClaw config + log directory + ClawHub manifest + upgrade journal

Select via OPENCLAW_HEALTH_BACKEND env var. Multi-backend support (federating linux-proc system metrics + openclaw application-specific) is planned for v1.2.

Roadmap

Version	Scope	Status
v0.1	Protocol wiring, mock backend, 8 tools / 3 resources / 2 prompts, 40 tests	✅
v1.0	`linux-proc` backend (psutil + journalctl/dmesg OOM detection + log parsing); GitHub Actions CI matrix; PyPI Trusted Publishing; MCP Registry submission; 59 tests	✅
v1.1	`openclaw` backend — parses OpenClaw config, log dir, ClawHub manifest, upgrade journal	⏳
v1.2	Backend federation (`linux-proc + openclaw`); expanded log sources	⏳
v1.x	`cowork` backend, custom backend SDK, webhook emitter for alerts	⏳

Need this adapted to your stack?

Tier	Scope	Investment	Timeline
Simple	Single backend adapter for an existing runtime with documented logging/metrics	$8,000–$10,000	1–2 weeks
Standard	Custom backend + custom severity rules + integration with your existing alerting	$15,000–$20,000	2–4 weeks
Complex	Multi-backend federation + RBAC + audit-log integration + on-call workflow	$25,000–$35,000	4–8 weeks

To engage:

Email hello@temhan.dev with subject Custom MCP Build inquiry
Include: a 1-paragraph description of your stack + which tier you're considering
Reply within 2 business days with a 30-min discovery call slot

Production AI audits

Tier	Scope	Investment	Timeline
Audit Lite	One system, top-5 findings, written report	$1,500	1 week
Audit Standard	Full audit, all 14 patterns, 5 Cs findings, 90-day follow-up	$3,000	2–3 weeks
Audit + Workshop	Standard audit + 2-day team workshop + first monthly audit included	$7,500	3–4 weeks

Same email channel: hello@temhan.dev with subject AI audit inquiry.

Contributing

PRs welcome. Backends are intentionally pluggable — see src/openclaw_health_mcp/backends/ for the contract.

To add a new backend:

Subclass HealthBackend in backends/<your_backend>.py
Implement the 7 abstract probe methods (one per component)
Register in backends/__init__.py
Add tests in tests/test_backend_<your_backend>.py

Bug reports + feature requests: open a GitHub issue.

License

MIT — see LICENSE.

Production-AI MCP Suite (Gumroad bundle) — this server plus 6 others in one curated 7-pack bundle with a decision tree, day-one drill, and Custom MCP Build CTA. $29.
silentwatch-mcp — cron silent-failure detection. Install alongside this server for richer cron_health data.
openclaw-cost-tracker-mcp — token-cost telemetry + 429 prediction (v1.1+)
openclaw-skill-vetter-mcp — ClawHub skill security vetting
openclaw-upgrade-orchestrator-mcp — read-only upgrade advisor + provider-side regression detection (v1.2+)
openclaw-output-vetter-mcp — agent claim verification (inline grounding-check + swallowed-exception scanner + multi-turn transcript review)
AI Production Discipline Framework — Notion template, $19 — the methodology these MCP tools implement.
SPEC.md — full server design.
Model Context Protocol — protocol overview.

Built by Temur Khan — production AI engineer. Contact: hello@temhan.dev

Openclaw Health Mcp

openclaw-health-mcp

What it does

Why `openclaw-health-mcp`

Tool surface

Quickstart

Install

Quick verify (~30 seconds, no config)

Configure for Claude Desktop

Backends

Roadmap

Need this adapted to your stack?

Production AI audits

Contributing

License

Related

Openclaw Health Mcp

openclaw-health-mcp

What it does

Why `openclaw-health-mcp`

Tool surface

Quickstart

Install

Quick verify (~30 seconds, no config)

Configure for Claude Desktop

Backends

Roadmap

Need this adapted to your stack?

Production AI audits

Contributing

License

Related

Openclaw Health Mcp

openclaw-health-mcp

What it does

Why openclaw-health-mcp

Tool surface

Quickstart

Install

Quick verify (~30 seconds, no config)

Configure for Claude Desktop

Backends

Roadmap

Need this adapted to your stack?

Production AI audits

Contributing

License

Related

Openclaw Health Mcp

openclaw-health-mcp

What it does

Why openclaw-health-mcp

Tool surface

Quickstart

Install

Quick verify (~30 seconds, no config)

Configure for Claude Desktop

Backends

Roadmap

Need this adapted to your stack?

Production AI audits

Contributing

License

Related

Why `openclaw-health-mcp`

Why `openclaw-health-mcp`