Built for teams running local or airgapped models who need browser automation, desktop control, and code workflows without sending credentials to cloud APIs. Ships 52 capability packs as typed JSON tools that 7B–30B models can call reliably: browser sessions, git operations, vision tasks, and multi-step pipelines all wrapped behind single MCP calls. The pitch is cost and control: a six-step code edit loop runs for $0.07 on gpt-oss-120b versus $0.30+ on Sonnet, and everything stays in your VPC. Includes routing memory, intent decomposition, and a management UI. If you're optimizing for small model success rates or can't let an LLM see your SaaS tokens, this is the stack.
Today's helmdeck install ran a full 6-step code-edit loop (clone, read, patch, test, commit, push) on
gpt-oss-120bfor $0.07. The same loop on Cursor or Claude Code direct via Sonnet would have cost $0.30+. Same outcome, ~5× cheaper — and the "expensive" stack isn't even the most expensive option.
| Workflow | Frontier-model approach | Helmdeck (gpt-oss-120b) |
|---|---|---|
| Browser scrape + GitHub comment | $0.25 (Anthropic Computer Use) | $0.005 |
| Code edit loop (6 steps) | $0.35 (Cursor / Aider) | $0.07 |
| Multi-step browser test | $0.20 (Browser-use NL) | $0.03 |
| PDF → structured Markdown | $1.00 (naive Sonnet vision) | $0.003 |
Most browser agents require GPT-4o or Claude Sonnet to work reliably. Helmdeck is built for the other 99% of deployments — local 7B models, air-gapped environments, and teams that can't send credentials to a cloud API. It wraps every browser, desktop, git, and code action into a single typed JSON call that even a small model can fill in correctly. The numbers above are the consequence: when packs absorb the work the LLM would otherwise burn tokens rediscovering, cheap or local models do agentic work that frontier-model APIs charge 10× more for.
A self-hosted, containerized platform for AI agents, exposed as Capability Packs — schema-validated, one-shot JSON tools — and native MCP. The defining metric is ≥90% pack success on 7B–30B-class open-weight models, something no frontier-targeting competitor is optimizing for.
📊 Full per-task comparison with reproduction recipe at https://helmdeck.dev/explanation/why-helmdeck. These are one maintainer's findings; we welcome community reproductions.
Smart models thrive on bash and a README. Weak models stall on open-ended interfaces. Helmdeck closes that gap by hiding browser sessions, desktop actions, credentials, and multi-step workflows behind single typed REST / MCP calls.
Three audiences specifically:
v0.22.0 shipped — agents that work on free models, with memory. The release closes ADRs 047–050:
helmdeck.route meta-pack recommends the best pack/pipeline for an intent (with structured gap warnings when nothing fits), backed by per-caller learned defaults surfaced through the helmdeck://routing-guide and helmdeck://my-defaults MCP resources and a Routing Memory management UI.helmdeck.memory_store persists durable user facts (read back via helmdeck://my-memory), an optional embedding sidecar powers OpenClaw's memory_search, and a QMD corpus bridge exposes helmdeck memory to OpenClaw.helmdeck.plan turns a multi-action prompt into an ordered, pipeline-aware step plan plus a rewritten_prompt.internal/llmcontext compacts catalog-heavy prompts to fit small-model context budgets (tiered per-model budgets, cascading select + lexical rank, optional two-pass filter), surfaced through helmdeck://context-budgets and helmdeck://my-plans.57 capability packs ship in the control-plane binary (47 without an AI gateway configured), alongside 21 built-in pipelines, a community pack marketplace (helmdeck pack install <name>), and operator-supplied cmd.* subprocess packs. Earlier headline features remain: end-to-end content chaining (image.generate auto-feeds podcast/slides/blog covers), the helmdeck://image-models MCP resource, image-mode install (./scripts/install.sh --image-mode), and the Pack Test Runner UI. Helmdeck is published to the official
MCP Registry as
io.github.tosin2013/helmdeck for one-line install in registry-aware
clients. Phases 1–6.5 are complete; the current milestone is v1.0 — Kubernetes & GA (Phase 7), with backlog
materialised as GitHub issues tagged
good first issue
and help wanted.
docs/adrs/ — every architectural decision with PRD back-referencesdocs/TASKS.md — ~85 tasks across 8 phases with critical pathdocs/MILESTONES.md — drop-in issue checklists with current ship statedocs/PACKS.md — every shipped pack's input/output contractgit clone https://github.com/tosin2013/helmdeck
cd helmdeck
./scripts/install.sh
That's it. The script runs preflight checks (docker, node ≥20, go ≥1.26, make, openssl, curl) with platform-aware install hints, generates fresh secrets into deploy/compose/.env.local (chmod 600), builds the Management UI bundle, the Go binaries, and the browser sidecar image, brings the Compose stack up, and prints the URL plus a freshly generated admin password.
✓ helmdeck is up
URL: http://localhost:3000
Username: admin
Password: <generated; printed once — save it now>
Useful flags:
./scripts/install.sh --reset — tear down, regenerate secrets, reinstall (new admin password)./scripts/install.sh --no-build — skip build steps, just bring the stack up./scripts/install.sh --help — full flag referenceOr via make: make install.
A running stack is just the platform — the value is packs called by an agent. Wire one of the supported MCP clients to your fresh install:
| Client | Status | Setup guide |
|---|---|---|
| OpenClaw | ✅ validated end-to-end | docs/integrations/openclaw.md |
| Claude Code | 🟡 documented | docs/integrations/claude-code.md |
| Claude Desktop | 🟡 documented | docs/integrations/claude-desktop.md |
| Gemini CLI | 🟡 documented | docs/integrations/gemini-cli.md |
| Hermes Agent | 🟡 documented | docs/integrations/hermes-agent.md |
Once a client is connected, work through the
pack-demo-playbook.md — 20+
copy-pasteable prompts that exercise every pack. The
per-pack reference covers each
pack's contract, error codes, and chained workflows.
If you'd rather drive each step yourself instead of running the install script:
# 1. Build the Management UI bundle (needs Node 20+)
make web-deps && make web-build
# 2. Build the control-plane binary with the UI embedded
make build
# 3. Run the control plane with admin credentials
HELMDECK_JWT_SECRET=$(openssl rand -hex 32) \
HELMDECK_VAULT_KEY=$(openssl rand -hex 32) \
HELMDECK_ADMIN_PASSWORD=changeme \
./bin/control-plane
Or use the Compose stack directly (control plane + Garage object store + bundled init):
cp deploy/compose/.env.example deploy/compose/.env.local
# …edit deploy/compose/.env.local and fill in real secrets…
docker compose -f deploy/compose/compose.yaml --env-file deploy/compose/.env.local up -d
The login endpoint accepts a static admin password set via the
HELMDECK_ADMIN_PASSWORD env var on the control plane process.
Suitable for the dev / single-node Compose tier; OIDC SSO for
production deployments lands in a later phase.
| Setting | Default | Override |
|---|---|---|
| Username | admin | HELMDECK_ADMIN_USERNAME env var |
| Password | (none — UI login disabled) | HELMDECK_ADMIN_PASSWORD env var (required) |
| Session length | 12 hours | Hardcoded in internal/api/auth_login.go |
To change the password: stop the control plane, set
HELMDECK_ADMIN_PASSWORD to the new value, and restart. There is
no in-UI "change password" flow today — the password is managed
out-of-band by whichever orchestrator runs the control plane
(Compose, systemd, Kubernetes Secret, etc.).
If HELMDECK_ADMIN_PASSWORD is unset, the login endpoint
returns 503 login_disabled. The control plane still runs and the
API still works — operators can mint a JWT directly via the CLI:
./bin/control-plane -mint-token=alice -mint-token-scopes=admin -mint-token-ttl=12h
The minted token can be pasted into any tool that speaks
Authorization: Bearer <token>.
Production note: the static-password path uses constant-time comparison so it's safe against timing attacks, but it's still a shared secret that has to be rotated by hand. For production deployments with multiple operators, OIDC SSO via your existing identity provider is the right answer — see the Phase 6 follow-up roadmap.
helmdeck-mcp bridge binary (ADRs 025, 030)57 packs ship in the box (47 without an AI gateway configured). Each one hides a multi-step workflow
behind a single typed JSON-Schema call so weak open-weight models
can drive it as reliably as frontier models. The full input/output
contract for every pack lives in docs/PACKS.md.
The highlights:
| Pack | What it hides |
|---|---|
| Browser & web | |
browser.screenshot_url | Session lifecycle, navigation, render wait, cleanup |
browser.interact | Deterministic multi-step CDP (navigate, click, type, scroll, screenshot, assert_text) — no LLM needed |
web.scrape / web.scrape_spa | Firecrawl-backed markdown scrape OR schema-driven SPA extraction |
web.test | Natural-language browser tests via Playwright MCP + LLM loop |
research.deep | Multi-source Firecrawl search + per-source scrape + LLM synthesis with inline citations |
content.ground | Parses a markdown file for claims, finds authoritative sources, inserts real [link](url) citations in place |
| Document & vision | |
slides.render | Marp + Chromium + format flags |
slides.narrate | Narrated MP4 video (ElevenLabs TTS per slide) + YouTube engagement metadata + sidecar SRT captions + structured validation field (default-on post-step) |
av.validate | Structured AV-artifact validation (faststart, codec pin, packet contiguity, RMS sweep, LUFS, duration parity, SRT format) — default-on as a post-step on slides.narrate/podcast.generate; standalone for ad-hoc checks |
doc.parse | Docling layout-aware parse — PDF tables, multi-format, OCR fallback |
doc.ocr | Tesseract fallback for simple images |
desktop.run_app_and_screenshot | Xvfb + xdotool + scrot + window focus |
vision.click_anywhere | Native computer-use routing (Anthropic/OpenAI/Gemini schemas) with JSON-prompt fallback for Ollama/Deepseek |
vision.extract_visible_text / vision.fill_form_by_label | Screenshot → vision model → action loop |
| Code edit loop | |
repo.fetch / repo.push | SSH key selection from vault, known_hosts, key shred-on-exit; envelope returns tree/readme/entrypoints/signals so agents orient on the first turn |
repo.map | Aider-style structural symbol map under a token budget |
fs.read / fs.write / fs.patch / fs.list / fs.delete | Path-safe file ops inside a clone |
cmd.run | Run an arbitrary command in a clone path |
git.commit / git.diff / git.log | Stage + commit + review changes attributed to helmdeck-agent |
| GitHub | |
github.create_issue / github.list_issues / github.list_prs / github.post_comment / github.create_release / github.search | Vault-stored PAT, never visible to the agent |
| Language sidecars | |
python.run | CPython 3 + pytest + ruff + mypy in a Python sidecar image |
node.run | Node 20 LTS + npm + pnpm + yarn + tsc in a Node sidecar image |
| HTTP & credentials | |
http.fetch | Placeholder-token egress: ${vault:NAME} substitution in URL/headers/body |
See ADRs 014–036 for per-pack contracts and
docs/SIDECAR-LANGUAGES.md for the
runbook on adding new language sidecars (Rust, Go, Ruby, etc.).
The contribution guide in CONTRIBUTING.md
walks through writing your own pack — the most useful contributions
right now are SaaS API wrappers (Slack, Linear, Stripe, Notion, etc.).
Licensed under the Apache License, Version 2.0. See
NOTICE for attribution to bundled and depended-upon
projects, and CONTRIBUTING.md for the
contribution guide and the SPDX header convention.
By submitting a pull request you agree to license your contribution under the same terms (Apache 2.0 Section 5 covers the contribution grant — there's no separate CLA).
HELMDECK_URL*Base URL of your helmdeck control plane (e.g. http://localhost:3000 for a local Compose install, or your reverse-proxied https URL)
HELMDECK_TOKEN*secretJWT bearer token minted from your helmdeck control plane. Generate with: curl -X POST $HELMDECK_URL/api/v1/auth/login -d '{"username":"admin","password":"..."}'
therealtimex/browser-use
jae-jae/fetcher-mcp
merajmehrabi/puppeteer-mcp-server
com.thenextgennexus/playwright-mcp-server
saik0s/mcp-browser-use