Wraps the BugProof CLI for capturing, replaying, and analyzing executable bug artifacts inside Claude and other MCP hosts. Exposes 10 tools including capture (wrap failing commands), replay (run .bug files and verify failures), inspect (read artifact metadata), and diff (compare two captures). Also surfaces Resources for reading raw .bug contents and Prompts for workflows like "replay and analyze root cause." Useful when you're debugging flaky tests or environment-specific failures and want the agent to capture the exact failure state, replay it deterministically, or compare artifacts from CI versus local runs. Communicates over stdio and shells out to the local bugproof CLI with JSON output.
Executable bugs, not bug reports.
Capture a failing command into a portable .bug artifact that anyone can replay on their machine — same code, same env, same failure. Cryptographically signable. Cross-platform. Zero containers required.
https://github.com/user-attachments/assets/2315cfee-3ccf-40d7-830e-3a3d23731ab8
"Works on my machine" is not a bug report.
Filing a backend or CLI bug today usually looks like this:
Then the maintainer spends hours reconstructing the failure: matching versions, replicating the env, finding the right command, guessing at config. Most of that time is wasted.
BugProof captures the bug — not the description of it. One command produces a single .bug file containing the source snapshot, the exact command, the environment schema, the failure fingerprint, and replay metadata. Another developer runs bugproof replay bug.bug and reproduces the failure deterministically.
Think of it as Git for bugs: a portable, content-addressable, verifiable artifact that turns "can you reproduce?" into a one-liner.
bugproof capture -- and ship the result.bugproof keygen / --sign / verify.--self-heal auto-installs missing npm/pip deps in the sandbox and retries.npm install -g bugproof
Requirements: Node.js 18+ and Git. Optional language toolchains (Python, Java, Go, Rust, …) are only needed if your captured command uses them.
Run a one-off health check after install:
bugproof doctor
Add a single step to any GitHub Actions workflow to auto-capture flaky/failing commands as .bug artifacts.
- name: Capture flaky test
uses: sidinsearch/BugProof/.github/actions/bugproof-action@main
with:
command: 'npm test -- --run flaky-suite'
name: flaky-test-failure
timeout: 300000
How it works: The action installs bugproof from npmjs.org (npm install -g bugproof) → wraps your command with bugproof capture → on failure, the .bug artifact is uploaded to the Actions run. Developers download and repro locally with bugproof replay.
Use cases:
bugproof capture -- node app.js bundles the crash state, env, and sourceAll inputs:
| Input | Required | Default | Description |
|---|---|---|---|
command | ✅ | — | Command to capture (e.g. npm test) |
name | — | bug_<timestamp> | Artifact name |
timeout | — | 300000 | Command timeout in ms |
skip-secrets | — | false | Skip env secret scanning |
upload-artifact | — | true | Upload .bug file as Actions artifact |
node-version | — | 24 | Node.js version for bugproof |
The action lives at .github/actions/bugproof-action/action.yml in this repo. Reference it via uses: sidinsearch/BugProof/.github/actions/bugproof-action@main. BugProof is always installed from npmjs.org — no GitHub Packages token needed.
BugProof ships a built-in MCP (Model Context Protocol) server that exposes 10 tools plus Resources and Prompts for AI agents. Listed on the Official MCP Registry as io.github.sidinsearch/bugproof.
Add to ~/.claude/settings.json:
{
"mcpServers": {
"bugproof": {
"command": "npx",
"args": ["-y", "bugproof", "mcp"]
}
}
}
Add to Cursor MCP config (Settings → Features → MCP):
{
"mcpServers": {
"bugproof": {
"command": "npx",
"args": ["-y", "bugproof", "mcp"]
}
}
}
Add to ~/.continue/config.json:
{
"experimental": {
"mcpServers": {
"bugproof": {
"command": "npx",
"args": ["-y", "bugproof", "mcp"]
}
}
}
}
If you already have bugproof globally (npm install -g bugproof), omit npx -y:
{
"mcpServers": {
"bugproof": {
"command": "bugproof",
"args": ["mcp"]
}
}
}
No separate MCP install needed. npx -y bugproof mcp auto-downloads from npmjs.org and starts the server over stdio.
| Tool | Description | When an AI agent would use it |
|---|---|---|
capture | Run a command, capture as .bug artifact | "Capture this failing test and tell me what changed" |
replay | Replay a .bug file, return verdict | "Replay the artifact from CI and confirm it still fails" |
inspect | Show artifact metadata | "What's in this .bug file without running it?" |
diff | Compare two artifacts | "Compare the CI capture with my local capture — what's different?" |
doctor | Check sandbox capabilities | "Does this machine support full sandbox isolation?" |
share | Share artifact via GitHub Gist | "Share this bug with my team" |
pull | Download artifact from Gist | "Pull the bug artifact from this URL" |
watch | Auto-capture on command failure | "Watch npm test and capture if it fails" |
list | List .bug artifacts in directory | "Show me all bug artifacts in this project" |
clean | Remove .bug artifacts | "Clean up old bug artifacts from this directory" |
AI agents can read .bug artifact contents directly via resource URIs:
bugproof://artifact/{path} — Read the raw .bug artifact (base64-encoded ZIP)Pre-built workflows for common AI agent tasks:
| Prompt | Description |
|---|---|
capture-failure | Guide to capture a failing command as a .bug artifact |
replay-and-analyze | Replay an artifact and analyze the root cause |
compare-bugs | Compare two artifacts to find differences |
User: Capture the failing test and tell me what went wrong
Agent: [calls bugproof capture -- npm test -- --run flaky-suite]
[calls bugproof inspect on the result]
"The test failed with a timeout. Fingerprint matches a known
Redis-unreachable pattern. Here's the captured stderr..."
The MCP server communicates over stdio (JSON-RPC 2.0). It shells out to the local bugproof CLI with --json output and returns structured results with both human-readable summaries and raw data. If bugproof isn't installed, npx -y fetches it from npmjs.org — no global install required.
BugProof ships a distributable AI Agent Skill that teaches AI coding agents (Claude Code, Cursor, OpenCode, OpenClaw, Gemini CLI, GitHub Copilot, and others) how to use BugProof effectively. The skill follows the open Agent Skills standard — build once, use across any compatible agent.
When loaded, the skill gives AI agents complete knowledge of:
Agents auto-load the skill when users mention:
Or invoke directly: /bugproof
# Clone the BugProof repository
git clone https://github.com/sidinsearch/BugProof.git
# Copy the skill to your agent's skills directory
# Claude Code:
cp -r BugProof/skills/bugproof ~/.claude/skills/
# OpenCode:
cp -r BugProof/skills/bugproof ~/.config/opencode/skills/
# OpenClaw:
cp -r BugProof/skills/bugproof ~/.agents/skills/
# Cursor:
cp -r BugProof/skills/bugproof ~/.cursor/skills/
# Download just the skill
curl -L https://github.com/sidinsearch/BugProof/archive/refs/heads/main.tar.gz | tar xz
cp -r BugProof-main/skills/bugproof ~/.claude/skills/
skills/bugproof/
├── SKILL.md # Main instructions (required)
├── reference/
│ └── commands.md # Full command reference
└── examples/
└── usage-examples.md # Real-world usage examples
| Agent | Skill Directory |
|---|---|
| Claude Code | ~/.claude/skills/bugproof/ |
| OpenCode | ~/.config/opencode/skills/bugproof/ |
| OpenClaw | ~/.agents/skills/bugproof/ |
| Cursor | ~/.cursor/skills/bugproof/ |
| Gemini CLI | ~/.gemini/skills/bugproof/ |
| GitHub Copilot | .github/skills/bugproof/ |
The skill follows the open Agent Skills specification — any agent that supports the standard can use it.
# 1. Reproduce a failure
$ npm test
FAIL Tests failed because Redis was unreachable.
# 2. Capture it
$ bugproof capture -- npm test
✔ Artifact captured!
Path ./bug_1778049738215.bug
Files 42 files (28.4 KB)
Fingerprint sha256:c8b3...
# 3. Share the file (Slack, email, gist, attachment...)
# 4. Anyone replays it on their machine
$ bugproof replay bug_1778049738215.bug
✔ REPRODUCTION CONFIRMED
Exit code exit 1 (match)
Verdict Reproduction confirmed (exact fingerprint match)
Optional flow:
$ bugproof inspect bug.bug # peek at the contents
$ bugproof diff old.bug new.bug # what changed between two captures
$ bugproof share bug.bug # publish as a GitHub Gist
BugProof ships 14 commands. Every command supports --help and --json for machine-readable output.
| Command | Purpose |
|---|---|
bugproof capture | Run a command, record everything, produce a .bug artifact |
bugproof replay | Re-execute an artifact, compare against expected fingerprint |
bugproof watch | Transparently wrap a command — capture only if it fails |
bugproof inspect | Show artifact contents (manifest, command, fingerprint, files) |
bugproof diff | Side-by-side comparison of two artifacts |
bugproof verify | Validate the Ed25519 signature on a .bug (standalone) |
bugproof keygen | Generate an Ed25519 keypair for signing artifacts |
bugproof share | Publish an artifact as a GitHub Gist |
bugproof mcp | Start the MCP server for AI-agent integration |
bugproof init | Scaffold a .bugproofrc config file |
bugproof prune | Garbage-collect orphan sandbox temp directories |
bugproof clean | Remove all .bug artifacts from the current directory |
bugproof pull | Download a shared .bug artifact from a GitHub Gist |
bugproof doctor | Verify OS support for sandbox isolation features |
bugproof help | Help for any command |
bugproof capture [command...]Run a command end-to-end and bundle the failure as <name>.bug.
bugproof capture -- npm test
bugproof capture -n auth-crash -d "Login fails on expired session" -- node server.js
bugproof capture -o ./bugs/ -- npm test
bugproof capture --include-untracked -- python app.py
bugproof capture -x "*.log" -x "node_modules/**" -- go test ./...
bugproof capture --timeout 600000 -- java -cp . Main
bugproof capture --include-compiled -- mvn test # force include .class/.jar files
bugproof capture --sign --signer "alice@example.com" -- ./run.sh
bugproof capture --json -- node script.js
| Flag | Description |
|---|---|
-n, --name <name> | Artifact name (becomes <name>.bug) |
-d, --description <desc> | Human-readable description embedded in the manifest |
-o, --output <dir> | Output directory (default: current directory; respects .bugproofrc outputDir) |
-x, --exclude <pattern> | Exclude files by glob (repeatable) |
--include-untracked | Bundle untracked files too (git ls-files -o) |
--include-compiled | Force include compiled artifacts (.class, .jar, .pyc, etc.) — auto-detected by default |
--timeout <ms> | Kill the command after N ms (default 300000) |
--skip-secrets | Don't scan env for secrets (skip the confirm prompt) |
--sign [key] | Sign with the default key, or a named key / path to a .key file |
--signer <id> | Embed a signer identity (email, gist URL, etc.) |
--json | Structured JSON output |
Default behavior: Without -n, artifacts are named bug_<timestamp>.bug. With .bugproofrc nameTemplate configured, the template is used instead. Without -o, artifacts are saved in the current directory.
bugproof replay <artifact>Re-execute the captured artifact and compare results.
bugproof replay bug.bug
bugproof replay bug.bug --sandbox isolated
bugproof replay bug.bug --self-heal
bugproof replay bug.bug --verify-signature
bugproof replay bug.bug --source-dir .
bugproof replay bug.bug --json
| Flag | Description |
|---|---|
--sandbox <level> | workspace (default), isolated, or full |
--self-heal | Auto-install missing npm/pip deps and retry (up to 3 rounds) |
--verify-signature | Require a valid Ed25519 signature; exit 2 if missing or invalid |
--source-dir <dir> | Override source directory for git operations (use current dir's repo instead of captured path) |
--json | Structured JSON output |
Replay isolation: Replay always runs in an isolated temp directory. Files come from either: (1) git worktree/clone at the captured commit, (2) current directory's git repo (if original path is inaccessible), or (3) the artifact's bundled files/ snapshot. The current directory is never read for source files.
bugproof keygen / verify — Cryptographic ProvenanceSign artifacts with Ed25519 (RFC 8032). Built on Node's native crypto — no external deps.
# One-time: create your signing key
bugproof keygen
# → writes default.pub / default.key to ~/.bugproof/keys/
# Capture with a signature
bugproof capture --sign --signer "alice@example.com" -- npm test
# Verify a received artifact
bugproof verify bug.bug
✔ SIGNATURE VALID
Algorithm ed25519
Fingerprint 179721ef7e63f6b3
Signed at 2026-05-10T22:09:30Z
Signer alice@example.com
# Enforce signatures at replay time
bugproof replay --verify-signature bug.bug
The signature covers a canonical hash of the manifest, the failure fingerprint, and the SHA-256 of every file in the bundle. Tampering with source, output, exit code, or metadata invalidates the signature.
Note: identity/PKI is intentionally out of scope. Trust is established by comparing the embedded public-key fingerprint against one you know (gist pinning, team wiki, key server, etc.).
bugproof watch [command...]Transparent wrapper. Runs the command normally; only captures if it fails. Drop-in replacement for any command you'd otherwise hand-run.
bugproof watch -- npm test
bugproof watch -o ./bugs -- python app.py
bugproof watch --always -- node script.js # capture even on success
bugproof inspect <artifact> / diff <a> <b>bugproof inspect bug.bug # manifest, fingerprint, file list, env schema
bugproof diff captured-v1.bug captured-v2.bug # what changed between two captures
bugproof share <artifact>Publish an artifact as a GitHub Gist. Respects HTTPS_PROXY / HTTP_PROXY for corporate networks.
bugproof share bug.bug
bugproof share --public bug.bug
Requires GITHUB_TOKEN (or BUGPROOF_GITHUB_TOKEN) with gist scope.
bugproof init / prune / doctorbugproof init # scaffold .bugproofrc in the current directory
bugproof prune # garbage-collect orphan BugBox temp directories
bugproof doctor # check OS support for sandbox isolation (namespaces, Job Objects, Seatbelt)
.bugproofrc)Generated by bugproof init. All fields are optional.
{
"exclude": ["node_modules/**", "dist/**", "*.bug"],
"outputDir": ".",
"timeout": 300000,
"skipSecrets": false,
"includeUntracked": false
}
BugProof keeps artifacts small even on heavy codebases:
| Strategy | When | What ships | Typical size |
|---|---|---|---|
git-full | Clean git repo | Commit ref only | ~2 KB |
git-patch | Dirty git repo | Commit ref + diff patch | ~5 KB |
git-files | Force mode / untracked | All tracked + untracked files | varies |
full-copy | No git repo | Full codebase (excl. node_modules, etc.) | ~10–100 MB |
Git is strongly encouraged but not required.
BugProof auto-detects compiled languages and bundles build artifacts automatically:
| Language | Source Files | Compiled Artifacts | Auto-Included? |
|---|---|---|---|
| Java | .java | .class, .jar, .war, .ear | ✅ Yes (from target/, build/, out/) |
| Python | .py | .pyc, .pyo | ✅ Yes (from __pycache__/) |
| Go | .go | bin/, compiled binaries | ✅ Yes (from bin/, dist/) |
| Rust | .rs | target/ binaries | ✅ Yes (from target/) |
| .NET/C# | .cs, .csproj | .dll, .exe | ✅ Yes (from bin/, obj/) |
| WebAssembly | — | .wasm | ✅ Yes (from any build dir) |
| Node native | — | .node | ✅ Yes (from build/, dist/) |
| C/C++ | .c, .cpp | .o, .obj, .exe | ❌ No (platform-specific, source-only) |
How it works: When BugProof detects a compiled language project (via pom.xml, go.mod, Cargo.toml, etc.) and finds compiled artifacts in standard build directories, it automatically includes them. No flag needed.
Use --include-compiled to force inclusion even when auto-detection misses something, or for edge cases.
BugProof runs replayed commands in a layered sandbox — Docker-like isolation built on native OS primitives.
| Layer | Linux | Windows | macOS |
|---|---|---|---|
| Process | PID namespace (unshare --pid) | Job Objects | sandbox-exec |
| Network | Network namespace (unshare --net) | netsh advfirewall rules | (deny network*) |
| Filesystem | fuse-overlayfs (RO source + writable overlay) | Isolated temp directory | Restricted write paths |
| Resource limits | cgroups v2 (memory, CPU, PIDs) | Job Object limits | — |
| Env sanitization | Strip LD_PRELOAD, NODE_OPTIONS, … | Same | Same |
| Temp | Private /tmp | Private %TEMP% | Private /tmp |
Three sandbox levels are exposed via --sandbox:
workspace (default) — minimal isolation, fast. Good for trusted artifacts.isolated — namespace + temp isolation. Recommended for untrusted artifacts.full — all layers including network deny + resource limits.Caveat: On Windows,
isolatedandfullare best-effort hardening, not VM-grade containment. For artifacts from fully untrusted sources, replay inside a dedicated VM.
Capture-time runtime versions are recorded and diffed on replay:
Environment Mismatches
• node version mismatch: captured 18.0.0, current 22.1.0
✘ python 3.11.0 was available at capture but is not installed now.
Tracked: Node.js, Python, Ruby, Go, Rust, Java, npm, pip, OS platform, architecture.
| Capture ↘ / Replay ↗ | Windows | Linux | macOS |
|---|---|---|---|
| Windows | ✅ | ✅ | ✅ |
| Linux | ✅ | ✅ | ✅ |
| macOS | ✅ | ✅ | ✅ |
The translation layer normalizes commands (python3 ↔ python, gradlew ↔ gradlew.bat, make ↔ mingw32-make, shell paths). Architecture mismatches (x64 ↔ arm64) trigger explicit warnings with Rosetta/translation advice.
| Area | Mechanism |
|---|---|
| Secrets — known patterns | Env vars matching *_TOKEN, *_KEY, *_SECRET, AWS/GCP/Stripe shapes are redacted at capture |
| Secrets — unknown values | Shannon entropy analysis flags high-entropy values (≥4.5 bits/char) even with innocuous key names |
| stdout/stderr scrubbing | Active regex stream-scrubber strips emails, IPs, credit cards, GitHub tokens, Stripe keys from captured output |
| Path traversal | Every file copy is validated to stay within artifact and project boundaries |
| Script injection | Sandbox commands are spawned with argument arrays — never via shell strings |
| Provenance | Ed25519 signatures cover manifest + fingerprint + per-file SHA-256s. Verification is local, no network calls |
| Sandbox env sanitization | LD_PRELOAD, NODE_OPTIONS, DYLD_*, and similar runtime-hijack vectors are stripped before replay |
| Cryptography | Node native crypto only — no external crypto deps. No telemetry. No phone-home |
Use --skip-secrets only when you've audited the environment yourself.
Registered under HKCU\Software\Classes\.bug → BugProof.Artifact with open command node <package>/dist/cli.js replay "%1".
Registers MIME application/x-bugproof, a bugproof.desktop handler, and a user-level icon entry in ~/.local/share/icons/hicolor/.
Best-effort via the bundled script. Re-run manually if Finder association doesn't take:
bash scripts/bugproof-file-association-macos.sh
bugproof/
├── src/
│ ├── capture/ # Execution + env snapshot + language detection + packaging
│ ├── replay/ # Restore + sandbox orchestration + verdict + self-heal
│ ├── sandbox/ # OS-specific isolation (filesystem, network, process)
│ ├── share/ # Gist publisher
│ ├── diff/ # Two-artifact diff engine
│ ├── config/ # .bugproofrc loader and validation
│ ├── utils/ # signing, secrets, fingerprint, dependencies, security, …
│ └── cli.ts # Commander entrypoint (14 commands)
├── tests/ # 40 suites / 502 tests (Jest)
├── scripts/ # Postinstall, e2e matrix, file-association helpers
└── .github/workflows/ # CI/CD (tri-platform matrix, signed npm publish)
| Module | Responsibility |
|---|---|
capture/engine.ts | Execute the user's command, stream output to temp files, record stdout/stderr/exit |
capture/packager.ts | Bundle into .bug zip; optionally sign with Ed25519 |
capture/language-support.ts | Detect Node/Python/Java/Go/Rust/.NET/C++/Kotlin + compiled artifact auto-detection |
capture/env-snapshot.ts | Record runtime versions for environment diff on replay |
capture/source-strategy.ts | Smart source selection: git-full, git-patch, stacktrace, minimal |
replay/engine.ts | Reproduce the command in a sandbox |
replay/self-heal.ts | Detect missing deps, install in sandbox, retry |
replay/verdict.ts | Compare fingerprint + normalized error patterns |
replay/hints.ts | Generate actionable debugging hints from captured output |
sandbox/bugbox.ts | Orchestrate per-OS isolation layers |
sandbox/cross-platform.ts | Command translation across Windows/Linux/macOS |
utils/signing.ts | Ed25519 sign / verify / canonical-payload builder |
utils/secrets.ts | Pattern + entropy-based env scanning, PII stream scrubber |
utils/fingerprint.ts | Deterministic failure fingerprinting, path normalization |
utils/dependencies.ts | Detect missing npm/pip/system deps from stderr |
utils/ui.ts | Terminal UI: colors, spinners, progress bars, summary boxes |
diff/engine.ts | Side-by-side artifact comparison |
share/gist.ts | GitHub Gist publisher with proxy support |
config/loader.ts | Load and validate .bugproofrc configuration |
PRs welcome. See CONTRIBUTING.md for guidelines, dev setup, and the test matrix expectations. Every PR runs the full tri-platform CI; please add tests for new behavior.
This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0).
| Use Case | Allowed? |
|---|---|
| Personal & non-commercial use | ✅ Free, no restrictions |
| Forking & modifications | ✅ Must release under AGPL-3.0 with source code |
| Running as a network service (SaaS) | ✅ Must publish your modified source code |
| Commercial / proprietary use | ❌ Requires a separate commercial license |
Made with ❤️ by sidinsearch · Copyright © 2026 sidinsearch · AGPL-3.0 License