Antithesis Debug

146 installs67 stars

Summary

If you're debugging Antithesis test runs, this skill walks Claude through using the multiverse debugger to inspect container filesystems, run shell commands, and extract evidence from failed test scenarios. It supports both the simplified command interface and the advanced JavaScript notebook mode, with detailed references for switching between them, authorizing actions, and navigating debugging sessions. The skill emphasizes starting with simplified mode for most tasks and only escalating to advanced when you need time travel to inspect crashed containers or complex multi-step workflows. Requires agent-browser to drive the web interface, and includes ready-to-use snippets for common inspection patterns.

Install to Claude Code

npx -y skills add antithesishq/antithesis-skills --skill antithesis-debug --agent claude-code

Installs into .claude/skills of the current project.

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

Files

SKILL.mdView on GitHub

Antithesis Multiverse Debugger

Use the agent-browser skill to interact with the Antithesis multiverse debugger.

Every debugging session should use:

a fresh, unique SESSION value such as antithesis-debug-$(date +%s)-$$

Use --session-name antithesis so agent-browser manages shared authentication state automatically, while --session "$SESSION" keeps each debugging run isolated from other concurrent agents. Close the unique live session when debugging is complete.

When to use this skill

Use this when:

the user gives an Antithesis debugging-session URL
the user makes a request to inspect container filesystem, runtime state, or events inside Antithesis
the triage skill has launched an MVD session when investigating the results of a run

For auth and report navigation, use the antithesis-agent-browser skill. It owns the agent-browser session and authentication flow. This skill handles the debugger itself.

Prerequisites

DO NOT PROCEED if agent-browser is not installed. See https://raw.githubusercontent.com/vercel-labs/agent-browser/refs/heads/main/README.md for installation options.
DO NOT PROCEED if agent-browser is older than version v0.23.4. You can upgrade with agent-browser upgrade.

Gathering user input

Before starting, collect the following from the user:

Debugger URL (required) — A debugging-session URL like https://TENANT.antithesis.com/debugging-session/....
What to investigate — Are we checking filesystem contents? Runtime state? Specific artifacts?
Container name (if known) — The name of the container to target. If not provided, the log view or container dropdown will show available containers.

Simplified vs. advanced mode

The debugger has two modes. Prefer simplified mode — it is sufficient for most tasks.

Mode	Best for	How it works
Simplified	Running shell commands, reading files, listing directories, extracting artifacts	Click log lines to set moment/container, type bash commands, press Send
Advanced	Programmatic inspection, branching, event sets, custom JavaScript, multi-step notebook workflows	Monaco editor with JavaScript cells, action authorization, runtime API

Detecting which mode is active

Different tenants may open the debugger in either mode by default. After injecting the runtime (see references/setup-session.md), check which mode is active:

agent-browser --session "$SESSION" eval \
  "window.__antithesisDebug.getMode()"

Returns "simplified" or "advanced". To switch:

agent-browser --session "$SESSION" eval \
  'window.__antithesisDebug.switchMode("simplified")'

When to escalate to advanced mode

Switch to advanced mode only when you need:

A target container that had crashed (or otherwise wasn't running) at the MVD recreate moment. The simplified container dropdown only lists containers running at that point. To inspect a container while it was alive, you must time-travel back via moment.rewind_to(...) and target it by name via bash...run({container: "<name>"}). This is the one container-targeting case where advanced is strictly required.
Branching (moment.branch()) and advancing time (branch.wait, branch.wait_until)
Exploring multiple random histories from the same moment (branch.send_input)
Event set queries (environment.events.up_to(moment), .contains({source}))
Fault injector state inspection or control (environment.fault_injector.pause/unpause)
Background processes and process-exit awaiting (run_in_background, p.exits)
Complex multi-cell notebook workflows with grouped authorization
The full JavaScript notebook API
As an escape hatch when something in simplified mode isn't working

To switch programmatically, call window.__antithesisDebug.switchMode("advanced"). This is what getMode() reports. Note: there is a separate "Simple mode" / "Advanced mode" tab strip in the right pane (.display_area__tabs) that switchMode() does NOT click; the two can disagree. The right-pane tab strip controls the help / pop-out panel content. To read the on-page advanced-mode help (a canonical ~5KB reference for the notebook API), click the "Advanced mode" tab in that strip and read document.querySelector(".display_area__layout").innerText. See references/advanced-debugger.md.

Reference files

Each reference file covers a specific interaction mode or task. Read the relevant file before performing that task.

Simplified mode (default — start here)

Page	When to read
`references/setup-session.md`	Always — read first to set up the browser session
`references/simplified-debugger.md`	Running commands, extracting files, reading logs in simplified mode

Advanced mode (using a notebook)

Page	When to read
`references/setup-session.md`	Always — read first to set up the browser session
`references/advanced-debugger.md`	Read this BEFORE other advanced refs — mental model, mode-switching, authorization, branches, time advancement, fault injector, host commands, common errors
`references/notebook.md`	Reading or writing notebook source, injecting cells
`references/actions.md`	Authorizing shell actions, reading action output
`references/common-inspections.md`	Ready-to-use debug cell snippets for common tasks

Recommended workflows

Simplified: Run a command in a container

Read references/setup-session.md — open the debugger URL
Read references/simplified-debugger.md — click a log line to set the moment and container, enter a bash command, press Send, read the output
Report findings with concrete evidence

Simplified: Extract a file

Read references/setup-session.md — open the debugger URL
Read references/simplified-debugger.md — click a log line, toggle "Extract file", enter the file path, press Send
Read the download link from the output

Advanced: Programmatic investigation

Read references/setup-session.md — open the debugger URL and inject runtime
Switch to advanced mode: window.__antithesisDebug.switchMode("advanced"), then notebook.waitForReady()
Read references/advanced-debugger.md — internalize moments, branches, reactive-vs-effectful, action grouping, and common errors
Read references/notebook.md — mechanics of writing cells
Read references/common-inspections.md — ready-to-use snippets
Read references/actions.md — authorizing and reading action results
Report findings with evidence chain

Download the events log for offline analysis

Read references/setup-session.md — launch / open the debugger URL
Read references/download-log.md — run assets/download-mvd-log.sh to capture and (for JSON) annotate the events log. The annotation step uses this skill's own assets/process-logs.py.
Analyze the local file with jq (event shape matches the triage skill's logs reference)

Runtime injection

The JS runtime is required for both simplified and advanced modes. It provides the window.__antithesisDebug API with three namespaces: simplified, notebook, and actions.

Use the browser-side runtime file:

assets/antithesis-debug.js

Inject it into the current page with:

cat assets/antithesis-debug.js \
  | agent-browser --session "$SESSION" eval --stdin

Injecting the file registers methods on window.__antithesisDebug. Call those methods with agent-browser eval.

Method call examples:

# Simplified mode
agent-browser --session "$SESSION" eval \
  'window.__antithesisDebug.simplified.runCommand("ls -la /")'

# Advanced mode
agent-browser --session "$SESSION" eval \
  "window.__antithesisDebug.notebook.getSource()"

agent-browser eval awaits Promises automatically, so async and sync methods use the same call pattern.

If window.__antithesisDebug is missing, inject assets/antithesis-debug.js and retry the method call.

Do not run method calls in parallel with agent-browser open, navigation, or any other command that can replace the page. Wait until the target page is settled before starting eval calls. Run queries sequentially; mutations will interfere with each other if launched in parallel.

After every open call or any interaction that may navigate or replace the page, first confirm the browser has landed on the debugger page, then inject assets/antithesis-debug.js and call the appropriate waitForReady() before running other methods.

Page loading checks

After navigation, inject the runtime, then call the appropriate waitForReady:

# Simplified mode
agent-browser --session "$SESSION" eval \
  "window.__antithesisDebug.simplified.waitForReady()"

# Advanced mode
agent-browser --session "$SESSION" eval \
  "window.__antithesisDebug.notebook.waitForReady()"

Both poll for up to 60 seconds and return { ok, ready, attempts, waitedMs }. On timeout, the result also includes details.

One-shot probes:

agent-browser --session "$SESSION" eval \
  "window.__antithesisDebug.simplified.loadingFinished()"

agent-browser --session "$SESSION" eval \
  "window.__antithesisDebug.notebook.loadingFinished()"

Diagnostics:

agent-browser --session "$SESSION" eval \
  "window.__antithesisDebug.simplified.loadingStatus()"

agent-browser --session "$SESSION" eval \
  "window.__antithesisDebug.notebook.loadingStatus()"

General guidance

Start in simplified mode. The simplified debugger handles most debugging tasks. Only switch to advanced mode when you specifically need the notebook API.
Defer to antithesis-agent-browser for auth. If the debugger URL requires authentication, use the antithesis-agent-browser skill's references/setup-auth.md for the interactive login flow. Use the same --session-name antithesis so auth state is shared.
Use disposable sessions. Generate a unique SESSION for each debugging run, pair it with the shared --session-name antithesis, and agent-browser --session "$SESSION" close when you finish or abort.
Run commands sequentially. In both modes, wait for each command to complete before sending the next one.
Do not fabricate container names. Use the container dropdown (simplified) or environment.containers.list({moment}) (advanced) to determine valid container names.
Present results clearly. When reporting filesystem contents, include the full path and listing. When reporting artifact searches, include what was found and what was not.

Advanced mode only

Inject the runtime after navigation. After every open call or page reload, inject assets/antithesis-debug.js and call notebook.waitForReady() before the next method call.
Retry missing-runtime errors by reinjecting. If a command fails because window.__antithesisDebug is undefined or missing, inject the runtime and rerun the same method.
Group effects with action() + required_by. Don't authorize cells one at a time when a single click can fire a chain — see references/advanced-debugger.md for the tethered_authorization pattern.
Don't edit a command that has already been authorized. Make a new branch (and new action) for the new attempt. Editing an authorized cell re-runs it without re-authorization (a known bug).
A nonzero exit code terminates the branch. Subsequent commands on the same branch will fail with CAMPAIGN SAW TERMINAL EVENT. Fork a fresh branch from the same or an earlier moment.
Trust cell.text/innerText over actionCompleted. The runtime's completion-detection misses real DONE states; read the cell text directly.

Self-Review

Before declaring this skill complete, review your work against the criteria below. This skill's output is conversational (summaries, evidence, analysis), so the review should happen in your current context. Re-read the guidance in this file, then systematically check each item below against the answers and analysis you produced.

Review criteria:

Every filesystem listing or artifact search was extracted from actual debugger output, not inferred or assumed
The evidence chain is clear: which commands were run, what they returned, and what conclusions follow
The summary distinguishes between what the debugger shows and what you interpret or recommend
The browser session was closed at the end of the debugging run

Featured

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

Antithesis Multiverse Debugger

Use the agent-browser skill to interact with the Antithesis multiverse debugger.

Every debugging session should use:

a fresh, unique SESSION value such as antithesis-debug-$(date +%s)-$$

When to use this skill

Use this when:

the user gives an Antithesis debugging-session URL
the user makes a request to inspect container filesystem, runtime state, or events inside Antithesis
the triage skill has launched an MVD session when investigating the results of a run

For auth and report navigation, use the antithesis-agent-browser skill. It owns the agent-browser session and authentication flow. This skill handles the debugger itself.

Prerequisites

DO NOT PROCEED if agent-browser is not installed. See https://raw.githubusercontent.com/vercel-labs/agent-browser/refs/heads/main/README.md for installation options.
DO NOT PROCEED if agent-browser is older than version v0.23.4. You can upgrade with agent-browser upgrade.

Gathering user input

Before starting, collect the following from the user:

Debugger URL (required) — A debugging-session URL like https://TENANT.antithesis.com/debugging-session/....
What to investigate — Are we checking filesystem contents? Runtime state? Specific artifacts?
Container name (if known) — The name of the container to target. If not provided, the log view or container dropdown will show available containers.

Simplified vs. advanced mode

The debugger has two modes. Prefer simplified mode — it is sufficient for most tasks.

Mode	Best for	How it works
Simplified	Running shell commands, reading files, listing directories, extracting artifacts	Click log lines to set moment/container, type bash commands, press Send
Advanced	Programmatic inspection, branching, event sets, custom JavaScript, multi-step notebook workflows	Monaco editor with JavaScript cells, action authorization, runtime API

Detecting which mode is active

Different tenants may open the debugger in either mode by default. After injecting the runtime (see references/setup-session.md), check which mode is active:

agent-browser --session "$SESSION" eval \
  "window.__antithesisDebug.getMode()"

Returns "simplified" or "advanced". To switch:

agent-browser --session "$SESSION" eval \
  'window.__antithesisDebug.switchMode("simplified")'

When to escalate to advanced mode

Switch to advanced mode only when you need:

A target container that had crashed (or otherwise wasn't running) at the MVD recreate moment. The simplified container dropdown only lists containers running at that point. To inspect a container while it was alive, you must time-travel back via moment.rewind_to(...) and target it by name via bash...run({container: "<name>"}). This is the one container-targeting case where advanced is strictly required.
Branching (moment.branch()) and advancing time (branch.wait, branch.wait_until)
Exploring multiple random histories from the same moment (branch.send_input)
Event set queries (environment.events.up_to(moment), .contains({source}))
Fault injector state inspection or control (environment.fault_injector.pause/unpause)
Background processes and process-exit awaiting (run_in_background, p.exits)
Complex multi-cell notebook workflows with grouped authorization
The full JavaScript notebook API
As an escape hatch when something in simplified mode isn't working

Reference files

Each reference file covers a specific interaction mode or task. Read the relevant file before performing that task.

Simplified mode (default — start here)

Page	When to read
`references/setup-session.md`	Always — read first to set up the browser session
`references/simplified-debugger.md`	Running commands, extracting files, reading logs in simplified mode

Advanced mode (using a notebook)

Page	When to read
`references/setup-session.md`	Always — read first to set up the browser session
`references/advanced-debugger.md`	Read this BEFORE other advanced refs — mental model, mode-switching, authorization, branches, time advancement, fault injector, host commands, common errors
`references/notebook.md`	Reading or writing notebook source, injecting cells
`references/actions.md`	Authorizing shell actions, reading action output
`references/common-inspections.md`	Ready-to-use debug cell snippets for common tasks

Recommended workflows

Simplified: Run a command in a container

Read references/setup-session.md — open the debugger URL
Read references/simplified-debugger.md — click a log line to set the moment and container, enter a bash command, press Send, read the output
Report findings with concrete evidence

Simplified: Extract a file

Read references/setup-session.md — open the debugger URL
Read references/simplified-debugger.md — click a log line, toggle "Extract file", enter the file path, press Send
Read the download link from the output

Advanced: Programmatic investigation

Read references/setup-session.md — open the debugger URL and inject runtime
Switch to advanced mode: window.__antithesisDebug.switchMode("advanced"), then notebook.waitForReady()
Read references/advanced-debugger.md — internalize moments, branches, reactive-vs-effectful, action grouping, and common errors
Read references/notebook.md — mechanics of writing cells
Read references/common-inspections.md — ready-to-use snippets
Read references/actions.md — authorizing and reading action results
Report findings with evidence chain

Download the events log for offline analysis

Read references/setup-session.md — launch / open the debugger URL
Read references/download-log.md — run assets/download-mvd-log.sh to capture and (for JSON) annotate the events log. The annotation step uses this skill's own assets/process-logs.py.
Analyze the local file with jq (event shape matches the triage skill's logs reference)

Runtime injection

The JS runtime is required for both simplified and advanced modes. It provides the window.__antithesisDebug API with three namespaces: simplified, notebook, and actions.

Use the browser-side runtime file:

assets/antithesis-debug.js

Inject it into the current page with:

cat assets/antithesis-debug.js \
  | agent-browser --session "$SESSION" eval --stdin

Injecting the file registers methods on window.__antithesisDebug. Call those methods with agent-browser eval.

Method call examples:

# Simplified mode
agent-browser --session "$SESSION" eval \
  'window.__antithesisDebug.simplified.runCommand("ls -la /")'

# Advanced mode
agent-browser --session "$SESSION" eval \
  "window.__antithesisDebug.notebook.getSource()"

agent-browser eval awaits Promises automatically, so async and sync methods use the same call pattern.

If window.__antithesisDebug is missing, inject assets/antithesis-debug.js and retry the method call.

Page loading checks

After navigation, inject the runtime, then call the appropriate waitForReady:

# Simplified mode
agent-browser --session "$SESSION" eval \
  "window.__antithesisDebug.simplified.waitForReady()"

# Advanced mode
agent-browser --session "$SESSION" eval \
  "window.__antithesisDebug.notebook.waitForReady()"

Both poll for up to 60 seconds and return { ok, ready, attempts, waitedMs }. On timeout, the result also includes details.

One-shot probes:

agent-browser --session "$SESSION" eval \
  "window.__antithesisDebug.simplified.loadingFinished()"

agent-browser --session "$SESSION" eval \
  "window.__antithesisDebug.notebook.loadingFinished()"

Diagnostics:

agent-browser --session "$SESSION" eval \
  "window.__antithesisDebug.simplified.loadingStatus()"

agent-browser --session "$SESSION" eval \
  "window.__antithesisDebug.notebook.loadingStatus()"

General guidance

Start in simplified mode. The simplified debugger handles most debugging tasks. Only switch to advanced mode when you specifically need the notebook API.
Defer to antithesis-agent-browser for auth. If the debugger URL requires authentication, use the antithesis-agent-browser skill's references/setup-auth.md for the interactive login flow. Use the same --session-name antithesis so auth state is shared.
Use disposable sessions. Generate a unique SESSION for each debugging run, pair it with the shared --session-name antithesis, and agent-browser --session "$SESSION" close when you finish or abort.
Run commands sequentially. In both modes, wait for each command to complete before sending the next one.
Do not fabricate container names. Use the container dropdown (simplified) or environment.containers.list({moment}) (advanced) to determine valid container names.
Present results clearly. When reporting filesystem contents, include the full path and listing. When reporting artifact searches, include what was found and what was not.

Advanced mode only

Inject the runtime after navigation. After every open call or page reload, inject assets/antithesis-debug.js and call notebook.waitForReady() before the next method call.
Retry missing-runtime errors by reinjecting. If a command fails because window.__antithesisDebug is undefined or missing, inject the runtime and rerun the same method.
Group effects with action() + required_by. Don't authorize cells one at a time when a single click can fire a chain — see references/advanced-debugger.md for the tethered_authorization pattern.
Don't edit a command that has already been authorized. Make a new branch (and new action) for the new attempt. Editing an authorized cell re-runs it without re-authorization (a known bug).
A nonzero exit code terminates the branch. Subsequent commands on the same branch will fail with CAMPAIGN SAW TERMINAL EVENT. Fork a fresh branch from the same or an earlier moment.
Trust cell.text/innerText over actionCompleted. The runtime's completion-detection misses real DONE states; read the cell text directly.

Self-Review

Review criteria:

Every filesystem listing or artifact search was extracted from actual debugger output, not inferred or assumed
The evidence chain is clear: which commands were run, what they returned, and what conclusions follow
The summary distinguishes between what the debugger shows and what you interpret or recommend
The browser session was closed at the end of the debugging run