When your Antithesis test run fails or finishes, this skill walks you through the triage workflow using snouty and jq. It pulls run status, investigates failed assertions by downloading logs at specific virtual timestamps, compares passing and failing examples, and helps diagnose incomplete runs. The workflow is structured around reference files that get loaded on demand, which keeps the context manageable. One thing to know: it requires snouty 0.5.0 or later and checks your API access upfront before doing real work. The preflight checks are thorough about surfacing auth issues since that's the most common stumbling block. If you're debugging why an assertion fired in a distributed test, this gives you the systematic approach to trace through the logs.
npx -y skills add antithesishq/antithesis-skills --skill antithesis-triage --agent claude-codeInstalls into .claude/skills of the current project.
Use this skill to analyze Antithesis test runs.
Reference files: This skill's references/ directory contains detailed guides for specific tasks. Do NOT read them all up front — only read a reference file when you are told to. Each reference file is mentioned by name at the point where it is needed.
snouty is not installed. See https://raw.githubusercontent.com/antithesishq/snouty/refs/heads/main/README.md for installation options.snouty is not at least version 0.6.0. Use snouty --version to find the version. Use snouty update to update.jq is not installed. See https://jqlang.org/download/ for installation options.The triage skill talks to Antithesis through the snouty API (snouty runs ...). Before doing any work, confirm the setup is ready:
snouty doctor --json
This validates API connectivity and reports snouty's resolved configuration.
Proceed only when the top-level ok is true and the api_key check's status is ok. Otherwise relay the failing check's message/notes and stop; if api_key is the failing check, tell the user to set ANTITHESIS_API_KEY.
The settings array in the same output carries snouty's resolved parameters. For example, you can look up the resolved tenant with:
snouty doctor --json | jq -r '.settings[] | select(.name == "tenant") | .value'
Before starting, collect the following from the user:
Tenant Name (required) — You must know the tenant name. Read snouty's resolved tenant from snouty doctor --json (see Preflight); snouty resolves it from the environment or a settings file, so trust that value. Only ask the user if doctor shows the tenant is unset.
What they want to know — Are they interested in all failures in a specific run? Are they investigating a specific failure? Are they getting a general overview? Comparing runs? This determines which workflow to follow.
Your main method to obtain information is to use the snouty runs <OPTION> command with the --json option. The --json option returns line-delimited JSON. The fields in the JSON depend on the option you are using. The same command without --json returns fewer fields in a tabular form better suited for human consumption.
You will need to know the RUN_ID. Read references/run-discovery.md to learn how to obtain the run_id.
Read references/run-discovery.md to get a list of recent runs. Then summarize them in a report.
To look up a specific run (report), read references/run-info.md. Then continue with other workflows as needed.
If the run has a status of "incomplete", refer to the Diagnose incomplete run section below.
references/run-info.md to load information on a runlinks.triage_report is null/absent in the run-info output, no triage report was generated for this run (typical for cancelled runs and some unknown/starting states). Report that the run is not triageable — the properties and logs endpoints will return 404 for these runs.references/properties.md to load propertiesRead references/properties.md - use snouty runs --json properties to extract properties with their examples and learn how to download logs
Read references/logs.md to learn how to understand logs
For each property to investigate:
a. Pick the first failing example
b. Find the moment. If the counterexample has no moment field (telemetry / meta properties — see references/properties.md), report the counterexample value as the evidence and skip steps c–e.
c. Download the example's log using snouty runs --json logs $RUN_ID $INPUT_HASH $VTIME. Make sure vtime does not get rounded. input_hash and vtime should match exactly what is contained in the example's moment structure.
d. Analyze the downloaded log locally
e. If you aren't certain what caused the issue, consider downloading logs from other counterexamples and examples for the same property. Compare each occurrence and try to see if there are any similarities or differences that might explain the failure cause. Logs from passing examples can be useful to compare against to find differences between success and failure cases.
When searching for additional logs for property failures, first use:
snouty runs --json events ${RUN_ID} ${PROPERTY_NAME}
This returns SOME but not necessarily ALL cases of the property passing or failing in the run. PROPERTY_NAME should match the "name" field
in the property data you are investigating. Match the "hit" and "condition" fields against the examples or counterexamples you are trying to find more of. Note that it is likely the examples and counterexamples you already know about will be in the list returned. Check the moment of the property returned from snouty runs events against the moment in the examples or counterexamples you have already downloaded.
Important: Cross-reference the log against the source code of the system under test (SUT) and the workload if you have access to it.
Deeply investigate the failure to develop an understanding of the timeline of events which led up to and potentially caused it.
Report your findings.
Important: The property status and assertion text alone are not sufficient — the logs provide the actual runtime context needed to understand the failure.
If the "status" of a specific run is "incomplete", there may be an error log to examine. The steps to triage an "incomplete" status run:
snouty runs --json show ${RUN_ID}failure_moment structure in the returned JSON. If present, use the input_hash and vtime from failure_moment to download a log in accordance with the instructions in references/logs.md.snouty runs --json build-logs ${RUN_ID} and look for errors in the build.Note on
links.triage_report: For incomplete runs, the per-propertypropertiesandlogsendpoints typically return 404, butlinks.triage_reportandfailure_momentmay still be populated inshow. Report whatshowactually contains — do not claim the triage_report link is absent unless that field is null. The triage workflow for incomplete runs isfailure_moment+build-logs, regardless of whether a report URL exists.
snouty runs show <RUN_ID> --web rather than printing the long links.triage_report URL. --web works only when a triage report exists. See references/run-info.md.End your triage summary with a short "next steps" section that names follow-up skills or additional triage workflows when they would help. Treat these as suggestions for the user, not actions to take automatically — let the user (or their orchestration) decide when to invoke them. If the user has explicitly asked to chain skills (e.g. "run triage and then debug any failing property"), follow their instructions and proceed with the chain.
Skills worth suggesting based on what triage uncovers:
antithesis-research — map the codebase to surface better testable properties. Useful when failures point to systemic gaps in coverage rather than a discrete bug.antithesis-workload — extend or refine assertions, test commands, and logs. Should be suggested whenever all properties are passing to incrementally improve the workload, or when properties are not reachable due to the workload being underpowered. This skill can also be used to add more logging or fix properties to assist with root cause analysis.antithesis-debug — open the multiverse debugger on a specific failure to inspect container state at the failing moment or explore alternate histories. Useful when log analysis alone hasn't explained the failure. Requires agent-browser and may require interactive login in a browser.antithesis-query-logs — search across all timelines for ordering and causality questions ("did event A always precede failure B?", "do failures still occur without the preceding fault?"). Useful when a failure looks correlated with another event or fault, or when you've only inspected one history out of many. Requires agent-browser and may require interactive login in a browser.Include the data the next skill will need (run_id, property name, moment, etc.) so the user can invoke it without re-discovering context.
Before declaring this skill complete, review your work against the criteria below. This skill's output is conversational (summaries, tables, analysis), so the review should happen in your current context. Re-read the guidance in this file, then systematically check each item below against the answers and analysis you produced.
Review criteria:
juliusbrussee/caveman
mattpocock/skills
shadcn/improve
obra/superpowers
forrestchang/andrej-karpathy-skills
vercel-labs/skills