When your HyperPod cluster starts acting up and you need to collect diagnostics for AWS Support, this automates the tedious work of SSM-ing into nodes and gathering logs. It works with both EKS and Slurm orchestrators, auto-detects cluster type, and runs collection in parallel across nodes. You point it at your cluster and an S3 bucket, and it pulls system logs, configurations, and custom command output into a structured report. The workflow is straightforward: gather cluster details, verify your AWS permissions, run the bundled Python script via uv, then download or analyze results. Honest take: this is really just plumbing around AWS APIs and SSM sessions, but that plumbing saves hours when you're troubleshooting production issues at 2am.
npx -y skills add awslabs/agent-plugins --skill hyperpod-issue-report --agent claude-codeInstalls into .claude/skills of the current project.
Select a file.
cursor/plugins
github/awesome-copilot
alirezarezvani/claude-skills
microsoft/win-dev-skills