Polls WandB metrics during training runs to catch NaN values, loss divergence, and gradient explosions before you waste GPU hours. Starts checking every 10 minutes, then backs off to hourly if everything looks healthy. The clever bit is it only calls out to GPT-5.5 via Codex when metrics are ambiguous, like a flattening loss curve that might be normal or might be trouble. Otherwise it just applies hard rules and stays out of your way. Works alongside the watchdog for process-level checks. Set it up with CronCreate after your first few training steps stabilize and let it run unattended. Better than discovering at 3am that your 16-hour run diverged after step 100.
npx -y skills add wanshuiyin/auto-claude-code-research-in-sleep --skill training-check --agent claude-codeInstalls into .claude/skills of the current project.
Select a file.
sickn33/antigravity-awesome-skills