This is a comprehensive profiling assistant that goes beyond running nvidia-smi or cProfile. It intelligently chooses profiling strategies based on your target, whether that's a Python script, a running process, or an entire framework like vLLM. What makes it useful is that it doesn't just run external tools. It will actually write and inject instrumentation code into your target to measure things like CPU-GPU transfer patterns, NCCL collective latency, or memory redundancy across devices. It tracks all its modifications in a changelog table so you can see exactly what it changed and revert everything when you're done. The output is structured around four dimensions: CPU overhead, memory overhead, interconnect traffic, and GPU compute, with actionable recommendations ranked by impact.
npx -y skills add wanshuiyin/auto-claude-code-research-in-sleep --skill system-profile --agent claude-codeInstalls into .claude/skills of the current project.
Select a file.
sickn33/antigravity-awesome-skills