This is a Claude Code skill that helps you set up SimPO (Simple Preference Optimization), a reference-free preference optimization method for training language models. It's an alternative to DPO that skips the reference model entirely, which can save memory and compute. The skill appears to wrap setup instructions from the Hugging Face alignment-handbook, walking you through conda environment creation and PyTorch installation. You'd reach for this if you're doing RLHF work and want to experiment with preference optimization approaches. With 277 installs and passing most security audits, it's seeing real use, though the skill description itself is pretty bare bones compared to the underlying technique.
npx -y skills add davila7/claude-code-templates --skill simpo-training --agent claude-codeInstalls into .claude/skills of the current project.
Select a file.
juliusbrussee/caveman
mattpocock/skills
shadcn/improve
obra/superpowers
forrestchang/andrej-karpathy-skills
vercel-labs/skills