This is your toolkit for building evaluation datasets in LangSmith without writing a bunch of SDK boilerplate. It wraps the langsmith CLI so you can export traces, shape them into datasets (final response, single step, trajectory, or RAG formats), and upload them for testing. The workflow is straightforward: pull traces from a project, transform the JSON into examples with inputs and outputs, then push it back up. Handy if you're iterating on agent behavior and need regression tests or comparative evals. The CLI handles confirmations on destructive ops, which is nice when you're moving fast and don't want to accidentally nuke a dataset.
npx -y skills add langchain-ai/langsmith-skills --skill langsmith-dataset --agent claude-codeInstalls into .claude/skills of the current project.
Select a file.
sickn33/antigravity-awesome-skills
moizibnyousaf/ai-agent-skills
github/awesome-copilot