This adds Claude's vision capabilities to your code environment so you can process images, PDFs, and screenshots. You get base64 encoding patterns, multi-image comparison, and document extraction out of the box. The skill includes optimization helpers that resize images to around 1568px to cut token usage by 30-50%, which matters when you're processing multiple files. Works well for OCR-like text extraction, chart analysis, and structured data pulls from receipts or forms. One thing to note: it won't identify specific people and can struggle with handwriting, but for technical diagrams, UI screenshots, and printed documents it's solid.
npx -y skills add lobbi-docs/claude --skill vision-multimodal --agent claude-codeInstalls into .claude/skills of the current project.
Select a file.
supercent-io/skills-template
supercent-io/skills-template
huangjia2019/claude-code-engineering
reactjs/react.dev
reactjs/react.dev