This one's about spinning up professional narrated videos using Remotion and Resemble.ai's text-to-speech. You get a full production pipeline from research to animation, with specific templates for educational tutorials, SaaS demos, product launches, and social promos. The docs walk you through adapting scripts for different audiences (grade schoolers vs adults), syncing voiceover timing with animations, and handling the usual motion graphics stuff like kinetic typography and cursor animations. It's opinionated about workflow, which is helpful if you don't want to figure out scene planning from scratch. Best for when you need to pump out explainer or marketing videos without hiring a voice actor or spending days in After Effects.
npx -y skills add resemble-ai/remotion-resemble-skill --skill remotion-resemble-ai --agent claude-codeInstalls into .claude/skills of the current project.
Use this skill when the user wants to create:
Educational Tutorial Videos - Research a topic and create an animated explainer
SaaS Walkthrough Demos - Showcase software features with animated UI
Product Launch Announcements - Marketing videos with motion graphics
Tutorial / Walkthrough Videos - Step-by-step guides with cursor animations
Social Media Promo Videos - Short, punchy kinetic text videos for ads
App/Product Showcase Videos - Kinetic text combined with floating UI mockups
Landing Page Reveal Videos - Design-tool aesthetic with webpage animations
Follow this pipeline for all video types:
1. UNDERSTAND → 2. RESEARCH → 3. SCRIPT → 4. SCENES → 5. AUDIO → 6. ANIMATE
Identify:
For educational content:
For SaaS demos:
Write a complete voiceover script BEFORE generating audio. Structure varies by video type (see below).
Break the script into visual segments:
Use Resemble.ai to generate the voiceover (see setup below).
Build the Remotion composition with animations synchronized to the audio.
Goal: Explain a concept clearly for a specific audience level.
HOOK (5-10s)
"Have you ever wondered how a caterpillar becomes a butterfly?"
INTRODUCTION (10-15s)
Set context and preview what they'll learn.
BODY (main content, broken into 3-5 key points)
- Point 1: Explain with visual metaphor
- Point 2: Build on previous point
- Point 3: Add detail or example
- ...
RECAP (10-15s)
Summarize the key takeaways.
OUTRO (5s)
Call to action or closing thought.
| Audience | Language Style | Visuals |
|---|---|---|
| Grade K-2 | Very simple, playful, 5-word sentences | Bright colors, cute characters, big shapes |
| Grade 3-5 | Simple but informative, analogies | Clear diagrams, step-by-step, moderate pace |
| Grade 6-8 | More vocabulary, cause-effect | Charts, labeled diagrams, faster pace |
| High School+ | Technical terms OK, nuance | Data viz, detailed graphics |
| Adults | Professional, concise | Clean design, infographics |
Script: "First, the butterfly lays tiny eggs on a leaf."
Scene 1:
- Visual: Leaf appears (slide in from bottom, spring animation)
- Visual: Small eggs fade in on leaf
- Timing: Sync "eggs" word with eggs appearing
- Duration: 4 seconds
Goal: Show users how to accomplish a task in software.
HOOK (5-10s)
"Let me show you how to [accomplish goal] in [Product]."
CONTEXT (10s)
Why this feature matters, what problem it solves.
WALKTHROUGH (main content)
- Step 1: Navigate to X
- Step 2: Click on Y
- Step 3: Configure Z
- ...
RESULT (10s)
Show the outcome, the finished state.
OUTRO (5-10s)
Recap benefit, suggest next steps.
Script: "Click the blue 'New Database' button in the sidebar."
Scene:
- Visual: UI mockup of sidebar
- Animation: Cursor moves to button (0.8s ease-out)
- Animation: Button highlights with glow
- Animation: Click ripple effect
- Animation: New panel slides in from right
- Duration: 3 seconds
Goal: Generate excitement about a new product or feature.
HOOK (5-10s)
Bold statement or question that grabs attention.
PROBLEM (10-15s)
What pain point does this solve?
SOLUTION REVEAL (10-15s)
Introduce the product/feature dramatically.
KEY FEATURES (20-40s)
3-5 punchy feature highlights with visuals.
SOCIAL PROOF (optional, 10s)
Stats, testimonials, or credibility markers.
CALL TO ACTION (5-10s)
What should viewers do next?
Script: "Introducing Smart Search — find anything in milliseconds."
Scene:
- Visual: Dark background with gradient
- Animation: "Introducing" fades in (0.5s)
- Animation: "Smart Search" slams in large, bold (0.3s, with shake)
- Animation: Tagline types out below (typewriter, 1.5s)
- Animation: Search icon pulses
- Duration: 4 seconds
Goal: Teach users how to accomplish a task with clear step-by-step visual guidance.
INTRO (3-5s)
"Here's how to [accomplish goal] in [Product]."
STEP 1 (10-20s)
First action with clear visual demonstration.
STEP 2 (10-20s)
Second action, building on previous step.
STEP 3+ (10-20s each)
Continue through workflow...
SUCCESS STATE (5-10s)
Show completion, celebrate the outcome.
RECAP (optional, 5s)
Quick summary of what was accomplished.
Script: "First, enter what your lesson is about."
Scene 1 - Setup (2s):
- Visual: Full app interface, light background
- Visual: Breadcrumb shows "2. Seed Idea" active
- Animation: Screen fades in
Scene 2 - Interaction (4s):
- Visual: Form with "Enter a concept or theme:" label
- Animation: Cursor moves to input field (0.6s ease-out)
- Animation: Field border highlights blue
- Animation: "The Secret Lives of Mushrooms" types in (2s)
- Animation: Cursor moves to "Continue" button
Scene 3 - Transition (2s):
- Animation: Button highlights on hover
- Animation: Click ripple effect
- Animation: Screen slides left, new panel slides in
Scene 4 - Result (3s):
- Visual: Sketchpad view with AI suggestions panel
- Animation: Suggestion cards stagger in (0.15s delay each)
- Animation: "Lesson Flow" section populates
Scene 5 - Success (3s):
- Visual: Green full-screen background
- Animation: White circle scales in (spring)
- Animation: Checkmark draws inside circle
- Animation: "Your lesson is live!" fades in below
- Animation: Subtitle "Ready to inspire wonder and discovery" fades in
| Step Complexity | Duration | Notes |
|---|---|---|
| Simple click | 2-3s | Click and immediate result |
| Form input | 4-6s | Type + submit |
| Multi-field form | 8-12s | Multiple inputs |
| Complex workflow | 15-20s | Several sub-steps |
| Success state | 3-5s | Let it breathe |
Goal: Grab attention fast with bold kinetic text for ads and social content.
HOOK (1-3s)
Single powerful word or phrase that stops the scroll.
VALUE PROP (3-5s)
What's in it for them? One punchy sentence.
FEATURE FLASH (5-10s)
2-3 key benefits, rapid fire.
CTA (2-3s)
Clear action: "Download now", "Try free", URL
Script: "We are ready."
Scene sequence (4 seconds total):
- Frame 1: Black bg, "WE" slams in white (0.8s)
- Frame 2: White bg, "A" appears black, centered (0.8s)
- Frame 3: Red bg, "R" types out, becomes "READY" (1.6s)
- Frame 4: Hold final frame (0.8s)
| Platform | Ideal Length | Max Length |
|---|---|---|
| Instagram Reels | 15-30s | 90s |
| TikTok | 15-60s | 3min |
| Twitter/X | 15-45s | 2min 20s |
| 30s-1min | 10min | |
| YouTube Shorts | 15-60s | 60s |
Goal: Demonstrate a product's interface with cinematic flair, combining text and UI.
HOOK (3-5s)
Problem statement or intriguing question.
INTRODUCE PRODUCT (5-10s)
Show the product name/logo, establish brand.
FEATURE TOUR (20-40s)
Walk through 3-5 key features with UI demonstrations.
BENEFIT SUMMARY (5-10s)
Recap what makes it special.
CTA (5s)
URL, app store badge, or next step.
Script: "Earn the best yields on your crypto."
Scene 1 (3s):
- Visual: Soft purple gradient background
- Animation: Text "Earn the best" fades in, "yields" appears in accent color
- Timing: Words sync with voiceover
Scene 2 (4s):
- Visual: Browser URL bar appears (rounded rectangle)
- Animation: "jumper.exchange/" types out character by character
- Animation: Cursor blinks at end
Scene 3 (5s):
- Visual: App UI mockup slides up from bottom
- Animation: Token list (USDC, ETH, LINK) reveals with stagger
- Animation: Cursor moves to highlight token row
Goal: Showcase a website design with dramatic designer-tool aesthetics.
INTRO (3-5s)
Establish the problem or context.
DESIGN REVEAL (10-20s)
Dramatic entrance of the landing page.
FEATURE CALLOUTS (15-30s)
Highlight specific sections of the page.
CTA (5-10s)
"Book a demo", "Get started", with URL.
Script: "Achieve team harmony with TeamFusion."
Scene 1 (2s):
- Visual: Dark background, grid lines appear (dashed, gray)
- Animation: Blue rectangle placeholder at center
- Animation: Cursor enters frame
Scene 2 (3s):
- Visual: Light beam sweeps diagonally
- Animation: Blue rectangle transforms/morphs into webpage mockup
- Animation: Glow effect around the design
Scene 3 (5s):
- Visual: Full landing page visible with bento-style feature cards
- Animation: Cards lift slightly with shadow on hover simulation
- Animation: Text callout points to CTA button
Goal: Explain complex AI or tech products with text and interface demonstrations.
MEET THE PRODUCT (5-10s)
"Meet [Product] — the [category] that [key benefit]."
PROBLEM (10-15s)
What frustration does this solve?
HOW IT WORKS (20-40s)
Show the interface, demonstrate the magic.
INTEGRATIONS (10-15s)
What does it connect with?
CTA (5-10s)
Try it, sign up, learn more.
Script: "Meet the AI assistant that lets you talk to all your apps."
Scene 1 (3s):
- Visual: Dark purple gradient background
- Animation: "Meet" fades in gray, holds
Scene 2 (5s):
- Visual: App mockup slides in from right
- Visual: Voice recording UI with waveform animation
- Animation: Red recording dot pulses
- Animation: Text "that lets you talk to all your apps" appears left
Scene 3 (4s):
- Visual: Dropdown opens showing model options (GPT-4, Claude, etc.)
- Animation: List items stagger in (0.1s delay)
- Animation: Cursor hovers, selection highlights
Choose a visual style based on the product type and tone:
Use these terms when describing animations:
| Term | Effect | Best For |
|---|---|---|
| Fade | Opacity 0→1 or 1→0 | Subtle entrances, transitions |
| Slide | Move from off-screen | UI elements, list items |
| Scale | Grow or shrink | Emphasis, entrances |
| Spring | Bouncy with overshoot | Playful, energetic feel |
| Ease-out | Starts fast, slows at end | Natural movement |
| Ease-in-out | Slow start and end | Smooth, polished |
| Typewriter | Characters appear one by one | Text reveals, URLs, code |
| Stagger | Sequential delay between items | Lists, multiple elements |
| Morph | Shape transforms into another | Transformations, transitions |
| Pulse | Scale up/down rhythmically | Drawing attention |
| Shake | Quick horizontal vibration | Impact, emphasis |
| Wipe | Reveal with moving edge | Scene transitions |
| Term | Effect | Best For |
|---|---|---|
| Word slam | Word appears with scale overshoot + shake | Bold statements, social videos |
| Background flash | Instant background color change | Scene transitions, energy |
| Light sweep | Diagonal light beam across frame | Dramatic reveals |
| Glow/bloom | Soft light halo around element | Focus, premium feel |
| Perspective tilt | 3D rotation (5-10°) on UI | App mockups, depth |
| Cursor trail | Smooth bezier path for pointer | UI demonstrations |
| Click ripple | Circular pulse on click | Button interactions |
| Countdown | Numbers ticking down | Urgency, timers |
| Grid reveal | Guide lines appear before content | Design tool aesthetic |
| Logo assemble | Logo pieces come together | Brand reveals |
For each scene, document:
SCENE [number]: [title]
Script: "[exact voiceover text]"
Duration: [X seconds]
Visuals:
- [element]: [description]
- [element]: [description]
Animations:
- [timing]: [element] [animation type] ([duration], [easing])
- [timing]: [element] [animation type] ([duration], [easing])
Transition to next: [type]
Before generating voiceovers, verify the user has credentials configured.
Check if .env file exists with:
RESEMBLE_API_KEY - The Resemble.ai API tokenRESEMBLE_VOICE_UUID - Voice ID (optional, defaults to 7213a9ea)If RESEMBLE_API_KEY is missing: Ask: "I need your Resemble.ai API key to generate voiceovers. You can get one from https://app.resemble.ai/account/api — please paste your API key:"
If RESEMBLE_VOICE_UUID is missing:
Use the default voice UUID 7213a9ea. Only ask if they want a different voice.
.env with the provided values.env is in .gitignore.env is in .gitignore.env, never hardcodedCaptions are DISABLED by default. Only generate word-level timestamps and caption components when explicitly requested.
audio_timestamps into word-level dataRead these for detailed implementation:
davila7/claude-code-templates
orchestra-research/ai-research-skills
agentspace-so/runcomfy-agent-skills
inferen-sh/skills