Pulls YouTube transcripts using yt-dlp with a sensible fallback chain: tries manual subtitles first, then auto-generated, and only hits Whisper transcription as a last resort after asking permission. The workflow is thorough, it checks dependencies, lists available subtitle tracks before downloading, and includes Python one-liners to deduplicate the VTT output since YouTube's auto-captions repeat lines across overlapping timestamps. Useful when you need text from videos for analysis or documentation. The Whisper path is properly gated behind confirmations since it downloads audio and installs models, which is the right call. Handles the boring stuff like filename sanitization and cleanup automatically.
npx -y skills add michalparkola/tapestry-skills-for-claude-code --skill youtube-transcript --agent claude-codeInstalls into .claude/skills of the current project.
This skill helps download transcripts (subtitles/captions) from YouTube videos using yt-dlp.
Activate this skill when the user:
--write-sub) - highest quality--write-auto-sub) - usually availableIMPORTANT: Always check if yt-dlp is installed first:
which yt-dlp || command -v yt-dlp
Attempt automatic installation based on the system:
macOS (Homebrew):
brew install yt-dlp
Linux (apt/Debian/Ubuntu):
sudo apt update && sudo apt install -y yt-dlp
Alternative (pip - works on all systems):
pip3 install yt-dlp
# or
python3 -m pip install yt-dlp
If installation fails: Inform the user they need to install yt-dlp manually and provide them with installation instructions from https://github.com/yt-dlp/yt-dlp#installation
ALWAYS do this first before attempting to download:
yt-dlp --list-subs "YOUTUBE_URL"
This shows what subtitle types are available without downloading anything. Look for:
Try this first - highest quality, human-created:
yt-dlp --write-sub --skip-download --output "OUTPUT_NAME" "YOUTUBE_URL"
If manual subtitles aren't available:
yt-dlp --write-auto-sub --skip-download --output "OUTPUT_NAME" "YOUTUBE_URL"
Both commands create a .vtt file (WebVTT subtitle format).
ONLY use this if both manual and auto-generated subtitles are unavailable.
# Get audio file size estimate
yt-dlp --print "%(filesize,filesize_approx)s" -f "bestaudio" "YOUTUBE_URL"
# Or get duration to estimate
yt-dlp --print "%(duration)s %(title)s" "YOUTUBE_URL"
IMPORTANT: Display the file size to the user and ask: "No subtitles are available. I can download the audio (approximately X MB) and transcribe it using Whisper. Would you like to proceed?"
Wait for user confirmation before continuing.
command -v whisper
If not installed, ask user: "Whisper is not installed. Install it with pip install openai-whisper (requires ~1-3GB for models)? This is a one-time installation."
Wait for user confirmation before installing.
Install if approved:
pip3 install openai-whisper
yt-dlp -x --audio-format mp3 --output "audio_%(id)s.%(ext)s" "YOUTUBE_URL"
# Auto-detect language (recommended)
whisper audio_VIDEO_ID.mp3 --model base --output_format vtt
# Or specify language if known
whisper audio_VIDEO_ID.mp3 --model base --language en --output_format vtt
Model Options (stick to base for now):
tiny - fastest, least accurate (~1GB)base - good balance (~1GB) ← USE THISsmall - better accuracy (~2GB)medium - very good (~5GB)large - best accuracy (~10GB)After transcription completes, ask user: "Transcription complete! Would you like me to delete the audio file to save space?"
If yes:
rm audio_VIDEO_ID.mp3
yt-dlp --print "%(title)s" "YOUTUBE_URL"
Use this to create meaningful filenames based on the video title. Clean the title for filesystem compatibility:
/ with -$(yt-dlp --print "%(title)s" "URL" | tr '/' '-' | tr ':' '-')YouTube's auto-generated VTT files contain duplicate lines because captions are shown progressively with overlapping timestamps. Always deduplicate when converting to plain text while preserving the original speaking order.
python3 -c "
import sys, re
seen = set()
with open('transcript.en.vtt', 'r') as f:
for line in f:
line = line.strip()
if line and not line.startswith('WEBVTT') and not line.startswith('Kind:') and not line.startswith('Language:') and '-->' not in line:
clean = re.sub('<[^>]*>', '', line)
clean = clean.replace('&', '&').replace('>', '>').replace('<', '<')
if clean and clean not in seen:
print(clean)
seen.add(clean)
" > transcript.txt
# Get video title
VIDEO_TITLE=$(yt-dlp --print "%(title)s" "YOUTUBE_URL" | tr '/' '_' | tr ':' '-' | tr '?' '' | tr '"' '')
# Find the VTT file
VTT_FILE=$(ls *.vtt | head -n 1)
# Convert with deduplication
python3 -c "
import sys, re
seen = set()
with open('$VTT_FILE', 'r') as f:
for line in f:
line = line.strip()
if line and not line.startswith('WEBVTT') and not line.startswith('Kind:') and not line.startswith('Language:') and '-->' not in line:
clean = re.sub('<[^>]*>', '', line)
clean = clean.replace('&', '&').replace('>', '>').replace('<', '<')
if clean and clean not in seen:
print(clean)
seen.add(clean)
" > "${VIDEO_TITLE}.txt"
echo "✓ Saved to: ${VIDEO_TITLE}.txt"
# Clean up VTT file
rm "$VTT_FILE"
echo "✓ Cleaned up temporary VTT file"
.vtt): Includes timestamps and formatting, good for video players.txt): Just the text content, good for reading or analysis{output_name}.{language_code}.vtt (e.g., transcript.en.vtt)--write-sub instead for manual subtitlesVIDEO_URL="https://www.youtube.com/watch?v=dQw4w9WgXcQ"
# Get video title for filename
VIDEO_TITLE=$(yt-dlp --print "%(title)s" "$VIDEO_URL" | tr '/' '_' | tr ':' '-' | tr '?' '' | tr '"' '')
OUTPUT_NAME="transcript_temp"
# ============================================
# STEP 1: Check if yt-dlp is installed
# ============================================
if ! command -v yt-dlp &> /dev/null; then
echo "yt-dlp not found, attempting to install..."
if command -v brew &> /dev/null; then
brew install yt-dlp
elif command -v apt &> /dev/null; then
sudo apt update && sudo apt install -y yt-dlp
else
pip3 install yt-dlp
fi
fi
# ============================================
# STEP 2: List available subtitles
# ============================================
echo "Checking available subtitles..."
yt-dlp --list-subs "$VIDEO_URL"
# ============================================
# STEP 3: Try manual subtitles first
# ============================================
echo "Attempting to download manual subtitles..."
if yt-dlp --write-sub --skip-download --output "$OUTPUT_NAME" "$VIDEO_URL" 2>/dev/null; then
echo "✓ Manual subtitles downloaded successfully!"
ls -lh ${OUTPUT_NAME}.*
else
# ============================================
# STEP 4: Fallback to auto-generated
# ============================================
echo "Manual subtitles not available. Trying auto-generated..."
if yt-dlp --write-auto-sub --skip-download --output "$OUTPUT_NAME" "$VIDEO_URL" 2>/dev/null; then
echo "✓ Auto-generated subtitles downloaded successfully!"
ls -lh ${OUTPUT_NAME}.*
else
# ============================================
# STEP 5: Last resort - Whisper transcription
# ============================================
echo "⚠ No subtitles available for this video."
# Get file size
FILE_SIZE=$(yt-dlp --print "%(filesize_approx)s" -f "bestaudio" "$VIDEO_URL")
DURATION=$(yt-dlp --print "%(duration)s" "$VIDEO_URL")
TITLE=$(yt-dlp --print "%(title)s" "$VIDEO_URL")
echo "Video: $TITLE"
echo "Duration: $((DURATION / 60)) minutes"
echo "Audio size: ~$((FILE_SIZE / 1024 / 1024)) MB"
echo ""
echo "Would you like to download and transcribe with Whisper? (y/n)"
read -r RESPONSE
if [[ "$RESPONSE" =~ ^[Yy]$ ]]; then
# Check for Whisper
if ! command -v whisper &> /dev/null; then
echo "Whisper not installed. Install now? (requires ~1-3GB) (y/n)"
read -r INSTALL_RESPONSE
if [[ "$INSTALL_RESPONSE" =~ ^[Yy]$ ]]; then
pip3 install openai-whisper
else
echo "Cannot proceed without Whisper. Exiting."
exit 1
fi
fi
# Download audio
echo "Downloading audio..."
yt-dlp -x --audio-format mp3 --output "audio_%(id)s.%(ext)s" "$VIDEO_URL"
# Get the actual audio filename
AUDIO_FILE=$(ls audio_*.mp3 | head -n 1)
# Transcribe
echo "Transcribing with Whisper (this may take a few minutes)..."
whisper "$AUDIO_FILE" --model base --output_format vtt
# Cleanup
echo "Transcription complete! Delete audio file? (y/n)"
read -r CLEANUP_RESPONSE
if [[ "$CLEANUP_RESPONSE" =~ ^[Yy]$ ]]; then
rm "$AUDIO_FILE"
echo "Audio file deleted."
fi
ls -lh *.vtt
else
echo "Transcription cancelled."
exit 0
fi
fi
fi
# ============================================
# STEP 6: Convert to readable plain text with deduplication
# ============================================
VTT_FILE=$(ls ${OUTPUT_NAME}*.vtt 2>/dev/null || ls *.vtt | head -n 1)
if [ -f "$VTT_FILE" ]; then
echo "Converting to readable format and removing duplicates..."
python3 -c "
import sys, re
seen = set()
with open('$VTT_FILE', 'r') as f:
for line in f:
line = line.strip()
if line and not line.startswith('WEBVTT') and not line.startswith('Kind:') and not line.startswith('Language:') and '-->' not in line:
clean = re.sub('<[^>]*>', '', line)
clean = clean.replace('&', '&').replace('>', '>').replace('<', '<')
if clean and clean not in seen:
print(clean)
seen.add(clean)
" > "${VIDEO_TITLE}.txt"
echo "✓ Saved to: ${VIDEO_TITLE}.txt"
# Clean up temporary VTT file
rm "$VTT_FILE"
echo "✓ Cleaned up temporary VTT file"
else
echo "⚠ No VTT file found to convert"
fi
echo "✓ Complete!"
Note: This complete workflow handles all scenarios with proper error checking and user prompts at each decision point.
1. yt-dlp not installed
2. No subtitles available
--write-sub and --write-auto-sub3. Invalid or private video
https://www.youtube.com/watch?v=VIDEO_ID4. Whisper installation fails
pip3 install openai-whisper"5. Download interrupted or failed
--no-check-certificate if SSL issues occur6. Multiple subtitle languages
--sub-langs en for English only--list-subs first--list-subs)juliusbrussee/caveman
mattpocock/skills
shadcn/improve
obra/superpowers
forrestchang/andrej-karpathy-skills
vercel-labs/skills