This is a full-featured video production pipeline wrapped as eight MCP tools. It uses Playwright for 60fps frame-by-frame website recording with cinema easing curves, ffmpeg for encoding and editing (speed changes, keyframing, picture-in-picture), and supports color grading with 22 LUT presets, auto-captioning via Whisper, and TTS narration through ElevenLabs or OpenAI. The smart screenshot tool detects UI elements like chat widgets and pricing sections. You can record a product demo, add synchronized voiceover, apply cinematic color, and export to social formats (Instagram Reel, TikTok, YouTube Short) in one workflow. Built by StudioMeyer in Mallorca, runs on stdio or HTTP, requires ffmpeg and Playwright browsers.
Public tool metadata for what this MCP can expose to an agent.
get_job_resultCheck job status and result. Poll every 60 seconds — do NOT poll more frequently. Video processing typically takes 3-5 minutes. Progress may stay at 20% during frame analysis for 1-3 minutes — this is completely normal. Do NOT interpret slow progress as failure. Only report fa...3 paramsCheck job status and result. Poll every 60 seconds — do NOT poll more frequently. Video processing typically takes 3-5 minutes. Progress may stay at 20% during frame analysis for 1-3 minutes — this is completely normal. Do NOT interpret slow progress as failure. Only report fa...
job_idstringapi_keystringdb_job_idstringget_upload_urlGET A SIGNED UPLOAD URL for uploading a local video to NarrateAI cloud storage. Use this ONLY when running in HTTP/remote mode and the user has a local video file. After getting the URL, upload the file with curl, then pass the returned temp_file_path to any processing tool as...2 paramsGET A SIGNED UPLOAD URL for uploading a local video to NarrateAI cloud storage. Use this ONLY when running in HTTP/remote mode and the user has a local video file. After getting the URL, upload the file with curl, then pass the returned temp_file_path to any processing tool as...
api_keystringfilenamestringgenerate_narration_scriptNARRATION SCRIPT – generates an AI-written timed script for a SILENT video. No audio output. Use when the user wants a timed narration script, text-only narration, or sync data for a silent video. This does NOT extract existing speech (use transcribe_video for that). This does...4 paramsNARRATION SCRIPT – generates an AI-written timed script for a SILENT video. No audio output. Use when the user wants a timed narration script, text-only narration, or sync data for a silent video. This does NOT extract existing speech (use transcribe_video for that). This does...
api_keystringlanguagestringvideo_sourcestringmanual_contextstringnarrate_video_fullFULL NARRATED VIDEO – produces a downloadable video with AI voiceover. Use when the user wants: "narrate this video", "add voiceover", "make a narrated video". VOICE OPTIONS — ask the user which they prefer: 1. AI voice: male1 (default, fastest), female1 (default, fastest), fe...6 paramsFULL NARRATED VIDEO – produces a downloadable video with AI voiceover. Use when the user wants: "narrate this video", "add voiceover", "make a narrated video". VOICE OPTIONS — ask the user which they prefer: 1. AI voice: male1 (default, fastest), female1 (default, fastest), fe...
api_keystringlanguagestringvoice_typestringvideo_sourcestringvoice_samplestringmanual_contextstringabandon_jobAbandon/cancel a processing job. Call this when the user cancels on the agent side. Stops the backend from continuing audio generation and video assembly. Use after narrate_video_transcript or when continue_to_full_video was started but user cancelled. Returns: JSON with succe...2 paramsAbandon/cancel a processing job. Call this when the user cancels on the agent side. Stops the backend from continuing audio generation and video assembly. Use after narrate_video_transcript or when continue_to_full_video was started but user cancelled. Returns: JSON with succe...
job_idstringapi_keystringtranscribe_videoTRANSCRIPTION ONLY – video with existing voice -> speech-to-text -> timed transcript. No translation, no narrated video. Returns original speech as-is. Use when the user wants to transcribe a video that already has spoken audio (podcast, interview, meeting recording, etc.). CR...3 paramsTRANSCRIPTION ONLY – video with existing voice -> speech-to-text -> timed transcript. No translation, no narrated video. Returns original speech as-is. Use when the user wants to transcribe a video that already has spoken audio (podcast, interview, meeting recording, etc.). CR...
api_keystringvideo_sourcestringsource_languagestringtranscribe_and_translateTRANSCRIBE & TRANSLATE (new upload) – video with voice -> speech-to-text -> translate -> translated transcript. No TTS, no video output. Returns translated timed transcript only. Use when the user uploads a new video and wants a translated transcript (e.g. Spanish podcast -> E...4 paramsTRANSCRIBE & TRANSLATE (new upload) – video with voice -> speech-to-text -> translate -> translated transcript. No TTS, no video output. Returns translated timed transcript only. Use when the user uploads a new video and wants a translated transcript (e.g. Spanish podcast -> E...
api_keystringvideo_sourcestringsource_languagestringtarget_languagestringtranslate_existing_videoTRANSLATION (existing video) – Translate transcript of a video already in the user's library. Loads transcript from cloud, translates, returns. No upload. Sync – returns immediately. Use when the user wants to translate a video they already narrated/dubbed with NarrateAI (e.g....4 paramsTRANSLATION (existing video) – Translate transcript of a video already in the user's library. Loads transcript from cloud, translates, returns. No upload. Sync – returns immediately. Use when the user wants to translate a video they already narrated/dubbed with NarrateAI (e.g....
job_idstringapi_keystringsource_languagestringtarget_languagestringdub_video_fullFULL AUTO-DUBBING – transcribe -> translate -> extract speaker voice -> TTS with cloned voice -> dubbed video. No refinement screen. Uses the video's own speaker voice for the dubbed audio. Use when the user wants a complete dubbed video (e.g. Spanish video -> English dubbed)....5 paramsFULL AUTO-DUBBING – transcribe -> translate -> extract speaker voice -> TTS with cloned voice -> dubbed video. No refinement screen. Uses the video's own speaker voice for the dubbed audio. Use when the user wants a complete dubbed video (e.g. Spanish video -> English dubbed)....
api_keystringvideo_sourcestringsource_languagestringtarget_languagestringpreserve_background_musicbooleangenerate_documentDOCUMENT GENERATION – produces a structured markdown document from a silent video. Use when the user wants: a document, article, guide, tutorial, or written content based on a video. NOT for narrated video or voiceover. The agent MUST ask which document type the user wants bef...5 paramsDOCUMENT GENERATION – produces a structured markdown document from a silent video. Use when the user wants: a document, article, guide, tutorial, or written content based on a video. NOT for narrated video or voiceover. The agent MUST ask which document type the user wants bef...
api_keystringlanguagestringvideo_sourcestringdocument_typestringmanual_contextstringgenerate_ttsTEXT-TO-SPEECH – generate audio from text. Returns a downloadable audio URL. Use when the user wants: "read this aloud", "generate speech", "text to speech", "convert text to audio", "make an audio file from this text". VOICE OPTIONS — ask the user which they prefer: 1. AI voi...5 paramsTEXT-TO-SPEECH – generate audio from text. Returns a downloadable audio URL. Use when the user wants: "read this aloud", "generate speech", "text to speech", "convert text to audio", "make an audio file from this text". VOICE OPTIONS — ask the user which they prefer: 1. AI voi...
textstringapi_keystringlanguagestringvoice_typestringvoice_samplestringnarrate_batchBATCH NARRATION – narrate multiple videos in parallel. Each gets a full narrated video with voiceover. Use when the user has multiple videos to narrate (e.g. "narrate these 3 videos"). Maximum 5 videos per batch. Each video is processed independently – one failure does not aff...6 paramsBATCH NARRATION – narrate multiple videos in parallel. Each gets a full narrated video with voiceover. Use when the user has multiple videos to narrate (e.g. "narrate these 3 videos"). Maximum 5 videos per batch. Each video is processed independently – one failure does not aff...
api_keystringlanguagestringvoice_typestringcontexts_jsonstringmanual_contextstringvideo_sources_jsonstringbatch_generate_scriptsBATCH SCRIPT GENERATION – generate AI narration scripts for multiple silent videos in parallel. Each video gets a timed narration script (text only, no audio). Maximum 5 videos per batch. One failure does not affect others. CRITICAL – Context handling: Before calling, ask the...5 paramsBATCH SCRIPT GENERATION – generate AI narration scripts for multiple silent videos in parallel. Each video gets a timed narration script (text only, no audio). Maximum 5 videos per batch. One failure does not affect others. CRITICAL – Context handling: Before calling, ask the...
api_keystringlanguagestringcontexts_jsonstringmanual_contextstringvideo_sources_jsonstringbatch_transcribeBATCH TRANSCRIPTION – transcribe speech from multiple videos in parallel. Each video must have existing spoken audio. Returns timed transcript per video. CRITICAL: source_language is REQUIRED – ask user if not specified. Applies to all videos. Maximum 5 videos per batch. One f...3 paramsBATCH TRANSCRIPTION – transcribe speech from multiple videos in parallel. Each video must have existing spoken audio. Returns timed transcript per video. CRITICAL: source_language is REQUIRED – ask user if not specified. Applies to all videos. Maximum 5 videos per batch. One f...
api_keystringsource_languagestringvideo_sources_jsonstringbatch_dubBATCH DUBBING – dub multiple videos into another language in parallel. Each video gets full auto-dubbing (transcribe -> translate -> voice clone -> dubbed video). CRITICAL: source_language, target_language, preserve_background_music are REQUIRED – ask user. All videos share th...5 paramsBATCH DUBBING – dub multiple videos into another language in parallel. Each video gets full auto-dubbing (transcribe -> translate -> voice clone -> dubbed video). CRITICAL: source_language, target_language, preserve_background_music are REQUIRED – ask user. All videos share th...
api_keystringsource_languagestringtarget_languagestringvideo_sources_jsonstringpreserve_background_musicbooleanupdate_transcriptUPDATE TRANSCRIPT – edit the narration script before continuing to full video. Use after generate_narration_script returns a transcript and the user wants to change wording, timing, or content of specific segments. The user describes changes naturally; you apply them and call...5 paramsUPDATE TRANSCRIPT – edit the narration script before continuing to full video. Use after generate_narration_script returns a transcript and the user wants to change wording, timing, or content of specific segments. The user describes changes naturally; you apply them and call...
job_idstringapi_keystringtarget_languagestringtranscript_jsonstringreset_for_reprocessingbooleanlist_videosLIST VIDEOS – get the user's video library (previously processed videos). Use when the user wants to see their existing videos, re-translate a previously narrated video, or work with videos they already processed. Returns paginated list with job IDs, filenames, status, and tim...3 paramsLIST VIDEOS – get the user's video library (previously processed videos). Use when the user wants to see their existing videos, re-translate a previously narrated video, or work with videos they already processed. Returns paginated list with job IDs, filenames, status, and tim...
pageintegerapi_keystringper_pageintegercontinue_to_full_videoContinue from transcript to full narrated video. Use after generate_narration_script returns a transcript and the user is satisfied with it. VOICE OPTIONS — ask the user which they prefer: 1. AI voice: male1 (default, fastest), female1 (default, fastest), female2, female3, fem...5 paramsContinue from transcript to full narrated video. Use after generate_narration_script returns a transcript and the user is satisfied with it. VOICE OPTIONS — ask the user which they prefer: 1. AI voice: male1 (default, fastest), female1 (default, fastest), female2, female3, fem...
job_idstringapi_keystringdb_job_idstringvoice_typestringvoice_samplestringPart of the StudioMeyer MCP Stack — Built in Mallorca 🌴 · ⭐ if you use it
8 MCP tools for recording, editing, effects, captions, TTS, and smart screenshots.
Built on ffmpeg and Playwright. Works with any MCP client.
We have been building tools and systems for ourselves for the past two years. The fact that this repo is small and has few stars is not because it is new. It is because we only just decided to share what we have built. It is not a fresh experiment, it is a long story with a recent commit.
We love building things and sharing them. We do not love social media tactics, growth hacks, or chasing stars and followers. So this repo is small. The code is real, it gets used, issues get answered. Judge for yourself.
If it helps you, sharing, testing, and feedback help us. If it could be better, an issue is more useful. If you build something with it, tell us at hello@studiomeyer.io. That genuinely makes our day.
From a small studio in Palma de Mallorca.
| Tool | Operations | Description |
|---|---|---|
video_record | cinema, scroll, multi-device | Record websites at 60fps with frame-by-frame capture |
video_edit | speed, crop, reverse, keyframe, pip | Edit clips with zoom/pan, PiP, slow-mo |
video_color | grade, effect, lut, chroma | Color grading, 22 LUT presets, green screen |
video_audio | extract, music, ducking, mix, voice | Audio extraction, mixing, 9 voice effects |
video_text | subtitles, caption, overlay, animate | Burn SRT, Whisper auto-caption, 15 text animations |
video_compose | concat, intro, social, beat-sync, templates | Join clips, social format conversion, beat sync |
video_speech | generate, voices, narrated | ElevenLabs/OpenAI TTS, full narrated videos |
video_screenshot | capture, detect | Element-aware screenshots, page feature detection |
cinematic and showcase for buttery smooth scrolling.npx playwright install chromium)ELEVENLABS_API_KEY for ElevenLabs TTSOPENAI_API_KEY for Whisper captions and OpenAI TTSIf ffmpeg lives outside PATH, set FFMPEG_PATH and FFPROBE_PATH to the
absolute binary paths. Both env vars are honoured at startup AND at every
runtime spawn site.
{
"mcpServers": {
"video": {
"command": "npx",
"args": ["-y", "mcp-video"]
}
}
}
npx mcp-video
git clone https://github.com/studiomeyer-io/mcp-video.git
cd mcp-video
npm install
npx playwright install chromium
npm run build
npm start
# Start as HTTP microservice
npx mcp-video --http --port=9847
# Or via environment variables
MCP_HTTP=1 MCP_PORT=9847 npx mcp-video
| Environment Variable | Default | Description |
|---|---|---|
VIDEO_OUTPUT_DIR | ./output | Directory for generated files |
FFMPEG_PATH | — | Absolute path to ffmpeg binary if not on PATH |
FFPROBE_PATH | — | Absolute path to ffprobe binary if not on PATH |
ELEVENLABS_API_KEY | — | ElevenLabs TTS API key |
OPENAI_API_KEY | — | OpenAI API key (Whisper + TTS) |
MCP_HTTP | false | Enable HTTP transport |
MCP_PORT | 9847 | HTTP port |
MCP_HOST | 127.0.0.1 | HTTP bind address |
MCP_VIDEO_DEBUG | false | Enable debug logging |
MCP_VIDEO_ALLOW_INTERNAL | false | Set to 1 to allow URLs that resolve to localhost / private / metadata IPs. Local dev only — leave unset in production (SSRF guard). |
| Use Case | Tools Used | Output |
|---|---|---|
| Product demo video | video_record → video_text → video_audio | 60fps website recording + auto-captions + background music |
| Social media clips | video_record → video_compose | Record once → export to Instagram Reel, TikTok, YouTube Short |
| Narrated explainer | video_speech → video_color | AI voiceover + cinematic color grade |
| Before/after comparison | video_screenshot → video_edit | Smart element screenshots + PiP composition |
| Automated QA | video_record + video_screenshot | Record user flows + screenshot specific elements |
Use video_record with type "cinema" to record https://example.com
with a smooth scroll and hover over the navbar.
Use video_speech with type "narrated" to create a narrated video of
https://example.com with these segments:
1. "Welcome to our homepage" — pause on hero section
2. "Check out our features" — scroll to features
3. "Get started today" — hover over CTA button
Use video_text with type "caption" to add auto-generated captions
to /path/to/video.mp4
Use video_compose with type "social-all" to convert
/path/to/video.mp4 to all social media formats.
Use video_screenshot with type "capture" to screenshot the chat widget
and pricing section on https://example.com
src/
server.ts Entry point, 8 consolidated MCP tools
lib/ Logger, types, dual transport
handlers/ Tool handlers (video, editing, post-production, tts, screenshots)
schemas/ JSON Schema definitions for legacy tool format
tools/
engine/ Core engines
capture.ts Frame-by-frame recording (Playwright → PNG → ffmpeg)
encoder.ts ffmpeg encoding pipeline
scenes.ts Scene execution (scroll, hover, click, type, wait)
cursor.ts Visible cursor simulation
smart-screenshot.ts Element-aware screenshot engine
tts.ts ElevenLabs + OpenAI TTS with fallback
narrated-video.ts Full narration pipeline
social-format.ts Social media format conversion
concat.ts Video concatenation with transitions
lut-presets.ts 22 cinema LUT presets
...and more
npm run dev # Start with tsx (hot reload)
npm run typecheck # Type check
npm test # Run tests
npm run check # Verify ffmpeg/ffprobe installed
StudioMeyer is an AI and design studio based in Palma de Mallorca, working with clients worldwide. We build custom websites and AI infrastructure for small and medium businesses. Production stack on Claude Agent SDK, MCP and n8n, with Sentry, Langfuse and LangGraph for observability and an in-house guard layer.
MIT
Built by StudioMeyer. Part of our open-source toolkit for AI-powered content creation.
io.github.socialapishub/social-media-api
io.github.xpaysh/social-media
com.thenextgennexus/youtube-media-mcp-server
io.github.ludmila-omlopes/youtube-video-analyzer
csoai-org/social-media-ai-mcp
com.ezbizservices/social-media