Routes your lipsync job across four RunComfy endpoints (Sync Labs v2/Pro, ByteDance OmniHuman, Kling, Creatify) based on what you're actually trying to do: dub an existing video, animate a portrait still, or generate speech from a script. The skill picks Sync Labs for mouth-swap on real footage, OmniHuman for talking-head avatars from a single photo, and Kling text-to-video when you don't have pre-recorded audio. Includes consent guardrails since driving someone's mouth with arbitrary audio is obviously dual-use. The docs are thorough on model trade-offs (Pro vs standard, cost vs quality, audio stem isolation) and all four routes ship with working bash invocations.
npx -y skills add agentspace-so/runcomfy-agent-skills --skill lipsync --agent claude-codeInstalls into .claude/skills of the current project.
Drive a face's mouth from an audio track. This skill routes across the lip-sync endpoints in the RunComfy catalog — OmniHuman, Sync Labs sync v2, Kling lipsync, Creatify — picking the right model for the user's actual intent and shipping the documented prompts + the exact runcomfy run invoke.
runcomfy.com · Sync Labs models · CLI docs
# 1. Install (see runcomfy-cli skill for details)
npm i -g @runcomfy/cli # or: npx -y @runcomfy/cli --version
# 2. Sign in
runcomfy login # or in CI: export RUNCOMFY_TOKEN=<token>
# 3. Lipsync
runcomfy run <vendor>/<model> \
--input '{"video_url": "...", "audio_url": "..."}' \
--output-dir ./out
CLI deep dive: runcomfy-cli skill.
Driving a real person's mouth from a separate audio track is dual-use. Refuse user requests that target real public figures without consent, or that aim at defamatory or sexually explicit synthetic media. The skill itself does not gate inputs — the responsibility rests with the operator.
Listed newest first within each subtype. The agent picks one route based on: input shape (portrait still + audio vs source video + audio vs script-only), quality tier, and budget.
Sync Labs sync v2 Pro — sync/sync/lipsync/v2/pro (default for premium)
Sync Labs' premium lip-sync — state-of-the-art mouth motion onto an existing video. Preserves the rest of the frame untouched. Pick for: hero-quality dubs, lipsync on professionally-shot video, foreign-language dubbing where mouth fidelity matters most. Avoid for: cost-sensitive batch jobs — drop to sync v2.
Sync Labs sync v2 — sync/sync/lipsync/v2
Standard Sync Labs tier, same workflow as Pro. Pick for: scaled / batch lipsync jobs, drafts. Avoid for: hero delivery — use v2 Pro.
Kling Lipsync (audio-to-video) — kling/lipsync/audio-to-video
Kling's lip-sync onto a source video, driven by an audio track. Pick for: Kling-pipeline integration; alternative to Sync Labs. Avoid for: top-tier mouth fidelity — Sync Labs Pro is the industry benchmark.
Creatify Lipsync — creatify/lipsync
Creatify's lipsync endpoint. Pick for: Creatify-ecosystem workflows. Avoid for: comparison shopping unless cost / latency favors it.
OmniHuman — bytedance/omnihuman/api (default for avatar-style)
ByteDance's audio-driven full-body avatar. One portrait + one audio → video where the subject speaks / gestures naturally. Listed under RunComfy's
/feature/lip-syncas the curated default. Pick for: UGC voiceover, virtual presenter, dubbed product demo from a single portrait. Avoid for: lip-sync onto an existing video (no portrait, want to preserve original motion) — use Sync Labs v2 instead.
Wan 2-7 with audio_url — wan-ai/wan-2-7/text-to-video
Open-weights t2v with
audio_urlfield — prompt describes the scene, audio drives the mouth. Pick for: full scene control (not just a portrait) with a specific voiceover MP3 + open-weights pipeline. Avoid for: simplest "portrait talks" — use OmniHuman.
Kling Lipsync (text-to-video) — kling/lipsync/text-to-video
Generates speech audio in-pass from a script and syncs it to the resulting video. Pick for: "write a script → get a video with synced speech", no audio file needed. Avoid for: precise lip-sync to a specific MP3 (audio is regenerated each call, not locked).
HappyHorse 1.0 — happyhorse/happyhorse-1-0/text-to-video (also /image-to-video)
Arena #1 t2v / i2v with in-pass audio generated from prompt. Quote the spoken line inside the prompt with
says clearly: "…". Pick for: written script, in-pass audio with strong overall quality, social/UGC clips. Avoid for: locking mouth to a pre-recorded voiceover.
Model: sync/sync/lipsync/v2/pro (or sync/sync/lipsync/v2)
Catalog: sync v2 Pro · sync v2
runcomfy run sync/sync/lipsync/v2/pro \
--input '{
"video_url": "https://your-cdn.example/source-video.mp4",
"audio_url": "https://your-cdn.example/voiceover.mp3"
}' \
--output-dir ./out
Model: bytedance/omnihuman/api
Catalog: omnihuman
runcomfy run bytedance/omnihuman/api \
--input '{
"image_url": "https://your-cdn.example/portrait.jpg",
"audio_url": "https://your-cdn.example/voiceover.mp3"
}' \
--output-dir ./out
ai-avatar-video skill for the full avatar treatment.Model: kling/lipsync/audio-to-video (existing video + audio) or kling/lipsync/text-to-video (script-only)
Catalog: Kling lipsync a2v · Kling lipsync t2v
runcomfy run kling/lipsync/audio-to-video \
--input '{
"video_url": "https://your-cdn.example/source-video.mp4",
"audio_url": "https://your-cdn.example/voiceover.mp3"
}' \
--output-dir ./out
Schema details on the model page.
community/wan-2-2-animate/video-to-video) — see ai-avatar-video.kling collection — including Kling lipsync variants| code | meaning |
|---|---|
| 0 | success |
| 64 | bad CLI args |
| 65 | bad input JSON / schema mismatch |
| 69 | upstream 5xx |
| 75 | retryable: timeout / 429 |
| 77 | not signed in or token rejected |
Full reference: docs.runcomfy.com/cli/troubleshooting.
The skill classifies user intent — source video + audio? portrait still + audio? script only? — picks the matching route, and invokes runcomfy run with the JSON body. The CLI POSTs to the Model API, polls request status, fetches the result, and downloads any .runcomfy.net / .runcomfy.com URLs into --output-dir.
npm i -g @runcomfy/cli or npx -y @runcomfy/cli. Agents must not pipe an arbitrary remote install script into a shell on the user's behalf.runcomfy login writes the API token to ~/.config/runcomfy/token.json with mode 0600. Set RUNCOMFY_TOKEN env var in CI / containers.--input. The CLI does not shell-expand prompt content. No shell-injection surface.model-api.runcomfy.net and *.runcomfy.net / *.runcomfy.com. No telemetry.Bash(runcomfy *) only.runcomfy-cli — the underlying CLIai-avatar-video — full avatar / talking-head router (OmniHuman + HappyHorse + Wan)ai-video-generation — general t2v / i2vface-swap — identity swap on existing video (often paired with lipsync)video-edit — broader video editsickn33/antigravity-awesome-skills
moizibnyousaf/ai-agent-skills
github/awesome-copilot