A clean wrapper around ElevenLabs Music via the RunComfy CLI. Generates studio-quality songs and instrumentals from text prompts, 5 seconds to 5 minutes, with section markers for verse/chorus/bridge structure. The single prompt field carries both style instructions and lyrics, which takes some getting used to but works well once you nail the format. Vocals are multilingual and surprisingly coherent. Force instrumental mode for podcast intros and background beds. Pricing scales linearly with duration, so draft at 30 seconds before committing to a 5-minute render. Best for structured songs with real vocals or polished instrumental tracks. For one-off sound effects or voice cloning, look elsewhere.
npx -y skills add agentspace-so/runcomfy-agent-skills --skill elevenlabs-music-generation --agent claude-codeInstalls into .claude/skills of the current project.
Generate full songs and instrumental tracks from a text description — studio-quality 44.1 kHz stereo, 5 seconds to 5 minutes, with section-level structure control. ElevenLabs Music on the RunComfy Model API, called through the runcomfy CLI.
runcomfy.com · ElevenLabs Music model · CLI docs
npx skills add agentspace-so/runcomfy-agent-skills --skill elevenlabs-music-generation -g
# 1. Install (one of — see runcomfy-cli skill for details)
npm i -g @runcomfy/cli # global install
npx -y @runcomfy/cli --version # zero-install
# 2. Sign in
runcomfy login # or in CI: export RUNCOMFY_TOKEN=<token>
# 3. Generate music
runcomfy run elevenlabs/elevenlabs/music-generation \
--input '{"prompt": "..."}' \
--output-dir ./out
CLI deep dive: runcomfy-cli skill.
ElevenLabs Music's strength is structured songs with real vocals — it takes a style brief plus lyrics with section markers and returns a coherent, mixed track. Pick it for:
force_instrumental: true for background music, podcast intros, game loopsIf the user just wants ambient sound or a one-off SFX (thunder, footsteps), that's a sound-effects task, not music — ElevenLabs Music is for songs and tracks.
Model: elevenlabs/elevenlabs/music-generation
| Field | Type | Required | Default | Notes |
|---|---|---|---|---|
prompt | string | yes | — | Style description and lyrics with section markers. See prompting tips |
music_length_ms | int | no | 40000 | Output duration in ms. 5000–300000 (5 s – 5 min) |
force_instrumental | bool | no | false | true = instrumental only, no vocals |
output_format | string | no | mp3_standard | mp3_standard (default), or WAV — see the model page API tab for the full format list |
Output: 44.1 kHz stereo audio. The result JSON contains the generated audio URL — the CLI downloads it into --output-dir.
Pricing: ~$0.0083 per second of generated audio (30 s ≈ $0.25, 60 s ≈ $0.50, 5 min ≈ $2.49). Cost scales with music_length_ms, so draft short and finalize long.
Full vocal song with structure:
runcomfy run elevenlabs/elevenlabs/music-generation \
--input '{
"prompt": "Upbeat indie-pop anthem, bright electric guitars, driving drums, 120 BPM, female lead vocal. [Intro 8 bars] instrumental build. [Verse] Chalk on the palms, laces double-knotted, morning on the ridge. [Chorus] We rise, we strike, we never fade out. [Bridge] soft breakdown, just piano and voice. [Outro] full band, fade.",
"music_length_ms": 60000
}' \
--output-dir ./out
Instrumental background bed:
runcomfy run elevenlabs/elevenlabs/music-generation \
--input '{
"prompt": "Calm lo-fi hip-hop instrumental for a study playlist. Warm Rhodes piano, soft vinyl crackle, mellow boom-bap drums, 75 BPM. No vocals. Consistent loop-friendly groove throughout.",
"music_length_ms": 90000,
"force_instrumental": true
}' \
--output-dir ./out
Short brand jingle:
runcomfy run elevenlabs/elevenlabs/music-generation \
--input '{
"prompt": "5-second cheerful brand stinger, bright marimba and a single uplifting chord resolve, no vocals.",
"music_length_ms": 5000,
"force_instrumental": true
}' \
--output-dir ./out
ElevenLabs Music reads one prompt field that carries both the style brief and the lyrics. Structure it well:
"Upbeat indie-pop anthem, bright electric guitars, 120 BPM, female lead vocal."[Intro], [Verse], [Chorus], [Bridge], [Outro]. Add approximate durations or bar counts — [Intro 8 bars], [Verse 16 bars]."electric guitar carries the chorus, drums sit back in the verse."force_instrumental: true AND say "no vocals" in the prompt — belt and suspenders.[Verse] (sung in Brazilian Portuguese) ...).music_length_ms: 35000) before paying for a 5-minute render.[Intro]/[Verse]/[Chorus] structure, music_length_ms matched to the video lengthforce_instrumental: true, 10–20 s, "loop-friendly, clean ending"force_instrumental: true, describe "seamless loop", 60–120 s, consistent groovemusic_length_ms: 35000 to lock genre/tempo/structure → final render at full lengthprompt field carries everything (style + lyrics). There is no separate "lyrics" parameter.music_length_ms 5000–300000). For longer pieces, generate sections and stitch externally.force_instrumental is the only vocal toggle — you can't request specific voice identities or clone a singer through this endpoint.| code | meaning |
|---|---|
| 0 | success |
| 64 | bad CLI args |
| 65 | bad input JSON / schema mismatch |
| 69 | upstream 5xx |
| 75 | retryable: timeout / 429 |
| 77 | not signed in or token rejected |
Full reference: docs.runcomfy.com/cli/troubleshooting.
The skill invokes runcomfy run elevenlabs/elevenlabs/music-generation with the JSON body. The CLI POSTs to the RunComfy Model API, polls request status, fetches the result, and downloads the generated audio file into --output-dir. Ctrl-C cancels the remote request before exit.
npm i -g @runcomfy/cli or npx -y @runcomfy/cli. Agents must not pipe an arbitrary remote install script into a shell on the user's behalf — if the operator wants the curl-pipe path documented at docs.runcomfy.com/cli/install, they should review the script first.runcomfy login writes the API token to ~/.config/runcomfy/token.json with mode 0600. Set RUNCOMFY_TOKEN env var to bypass the file in CI / containers. Never echo the token into a prompt, log it, or check it in.--input. The CLI does not shell-expand prompt content; it transmits the JSON body directly to the Model API over HTTPS. No shell-injection surface from prompt content, even with backticks, quotes, or $(...) patterns.model-api.runcomfy.net (request submission) and *.runcomfy.net / *.runcomfy.com (download whitelist for generated audio). No telemetry, no callbacks.runcomfy <subcommand> — npm / npx lines are one-time operator setup, not commands the skill executes per call.runcomfy-cli — the underlying CLI, schema discovery, polling modes, scriptingai-video-generation — pair a generated track with a generated videoai-avatar-video — talking-head video (different audio path — speech, not music)cursor/plugins
metabase/metabase
metabase/metabase
telagod/code-abyss
github/awesome-copilot
DietrichGebert/ponytail