Wraps Google's flash-tier Gemini image model with documented prompting patterns that actually work: subject-first grammar, quoted text for in-image typography, seed locking for reproducible variants. Calls RunComfy's CLI for batching drafts at 0.5K or pushing finals to 2K. The bundled guide knows when to route to Nano Banana Pro for portraits or Flux/GPT Image for stylization, which matters because the skill triggers on generic asks like "gemini image" or "google image gen." The web grounding option is there for current-event imagery but adds latency. Best for social thumbnails, rapid ideation rounds, and anything where you need predictable framing over maximum detail.
npx -y skills add agentspace-so/runcomfy-agent-skills --skill nano-banana-2 --agent claude-codeInstalls into .claude/skills of the current project.
runcomfy.com · Model page · GitHub
Google Nano Banana 2 — the flash-tier text-to-image model in the Gemini family — hosted on the RunComfy Model API. Optimized for ideation, social-thumbnail batches, and rapid drafts with strong in-image typography.
npx skills add agentspace-so/runcomfy-skills --skill nano-banana-2 -g
Nano Banana 2 is the flash-tier of the Google image-gen line. Pick it when iteration speed and predictable framing matter more than maximum detail.
| You want | Use |
|---|---|
| Rapid drafts, social thumbnails, batch variants | Nano Banana 2 |
| In-image typography with predictable rendering | Nano Banana 2 |
| Web-grounded image (current events / real entities) | Nano Banana 2 + enable_web_search |
| Image edit (preserve subject, swap background) | Nano Banana Edit (sibling skill) |
| Heavy stylization, painterly look | Flux 2 |
| Maximum prompt adherence + multilingual text | GPT Image 2 |
| 2K–4K hero shots, max realism | Seedream 5 |
| Hyperrealistic portrait | Nano Banana Pro |
If the user said "Nano Banana" / "nano-banana-2" / "Gemini image" explicitly, route here regardless. If they said "Nano Banana" without specifying 2 vs Pro, default to Pro for portraits and 2 for everything else.
npm i -g @runcomfy/cliruncomfy login opens a browser device-code flow.RUNCOMFY_TOKEN=<token> instead of runcomfy login.google/nano-banana-2/text-to-image| Field | Type | Required | Default | Notes |
|---|---|---|---|---|
prompt | string | yes | — | Subject-first description. |
num_images | int | no | 1 | 1–4. Use 4 for ideation rounds. |
seed | int | no | 0 | Reuse for reproducibility. |
aspect_ratio | enum | no | auto | auto, 21:9, 16:9, 3:2, 4:3, 5:4, 1:1, 4:5, 3:4, 2:3, 9:16. |
resolution | enum | no | 1K | 0.5K (drafts), 1K (default), 2K (final), 4K (max). |
output_format | enum | no | png | png, jpeg, webp. |
safety_tolerance | int | no | 4 | 1 (strict) – 6 (permissive). |
limit_generations | bool | no | true | Limit each prompt round to one generation. |
enable_web_search | bool | no | false | Adds web grounding (extra cost + latency). |
For image edit (preserve subject + apply changes), see the sibling nano-banana-edit skill.
Default draft (1K, square, png):
runcomfy run google/nano-banana-2/text-to-image \
--input '{"prompt": "<user prompt>"}' \
--output-dir <absolute/path>
Vertical 4-up batch for ideation:
runcomfy run google/nano-banana-2/text-to-image \
--input '{
"prompt": "<user prompt>",
"num_images": 4,
"aspect_ratio": "9:16",
"resolution": "0.5K"
}' \
--output-dir <absolute/path>
Final at 2K with seed lock:
runcomfy run google/nano-banana-2/text-to-image \
--input '{
"prompt": "<user prompt>",
"resolution": "2K",
"aspect_ratio": "16:9",
"seed": 42
}' \
--output-dir <absolute/path>
Web-grounded (current event / real entity):
runcomfy run google/nano-banana-2/text-to-image \
--input '{
"prompt": "<prompt referencing a real-world event from this week>",
"enable_web_search": true
}' \
--output-dir <absolute/path>
Subject-first declarative grammar. "A cinematic close-up portrait of an American woman standing under neon lights in rainy Tokyo, shallow depth of field, reflective wet streets, ultra-detailed, realistic skin texture" — primary subject, then action, environment, style, camera. Front-load subject; trail with directives.
Exact text quoting for in-image typography. "The label reads 'AURA' in clean bold sans-serif, centered, white on black" — quote the literal characters. Specify placement and font style. Don't say "with the brand name on it" and hope.
Consistent seeds for refinement. Lock seed when iterating a single prompt across small variants — keeps composition stable.
Web-grounding, sparingly. Turn on enable_web_search only when the prompt names current events / real entities. Adds latency + cost; off by default.
Don't conflict styles. "minimalist + ornate + retro + cyberpunk" cancels. Pick 1–2 anchors.
Anti-patterns:
| Use case | Why Nano Banana 2 |
|---|---|
| Marketing draft thumbnails (batch of 4) | Fast iteration at 0.5K, then promote winner to 2K |
| Social-platform-native | Wide aspect ratio support including 9:16, 4:5, 21:9 |
| In-image typography for posters / cards | Predictable text rendering when characters are quoted |
| Web-grounded current-event imagery | enable_web_search integrates fresh info |
| Reproducible variant testing | Strong seed + consistent framing |
Cinematic portrait (page example):
A cinematic close-up portrait of an American woman standing under neon
lights in rainy Tokyo, shallow depth of field, reflective wet streets,
ultra-detailed, realistic skin texture
Brand-asset card with quoted text:
A minimalist 16:9 product card: a matte black ceramic mug centered on a
soft warm-grey paper background, rim highlight from upper-left, the
headline "Brewed Quietly" in clean bold sans-serif top-right, balanced
negative space below, e-commerce ready, clean studio lighting
Vertical platform-native:
A 9:16 vertical hero for a wellness brand: a single ceramic teacup on a
linen runner, soft morning side-light, the words "Slow Down" in
hand-drawn serif large at the top, gentle steam rising, neutral color
palette, uncluttered
/edit endpoint — not this one.| code | meaning |
|---|---|
| 0 | success |
| 64 | bad CLI args |
| 65 | bad input JSON / schema mismatch |
| 69 | upstream 5xx |
| 75 | retryable: timeout / 429 |
| 77 | not signed in or token rejected |
Full reference: docs.runcomfy.com/cli/troubleshooting.
The skill invokes runcomfy run google/nano-banana-2/text-to-image with a JSON body matching the schema. The CLI POSTs to https://model-api.runcomfy.net/v1/models/google/nano-banana-2/text-to-image, polls the request, fetches the result, and downloads any .runcomfy.net/.runcomfy.com URL into --output-dir. Ctrl-C cancels the remote request before exit.
runcomfy login writes the API token to ~/.config/runcomfy/token.json with mode 0600 (owner-only read/write). Set RUNCOMFY_TOKEN env var to bypass the file entirely in CI / containers.--input. The CLI does NOT shell-expand the prompt; it transmits the JSON body directly to the Model API over HTTPS. No shell injection surface from prompt content.model-api.runcomfy.net (request submission) and *.runcomfy.net / *.runcomfy.com (download whitelist for generated outputs). No telemetry, no callbacks.sickn33/antigravity-awesome-skills
moizibnyousaf/ai-agent-skills
github/awesome-copilot