This is actually one half of a two-server toolkit that wraps Google's Gemini media APIs. The NanoBanana server gives you four tools for image generation via NanoBanana Pro 2, with both Pro and Flash models available. You get 4K output by default, aspect ratio control, negative prompting, and Google Search grounding for factual image content. The Pro model runs on Gemini 3 for maximum quality, Flash uses Gemini 3.1 for speed. You can upload reference images for conditioning and track generation stats across sessions. Requires a Gemini API key, outputs to configurable directories, and uses async patterns so you don't hit timeouts on heavy renders. The repo also includes a seven-layer prompt engineering skill for photorealistic results if you're using Claude Code.
All-in-one MCP toolkit for AI media generation -- VEO 3.1 video + NanoBanana Pro 2 images + prompting skills
What's Included • Quick Start • VEO Server • NanoBanana Server • Skills • Contributing
| CLI | Install check | Version check |
|---|---|---|
veo-mcp-server | veo-mcp-server --help | veo-mcp-server --version |
nanobanana-imagen-mcp | nanobanana-imagen-mcp --help | nanobanana-imagen-mcp --version |
Gemini Media MCP is a comprehensive toolkit that brings Google's most powerful AI media generation models into any MCP-compatible AI assistant. Generate 4K videos with VEO 3.1, create stunning images with NanoBanana Pro 2, and craft professional prompts with built-in skills -- all from a single repository.
| Server | Description | Tools |
|---|---|---|
| VEO 3.1 | AI video generation (text-to-video, image-to-video, extend, interpolate) | 9 tools |
| NanoBanana | AI image generation with NanoBanana Pro 2 (Pro + Flash models) | 4 tools |
| Skill | Description |
|---|---|
| VEO Prompting | 7-layer prompt engineering for cinematic VEO 3.1 videos |
| NanoBanana Prompting | 7-layer prompt engineering for photorealistic NanoBanana Pro 2 images |
Install skills via Claude Code:
/plugin marketplace add u2n4/gemini-media-mcp
This repo hosts two independent MCP servers. Install whichever you need — or both. Each server publishes to PyPI separately.
uvx (zero-install, recommended)Add a block to your MCP client config (Claude Desktop, Claude Code, Cursor, VS Code, Windsurf) using the appropriate server block below (see VEO Server / NanoBanana Server sections).
git clone https://github.com/u2n4/gemini-media-mcp.git
cd gemini-media-mcp
# Create a virtual environment (uv pip install requires one — or pass --system)
uv venv
source .venv/bin/activate # macOS / Linux
# .venv\Scripts\activate # Windows PowerShell
# Install one or both sub-packages
uv pip install -e servers/veo
uv pip install -e servers/nanobanana
uvx (zero-install):
{
"mcpServers": {
"veo": {
"command": "uvx",
"args": ["veo-mcp-server"],
"env": {
"GEMINI_API_KEY": "your_key",
"VIDEO_OUTPUT_DIR": "./videos"
}
}
}
}
Claude Code:
claude mcp add veo -s user -e GEMINI_API_KEY=your_key -- uvx veo-mcp-server
pip install:
pip install veo-mcp-server
uvx (zero-install):
{
"mcpServers": {
"nanobanana": {
"command": "uvx",
"args": ["nanobanana-imagen-mcp"],
"env": {
"GEMINI_API_KEY": "your_key"
}
}
}
}
Claude Code:
claude mcp add nanobanana -s user -e GEMINI_API_KEY=your_key -- uvx nanobanana-imagen-mcp
pip install:
pip install nanobanana-imagen-mcp
AI video generation powered by Google VEO 3.1. Uses an async job pattern where generation starts in the background and returns a job ID for polling -- no timeouts.
| Tool | Description |
|---|---|
veo_generate_video | Generate video from text prompt. Supports 720p/1080p/4K, 16:9 or 9:16, 4/6/8 second duration, negative prompts, reference images, seed control, and batch generation (1-4 videos). |
veo_image_to_video | Animate a reference image with a motion prompt. |
veo_interpolate_video | Create smooth transition between two frames (first frame + last frame). |
veo_extend_video | Extend an existing VEO video by ~7 seconds. 720p only, max 148 seconds total. |
veo_check_job | Check async job status. Call every 15-20 seconds until completed or failed. |
veo_list_jobs | List all generation jobs and their current status. |
veo_api_status | Check API key status -- keys configured, active key, keys remaining. |
veo_pricing_info | Show pricing per second for standard and fast models at all resolutions. |
veo_show_output_stats | Display generation statistics -- video count, total size, file details, job statuses. |
| Variable | Description | Default |
|---|---|---|
GEMINI_API_KEY | Primary API key (required) | -- |
GEMINI_API_KEY_BACKUP | Backup key for auto-rotation | -- |
VIDEO_OUTPUT_DIR | Output directory for videos | ~/veo-videos |
| Tier | Model | Best For |
|---|---|---|
| Standard | veo-3.1-generate-preview | Higher quality output |
| Fast | veo-3.1-fast-generate-preview | Quicker generation |
AI image generation powered by NanoBanana Pro 2. Supports Pro (maximum quality) and Flash (fast) models with default 4K resolution.
| Tool | Description |
|---|---|
generate_image | Generate images using NanoBanana Pro 2 (Pro or Flash). Supports aspect ratio, resolution (up to 4K), negative prompts, thinking level, grounding, and reference images. |
upload_file | Upload reference image for editing or conditioning. |
show_output_stats | Display generation statistics -- image count, total size, file details. |
maintenance | Server maintenance and cleanup -- clear caches, remove temporary files. |
| Model | Engine | Best For |
|---|---|---|
| Pro | Gemini 3 Pro Image | Maximum quality, complex scenes |
| Flash | Gemini 3.1 Flash Image | Fast generation, simple scenes |
7-layer prompt engineering system for VEO 3.1:
7-layer prompt engineering system for NanoBanana Pro 2:
gemini-media-mcp/
├── servers/
│ ├── veo/ # VEO 3.1 MCP Server (PyPI: veo-mcp-server)
│ │ ├── pyproject.toml
│ │ ├── requirements.txt
│ │ └── src/
│ │ └── veo_mcp_server/
│ │ ├── __init__.py
│ │ ├── __main__.py
│ │ └── server.py
│ └── nanobanana/ # NanoBanana MCP Server (PyPI: nanobanana-imagen-mcp)
│ ├── pyproject.toml
│ ├── requirements.txt
│ └── nanobanana_mcp_server/ # Package
├── skills/
│ ├── veo-prompting/ # VEO prompting skill
│ │ └── SKILL.md
│ └── nanobanana-prompting/ # NanoBanana prompting skill
│ └── SKILL.md
├── plugins/ # Claude Code Plugin Marketplace
│ ├── veo-prompting/
│ │ ├── .claude-plugin/
│ │ │ └── plugin.json
│ │ └── skills/
│ │ └── veo-prompting/
│ │ └── SKILL.md
│ └── nanobanana-prompting/
│ ├── .claude-plugin/
│ │ └── plugin.json
│ └── skills/
│ └── nanobanana-prompting/
│ └── SKILL.md
├── .claude-plugin/
│ └── marketplace.json
├── .env.example
├── .gitignore
├── CHANGELOG.md
├── CONTRIBUTING.md
├── LICENSE
├── README.md
├── llms.txt
└── llms-install.md
See CONTRIBUTING.md.
MIT -- see LICENSE.
If you find this useful, please star this repository!
Made with ❤️ in the Eastern Province of Saudi Arabia.
io.github.ericm1018/skillfm-llm-cost-optimizer-openai-anthropic-usage
io.github.mikerawsonnz/llm-orchestration-agent
io.github.mikerawsonnz/authenticated-llm-agent
labforgedev/copilot-memory-mcp
csoai-org/agent-prompt-injection-firewall-mcp
io.github.mikerawsonnz/authenticated-multi-llm-agent