CAT
/MCP
SkillsMCPMarketplacesDigestToolsAdvertise

This week in Claude

Every Monday: Claude Code, Agent SDK, MCP, and the Anthropic platform moves worth your time.

Skills by Category
Frontend DevelopmentBackend & APIsTesting & QASecurityDevOps & CI/CDGit & Pull RequestsDocumentationCode Review & QualityAI & Agent BuildingSkill Development
MCP Servers by Category
Sales & MarketingWeb & Browser AutomationDatabasesAI & LLM ToolsCloud & InfrastructureCommunication & MessagingDeveloper ToolsDesign & CreativeDocuments & KnowledgeSearch & Web Crawling
Marketplaces by Category
AI Agents & OrchestrationLLM IntegrationDevelopment ToolsFrontend & UIBackend & APIsDatabasesTesting & Code QualityDevOps & CloudSecurity & ComplianceGit & Version Control

Cross AI Tools

Discover Claude Code plugins, extensions, and tools. Automatically updated directory of Anthropic Claude AI marketplaces with development tools, productivity plugins, and integrations.

Resources

  • Browse Skills
  • Browse MCP Servers
  • Browse Marketplaces
  • Plugins Reference

Community

  • About
  • Tools
  • Feedback
  • Privacy Policy
  • Advertise

Built for the Claude Code community with Claude Code by @mertduzgun

Independent project, not affiliated with Anthropic

Gemini Image

jimothysnicket/gemini-image-mcp
1authSTDIOregistry active
Summary

Connects Claude to Google's Gemini image generation API with two tools: generate_image for text-to-image, multi-turn editing with reference images, and iterative refinement via session IDs, plus process_image for local operations like cropping, background removal, chroma keying, and format conversion. Built on Gemini's current generateContent API rather than the deprecated Imagen service. Includes built-in cost tracking per request and session, configurable rate limits to prevent runaway agent spending, and auto-discovery of available models from your API key. Saves outputs with auto-versioning, maintains a generation manifest in JSONL, and supports aspect ratios from 1:1 to 21:9 at 1K/2K/4K resolutions. Good fit if you need programmatic image generation with cost controls or want to batch process images locally without hitting APIs.

CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →
CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →

gemini-image-mcp

A simple, focused MCP server for Google Gemini's native image generation — the "Nano Banana" models. Generate, edit, and locally process images from Claude Code, Claude Desktop, or any stdio-based MCP client. Two tools, no bloat.

Built for agents: a single call returns a saved image — or, with one-call background removal, a ready-to-use transparent PNG — without streaming image data through your agent's context. Uses Gemini's generateContent API (not the deprecated Imagen API).

Install

npm install -g @jimothy-snicket/gemini-image-mcp

Or use directly with npx:

npx -y @jimothy-snicket/gemini-image-mcp

Claude Code (one command):

claude mcp add gemini-image -- npx -y @jimothy-snicket/gemini-image-mcp

Requires a GEMINI_API_KEY environment variable — see Setup for details.

Set up a config file (optional):

npx @jimothy-snicket/gemini-image-mcp --init

Creates ~/.gemini-image-mcp.json with commented defaults. For project-specific overrides:

npx @jimothy-snicket/gemini-image-mcp --init --local

Features

generate_image — AI-powered

  • Text-to-image — describe what you want, get an image
  • Image editing — provide reference images and an editing instruction
  • Transparent assets in one call — removeBackground returns a clean transparent PNG: a local AI matte (works on any subject; optional add-on, see below) by default, or built-in green-screen / white-threshold keying. No extra API cost
  • Multi-turn edits — pass a sessionId to refine an image across calls, with prior turns kept as context
  • Multi-image input — up to ~14 reference images on gemini-3.1-flash-image (~11 on gemini-3-pro-image)
  • Cost reporting — every response includes token counts, estimated USD cost, and session totals
  • Rate limiting — configurable per-hour caps on requests and cost to prevent runaway agents
  • Auto model discovery — detects available image models from your API key at startup
  • Seed — reproducible generation with integer seeds
  • Google Search grounding — real-world accuracy on the gemini-3.x image models

process_image — Local (free, no API calls)

  • Crop — pixel-exact, aspect ratio (center), or focal point (attention/entropy)
  • Resize — to width, height, or both (maintains aspect ratio)
  • Background removal — threshold-based (white backgrounds) or chroma key (green screen, any solid colour)
  • Chroma key pipeline — HSV keying with smoothstep feather, spill suppression, and edge anti-aliasing
  • Trim — auto-remove whitespace borders
  • Format conversion — PNG, JPEG, WebP with quality control

Both tools

  • Output organization — meaningful filenames with auto-versioning, subfolders
  • Generation manifest — generations.jsonl logs every generation with prompt, params, cost
  • Full aspect ratio support — 1:1, 16:9, 9:16, 3:2, 2:3, 4:3, 3:4, 21:9
  • Resolution control — 1K, 2K, 4K

Setup

1. Get a Gemini API Key

Go to Google AI Studio and create an API key. It's free to start with generous rate limits.

2. Set the API Key

The server reads your key from the GEMINI_API_KEY environment variable. Set it once so it's available in every session:

Windows (PowerShell — run as admin):

[System.Environment]::SetEnvironmentVariable('GEMINI_API_KEY', 'your-key-here', 'User')

Then restart your terminal.

macOS / Linux:

echo 'export GEMINI_API_KEY="your-key-here"' >> ~/.bashrc
source ~/.bashrc

(Use ~/.zshrc if you're on zsh.)

Verify it's set:

echo $GEMINI_API_KEY

3. Connect to Your MCP Client

Pick the method that matches how you use MCP:

Claude Code (one-liner)

claude mcp add gemini-image -- npx -y @jimothy-snicket/gemini-image-mcp

Claude Code will pick up GEMINI_API_KEY from your environment automatically.

Claude Code (manual .mcp.json)

Add to .mcp.json in your project root or ~/.claude/.mcp.json for global access:

{
  "mcpServers": {
    "gemini-image": {
      "command": "npx",
      "args": ["-y", "@jimothy-snicket/gemini-image-mcp"],
      "env": {
        "GEMINI_API_KEY": "${GEMINI_API_KEY}"
      }
    }
  }
}

The ${GEMINI_API_KEY} syntax reads the value from your shell environment — your actual key never gets written into config files.

Claude Desktop

Edit claude_desktop_config.json:

  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
  • Windows: %APPDATA%\Claude\claude_desktop_config.json
{
  "mcpServers": {
    "gemini-image": {
      "command": "npx",
      "args": ["-y", "@jimothy-snicket/gemini-image-mcp"],
      "env": {
        "GEMINI_API_KEY": "${GEMINI_API_KEY}"
      }
    }
  }
}

Restart Claude Desktop after saving.

Other MCP Clients

Any client that supports stdio transport works. Point it at npx -y @jimothy-snicket/gemini-image-mcp and pass GEMINI_API_KEY in the environment.

Security Notes

  • Never commit your API key to version control. The ${GEMINI_API_KEY} syntax in config files references your environment — the key itself stays in your shell profile.
  • If your .mcp.json is in a project repo, add it to .gitignore or use the global config at ~/.claude/.mcp.json instead.
  • For extra security, you can use a wrapper script that reads the key from your OS keychain (macOS Keychain, Windows Credential Manager) and launches the server with it injected.

Configuration

All optional. The only required setup is GEMINI_API_KEY (covered above).

VariableDefaultDescription
OUTPUT_DIR~/gemini-imagesDefault directory for saved images
DEFAULT_MODELgemini-2.5-flash-imageDefault Gemini model
LOG_LEVELinfodebug, info, or error
REQUEST_TIMEOUT_MS60000API request timeout in milliseconds
MAX_REQUESTS_PER_HOUR0 (unlimited)Max image generations per rolling hour
MAX_COST_PER_HOUR0 (unlimited)Max estimated cost (USD) per rolling hour
SESSION_TIMEOUT_MS1800000 (30min)Multi-turn session expiry
GEMINI_IMAGE_AUTO_INSTALL1 (on)Auto-install the AI matte engine on first removeBackground: { mode: "auto" } use. Set 0 to disable (then auto falls back to chroma/threshold with instructions)

Set these the same way as GEMINI_API_KEY, or pass them in the env block of your MCP config.

Rate limiting is recommended when agents have access to this tool. An agent in a loop can generate images quickly — set MAX_REQUESTS_PER_HOUR=20 and MAX_COST_PER_HOUR=5 as sensible defaults.

Config File

Instead of environment variables, you can use a JSON config file. Create one with:

npx @jimothy-snicket/gemini-image-mcp --init

This creates ~/.gemini-image-mcp.json with all defaults and inline documentation. Edit it to set your preferences.

Priority: env vars > local config (.gemini-image-mcp.json in CWD) > global config (~/.gemini-image-mcp.json) > defaults.

You can also set per-tool defaults so every request uses your preferred settings:

{
  "defaultModel": "gemini-3.1-flash-image",
  "defaults": {
    "generate": {
      "aspectRatio": "16:9",
      "resolution": "2K"
    },
    "process": {
      "removeBackground": { "color": "#00FF00" },
      "trim": true
    }
  }
}

Per-request parameters always override config defaults.

Custom pricing. Cost estimates come from a built-in per-token rate table (there's no pricing API to fetch live). If you use a model the table doesn't know yet — or Google changes a rate before this package updates — add pricingOverrides so cost reporting stays accurate without waiting for a release:

{
  "pricingOverrides": {
    "some-new-image-model": {
      "inputPerMillion": 0.5,
      "textOutputPerMillion": 60,
      "imageOutputPerMillion": 60,
      "thinkingPerMillion": 60
    }
  }
}

Models with no entry (built-in or override) still generate — their cost is reported as unknown rather than guessed.

Tool: generate_image

Parameters

ParameterRequiredDescription
promptYesText description or editing instruction
imagesNoArray of file paths to input/reference images
modelNoGemini model ID
aspectRatioNo1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9, plus 1:4, 4:1, 1:8, 8:1 (gemini-3.1-flash-image). Validated by the API.
resolutionNo512 (gemini-3.1-flash-image only), 1K, 2K, 4K
outputDirNoOverride output directory for this request
filenameNoBase name for saved file (e.g. hero-banner). Auto-versioned if duplicate.
subfolderNoSubfolder within output directory (e.g. landing-page)
sessionIdNoContinue a multi-turn editing session from a previous response
seedNoInteger seed for reproducible generation
useSearchGroundingNoEnable Google Search grounding (gemini-3.x image models)
removeBackgroundNoReturn a transparent PNG cutout. { "mode": "auto" } = local AI matte (any subject; default); { "mode": "chroma" } = green screen; { "mode": "threshold" } = white removal (line art). No extra API cost

Example Response

{
  "imagePath": "/home/user/gemini-images/hero-banner.png",
  "mimeType": "image/png",
  "model": "gemini-2.5-flash-image",
  "sessionId": "session-1711929600000-a1b2c3",
  "sessionTurn": 1,
  "usage": {
    "promptTokens": 5,
    "outputTokens": 1295,
    "imageTokens": 1290,
    "thinkingTokens": 412,
    "totalTokens": 1712,
    "estimatedCost": "$0.0390",
    "pricingVerifiedDate": "2026-06-15"
  },
  "session": {
    "generationsThisSession": 3,
    "totalCostThisSession": "$0.1161",
    "generationsThisHour": 5,
    "limit": {
      "maxPerHour": 20,
      "maxCostPerHour": 5,
      "remainingThisHour": 15
    }
  }
}

Usage Examples

Text-to-image:

"Generate a hero image for a SaaS landing page, modern gradient style, 16:9"

Image editing:

"Take this screenshot and redesign the header with a dark theme" (with image paths)

Iterative editing (multi-turn):

Generate an image, then call again with the returned sessionId and a refinement like "make it more minimal" — the prior image stays in context.

Organized output:

"Generate a hero banner" with filename: "hero", subfolder: "landing-page" → saves to ~/gemini-images/landing-page/hero.png

High quality:

"A photorealistic product shot of headphones on marble, 4K" (using gemini-3-pro-image)

Transparent asset (one call):

"A glossy red sneaker, product shot" with removeBackground: { "mode": "auto" } → a ready-to-place transparent PNG. The local AI matte works on any subject — no green screen needed.

Tool: process_image

Local image processing via sharp. Free, fast, no API calls.

Parameters

ParameterRequiredDescription
imagePathYesPath to the image file to process
cropNoCrop by pixel dimensions, aspect ratio, or focal point strategy
resizeNoResize to width/height (maintains aspect ratio)
removeBackgroundNoRemove background: { "mode": "auto" } (AI matte, any subject), { "mode": "chroma" } (green screen), or { "mode": "threshold" } (white). Defaults to chroma if color set, else threshold
trimNoAuto-remove whitespace/transparent borders
formatNoConvert to png, jpeg, or webp
qualityNoOutput quality for JPEG/WebP (1-100)
filenameNoBase name for saved file. Auto-versioned if duplicate.
subfolderNoSubfolder within output directory
outputDirNoOverride output directory

Crop Options

// Pixel-exact
{"width": 500, "height": 300, "left": 100, "top": 50}

// Aspect ratio (center crop)
{"aspectRatio": "16:9"}

// Focal point — shifts crop to the most interesting region
{"aspectRatio": "16:9", "strategy": "attention"}

// Detail-based — shifts crop to the most detailed region
{"aspectRatio": "16:9", "strategy": "entropy"}

Background Removal Options

// AI semantic matte — best quality, works on ANY subject
{"mode": "auto"}

// White/light background (threshold)
{"mode": "threshold", "threshold": 240}

// Green screen (chroma key)
{"mode": "chroma", "color": "#00FF00"}

// Any solid colour
{"mode": "chroma", "color": "#0000FF", "tolerance": 60}

mode: "auto" runs a local BiRefNet matte that isolates the subject semantically — so it handles hair, glass, and green/yellow subjects that chroma key can't. The matte engine isn't bundled (keeps the base install ~65 MB). On your first auto call the server auto-installs it (@huggingface/transformers, ~340 MB) plus the fp16 model (~109 MB) — a one-time pause of a minute or two, then it runs locally with no extra API cost. Set GEMINI_IMAGE_AUTO_INSTALL=0 to disable auto-install (then auto falls back to returning the image with instructions to install it manually). chroma and threshold need nothing extra.

Chroma key (mode: "chroma") uses HSV keying with smoothstep feathering, spill suppression, and 5-pass edge anti-aliasing (default tolerance 80). Use #00FF00 for AI-generated green screens — it works better than matching the exact shade Gemini produces.

Note: Chroma key destroys subjects that share the key colour (green/yellow) and transparent/reflective subjects (glass) — the green parrot vanishes. For those, use mode: "auto" (the AI matte preserves them), or the canvas approach: feed a solid-colour background image to generate_image and let Gemini place the subject with correct lighting. The canvas approach is still best for truly transparent objects like glass, which should transmit the final background rather than be cut out.

Common Pipelines

Subject on a specific background (canvas approach):

generate_image → "Place a [subject] on this background" with images: [solid colour canvas]

One API call. Best for yellow, green, or glass subjects where chroma key struggles.

Transparent asset (one call):

generate_image → "A product photo of <subject>" with removeBackground: {mode: "auto"}

One API call → a transparent PNG. The local AI matte works on any subject. (For truly transparent/reflective objects like glass, the canvas approach above is still best.)

Transparent asset from green screen (zero-dependency):

generate_image → "A product photo on a bright green background"
process_image → removeBackground {mode: "chroma"} + trim

Avoids the matte model entirely — best for high-contrast subjects on locked-down/offline machines.

Favicon from a generated logo:

process_image → removeBackground {threshold: 230} + trim + resize {width: 192, height: 192}

Social card from a photo:

process_image → crop {aspectRatio: "16:9", strategy: "attention"} + resize {width: 1200}

WebP conversion for web:

process_image → format: "webp" + quality: 85

Models

ModelStrengthsResolutionNotes
gemini-2.5-flash-imageCheapest (~$0.04/image)1KDefault. Shuts down 2026-10-02
gemini-3.1-flash-imageSpeed + quality, Google Search grounding512, 1K, 2K, 4K~$0.07/1K image. ~14 reference images
gemini-3-pro-imageBest quality, text rendering1K, 2K, 4K~$0.13/1K image. ~11 reference images

The -preview IDs (gemini-3-pro-image-preview, gemini-3.1-flash-image-preview) are still accepted during Google's cutover but retire 2026-06-25 — use the GA IDs above. The server discovers whichever image models your API key supports at startup and validates each request against that live list, so new models work without an update.

Development

bun install
bun run build     # TypeScript -> dist/
bun run dev       # Run directly with Bun

License

MIT

Featured
CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →

Configuration

GEMINI_API_KEY*secret

Google Gemini API key from https://aistudio.google.com/apikey

Categories
AI & LLM Tools
Registryactive
Package@jimothy-snicket/gemini-image-mcp
TransportSTDIO
AuthRequired
UpdatedApr 2, 2026
View on GitHub

Related AI & LLM Tools MCP Servers

View all →
SkillFM LLM Cost Optimizer

io.github.ericm1018/skillfm-llm-cost-optimizer-openai-anthropic-usage

LLM cost optimizer for OpenAI, Anthropic, token usage, BYOK, and SkillFM Beacon audits.
Llm Orchestration Agent

io.github.mikerawsonnz/llm-orchestration-agent

Run a prompt through a LangChain (system + human) chain over Gemini on Vertex AI; optional LangSmith
Authenticated Llm Agent

io.github.mikerawsonnz/authenticated-llm-agent

JWT-gated LLM gateway: authenticate (bcrypt/JWT), then run a LangChain-on-Vertex Gemini completion.
Copilot Memory MCP

labforgedev/copilot-memory-mcp

Persistent semantic memory for AI agents using local ChromaDB vector search. No cloud required.
1
Agent Prompt Injection Firewall Mcp

csoai-org/agent-prompt-injection-firewall-mcp

The WAF for agents. Pattern-based + heuristic firewall scans prompts, RAG documents, tool argume...
Authenticated Multi Llm Agent

io.github.mikerawsonnz/authenticated-multi-llm-agent

Google-OAuth-gated LLM gateway: verify a Google ID token, then run a Gemini (Vertex AI) completion f