CAT
/MCP
SkillsMCPMarketplacesDigestToolsAdvertise

This week in Claude

Every Monday: Claude Code, Agent SDK, MCP, and the Anthropic platform moves worth your time.

Skills by Category
Frontend DevelopmentBackend & APIsTesting & QASecurityDevOps & CI/CDGit & Pull RequestsDocumentationCode Review & QualityAI & Agent BuildingSkill Development
MCP Servers by Category
Sales & MarketingWeb & Browser AutomationDatabasesAI & LLM ToolsCloud & InfrastructureCommunication & MessagingDeveloper ToolsDesign & CreativeDocuments & KnowledgeSearch & Web Crawling
Marketplaces by Category
AI Agents & OrchestrationLLM IntegrationDevelopment ToolsFrontend & UIBackend & APIsDatabasesTesting & Code QualityDevOps & CloudSecurity & ComplianceGit & Version Control

Cross AI Tools

Discover Claude Code plugins, extensions, and tools. Automatically updated directory of Anthropic Claude AI marketplaces with development tools, productivity plugins, and integrations.

Resources

  • Browse Skills
  • Browse MCP Servers
  • Browse Marketplaces
  • Plugins Reference

Community

  • About
  • Tools
  • Feedback
  • Privacy Policy
  • Advertise

Built for the Claude Code community with Claude Code by @mertduzgun

Independent project, not affiliated with Anthropic

Gemini

houtini-ai/gemini-mcp
258 toolsauthSTDIOregistry active
Summary

Wraps Google's Gemini models as MCP tools inside Claude Desktop. The real utility isn't just chat, it's the stuff Gemini does natively: grounded search that pulls live Google results before answering, Imagen for image generation with current data lookup, Veo for video synthesis, and SVG diagram rendering that outputs clean vector code you can commit straight to a repo. Deep research runs multi-iteration searches then synthesizes a full report. Image editing maintains conversational context across turns via thought signatures. Thirteen tools total, all stdio transport. Works via npx with just an API key from Google AI Studio. Free tier covers most development use.

CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →
CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →

Tools

Public tool metadata for what this MCP can expose to an agent.

8 tools
GEMINI_COUNT_TOKENSCounts the number of tokens in text using Gemini tokenization. Useful for estimating costs, checking input limits, and optimizing prompts before making API calls.2 params

Counts the number of tokens in text using Gemini tokenization. Useful for estimating costs, checking input limits, and optimizing prompts before making API calls.

Parameters* required
textstring
Text to count tokens for
modelstring
Model to use for token counting. Examples: 'gemini-1.5-flash', 'gemini-1.5-pro'default: gemini-1.5-flash
GEMINI_EMBED_CONTENTGenerates text embeddings using Gemini embedding models. Converts text into numerical vectors for semantic search, similarity comparison, clustering, and classification tasks.4 params

Generates text embeddings using Gemini embedding models. Converts text into numerical vectors for semantic search, similarity comparison, clustering, and classification tasks.

Parameters* required
textstring
Text to generate embeddings for
modelstring
Embedding model to use. Examples: 'text-embedding-004', 'embedding-001'default: text-embedding-004
titlestring
Optional title for the content (for document embeddings)
task_typestring
Task type: 'RETRIEVAL_QUERY', 'RETRIEVAL_DOCUMENT', 'SEMANTIC_SIMILARITY', 'CLASSIFICATION', 'CLUSTERING'
GEMINI_GENERATE_CONTENTGenerates text content from prompts using Gemini models. Supports various models like Gemini Flash and Pro with configurable temperature, token limits, and safety settings for diverse text generation tasks.9 params

Generates text content from prompts using Gemini models. Supports various models like Gemini Flash and Pro with configurable temperature, token limits, and safety settings for diverse text generation tasks.

Parameters* required
modelstring
Model to use. Examples: 'gemini-1.5-flash', 'gemini-1.5-pro', 'gemini-2.0-flash-exp'default: gemini-1.5-flash
top_kinteger
Top-k sampling parameter
top_pnumber
Nucleus sampling parameter (0.0 to 1.0)
promptstring
Text prompt for content generation
temperaturenumber
Controls randomness (0.0 to 2.0)
stop_sequencesarray
Sequences where generation should stop
safety_settingsarray
Safety filter settings
max_output_tokensinteger
Maximum number of tokens to generate
system_instructionstring
System instruction to guide the model's behavior
GEMINI_GENERATE_IMAGEGenerates images from text prompts using Gemini 2.5 Flash Image Preview model (Nano Banana). Supports creative image generation with customizable parameters like aspect ratio, safety settings, and optional local file saving. Generated images are automatically uploaded to S3 an...9 params

Generates images from text prompts using Gemini 2.5 Flash Image Preview model (Nano Banana). Supports creative image generation with customizable parameters like aspect ratio, safety settings, and optional local file saving. Generated images are automatically uploaded to S3 an...

Parameters* required
modelstring
Model to use. Use 'gemini-2.5-flash-image-preview' for image generationdefault: gemini-2.5-flash-image-preview
top_kinteger
Top-k sampling parameter
top_pnumber
Nucleus sampling parameter (0.0 to 1.0)
promptstring
Text prompt for image generation
save_pathstring
Optional local path to save the generated image
temperaturenumber
Controls randomness (0.0 to 2.0)
safety_settingsarray
Safety filter settings
max_output_tokensinteger
Maximum number of tokens to generate (max 32,768)
system_instructionstring
System instruction to guide image generation behavior
GEMINI_GENERATE_VIDEOSGenerates videos from text prompts using Google's Veo models. Creates high-quality video content. Returns operation ID for tracking progress. After this, call GEMINI_WAIT_FOR_VIDEO to download the video using the operation ID.4 params

Generates videos from text prompts using Google's Veo models. Creates high-quality video content. Returns operation ID for tracking progress. After this, call GEMINI_WAIT_FOR_VIDEO to download the video using the operation ID.

Parameters* required
modelstring
Model to use. Examples: 'veo-3.0-generate-preview', 'veo-3.0-fast-generate-preview', 'veo-2.0-generate-001'default: veo-3.0-generate-preview
extrasobject
Additional parameters passed through to API
promptstring
Text prompt for Veo video generation
person_generationstring
Controls person generation in videos. Values: 'allow_adult' or 'dont_allow'. IMPORTANT: Veo 3 models in EU/UK/CH/MENA regions ONLY support 'allow_adult'. Veo 2 models support both values in all regions.
GEMINI_GET_VIDEOS_OPERATIONChecks the status of a Veo video generation operation. Use the operation name from GenerateVideos to track progress and get the download URL when complete.1 params

Checks the status of a Veo video generation operation. Use the operation name from GenerateVideos to track progress and get the download URL when complete.

Parameters* required
operation_namestring
Operation resource name returned by predictLongRunning
GEMINI_LIST_MODELSLists available Gemini and Veo models with their capabilities and limits. Useful for discovering supported models and their features before making generation requests.1 params

Lists available Gemini and Veo models with their capabilities and limits. Useful for discovering supported models and their features before making generation requests.

Parameters* required
filter_prefixstring
Filter models by name prefix (client-side). Leave empty to get all models.default:
GEMINI_WAIT_FOR_VIDEOPolls a Veo video generation operation until completion, then downloads and returns the video as a FileDownloadable with public URL.1 params

Polls a Veo video generation operation until completion, then downloads and returns the video as a FileDownloadable with public URL.

Parameters* required
operation_namestring
The operation name from video generation (e.g., 'models/...')

@houtini/gemini-mcp

npm version MCP Registry

I've been running this MCP server in my Claude Desktop setup for months. It's one of the few I leave on permanently — not because Gemini replaces Claude, but because grounded search, image generation, SVG diagrams, and video are things Gemini does genuinely well. Having them as tools inside Claude beats switching browser tabs.

Thirteen tools. One npx command.

Gemini MCP server


Quick Navigation

Get started | What it does | SVG generation | Image output | Configuration | Tools | Models | Requirements


What it looks like

Generated images, SVGs, and videos render inline in Claude Desktop with zoom controls, file paths, and prompt context:

Image generationSVG / diagram generation
Image previewSVG preview
Image embedSVG embedVideo embed
Image embedSVG embedVideo embed

Get started in two minutes

Step 1: Get a Gemini API key

Go to Google AI Studio and create one. The free tier covers most development use — you'll hit rate limits on deep research if you're hammering it, but for day-to-day work it's fine.

Step 2: Add to your Claude Desktop config

Config file locations:

  • Windows: C:\Users\{username}\AppData\Roaming\Claude\claude_desktop_config.json
  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
{
  "mcpServers": {
    "gemini": {
      "command": "npx",
      "args": ["@houtini/gemini-mcp"],
      "env": {
        "GEMINI_API_KEY": "your-api-key-here"
      }
    }
  }
}

Step 3: Restart Claude Desktop

That's it. Tools show up automatically. npx pulls the package on first run — no separate install needed.

Local build instead

For development, or if you'd rather not rely on npx:

git clone https://github.com/houtini-ai/gemini-mcp
cd gemini-mcp
npm install --include=dev
npm run build

Then point your config at the local build:

{
  "mcpServers": {
    "gemini": {
      "command": "node",
      "args": ["C:/path/to/gemini-mcp/dist/index.js"],
      "env": {
        "GEMINI_API_KEY": "your-api-key-here"
      }
    }
  }
}

Claude Code (CLI)

Claude Code uses a different registration mechanism — it doesn't read claude_desktop_config.json. Use claude mcp add instead:

claude mcp add -e GEMINI_API_KEY=your-api-key-here -s user gemini -- npx -y @houtini/gemini-mcp

With optional image output directory:

claude mcp add \
  -e GEMINI_API_KEY=your-api-key-here \
  -e GEMINI_IMAGE_OUTPUT_DIR=/path/to/output \
  -s user \
  gemini -- npx -y @houtini/gemini-mcp

Verify with claude mcp get gemini — you should see Status: Connected.


What it does

Chat with Google Search grounding

Use gemini:gemini_chat to ask: "What changed in the MCP spec in the last month?"

Grounding is on by default. Gemini searches Google before answering, so you get current information rather than training cutoff answers. Sources come back as markdown links. For questions where you want pure reasoning — "explain this code" or similar — set grounding: false.

Supports thinking_level on Gemini 3 models: high for maximum reasoning depth, low to keep it fast, medium/minimal on Gemini 3 Flash only.

Deep research

Use gemini:gemini_deep_research with:
  research_question="What are the current approaches to AI agent memory management?"
  max_iterations=5

Runs multiple grounded search iterations then synthesises a full report. Takes 2-5 minutes depending on complexity — worth it for anything needing comprehensive coverage rather than a quick answer.

Set max_iterations to 3-4 in Claude Desktop (4-minute tool timeout). In IDEs (Cursor, Windsurf, VS Code) or agent frameworks, 7-10 iterations produces noticeably better synthesis. Pass focus_areas as an array to steer toward specific angles.

Image generation with search grounding

Use gemini:generate_image with:
  prompt="Stock price chart showing Apple (AAPL) closing prices for the last 5 trading days"
  use_search=true
  aspectRatio="16:9"

Default model is gemini-3-pro-image-preview (Nano Banana Pro). Also supports gemini-2.5-flash-image for faster generation.

When use_search=true, Gemini searches Google for current data before generating. Financial and news queries work reliably. The full-resolution image saves to disk automatically — the inline preview is resized for transport but the original is untouched.

Video generation with Veo 3.1

Use gemini:generate_video with:
  prompt="A close-up shot of a futuristic coffee machine brewing a glowing blue espresso, steam rising dramatically. Cinematic lighting."
  resolution="1080p"
  durationSeconds=8

Uses Google's Veo 3.1 model. Generates 4-8 second videos at up to 4K with native synchronised audio. Processing takes 2-5 minutes — the tool polls automatically until ready.

Options worth knowing:

  • aspectRatio — 16:9 landscape or 9:16 portrait/vertical
  • generateAudio — on by default, produces dialogue and sound effects matching the prompt
  • sampleCount — generate up to 4 variations in one call
  • seed — deterministic output across runs
  • generateThumbnail — extracts a frame via ffmpeg (needs ffmpeg in PATH)
  • firstFrameImage — animate from a starting image (image-to-video)

SVG generation

This is the one people underestimate. SVG output isn't just diagrams — it's production-ready vector graphics you can drop straight into a codebase, a presentation, or a web page. Clean, scalable, no raster artefacts.

Use gemini:generate_svg with:
  prompt="Architecture diagram showing a microservices system with API gateway, three services, and a shared database"
  style="technical"
  width=1000
  height=600

Four styles:

StyleBest for
technicalArchitecture diagrams, flowcharts, system maps
artisticIllustrations, decorative graphics, icons
minimalClean data visualisations, simple charts
data-vizComplex charts, dashboards, infographics

The output is actual SVG code — edit it, animate it, embed it in HTML, commit it to a repo. No rasterising, no export steps, no Figma required.

SVG generation in Claude Desktop

Image editing and analysis

Conversational editing — Gemini 3 Pro Image maintains context across editing turns. Pass thought signatures back on subsequent edit_image calls for full continuity:

Use gemini:edit_image with:
  prompt="Change the colour scheme to blue and green"
  images=[{data: imageBase64, mimeType: "image/png", thoughtSignature: "fromPreviousCall"}]

Analysis — two tools for different purposes:

  • describe_image — Fast general descriptions using Gemini 3 Flash
  • analyze_image — Structured extraction and detailed reasoning using Gemini 3.1 Pro

Load local files:

Use gemini:load_image_from_path with filePath="C:/screenshots/error.png"

Media resolution control

Reduce token usage by up to 75% whilst maintaining quality for the task:

LevelTokensSavingsBest for
MEDIA_RESOLUTION_LOW28075%Simple tasks, bulk operations
MEDIA_RESOLUTION_MEDIUM56050%PDFs/documents (OCR saturates here)
MEDIA_RESOLUTION_HIGH1120defaultDetailed analysis
MEDIA_RESOLUTION_ULTRA_HIGH2000+per-image onlyMaximum detail

For PDF OCR, MEDIUM gives identical text extraction quality to HIGH at half the tokens.

Landing page generation

Use gemini:generate_landing_page with:
  brief="A SaaS tool that helps developers monitor API latency"
  companyName="PingWatch"
  primaryColour="#6366F1"
  style="startup"
  sections=["hero", "features", "pricing", "cta"]

Returns a self-contained HTML file — inline CSS and vanilla JS, no external dependencies. Styles: minimal, bold, corporate, startup.

Professional chart design systems

gemini_prompt_assistant includes 9 professional chart design systems:

SystemInspirationBest for
storytellingCole Nussbaumer KnaflicExecutive presentations
financialFinancial TimesEditorial journalism — FT Pink, serif titles
terminalBloomberg / FintechHigh-density dark mode with neon
modernistW.E.B. Du BoisBold geometric blocks, stark contrasts
professionalIBM Carbon / TailwindEnterprise dashboards
editorialFiveThirtyEight / EconomistData journalism
scientificNature / ScienceAcademic rigour
minimalEdward TufteMaximum data-ink ratio
darkObservableModern dark mode

Help system

Use gemini:gemini_help with topic="overview"

Full documentation without leaving Claude. Topics: overview, image_generation, image_editing, image_analysis, chat, deep_research, grounding, media_resolution, models, all.


Image output and storage

By default, images return as inline previews rendered directly in Claude. Set GEMINI_IMAGE_OUTPUT_DIR to auto-save everything:

"env": {
  "GEMINI_API_KEY": "your-api-key-here",
  "GEMINI_IMAGE_OUTPUT_DIR": "C:/Users/username/Pictures/gemini-output"
}

The server uses a two-tier approach to handle the MCP protocol's 1MB JSON-RPC limit whilst preserving full-resolution files:

TierPurpose
Full-resSaved to disk immediately, untouched
PreviewResized JPEG for inline transport — dynamically sized to fit under the cap

Gemini returns 2-5MB images. The resize is smart — it measures the non-image overhead in each response and calculates the exact binary budget available, stepping down dimensions (800→600→400→300→200px) until it fits. The full image is always there on disk.


Configuration reference

VariableRequiredDefaultDescription
GEMINI_API_KEYYes—Google AI API key from AI Studio
GEMINI_DEFAULT_MODELNogemini-3.1-pro-previewDefault model for gemini_chat and analyze_image
GEMINI_DEFAULT_GROUNDINGNotrueEnable Google Search grounding by default
GEMINI_IMAGE_OUTPUT_DIRNo—Auto-save directory for generated images and videos
GEMINI_ALLOW_EXPERIMENTALNofalseInclude experimental/preview models in auto-discovery
GEMINI_MCP_LOG_FILENofalseWrite logs to ~/.gemini-mcp/logs/
DEBUG_MCPNofalseLog to stderr for debugging tool calls

Tools reference

ToolDescription
gemini_chatChat with Gemini 3.1 Pro. Google Search grounding on by default. Supports thinking_level
gemini_deep_researchMulti-step iterative research with Google Search. Synthesises comprehensive reports
gemini_list_modelsLists available models from the Gemini API
gemini_helpDocumentation for all features without leaving Claude
gemini_prompt_assistantExpert guidance for image generation with 9 chart design systems
generate_imageImage generation with optional search grounding. Full-res saved to disk
edit_imageEdit images with natural-language instructions. Multi-turn continuity via thought signatures
describe_imageFast image descriptions using Gemini 3 Flash
analyze_imageStructured extraction and analysis using Gemini 3.1 Pro
load_image_from_pathRead a local image file and return base64 for any image tool
generate_videoVideo generation with Veo 3.1 — 4-8 seconds at up to 4K with native audio
generate_svgProduction-ready SVG: diagrams, illustrations, icons, data visualisations
generate_landing_pageSelf-contained HTML landing pages with inline CSS/JS

Model reference

ModelUsed byNotes
gemini-3.1-pro-previewgemini_chat, analyze_imageDefault. Advanced reasoning
gemini-3-pro-image-previewgenerate_image, edit_imageNano Banana Pro — highest quality image generation
gemini-2.5-flash-imagegenerate_image (optional)Faster generation, higher volume
gemini-3-flash-previewdescribe_imageFast general descriptions
veo-3.1-generate-previewgenerate_videoVeo 3.1 — 4K video with native audio

Gemini 3 notes: Temperature is forced to 1.0 on Gemini 3 models (Google's requirement — lower values cause looping). Thinking level only applies to gemini_chat.


Requirements

  • Node.js 18+
  • A Gemini API key from Google AI Studio
  • ffmpeg (optional, for video thumbnail extraction)

Licence

Apache-2.0

Featured
CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →

Configuration

GEMINI_API_KEY*secret

Google AI Studio API key (get from https://aistudio.google.com/apikey)

Categories
AI & LLM ToolsCommunication & Messaging
Registryactive
Package@houtini/gemini-mcp
TransportSTDIO
AuthRequired
UpdatedJan 29, 2026
View on GitHub

Related AI & LLM Tools MCP Servers

View all →
SkillFM LLM Cost Optimizer

io.github.ericm1018/skillfm-llm-cost-optimizer-openai-anthropic-usage

LLM cost optimizer for OpenAI, Anthropic, token usage, BYOK, and SkillFM Beacon audits.
Llm Orchestration Agent

io.github.mikerawsonnz/llm-orchestration-agent

Run a prompt through a LangChain (system + human) chain over Gemini on Vertex AI; optional LangSmith
Authenticated Llm Agent

io.github.mikerawsonnz/authenticated-llm-agent

JWT-gated LLM gateway: authenticate (bcrypt/JWT), then run a LangChain-on-Vertex Gemini completion.
Copilot Memory MCP

labforgedev/copilot-memory-mcp

Persistent semantic memory for AI agents using local ChromaDB vector search. No cloud required.
1
Agent Prompt Injection Firewall Mcp

csoai-org/agent-prompt-injection-firewall-mcp

The WAF for agents. Pattern-based + heuristic firewall scans prompts, RAG documents, tool argume...
Authenticated Multi Llm Agent

io.github.mikerawsonnz/authenticated-multi-llm-agent

Google-OAuth-gated LLM gateway: verify a Google ID token, then run a Gemini (Vertex AI) completion f