Imagine Mcp

2authSTDIOregistry active

Summary

A production-grade bridge to image and video understanding plus generation across Gemini, OpenAI, and Grok. You get two core tools: `understand` for reasoning over image and video URLs with configurable context length, and `generate` for text-to-image, image-to-image, and text-to-video workflows. Each provider exposes "poor" and "rich" tiers so you can trade speed for quality on the fly. The server ranks models weekly against Artificial Analysis and LMArena leaderboards, caches responses to disk with configurable TTL, and degrades gracefully when credentials are missing. Ships with stdio and HTTP transports, plus a config tool that surfaces relay forms, credential state checks, and runtime knobs like log level and default provider. Reach for this when you need multimodal ops without hardcoding a single provider.

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

imagine-mcp

mcp-name: io.github.n24q02m/imagine-mcp

Image and video understanding + generation for AI agents -- across Gemini, OpenAI, and Grok.

Sister projects from n24q02m (click to expand)

Project	Tagline	Tag
better-code-review-graph	Knowledge graph for token-efficient code reviews -- semantic search and call-...	MCP
better-email-mcp	IMAP/SMTP email for AI agents -- read, send, organize folders, and manage att...	MCP
better-godot-mcp	Composite MCP server for Godot Engine -- 17 composite tools for AI-assisted g...	MCP
better-notion-mcp	Markdown-first Notion for AI agents -- pages, databases, blocks, and comments...	MCP
better-telegram-mcp	Telegram for AI agents -- messages, chats, media, and contacts across both bo...	MCP
claude-plugins	Claude Code plugin marketplace for the n24q02m MCP servers -- install web sea...	Marketplace
imagine-mcp	Image and video understanding + generation for AI agents -- across Gemini, Op...	MCP
jules-task-archiver	Chrome Extension for bulk operations on Jules tasks via batchexecute API -- a...	Tooling
mcp-core	Shared foundation for building MCP servers -- Streamable HTTP transport, OAut...	MCP
mnemo-mcp	Persistent AI memory with hybrid search and embedded sync. Open, free, unlimi...	MCP
qwen3-embed	Lightweight Qwen3 text embedding and reranking via ONNX Runtime and GGUF	Library
skret	Secrets without the server.	CLI
tacet	TACET: a self-distilling neuro-symbolic cascade that amortises LLM cost in kn...	Tooling
web-core	Shared web infrastructure package for search, scraping, HTTP security, and st...	Library
wet-mcp	Open-source MCP server for AI agents: web search, content extraction, and lib...	MCP

Features
Install
Configuration
Documentation
Tools
Comparison
Security
Build from Source
Trust Model
Contributing
License

Features

Multimodal understanding -- Describe, classify, or reason over images and videos (Gemini handles mixed image + video in one call)
Image generation -- Text-to-image and image-to-image (edit / inpaint) across Gemini Imagen, OpenAI gpt-image, Grok Imagine
Video generation -- Text-to-video and image-to-video (Gemini Veo 3.1, Grok Imagine Video)
3 providers x 2 tiers -- Same interface for gemini / openai / grok at poor (cheap/fast) or rich (high quality); swap via parameter
Leaderboard-ranked models -- Provider ordering auto-refreshed weekly from Artificial Analysis + LMArena leaderboards
Degraded mode -- Server starts with zero credentials and surfaces remaining providers as you add keys
Response cache -- Disk-based caching of understand responses with configurable TTL
Dual transport -- pure stdio with provider env vars (default) or HTTP multi-user with paste-token relay form

Install

Run with uvx (no install step) or pull the container image:

# uvx -- recommended, runs the published PyPI package
uvx imagine-mcp

# Docker
docker run -it --rm ghcr.io/n24q02m/imagine-mcp:latest

Add it to an MCP client by pointing the client at the uvx imagine-mcp command and supplying at least one provider key (see Configuration):

{
  "mcpServers": {
    "imagine": {
      "command": "uvx",
      "args": ["imagine-mcp"],
      "env": { "GEMINI_API_KEY": "AIza..." }
    }
  }
}

For per-client snippets (Claude Code, Codex, Gemini CLI, Cursor, Windsurf) and the browser-based HTTP setup, see the Setup docs.

Install with an AI agent -- paste this to your AI coding agent:

Install MCP server imagine-mcp following the steps at
https://raw.githubusercontent.com/n24q02m/claude-plugins/main/plugins/imagine-mcp/setup-with-agent.md

Configuration

Two transports (default stdio; opt into http with --http, MCP_TRANSPORT=http, or TRANSPORT_MODE=http):

stdio (default) -- single-user, reads credentials from env vars only. Exits if none of the three provider keys are set.
http -- HTTP daemon. Local self-host on 127.0.0.1 by default, or multi-user remote (per-JWT-sub credential isolation) when PUBLIC_URL + MCP_DCR_SERVER_SECRET are set. In HTTP mode credentials are entered through a browser form at /authorize.

Provider keys

All optional -- the server starts in degraded mode and surfaces whichever providers have a key. Set at least one.

Env var	Provider	Get a key at
`GEMINI_API_KEY`	Gemini (image + video)	aistudio.google.com/apikey
`OPENAI_API_KEY`	OpenAI (image)	platform.openai.com/api-keys
`XAI_API_KEY`	Grok / xAI (image + video)	console.x.ai

When a tool is called without an explicit provider, the first key present wins in the order XAI_API_KEY -> OPENAI_API_KEY -> GEMINI_API_KEY.

Model chains (optional)

Override the built-in provider/tier catalog with explicit model chains. Each is a CSV of litellm provider/model entries; the order is the fallback order.

Env var	Purpose
`UNDERSTAND_MODELS`	Ordered model chain for `understand` (litellm fallback). Empty -> catalog default.
`GENERATE_MODELS`	Ordered model chain for `generate`. The first entry selects the native provider + model. Empty -> catalog default.
`GENERATE_PROVIDER_PRIORITY`	CSV of provider names reordering generation auto-fallback. Defaults to `grok,openai,gemini`.

Understanding is routed through litellm (provider/model passthrough), so any litellm provider works -- supply that provider's <PROVIDER>_API_KEY. Generation stays on the native provider SDKs (Gemini, OpenAI, Grok). Example:

{
  "mcpServers": {
    "imagine": {
      "command": "uvx",
      "args": ["imagine-mcp"],
      "env": {
        "UNDERSTAND_MODELS": "gemini/gemini-3.1-pro-preview,openai/gpt-5.4",
        "GEMINI_API_KEY": "AIza...",
        "OPENAI_API_KEY": "sk-..."
      }
    }
  }
}

Runtime knobs

config(action="set", key=..., value=...) adjusts log_level, default_provider, default_tier, and cache_ttl_seconds at runtime.

Documentation

Full docs at mcp.n24q02m.com/servers/imagine-mcp/setup/:

Setup -- install methods for Claude Code, Codex, Gemini CLI, Cursor, Windsurf, mcp.json
Modes overview -- stdio / local-relay / remote-relay / remote-oauth
Multi-user setup -- per-JWT-sub credential model

Tools

Tool	Actions	Description
`understand`	--	Describe or reason over one or more image/video URLs. `media_urls: list[str]`, `prompt: str`, `provider`, `tier`, `max_tokens`.
`generate`	--	Generate an image or video from a text prompt. `media_type: image\|video`, optional `reference_image_url`, optional `job_id` (video poll), `aspect_ratio`, `duration_seconds`.
`config`	`open_relay`, `relay_status`, `relay_skip`, `relay_reset`, `relay_complete`, `warmup`, `status`, `set`, `cache_clear`	Credential + runtime config: open relay form, check credential state, set runtime knobs (log level, default provider, TTL), clear response cache.
`help`	--	Full Markdown documentation for `understand`, `generate`, or `config` topics.
`config__open_relay`	--	Framework-injected helper (mcp-core) equivalent to `config(action="open_relay")`; opens the browser credential form.

Model IDs per provider x action x tier are leaderboard-ranked; see docs/models.md (auto-regenerated from src/imagine_mcp/models.py).

Comparison

How imagine-mcp stacks up against direct competitors in each pillar:

Capability	imagine-mcp	EverArt MCP	fal.ai MCP	Replicate Flux MCP
Image/video understanding	Yes (describe / classify / reason over image + video URLs)	No	No	No
Image generation	Yes (text-to-image + image-to-image via `reference_image_url`)	Yes (single `generate_image`)	Yes (text/image-to-image, edit, inpaint)	Yes (single `generate_image`)
Video generation	Yes (text-to-video + image-to-video, async `job_id` poll)	No	Yes (text/image-to-video)	No
Multi-provider backends	Yes (Gemini / OpenAI / Grok, auto-fallback)	No (EverArt only)	No (fal.ai only)	No (Replicate Flux only)
Quality/cost tiers	Yes (`poor` cheap-fast vs `rich` high-quality per provider)	No	No	No
Self-hostable / open source	Yes (MIT, stdio + HTTP self-host)	Yes (MIT, archived)	Yes (MIT)	Yes (MIT, archived)

Security

SSRF + LFI prevention -- All media_urls and reference_image_url are validated at the dispatch boundary; only http:// and https:// schemes reach the providers. file://, ftp://, gopher://, and scheme-less URLs are rejected.
No credentials in errors -- Provider-side errors are sanitized before being returned.
Degraded start -- Missing credentials do not prevent the server from starting; affected actions surface actionable errors instead of crashing at boot.
Credential storage -- Credentials submitted through the browser credential form are stored encrypted via mcp-core (AES-GCM, machine-bound key) at ~/.imagine-mcp/config.json.

Build from Source

git clone https://github.com/n24q02m/imagine-mcp.git
cd imagine-mcp
mise run setup      # or: uv sync --group dev
mise run dev        # run the server in stdio mode (add --http for the HTTP daemon)

Trust Model

This plugin implements TC-Local (machine-bound, single trust principal). See mcp-core trust model for full classification.

Mode	Storage	Encryption	Who can read your data?
stdio (default)	`~/.imagine-mcp/config.json`	AES-GCM, machine-bound key	Only your OS user (file perm 0600)
HTTP self-host	Same as stdio	Same	Only you (admin = user)

Contributing

See CONTRIBUTING.md for the full development workflow, commit convention, and release process. Issues + Discussions welcome.

License

MIT -- see LICENSE.

Featured

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

Configuration

GOOGLE_AI_STUDIO_API_KEYsecret

Google AI Studio API key (aistudio.google.com/apikey)

OPENAI_API_KEYsecret

OpenAI API key (platform.openai.com)

XAI_API_KEYsecret

xAI (Grok) API key (console.x.ai)

imagine-mcp

mcp-name: io.github.n24q02m/imagine-mcp

Image and video understanding + generation for AI agents -- across Gemini, OpenAI, and Grok.

Sister projects from n24q02m (click to expand)

Project	Tagline	Tag
better-code-review-graph	Knowledge graph for token-efficient code reviews -- semantic search and call-...	MCP
better-email-mcp	IMAP/SMTP email for AI agents -- read, send, organize folders, and manage att...	MCP
better-godot-mcp	Composite MCP server for Godot Engine -- 17 composite tools for AI-assisted g...	MCP
better-notion-mcp	Markdown-first Notion for AI agents -- pages, databases, blocks, and comments...	MCP
better-telegram-mcp	Telegram for AI agents -- messages, chats, media, and contacts across both bo...	MCP
claude-plugins	Claude Code plugin marketplace for the n24q02m MCP servers -- install web sea...	Marketplace
imagine-mcp	Image and video understanding + generation for AI agents -- across Gemini, Op...	MCP
jules-task-archiver	Chrome Extension for bulk operations on Jules tasks via batchexecute API -- a...	Tooling
mcp-core	Shared foundation for building MCP servers -- Streamable HTTP transport, OAut...	MCP
mnemo-mcp	Persistent AI memory with hybrid search and embedded sync. Open, free, unlimi...	MCP
qwen3-embed	Lightweight Qwen3 text embedding and reranking via ONNX Runtime and GGUF	Library
skret	Secrets without the server.	CLI
tacet	TACET: a self-distilling neuro-symbolic cascade that amortises LLM cost in kn...	Tooling
web-core	Shared web infrastructure package for search, scraping, HTTP security, and st...	Library
wet-mcp	Open-source MCP server for AI agents: web search, content extraction, and lib...	MCP

Features
Install
Configuration
Documentation
Tools
Comparison
Security
Build from Source
Trust Model
Contributing
License

Features

Multimodal understanding -- Describe, classify, or reason over images and videos (Gemini handles mixed image + video in one call)
Image generation -- Text-to-image and image-to-image (edit / inpaint) across Gemini Imagen, OpenAI gpt-image, Grok Imagine
Video generation -- Text-to-video and image-to-video (Gemini Veo 3.1, Grok Imagine Video)
3 providers x 2 tiers -- Same interface for gemini / openai / grok at poor (cheap/fast) or rich (high quality); swap via parameter
Leaderboard-ranked models -- Provider ordering auto-refreshed weekly from Artificial Analysis + LMArena leaderboards
Degraded mode -- Server starts with zero credentials and surfaces remaining providers as you add keys
Response cache -- Disk-based caching of understand responses with configurable TTL
Dual transport -- pure stdio with provider env vars (default) or HTTP multi-user with paste-token relay form

Install

Run with uvx (no install step) or pull the container image:

# uvx -- recommended, runs the published PyPI package
uvx imagine-mcp

# Docker
docker run -it --rm ghcr.io/n24q02m/imagine-mcp:latest

Add it to an MCP client by pointing the client at the uvx imagine-mcp command and supplying at least one provider key (see Configuration):

{
  "mcpServers": {
    "imagine": {
      "command": "uvx",
      "args": ["imagine-mcp"],
      "env": { "GEMINI_API_KEY": "AIza..." }
    }
  }
}

For per-client snippets (Claude Code, Codex, Gemini CLI, Cursor, Windsurf) and the browser-based HTTP setup, see the Setup docs.

Install with an AI agent -- paste this to your AI coding agent:

Install MCP server imagine-mcp following the steps at
https://raw.githubusercontent.com/n24q02m/claude-plugins/main/plugins/imagine-mcp/setup-with-agent.md

Configuration

Two transports (default stdio; opt into http with --http, MCP_TRANSPORT=http, or TRANSPORT_MODE=http):

stdio (default) -- single-user, reads credentials from env vars only. Exits if none of the three provider keys are set.
http -- HTTP daemon. Local self-host on 127.0.0.1 by default, or multi-user remote (per-JWT-sub credential isolation) when PUBLIC_URL + MCP_DCR_SERVER_SECRET are set. In HTTP mode credentials are entered through a browser form at /authorize.

Provider keys

All optional -- the server starts in degraded mode and surfaces whichever providers have a key. Set at least one.

Env var	Provider	Get a key at
`GEMINI_API_KEY`	Gemini (image + video)	aistudio.google.com/apikey
`OPENAI_API_KEY`	OpenAI (image)	platform.openai.com/api-keys
`XAI_API_KEY`	Grok / xAI (image + video)	console.x.ai

When a tool is called without an explicit provider, the first key present wins in the order XAI_API_KEY -> OPENAI_API_KEY -> GEMINI_API_KEY.

Model chains (optional)

Override the built-in provider/tier catalog with explicit model chains. Each is a CSV of litellm provider/model entries; the order is the fallback order.

Env var	Purpose
`UNDERSTAND_MODELS`	Ordered model chain for `understand` (litellm fallback). Empty -> catalog default.
`GENERATE_MODELS`	Ordered model chain for `generate`. The first entry selects the native provider + model. Empty -> catalog default.
`GENERATE_PROVIDER_PRIORITY`	CSV of provider names reordering generation auto-fallback. Defaults to `grok,openai,gemini`.

{
  "mcpServers": {
    "imagine": {
      "command": "uvx",
      "args": ["imagine-mcp"],
      "env": {
        "UNDERSTAND_MODELS": "gemini/gemini-3.1-pro-preview,openai/gpt-5.4",
        "GEMINI_API_KEY": "AIza...",
        "OPENAI_API_KEY": "sk-..."
      }
    }
  }
}

Runtime knobs

config(action="set", key=..., value=...) adjusts log_level, default_provider, default_tier, and cache_ttl_seconds at runtime.

Documentation

Full docs at mcp.n24q02m.com/servers/imagine-mcp/setup/:

Setup -- install methods for Claude Code, Codex, Gemini CLI, Cursor, Windsurf, mcp.json
Modes overview -- stdio / local-relay / remote-relay / remote-oauth
Multi-user setup -- per-JWT-sub credential model

Tools

Tool	Actions	Description
`understand`	--	Describe or reason over one or more image/video URLs. `media_urls: list[str]`, `prompt: str`, `provider`, `tier`, `max_tokens`.
`generate`	--	Generate an image or video from a text prompt. `media_type: image\|video`, optional `reference_image_url`, optional `job_id` (video poll), `aspect_ratio`, `duration_seconds`.
`config`	`open_relay`, `relay_status`, `relay_skip`, `relay_reset`, `relay_complete`, `warmup`, `status`, `set`, `cache_clear`	Credential + runtime config: open relay form, check credential state, set runtime knobs (log level, default provider, TTL), clear response cache.
`help`	--	Full Markdown documentation for `understand`, `generate`, or `config` topics.
`config__open_relay`	--	Framework-injected helper (mcp-core) equivalent to `config(action="open_relay")`; opens the browser credential form.

Model IDs per provider x action x tier are leaderboard-ranked; see docs/models.md (auto-regenerated from src/imagine_mcp/models.py).

Comparison

How imagine-mcp stacks up against direct competitors in each pillar:

Capability	imagine-mcp	EverArt MCP	fal.ai MCP	Replicate Flux MCP
Image/video understanding	Yes (describe / classify / reason over image + video URLs)	No	No	No
Image generation	Yes (text-to-image + image-to-image via `reference_image_url`)	Yes (single `generate_image`)	Yes (text/image-to-image, edit, inpaint)	Yes (single `generate_image`)
Video generation	Yes (text-to-video + image-to-video, async `job_id` poll)	No	Yes (text/image-to-video)	No
Multi-provider backends	Yes (Gemini / OpenAI / Grok, auto-fallback)	No (EverArt only)	No (fal.ai only)	No (Replicate Flux only)
Quality/cost tiers	Yes (`poor` cheap-fast vs `rich` high-quality per provider)	No	No	No
Self-hostable / open source	Yes (MIT, stdio + HTTP self-host)	Yes (MIT, archived)	Yes (MIT)	Yes (MIT, archived)

Security

SSRF + LFI prevention -- All media_urls and reference_image_url are validated at the dispatch boundary; only http:// and https:// schemes reach the providers. file://, ftp://, gopher://, and scheme-less URLs are rejected.
No credentials in errors -- Provider-side errors are sanitized before being returned.
Degraded start -- Missing credentials do not prevent the server from starting; affected actions surface actionable errors instead of crashing at boot.
Credential storage -- Credentials submitted through the browser credential form are stored encrypted via mcp-core (AES-GCM, machine-bound key) at ~/.imagine-mcp/config.json.

Build from Source

git clone https://github.com/n24q02m/imagine-mcp.git
cd imagine-mcp
mise run setup      # or: uv sync --group dev
mise run dev        # run the server in stdio mode (add --http for the HTTP daemon)

Trust Model

This plugin implements TC-Local (machine-bound, single trust principal). See mcp-core trust model for full classification.

Mode	Storage	Encryption	Who can read your data?
stdio (default)	`~/.imagine-mcp/config.json`	AES-GCM, machine-bound key	Only your OS user (file perm 0600)
HTTP self-host	Same as stdio	Same	Only you (admin = user)

Contributing

See CONTRIBUTING.md for the full development workflow, commit convention, and release process. Issues + Discussions welcome.

License

MIT -- see LICENSE.

Imagine Mcp

imagine-mcp

Table of contents

Features

Install

Configuration

Provider keys

Model chains (optional)

Runtime knobs

Documentation

Tools

Comparison

Security

Build from Source

Trust Model

Contributing

License

Configuration

Imagine Mcp

imagine-mcp

Table of contents

Features

Install

Configuration

Provider keys

Model chains (optional)

Runtime knobs

Documentation

Tools

Comparison

Security

Build from Source

Trust Model

Contributing

License

Configuration

Related AI & LLM Tools MCP Servers

Related AI & LLM Tools MCP Servers