A production-grade bridge to image and video understanding plus generation across Gemini, OpenAI, and Grok. You get two core tools: `understand` for reasoning over image and video URLs with configurable context length, and `generate` for text-to-image, image-to-image, and text-to-video workflows. Each provider exposes "poor" and "rich" tiers so you can trade speed for quality on the fly. The server ranks models weekly against Artificial Analysis and LMArena leaderboards, caches responses to disk with configurable TTL, and degrades gracefully when credentials are missing. Ships with stdio and HTTP transports, plus a config tool that surfaces relay forms, credential state checks, and runtime knobs like log level and default provider. Reach for this when you need multimodal ops without hardcoding a single provider.
mcp-name: io.github.n24q02m/imagine-mcp
Image and video understanding + generation for AI agents -- across Gemini, OpenAI, and Grok.
| Project | Tagline | Tag |
|---|---|---|
| better-code-review-graph | Knowledge graph for token-efficient code reviews -- semantic search and call-... | MCP |
| better-email-mcp | IMAP/SMTP email for AI agents -- read, send, organize folders, and manage att... | MCP |
| better-godot-mcp | Composite MCP server for Godot Engine -- 17 composite tools for AI-assisted g... | MCP |
| better-notion-mcp | Markdown-first Notion for AI agents -- pages, databases, blocks, and comments... | MCP |
| better-telegram-mcp | Telegram for AI agents -- messages, chats, media, and contacts across both bo... | MCP |
| claude-plugins | Claude Code plugin marketplace for the n24q02m MCP servers -- install web sea... | Marketplace |
| imagine-mcp | Image and video understanding + generation for AI agents -- across Gemini, Op... | MCP |
| jules-task-archiver | Chrome Extension for bulk operations on Jules tasks via batchexecute API -- a... | Tooling |
| mcp-core | Shared foundation for building MCP servers -- Streamable HTTP transport, OAut... | MCP |
| mnemo-mcp | Persistent AI memory with hybrid search and embedded sync. Open, free, unlimi... | MCP |
| qwen3-embed | Lightweight Qwen3 text embedding and reranking via ONNX Runtime and GGUF | Library |
| skret | Secrets without the server. | CLI |
| tacet | TACET: a self-distilling neuro-symbolic cascade that amortises LLM cost in kn... | Tooling |
| web-core | Shared web infrastructure package for search, scraping, HTTP security, and st... | Library |
| wet-mcp | Open-source MCP server for AI agents: web search, content extraction, and lib... | MCP |
gemini / openai / grok at poor (cheap/fast) or rich (high quality); swap via parameterunderstand responses with configurable TTLRun with uvx (no install step) or pull the container image:
# uvx -- recommended, runs the published PyPI package
uvx imagine-mcp
# Docker
docker run -it --rm ghcr.io/n24q02m/imagine-mcp:latest
Add it to an MCP client by pointing the client at the uvx imagine-mcp command and
supplying at least one provider key (see Configuration):
{
"mcpServers": {
"imagine": {
"command": "uvx",
"args": ["imagine-mcp"],
"env": { "GEMINI_API_KEY": "AIza..." }
}
}
}
For per-client snippets (Claude Code, Codex, Gemini CLI, Cursor, Windsurf) and the browser-based HTTP setup, see the Setup docs.
Install with an AI agent -- paste this to your AI coding agent:
Install MCP server
imagine-mcpfollowing the steps at
https://raw.githubusercontent.com/n24q02m/claude-plugins/main/plugins/imagine-mcp/setup-with-agent.md
Two transports (default stdio; opt into http with --http, MCP_TRANSPORT=http,
or TRANSPORT_MODE=http):
127.0.0.1 by default, or multi-user
remote (per-JWT-sub credential isolation) when PUBLIC_URL + MCP_DCR_SERVER_SECRET
are set. In HTTP mode credentials are entered through a browser form at /authorize.All optional -- the server starts in degraded mode and surfaces whichever providers have a key. Set at least one.
| Env var | Provider | Get a key at |
|---|---|---|
GEMINI_API_KEY | Gemini (image + video) | aistudio.google.com/apikey |
OPENAI_API_KEY | OpenAI (image) | platform.openai.com/api-keys |
XAI_API_KEY | Grok / xAI (image + video) | console.x.ai |
When a tool is called without an explicit provider, the first key present wins in the
order XAI_API_KEY -> OPENAI_API_KEY -> GEMINI_API_KEY.
Override the built-in provider/tier catalog with explicit model chains. Each is a CSV of
litellm provider/model entries; the order is the fallback order.
| Env var | Purpose |
|---|---|
UNDERSTAND_MODELS | Ordered model chain for understand (litellm fallback). Empty -> catalog default. |
GENERATE_MODELS | Ordered model chain for generate. The first entry selects the native provider + model. Empty -> catalog default. |
GENERATE_PROVIDER_PRIORITY | CSV of provider names reordering generation auto-fallback. Defaults to grok,openai,gemini. |
Understanding is routed through litellm (provider/model passthrough), so any litellm
provider works -- supply that provider's <PROVIDER>_API_KEY. Generation stays on the
native provider SDKs (Gemini, OpenAI, Grok). Example:
{
"mcpServers": {
"imagine": {
"command": "uvx",
"args": ["imagine-mcp"],
"env": {
"UNDERSTAND_MODELS": "gemini/gemini-3.1-pro-preview,openai/gpt-5.4",
"GEMINI_API_KEY": "AIza...",
"OPENAI_API_KEY": "sk-..."
}
}
}
}
config(action="set", key=..., value=...) adjusts log_level, default_provider,
default_tier, and cache_ttl_seconds at runtime.
Full docs at mcp.n24q02m.com/servers/imagine-mcp/setup/:
| Tool | Actions | Description |
|---|---|---|
understand | -- | Describe or reason over one or more image/video URLs. media_urls: list[str], prompt: str, provider, tier, max_tokens. |
generate | -- | Generate an image or video from a text prompt. media_type: image|video, optional reference_image_url, optional job_id (video poll), aspect_ratio, duration_seconds. |
config | open_relay, relay_status, relay_skip, relay_reset, relay_complete, warmup, status, set, cache_clear | Credential + runtime config: open relay form, check credential state, set runtime knobs (log level, default provider, TTL), clear response cache. |
help | -- | Full Markdown documentation for understand, generate, or config topics. |
config__open_relay | -- | Framework-injected helper (mcp-core) equivalent to config(action="open_relay"); opens the browser credential form. |
Model IDs per provider x action x tier are leaderboard-ranked; see docs/models.md (auto-regenerated from src/imagine_mcp/models.py).
How imagine-mcp stacks up against direct competitors in each pillar:
| Capability | imagine-mcp | EverArt MCP | fal.ai MCP | Replicate Flux MCP |
|---|---|---|---|---|
| Image/video understanding | Yes (describe / classify / reason over image + video URLs) | No | No | No |
| Image generation | Yes (text-to-image + image-to-image via reference_image_url) | Yes (single generate_image) | Yes (text/image-to-image, edit, inpaint) | Yes (single generate_image) |
| Video generation | Yes (text-to-video + image-to-video, async job_id poll) | No | Yes (text/image-to-video) | No |
| Multi-provider backends | Yes (Gemini / OpenAI / Grok, auto-fallback) | No (EverArt only) | No (fal.ai only) | No (Replicate Flux only) |
| Quality/cost tiers | Yes (poor cheap-fast vs rich high-quality per provider) | No | No | No |
| Self-hostable / open source | Yes (MIT, stdio + HTTP self-host) | Yes (MIT, archived) | Yes (MIT) | Yes (MIT, archived) |
media_urls and reference_image_url are validated at the dispatch boundary; only http:// and https:// schemes reach the providers. file://, ftp://, gopher://, and scheme-less URLs are rejected.mcp-core (AES-GCM, machine-bound key) at ~/.imagine-mcp/config.json.git clone https://github.com/n24q02m/imagine-mcp.git
cd imagine-mcp
mise run setup # or: uv sync --group dev
mise run dev # run the server in stdio mode (add --http for the HTTP daemon)
This plugin implements TC-Local (machine-bound, single trust principal). See mcp-core trust model for full classification.
| Mode | Storage | Encryption | Who can read your data? |
|---|---|---|---|
| stdio (default) | ~/.imagine-mcp/config.json | AES-GCM, machine-bound key | Only your OS user (file perm 0600) |
| HTTP self-host | Same as stdio | Same | Only you (admin = user) |
See CONTRIBUTING.md for the full development workflow, commit convention, and release process. Issues + Discussions welcome.
MIT -- see LICENSE.
GOOGLE_AI_STUDIO_API_KEYsecretGoogle AI Studio API key (aistudio.google.com/apikey)
OPENAI_API_KEYsecretOpenAI API key (platform.openai.com)
XAI_API_KEYsecretxAI (Grok) API key (console.x.ai)
io.github.ericm1018/skillfm-llm-cost-optimizer-openai-anthropic-usage
io.github.mikerawsonnz/llm-orchestration-agent
io.github.mikerawsonnz/authenticated-llm-agent
labforgedev/copilot-memory-mcp
csoai-org/agent-prompt-injection-firewall-mcp
io.github.mikerawsonnz/authenticated-multi-llm-agent