Model ID Cheatsheet

SSEregistry active

Summary

Pulls current model IDs, pricing, and specs for 107 models across 19 AI providers into any MCP client. Exposes six tools: get_model_info for full specs, check_model_status to verify if a model is current or deprecated, compare_models for side-by-side analysis, recommend_model for task-based suggestions, list_models with filters for provider and capability, and search_models for free-text queries. Built in Go with sub-millisecond responses and zero external API calls. The registry updates daily and catches common mistakes like using gpt-4o (deprecated) instead of gpt-5 or outdated Claude model IDs. Useful when writing API integration code, comparing costs across providers, or preventing your agent from hallucinating model names that no longer exist.

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

Model ID Cheatsheet

Stop your AI coding agent from hallucinating outdated model names. This MCP server gives any AI assistant instant access to accurate, up-to-date API model IDs, pricing, and specs for 107 models across 19 providers.

Built in Go. Single 10MB binary. Zero external calls. Sub-millisecond responses. Auto-updated daily.

- model = "gpt-4-turbo"           # Hallucinated - doesn't exist anymore
+ model = "gpt-5.3-codex"         # Correct - verified against official docs

- model = "claude-3-opus-20240229" # Deprecated
+ model = "claude-opus-4-6"        # Current - latest Anthropic flagship

Quick Start

Pick one option below. You'll be up and running in under a minute.

Option A: Claude Code (one command)

claude mcp add --transport sse --scope user model-id-cheatsheet \
  https://universal-model-registry-production.up.railway.app/sse

Verify it works:

claude mcp list
# Should show: model-id-cheatsheet ... Connected

Then start a new Claude Code session and ask: "What's the latest OpenAI model?" - it will use the tools automatically.

Option B: Cursor

Add to ~/.cursor/mcp.json:

{
  "mcpServers": {
    "model-id-cheatsheet": {
      "url": "https://universal-model-registry-production.up.railway.app/sse"
    }
  }
}

Restart Cursor to pick up the change.

Option C: Windsurf

Add to Settings > MCP Servers (or edit ~/.codeium/windsurf/mcp_config.json):

{
  "mcpServers": {
    "model-id-cheatsheet": {
      "serverUrl": "https://universal-model-registry-production.up.railway.app/sse"
    }
  }
}

Option D: Codex CLI

Add to ~/.codex/config.toml:

[mcp_servers.model-id-cheatsheet]
command = "uvx"
args = ["mcp-proxy", "--transport", "sse", "https://universal-model-registry-production.up.railway.app/sse"]

Option E: OpenCode

Add to ~/.config/opencode/opencode.json:

{
  "$schema": "https://opencode.ai/config.json",
  "mcp": {
    "model-id-cheatsheet": {
      "type": "remote",
      "url": "https://universal-model-registry-production.up.railway.app/sse"
    }
  }
}

Option F: Any MCP Client

Connect to the SSE endpoint directly (no API key, no auth):

https://universal-model-registry-production.up.railway.app/sse

Or use the Streamable HTTP transport:

https://universal-model-registry-production.up.railway.app/mcp

Verify Your Setup

Once connected, try asking your AI assistant any of these:

"What's the correct model ID for Claude Opus 4.6?"
"Is gpt-4o still available?"
"Compare gpt-5.2 vs claude-opus-4-6"
"What's the cheapest model with vision?"

If the agent calls a tool like get_model_info or check_model_status before answering, it's working.

How It Works

Your AI agent gains 6 tools that it calls automatically before writing any model ID:

Tool	What It Does	Example Prompt
`get_model_info(model_id)`	Full specs: API ID, pricing, context window, capabilities	"What's the model ID for Claude Sonnet?"
`list_models(provider?, status?, capability?)`	Browse and filter the registry	"Show me all current Google models"
`recommend_model(task, budget?)`	Ranked recommendations for a task	"Best model for coding, cheap budget"
`check_model_status(model_id)`	Verify if a model is current, legacy, or deprecated	"Is gpt-4o still available?"
`compare_models(model_ids)`	Side-by-side comparison table	"Compare gpt-5.2 vs claude-opus-4-6"
`search_models(query)`	Free-text search across all fields	"Search for reasoning models"

Resources

URI	Description
`model://registry/all`	Full JSON dump of all 107 models
`model://registry/current`	Only current (non-deprecated) models as JSON
`model://registry/pricing`	Pricing table sorted cheapest-first (markdown)

What Happens Under the Hood

You ask your agent to write code or answer a model question
The agent automatically calls the appropriate tool (e.g., get_model_info)
The server responds in sub-milliseconds with verified data (no external API calls)
The agent writes code with the correct, current model ID

The server instructions tell the agent: "NEVER use a model ID from your training data without verifying it first." This means the agent will always check before writing.

Real-World Examples

Writing an API call:

# You: "Call the OpenAI API with their best coding model"
# Agent calls: get_model_info("gpt-5.4")
response = client.chat.completions.create(
    model="gpt-5.4",  # Verified via model registry
    messages=[...]
)

Catching deprecated models:

# You: "Use gpt-4o for this task"
# Agent calls: check_model_status("gpt-4o")
# Agent: "gpt-4o is deprecated. I'll use gpt-5 instead."
response = client.chat.completions.create(
    model="gpt-5",  # Updated automatically
    messages=[...]
)

Finding the cheapest option:

# You: "Use the cheapest model that supports vision"
# Agent calls: list_models(capability="vision", status="current")
response = client.chat.completions.create(
    model="gpt-5-nano",  # $0.05/$0.40 per 1M tokens
    messages=[...]
)

Comparing options:

# You: "Should I use Claude or GPT for this?"
# Agent calls: compare_models(["claude-opus-4-6", "gpt-5.2"])
# Agent gets a side-by-side table and makes a recommendation

Resource Footprint

A common concern: "Will this slow down my agent or eat tokens?"

Metric	Value
Binary size	~10MB
Runtime memory	Minimal (static in-memory map, no database)
External API calls	Zero (all data is baked in)
Response time	Sub-millisecond
Token cost per tool call	~200-500 tokens (small text response)
Tool schema overhead	~500-800 tokens in system prompt

For comparison, a single web search costs more tokens than all 6 tool schemas combined.

Covered Models (107 total)

Current Models (79)

Provider	Models	API IDs
OpenAI (15)	GPT-5.4, GPT-5.4 Pro, GPT-5.3 Instant, GPT-5.2, GPT-5.2 Pro, GPT-5.1, GPT-5.1 Codex, GPT-5.1 Mini, GPT-5, GPT-5 Mini, GPT-5 Nano, GPT-4.1 Mini, GPT-4.1 Nano, o3, o4-mini	`gpt-5.4`, `gpt-5.4-pro`, `gpt-5.3-chat-latest`, `gpt-5.2`, `gpt-5.2-pro`, `gpt-5.1`, `gpt-5.1-codex`, `gpt-5.1-mini`, `gpt-5`, `gpt-5-mini`, `gpt-5-nano`, `gpt-4.1-mini`, `gpt-4.1-nano`, `o3`, `o4-mini`
Anthropic (4)	Claude Opus 4.6, Claude Sonnet 4.6, Claude Sonnet 4.5, Claude Haiku 4.5	`claude-opus-4-6`, `claude-sonnet-4-6`, `claude-sonnet-4-5-20250929`, `claude-haiku-4-5-20251001`
Mistral (11)	Mistral Large 3, Mistral Medium 3, Mistral Small 3.2, Mistral Saba, Ministral 3B, Ministral 8B, Ministral 14B, Magistral Small 1.2, Magistral Medium 1.2, Devstral 2, Devstral Small 2	`mistral-large-2512`, `mistral-medium-2505`, `mistral-small-2506`, `mistral-saba-2502`, `ministral-3b-2512`, `ministral-8b-2512`, `ministral-14b-2512`, `magistral-small-2509`, `magistral-medium-2509`, `devstral-2512`, `devstral-small-2512`
Amazon (6)	Nova Micro, Nova Lite, Nova Pro, Nova Premier, Nova 2 Lite, Nova 2 Pro	`amazon-nova-micro`, `amazon-nova-lite`, `amazon-nova-pro`, `amazon-nova-premier`, `amazon-nova-2-lite`, `amazon-nova-2-pro`
Google (5)	Gemini 3.1 Pro, Gemini 3.1 Flash Lite, Gemini 3 Flash, Gemini 2.5 Pro, Gemini 2.5 Flash	`gemini-3.1-pro-preview`, `gemini-3.1-flash-lite-preview`, `gemini-3-flash-preview`, `gemini-2.5-pro`, `gemini-2.5-flash`
Cohere (5)	Command A, Command A Reasoning, Command A Vision, Command A Translate, Command R7B	`command-a-03-2025`, `command-a-reasoning-08-2025`, `command-a-vision-07-2025`, `command-a-translate-08-2025`, `command-r7b-12-2024`
xAI (4)	Grok 4, Grok 4.1 Fast, Grok 4 Fast, Grok Code Fast 1	`grok-4`, `grok-4.1-fast`, `grok-4-fast`, `grok-code-fast-1`
Microsoft (4)	Phi-4, Phi-4 Multimodal, Phi-4 Reasoning, Phi-4 Reasoning Plus	`phi-4`, `phi-4-multimodal-instruct`, `phi-4-reasoning`, `phi-4-reasoning-plus`
Perplexity (4)	Sonar, Sonar Pro, Sonar Reasoning Pro, Sonar Deep Research	`sonar`, `sonar-pro`, `sonar-reasoning-pro`, `sonar-deep-research`
Moonshot (3)	Kimi K2.5, Kimi K2 Thinking, Kimi K2 (0905)	`kimi-k2.5`, `kimi-k2-thinking`, `kimi-k2-0905-preview`
Tencent (3)	Hunyuan TurboS, Hunyuan T1, Hunyuan A13B	`hunyuan-turbos`, `hunyuan-t1`, `hunyuan-a13b`
Zhipu (3)	GLM-5, GLM-4.7, GLM-4.7 FlashX	`glm-5`, `glm-4.7`, `glm-4.7-flashx`
Meta (2)	Llama 4 Maverick, Llama 4 Scout	`llama-4-maverick`, `llama-4-scout`
DeepSeek (2)	DeepSeek Reasoner, DeepSeek Chat	`deepseek-reasoner`, `deepseek-chat`
NVIDIA (2)	Nemotron 3 Nano 30B, Nemotron Ultra 253B	`nvidia/nemotron-3-nano-30b-a3b`, `nvidia/llama-3.1-nemotron-ultra-253b-v1`
AI21 (2)	Jamba Large 1.7, Jamba Mini 1.7	`jamba-large-1.7`, `jamba-mini-1.7`
MiniMax (2)	MiniMax M2.5, MiniMax M2.5 Lightning	`minimax-m2.5`, `minimax-m2.5-lightning`
Kuaishou (1)	KAT-Coder Pro	`kat-coder-pro`
Xiaomi (1)	MiMo V2 Flash	`mimo-v2-flash`

Legacy & Deprecated Models (30)

Tracked so your agent can detect outdated model IDs and suggest current replacements:

OpenAI: gpt-5.3-codex (deprecated), gpt-5.2-codex (deprecated), gpt-5.1-codex-mini (deprecated), o3-pro (deprecated), o3-deep-research (deprecated), o3-mini (legacy), gpt-4.1 (deprecated), gpt-4o (deprecated), gpt-4o-mini (deprecated)
Anthropic: claude-opus-4-5 (legacy), claude-opus-4-1 (legacy), claude-opus-4-0 (legacy), claude-sonnet-4-0 (legacy), claude-3-7-sonnet-20250219 (deprecated)
Google: gemini-3-pro-preview (deprecated), gemini-3-pro-image-preview (deprecated), gemini-2.5-flash-lite (deprecated), gemini-2.0-flash-lite (deprecated), gemini-2.0-flash (deprecated)
xAI: grok-4.1 (deprecated), grok-3 (legacy), grok-3-mini (legacy)
Mistral: mistral-small-2503 (legacy), codestral-2508 (legacy)
MiniMax: minimax-m2.1 (legacy), minimax-01 (deprecated)
Meta: llama-3.3-70b (legacy)
DeepSeek: deepseek-r1 (legacy), deepseek-v3 (deprecated)
Zhipu: glm-4.6v (deprecated)

Self-Hosting

If you prefer to run the server locally instead of using the hosted endpoint:

Option 1: Build from Source (recommended for local use)

Requires Go 1.23+.

git clone https://github.com/aezizhu/universal-model-registry.git
cd universal-model-registry/go-server
go build -o model-id-cheatsheet ./cmd/server

Then add it to Claude Code as a local stdio server (zero latency, no network):

claude mcp add --scope user model-id-cheatsheet -- /path/to/model-id-cheatsheet

Or run in SSE mode for other clients:

MCP_TRANSPORT=sse PORT=8000 ./model-id-cheatsheet
# Endpoint: http://localhost:8000/sse

Option 2: Docker

git clone https://github.com/aezizhu/universal-model-registry.git
cd universal-model-registry
docker build -t model-id-cheatsheet .
docker run -p 8000:8000 model-id-cheatsheet

Your SSE endpoint will be at http://localhost:8000/sse.

Option 3: Deploy to Railway

Or manually:

railway login
railway init
railway up

Staying Up to Date

Model data is automatically checked and updated daily at 7 PM Pacific Time -- no human intervention needed.

How it works:

Railway cron runs the updater daily, scraping 6 providers' public documentation pages (no API keys needed)
Models removed from docs --> auto-deprecated via PR (status changed to "deprecated" in code)
New models detected --> GitHub issue created for review
CI runs on the auto-generated PR --> if tests pass --> auto-merged into main
Railway auto-deploys from main

No provider API keys required. The updater reads publicly available documentation pages to detect model changes. Only GITHUB_TOKEN and GITHUB_REPO are needed for creating PRs and issues.

Auto-Update Pipeline Details

Railway Cron (primary) -- The hosted instance uses a Railway cron service that runs the updater daily. See configs/railway-updater.toml for the configuration.

Required env vars (set in Railway dashboard):

GITHUB_TOKEN -- GitHub personal access token with repo scope
GITHUB_REPO -- Repository in "owner/repo" format (e.g. "aezizhu/universal-model-registry")

Providers checked (via public docs):

OpenAI (via GitHub SDK source), Anthropic, Google, Mistral, xAI, DeepSeek

CI/CD Workflows:

.github/workflows/ci.yml -- runs tests on every PR
.github/workflows/auto-merge.yml -- auto-merges bot PRs (labeled auto-update) after CI passes

GitHub Actions (alternative) -- A GitHub Actions workflow is also included at .github/workflows/auto-update.yml for users who self-host without Railway. No API keys needed -- only GITHUB_TOKEN (automatically provided by GitHub Actions).

Security

Rate limiting: 60 requests/minute per IP
Connection limits: Max 5 SSE connections per IP, 100 total
Request body limit: 64KB max
Input sanitization: All string inputs truncated to safe lengths
HTTP hardening: ReadTimeout 15s, ReadHeaderTimeout 5s, IdleTimeout 120s, 64KB max headers
Non-root Docker: Containers run as unprivileged user
Graceful shutdown: Clean connection draining on SIGINT/SIGTERM

Tech Stack

Language: Go 1.23
MCP SDK: github.com/modelcontextprotocol/go-sdk v1.3.0 (official)
Transports: stdio, SSE, Streamable HTTP
Binary size: ~10MB
Tests: 156 unit tests
Security: Per-IP rate limiting, connection limits, input sanitization
Deploy: Docker (alpine), Railway

Contributing

Contributions are welcome! Whether it's adding a new model, fixing data, or improving the server:

Fork the repo and clone it locally
Edit model data in go-server/internal/models/data.go
Update test counts in go-server/internal/models/data_test.go
Run the tests:
```
cd go-server && go test ./... -v
```
Submit a PR -- we'll review it quickly

If you spot an outdated model or incorrect pricing, opening an issue is just as helpful.

License

MIT

Featured

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

Registryactive

TransportSSE

UpdatedFeb 9, 2026

View on GitHub

Model ID Cheatsheet

Built in Go. Single 10MB binary. Zero external calls. Sub-millisecond responses. Auto-updated daily.

- model = "gpt-4-turbo"           # Hallucinated - doesn't exist anymore
+ model = "gpt-5.3-codex"         # Correct - verified against official docs

- model = "claude-3-opus-20240229" # Deprecated
+ model = "claude-opus-4-6"        # Current - latest Anthropic flagship

Quick Start

Pick one option below. You'll be up and running in under a minute.

Option A: Claude Code (one command)

claude mcp add --transport sse --scope user model-id-cheatsheet \
  https://universal-model-registry-production.up.railway.app/sse

Verify it works:

claude mcp list
# Should show: model-id-cheatsheet ... Connected

Then start a new Claude Code session and ask: "What's the latest OpenAI model?" - it will use the tools automatically.

Option B: Cursor

Add to ~/.cursor/mcp.json:

{
  "mcpServers": {
    "model-id-cheatsheet": {
      "url": "https://universal-model-registry-production.up.railway.app/sse"
    }
  }
}

Restart Cursor to pick up the change.

Option C: Windsurf

Add to Settings > MCP Servers (or edit ~/.codeium/windsurf/mcp_config.json):

{
  "mcpServers": {
    "model-id-cheatsheet": {
      "serverUrl": "https://universal-model-registry-production.up.railway.app/sse"
    }
  }
}

Option D: Codex CLI

Add to ~/.codex/config.toml:

[mcp_servers.model-id-cheatsheet]
command = "uvx"
args = ["mcp-proxy", "--transport", "sse", "https://universal-model-registry-production.up.railway.app/sse"]

Option E: OpenCode

Add to ~/.config/opencode/opencode.json:

{
  "$schema": "https://opencode.ai/config.json",
  "mcp": {
    "model-id-cheatsheet": {
      "type": "remote",
      "url": "https://universal-model-registry-production.up.railway.app/sse"
    }
  }
}

Option F: Any MCP Client

Connect to the SSE endpoint directly (no API key, no auth):

https://universal-model-registry-production.up.railway.app/sse

Or use the Streamable HTTP transport:

https://universal-model-registry-production.up.railway.app/mcp

Verify Your Setup

Once connected, try asking your AI assistant any of these:

"What's the correct model ID for Claude Opus 4.6?"
"Is gpt-4o still available?"
"Compare gpt-5.2 vs claude-opus-4-6"
"What's the cheapest model with vision?"

If the agent calls a tool like get_model_info or check_model_status before answering, it's working.

How It Works

Your AI agent gains 6 tools that it calls automatically before writing any model ID:

Tool	What It Does	Example Prompt
`get_model_info(model_id)`	Full specs: API ID, pricing, context window, capabilities	"What's the model ID for Claude Sonnet?"
`list_models(provider?, status?, capability?)`	Browse and filter the registry	"Show me all current Google models"
`recommend_model(task, budget?)`	Ranked recommendations for a task	"Best model for coding, cheap budget"
`check_model_status(model_id)`	Verify if a model is current, legacy, or deprecated	"Is gpt-4o still available?"
`compare_models(model_ids)`	Side-by-side comparison table	"Compare gpt-5.2 vs claude-opus-4-6"
`search_models(query)`	Free-text search across all fields	"Search for reasoning models"

Resources

URI	Description
`model://registry/all`	Full JSON dump of all 107 models
`model://registry/current`	Only current (non-deprecated) models as JSON
`model://registry/pricing`	Pricing table sorted cheapest-first (markdown)

What Happens Under the Hood

You ask your agent to write code or answer a model question
The agent automatically calls the appropriate tool (e.g., get_model_info)
The server responds in sub-milliseconds with verified data (no external API calls)
The agent writes code with the correct, current model ID

The server instructions tell the agent: "NEVER use a model ID from your training data without verifying it first." This means the agent will always check before writing.

Real-World Examples

Writing an API call:

# You: "Call the OpenAI API with their best coding model"
# Agent calls: get_model_info("gpt-5.4")
response = client.chat.completions.create(
    model="gpt-5.4",  # Verified via model registry
    messages=[...]
)

Catching deprecated models:

# You: "Use gpt-4o for this task"
# Agent calls: check_model_status("gpt-4o")
# Agent: "gpt-4o is deprecated. I'll use gpt-5 instead."
response = client.chat.completions.create(
    model="gpt-5",  # Updated automatically
    messages=[...]
)

Finding the cheapest option:

# You: "Use the cheapest model that supports vision"
# Agent calls: list_models(capability="vision", status="current")
response = client.chat.completions.create(
    model="gpt-5-nano",  # $0.05/$0.40 per 1M tokens
    messages=[...]
)

Comparing options:

# You: "Should I use Claude or GPT for this?"
# Agent calls: compare_models(["claude-opus-4-6", "gpt-5.2"])
# Agent gets a side-by-side table and makes a recommendation

Resource Footprint

A common concern: "Will this slow down my agent or eat tokens?"

Metric	Value
Binary size	~10MB
Runtime memory	Minimal (static in-memory map, no database)
External API calls	Zero (all data is baked in)
Response time	Sub-millisecond
Token cost per tool call	~200-500 tokens (small text response)
Tool schema overhead	~500-800 tokens in system prompt

For comparison, a single web search costs more tokens than all 6 tool schemas combined.

Covered Models (107 total)

Current Models (79)

Provider	Models	API IDs
OpenAI (15)	GPT-5.4, GPT-5.4 Pro, GPT-5.3 Instant, GPT-5.2, GPT-5.2 Pro, GPT-5.1, GPT-5.1 Codex, GPT-5.1 Mini, GPT-5, GPT-5 Mini, GPT-5 Nano, GPT-4.1 Mini, GPT-4.1 Nano, o3, o4-mini	`gpt-5.4`, `gpt-5.4-pro`, `gpt-5.3-chat-latest`, `gpt-5.2`, `gpt-5.2-pro`, `gpt-5.1`, `gpt-5.1-codex`, `gpt-5.1-mini`, `gpt-5`, `gpt-5-mini`, `gpt-5-nano`, `gpt-4.1-mini`, `gpt-4.1-nano`, `o3`, `o4-mini`
Anthropic (4)	Claude Opus 4.6, Claude Sonnet 4.6, Claude Sonnet 4.5, Claude Haiku 4.5	`claude-opus-4-6`, `claude-sonnet-4-6`, `claude-sonnet-4-5-20250929`, `claude-haiku-4-5-20251001`
Mistral (11)	Mistral Large 3, Mistral Medium 3, Mistral Small 3.2, Mistral Saba, Ministral 3B, Ministral 8B, Ministral 14B, Magistral Small 1.2, Magistral Medium 1.2, Devstral 2, Devstral Small 2	`mistral-large-2512`, `mistral-medium-2505`, `mistral-small-2506`, `mistral-saba-2502`, `ministral-3b-2512`, `ministral-8b-2512`, `ministral-14b-2512`, `magistral-small-2509`, `magistral-medium-2509`, `devstral-2512`, `devstral-small-2512`
Amazon (6)	Nova Micro, Nova Lite, Nova Pro, Nova Premier, Nova 2 Lite, Nova 2 Pro	`amazon-nova-micro`, `amazon-nova-lite`, `amazon-nova-pro`, `amazon-nova-premier`, `amazon-nova-2-lite`, `amazon-nova-2-pro`
Google (5)	Gemini 3.1 Pro, Gemini 3.1 Flash Lite, Gemini 3 Flash, Gemini 2.5 Pro, Gemini 2.5 Flash	`gemini-3.1-pro-preview`, `gemini-3.1-flash-lite-preview`, `gemini-3-flash-preview`, `gemini-2.5-pro`, `gemini-2.5-flash`
Cohere (5)	Command A, Command A Reasoning, Command A Vision, Command A Translate, Command R7B	`command-a-03-2025`, `command-a-reasoning-08-2025`, `command-a-vision-07-2025`, `command-a-translate-08-2025`, `command-r7b-12-2024`
xAI (4)	Grok 4, Grok 4.1 Fast, Grok 4 Fast, Grok Code Fast 1	`grok-4`, `grok-4.1-fast`, `grok-4-fast`, `grok-code-fast-1`
Microsoft (4)	Phi-4, Phi-4 Multimodal, Phi-4 Reasoning, Phi-4 Reasoning Plus	`phi-4`, `phi-4-multimodal-instruct`, `phi-4-reasoning`, `phi-4-reasoning-plus`
Perplexity (4)	Sonar, Sonar Pro, Sonar Reasoning Pro, Sonar Deep Research	`sonar`, `sonar-pro`, `sonar-reasoning-pro`, `sonar-deep-research`
Moonshot (3)	Kimi K2.5, Kimi K2 Thinking, Kimi K2 (0905)	`kimi-k2.5`, `kimi-k2-thinking`, `kimi-k2-0905-preview`
Tencent (3)	Hunyuan TurboS, Hunyuan T1, Hunyuan A13B	`hunyuan-turbos`, `hunyuan-t1`, `hunyuan-a13b`
Zhipu (3)	GLM-5, GLM-4.7, GLM-4.7 FlashX	`glm-5`, `glm-4.7`, `glm-4.7-flashx`
Meta (2)	Llama 4 Maverick, Llama 4 Scout	`llama-4-maverick`, `llama-4-scout`
DeepSeek (2)	DeepSeek Reasoner, DeepSeek Chat	`deepseek-reasoner`, `deepseek-chat`
NVIDIA (2)	Nemotron 3 Nano 30B, Nemotron Ultra 253B	`nvidia/nemotron-3-nano-30b-a3b`, `nvidia/llama-3.1-nemotron-ultra-253b-v1`
AI21 (2)	Jamba Large 1.7, Jamba Mini 1.7	`jamba-large-1.7`, `jamba-mini-1.7`
MiniMax (2)	MiniMax M2.5, MiniMax M2.5 Lightning	`minimax-m2.5`, `minimax-m2.5-lightning`
Kuaishou (1)	KAT-Coder Pro	`kat-coder-pro`
Xiaomi (1)	MiMo V2 Flash	`mimo-v2-flash`

Legacy & Deprecated Models (30)

Tracked so your agent can detect outdated model IDs and suggest current replacements:

OpenAI: gpt-5.3-codex (deprecated), gpt-5.2-codex (deprecated), gpt-5.1-codex-mini (deprecated), o3-pro (deprecated), o3-deep-research (deprecated), o3-mini (legacy), gpt-4.1 (deprecated), gpt-4o (deprecated), gpt-4o-mini (deprecated)
Anthropic: claude-opus-4-5 (legacy), claude-opus-4-1 (legacy), claude-opus-4-0 (legacy), claude-sonnet-4-0 (legacy), claude-3-7-sonnet-20250219 (deprecated)
Google: gemini-3-pro-preview (deprecated), gemini-3-pro-image-preview (deprecated), gemini-2.5-flash-lite (deprecated), gemini-2.0-flash-lite (deprecated), gemini-2.0-flash (deprecated)
xAI: grok-4.1 (deprecated), grok-3 (legacy), grok-3-mini (legacy)
Mistral: mistral-small-2503 (legacy), codestral-2508 (legacy)
MiniMax: minimax-m2.1 (legacy), minimax-01 (deprecated)
Meta: llama-3.3-70b (legacy)
DeepSeek: deepseek-r1 (legacy), deepseek-v3 (deprecated)
Zhipu: glm-4.6v (deprecated)

Self-Hosting

If you prefer to run the server locally instead of using the hosted endpoint:

Option 1: Build from Source (recommended for local use)

Requires Go 1.23+.

git clone https://github.com/aezizhu/universal-model-registry.git
cd universal-model-registry/go-server
go build -o model-id-cheatsheet ./cmd/server

Then add it to Claude Code as a local stdio server (zero latency, no network):

claude mcp add --scope user model-id-cheatsheet -- /path/to/model-id-cheatsheet

Or run in SSE mode for other clients:

MCP_TRANSPORT=sse PORT=8000 ./model-id-cheatsheet
# Endpoint: http://localhost:8000/sse

Option 2: Docker

git clone https://github.com/aezizhu/universal-model-registry.git
cd universal-model-registry
docker build -t model-id-cheatsheet .
docker run -p 8000:8000 model-id-cheatsheet

Your SSE endpoint will be at http://localhost:8000/sse.

Option 3: Deploy to Railway

Or manually:

railway login
railway init
railway up

Staying Up to Date

Model data is automatically checked and updated daily at 7 PM Pacific Time -- no human intervention needed.

How it works:

Railway cron runs the updater daily, scraping 6 providers' public documentation pages (no API keys needed)
Models removed from docs --> auto-deprecated via PR (status changed to "deprecated" in code)
New models detected --> GitHub issue created for review
CI runs on the auto-generated PR --> if tests pass --> auto-merged into main
Railway auto-deploys from main

No provider API keys required. The updater reads publicly available documentation pages to detect model changes. Only GITHUB_TOKEN and GITHUB_REPO are needed for creating PRs and issues.

Auto-Update Pipeline Details

Railway Cron (primary) -- The hosted instance uses a Railway cron service that runs the updater daily. See configs/railway-updater.toml for the configuration.

Required env vars (set in Railway dashboard):

GITHUB_TOKEN -- GitHub personal access token with repo scope
GITHUB_REPO -- Repository in "owner/repo" format (e.g. "aezizhu/universal-model-registry")

Providers checked (via public docs):

OpenAI (via GitHub SDK source), Anthropic, Google, Mistral, xAI, DeepSeek

CI/CD Workflows:

.github/workflows/ci.yml -- runs tests on every PR
.github/workflows/auto-merge.yml -- auto-merges bot PRs (labeled auto-update) after CI passes

Security

Rate limiting: 60 requests/minute per IP
Connection limits: Max 5 SSE connections per IP, 100 total
Request body limit: 64KB max
Input sanitization: All string inputs truncated to safe lengths
HTTP hardening: ReadTimeout 15s, ReadHeaderTimeout 5s, IdleTimeout 120s, 64KB max headers
Non-root Docker: Containers run as unprivileged user
Graceful shutdown: Clean connection draining on SIGINT/SIGTERM

Tech Stack

Language: Go 1.23
MCP SDK: github.com/modelcontextprotocol/go-sdk v1.3.0 (official)
Transports: stdio, SSE, Streamable HTTP
Binary size: ~10MB
Tests: 156 unit tests
Security: Per-IP rate limiting, connection limits, input sanitization
Deploy: Docker (alpine), Railway

Contributing

Contributions are welcome! Whether it's adding a new model, fixing data, or improving the server:

Fork the repo and clone it locally
Edit model data in go-server/internal/models/data.go
Update test counts in go-server/internal/models/data_test.go
Run the tests:
```
cd go-server && go test ./... -v
```
Submit a PR -- we'll review it quickly

If you spot an outdated model or incorrect pricing, opening an issue is just as helpful.

License

MIT