This is a proxy layer that sits between Claude and any OpenAI-compatible LLM API. You point it at a downstream provider via DOWNSTREAM_URL, set a default model, and it exposes two MCP tools: list_models() to see what's available and complete() to send prompts with configurable temperature and max tokens. It also surfaces resources for the model list and current config. Reach for this when you want Claude to route completion requests to a different LLM provider without changing your client setup, or when you need to query multiple models through a single gateway. It handles API key passthrough and model discovery from endpoints like models.dev, so you can swap providers by changing environment variables.
MCP-compatible LLM gateway that proxies completion requests to downstream OpenAI-compatible providers.
mcp-name: io.github.daedalus/mcp-llm-gateway
pip install mcp-llm-gateway
Set the following environment variables:
DOWNSTREAM_URL: Base URL for the OpenAI-compatible downstream API (required)DEFAULT_MODEL: Default model to use for completions (required)MODEL_LIST_URL: URL to fetch available models from (optional, defaults to models.dev)API_KEY: Optional API key for downstream (passthrough)TIMEOUT: Request timeout in seconds (optional, default: 60)Run the MCP server with stdio transport:
mcp-llm-gateway
The server exposes the following tools:
list_models(): List all available models from the remote endpointcomplete(prompt, model, max_tokens, temperature): Send a completion request to the downstream LLM providermodels://list: Returns the list of available modelsconfig://info: Returns current gateway configurationgit clone https://github.com/daedalus/mcp-llm-gateway.git
cd mcp-llm-gateway
pip install -e ".[test]"
# run tests
pytest
# format
ruff format src/ tests/
# lint
ruff check src/ tests/
# type check
mypy src/
Model: Dataclass representing an available LLM modelCompletionRequest: Dataclass for completion request payloadsGatewayConfig: Dataclass for gateway configurationHTTPAdapter: HTTP client for downstream API communicationModelListAdapter: Adapter for fetching model list from remote endpointsModelService: Service for managing model discovery and cachingCompletionService: Service for handling completion requestsConfigService: Service for managing gateway configurationio.github.ericm1018/skillfm-llm-cost-optimizer-openai-anthropic-usage
io.github.mikerawsonnz/llm-orchestration-agent
io.github.mikerawsonnz/authenticated-llm-agent
labforgedev/copilot-memory-mcp
csoai-org/agent-prompt-injection-firewall-mcp
io.github.mikerawsonnz/authenticated-multi-llm-agent