Local Model Suitability MCP

1 toolsauthSTDIO, HTTPregistry active

Summary

Before you fire off another expensive Claude API call, this server asks whether a local model could handle it instead. It exposes a single tool, check_local_viability, that analyzes your task and returns LOCAL or CLOUD with reasoning, cost savings estimates, and recommended local models like llama3.2 or mistral-7b. If you mark data as CONFIDENTIAL, it forces local routing regardless of complexity. Works over stdio or HTTP, supports LangChain and OpenAI Agents SDK integrations. Free tier gives you 20 routing decisions per month, paid tiers unlock bundles. Think of it as a gatekeeper that saves you money by catching tasks that don't need Sonnet's firepower.

Install to Claude Code

verified

claude mcp add --transport http local-model-suitability-mcp https://local-model-suitability-mcp-production.up.railway.app

Run in your terminal. Add --scope user to make it available in every project.

Review the command, arguments, and environment values before installing — MCP servers run with your local permissions.

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

Tools

Verified live against the running server on Jun 10, 2026.

verified live1 tools

check_local_viabilityCall this BEFORE your agent routes any task to a cloud model — at the moment a task is ready to be sent and the model has not yet been selected. Returns LOCAL/CLOUD/EITHER verdict with recommended model. If verdict is LOCAL, skip the cloud call entirely — saves cost on every r...3 params

Call this BEFORE your agent routes any task to a cloud model — at the moment a task is ready to be sent and the model has not yet been selected. Returns LOCAL/CLOUD/EITHER verdict with recommended model. If verdict is LOCAL, skip the cloud call entirely — saves cost on every r...

Parameters* required

task*string

The exact task you are about to send to a cloud model. Be specific — include what the input is and what output you need.

data_sensitivitystring

CONFIDENTIAL forces LOCAL verdict regardless of task complexity — data must not leave the machine. Defaults to PUBLIC.one of PUBLIC · INTERNAL · CONFIDENTIAL

quality_thresholdstring

PRODUCTION = output quality matters and errors are costly. PROTOTYPE = approximate results acceptable. BEST_EFFORT = speed and cost trump quality. Defaults to PRODUCTION.one of PRODUCTION · PROTOTYPE · BEST_EFFORT

Local Model Suitability MCP

Cloud inference is expensive. Everything that can run locally should.

This MCP server tells your agent — before every cloud API call — whether the task can be handled by a local model instead. Route to Ollama, LM Studio, or llama.cpp when you can. Only pay for cloud when you must.

The Tool

`check_local_viability`

Call this BEFORE every cloud inference call. If verdict is LOCAL, skip the cloud call entirely and route to your local model. Only use cloud when this tool returns CLOUD.

Inputs:

Field	Required	Description
`task`	✅	The exact task you are about to send to a cloud model
`quality_threshold`	Optional	`PRODUCTION` (default) / `PROTOTYPE` / `BEST_EFFORT`
`data_sensitivity`	Optional	`PUBLIC` (default) / `INTERNAL` / `CONFIDENTIAL`

CONFIDENTIAL forces LOCAL regardless of task complexity — data never leaves the machine.

Response:

{
  "verdict": "LOCAL",
  "confidence": "HIGH",
  "reason": "Simple text summarisation — no reasoning depth required. Any 7B+ local model handles this well.",
  "estimated_cost_saving": "$0.002-0.008 saved per call at claude-sonnet pricing",
  "recommended_local_models": ["llama3.2:8b", "mistral-7b", "phi3:medium"],
  "cloud_justified_reason": null,
  "analysis_type": "AI-powered cost routing — NOT a simple lookup"
}

Data Sources

AI reasoning: Anthropic Claude (claude-sonnet) — cost routing analysis
No external data sources — pure AI reasoning

Pricing

Plan	Calls	Price
Free	20/month	$0
Starter	500-call bundle	$20
Pro	2,000-call bundle	$70

Subscribe at kordagencies.com

Setup

{
  "mcpServers": {
    "local-model-suitability": {
      "command": "npx",
      "args": ["-y", "local-model-suitability-mcp"],
      "env": {
        "ANTHROPIC_API_KEY": "your-key",
        "API_KEY": "your-lms-api-key-for-paid-tier"
      }
    }
  }
}

Free tier requires no API key — tracked by IP.

Harness Integration

Claude Code / Claude Desktop (.mcp.json)

{
  "mcpServers": {
    "local-model-suitability": {
      "type": "http",
      "url": "https://local-model-suitability-mcp-production.up.railway.app"
    }
  }
}

LangChain (Python)

from langchain_mcp_adapters.client import MultiServerMCPClient
client = MultiServerMCPClient({
    "local-model-suitability": {
        "url": "https://local-model-suitability-mcp-production.up.railway.app",
        "transport": "http"
    }
})
tools = await client.get_tools()

OpenAI Agents SDK (Python)

from agents import Agent, HostedMCPTool
agent = Agent(
    name="Assistant",
    tools=[HostedMCPTool(tool_config={
        "type": "mcp",
        "server_label": "local-model-suitability",
        "server_url": "https://local-model-suitability-mcp-production.up.railway.app",
        "require_approval": "never"
    })]
)

LangGraph

Same as LangChain above — langchain-mcp-adapters works with LangGraph natively.

Legal

Results are for cost-optimisation guidance only and do not constitute technical advice. Full terms: kordagencies.com/terms.html

Featured

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

Configuration

ANTHROPIC_API_KEY*secret

Anthropic API key for Claude routing analysis

Local Model Suitability MCP

Cloud inference is expensive. Everything that can run locally should.

The Tool

`check_local_viability`

Call this BEFORE every cloud inference call. If verdict is LOCAL, skip the cloud call entirely and route to your local model. Only use cloud when this tool returns CLOUD.

Inputs:

Field	Required	Description
`task`	✅	The exact task you are about to send to a cloud model
`quality_threshold`	Optional	`PRODUCTION` (default) / `PROTOTYPE` / `BEST_EFFORT`
`data_sensitivity`	Optional	`PUBLIC` (default) / `INTERNAL` / `CONFIDENTIAL`

CONFIDENTIAL forces LOCAL regardless of task complexity — data never leaves the machine.

Response:

{
  "verdict": "LOCAL",
  "confidence": "HIGH",
  "reason": "Simple text summarisation — no reasoning depth required. Any 7B+ local model handles this well.",
  "estimated_cost_saving": "$0.002-0.008 saved per call at claude-sonnet pricing",
  "recommended_local_models": ["llama3.2:8b", "mistral-7b", "phi3:medium"],
  "cloud_justified_reason": null,
  "analysis_type": "AI-powered cost routing — NOT a simple lookup"
}

Data Sources

AI reasoning: Anthropic Claude (claude-sonnet) — cost routing analysis
No external data sources — pure AI reasoning

Pricing

Plan	Calls	Price
Free	20/month	$0
Starter	500-call bundle	$20
Pro	2,000-call bundle	$70

Subscribe at kordagencies.com

Setup

{
  "mcpServers": {
    "local-model-suitability": {
      "command": "npx",
      "args": ["-y", "local-model-suitability-mcp"],
      "env": {
        "ANTHROPIC_API_KEY": "your-key",
        "API_KEY": "your-lms-api-key-for-paid-tier"
      }
    }
  }
}

Free tier requires no API key — tracked by IP.

Harness Integration

Claude Code / Claude Desktop (.mcp.json)

{
  "mcpServers": {
    "local-model-suitability": {
      "type": "http",
      "url": "https://local-model-suitability-mcp-production.up.railway.app"
    }
  }
}

LangChain (Python)

from langchain_mcp_adapters.client import MultiServerMCPClient
client = MultiServerMCPClient({
    "local-model-suitability": {
        "url": "https://local-model-suitability-mcp-production.up.railway.app",
        "transport": "http"
    }
})
tools = await client.get_tools()

OpenAI Agents SDK (Python)

from agents import Agent, HostedMCPTool
agent = Agent(
    name="Assistant",
    tools=[HostedMCPTool(tool_config={
        "type": "mcp",
        "server_label": "local-model-suitability",
        "server_url": "https://local-model-suitability-mcp-production.up.railway.app",
        "require_approval": "never"
    })]
)

LangGraph

Same as LangChain above — langchain-mcp-adapters works with LangGraph natively.

Legal

Results are for cost-optimisation guidance only and do not constitute technical advice. Full terms: kordagencies.com/terms.html

Local Model Suitability MCP

Install to Claude Code

Tools

Local Model Suitability MCP

The Tool

`check_local_viability`

Data Sources

Pricing

Setup

Harness Integration

Claude Code / Claude Desktop (.mcp.json)

LangChain (Python)

OpenAI Agents SDK (Python)

LangGraph

Legal

Configuration

Local Model Suitability MCP

Install to Claude Code

Tools

Local Model Suitability MCP

The Tool

`check_local_viability`

Data Sources

Pricing

Setup

Harness Integration

Claude Code / Claude Desktop (.mcp.json)

LangChain (Python)

OpenAI Agents SDK (Python)

LangGraph

Legal

Configuration

Related Productivity & Office MCP Servers

Related Productivity & Office MCP Servers