If you're running Claude with a dozen MCP servers and watching your context window fill up with tool schemas on every turn, this proxy collapses them down to three tools: mcp_search, mcp_call, and mcp_schema. The LLM queries in natural language to find the right tool, then calls it on demand. The proxy sits between your client and upstream servers like Google Workspace or custom APIs, handling discovery with BM25 lexical search and lazy loading tools only when needed. Pure JavaScript, no native dependencies. Most helpful when you're juggling 50+ tools across multiple servers and token overhead is eating into your actual work. Works over stdio or HTTP, and the included systemd service keeps it running in the background.
A context-aware MCP proxy that reduces token usage by exposing only 3 tools (mcp_search, mcp_call, mcp_schema) to LLMs instead of the full catalog.
When you connect multiple MCP servers to an LLM, every tool from every server is listed in the LLM's context window. For a typical workspace with 50-100 tools across multiple MCP servers, that's thousands of tokens of schema documentation on every request.
MCP Proxy Gateway sits between your LLM and your MCP servers, offering:
┌─────────────────────────────────────────────────────────────────┐
│ Your LLM │
│ (sees only: mcp_search, mcp_call, mcp_schema) │
└────────────────────┬────────────────────────────────────────────┘
│
┌───────────▼──────────────┐
│ MCP Proxy Gateway │
│ ┌──────────────────────┐ │
│ │ Tool Registry │ │
│ │ (BM25 lexical) │ │
│ └──────────────────────┘ │
│ ┌──────────────────────┐ │
│ │ Connector Manager │ │
│ │ (Idle timeout reap) │ │
│ └──────────────────────┘ │
└────────────┬─────────────┘
│
┌─────────────┼─────────────┐
│ │ │
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐
│Google │ │MailerLite│ │Your Svc │
│Gmail │ │ Campaigns│ │ Custom │
│Calendar │ │ │ │ Tools │
│Drive │ │ │ │ │
└─────────┘ └─────────┘ └─────────┘
git clone https://github.com/steveweltman/4q-tokenz.git
cd 4q-tokenz
pnpm install
pnpm build
# Install to ~/.local/bin and configure
./install.sh
npm install -g 4q-tokenz
Here's a concrete walkthrough to connect Google Workspace (Gmail, Calendar, Drive) to your LLM through the proxy:
You need an MCP server that wraps Google APIs. Options:
@antidrift/mcp-google (recommended) — A collection of MCP server implementations for Google Workspace (Gmail, Calendar, Drive, Docs, Sheets). Works out of the box with this proxy.
npm install @antidrift/mcp-google
# or
npx @antidrift/mcp-google --help
@modelcontextprotocol/server-gmail — Gmail-only, official MCP server
Build your own — See the MCP spec to wrap your own APIs
token.json:
GOOGLE_CREDENTIAL_FILE=~/Downloads/credentials.json \
npx @antidrift/mcp-google
This opens a browser for you to authorize. Once done, it saves token.json locally.Create ~/.config/4q-tokens/config.json:
{
"upstreams": [
{
"name": "google-workspace",
"transport": "stdio",
"command": "npx",
"args": ["@antidrift/mcp-google"],
"env": {
"GOOGLE_TOKEN_FILE": "~/.local/share/google-mcp/token.json",
"GOOGLE_CONNECTORS": "gmail,calendar,drive"
}
}
],
"searchLimit": 5,
"callItemLimit": 30,
"maxTextLength": 800,
"maxOutputTokens": 10000,
"idleTimeoutMs": 600000
}
mcp-proxy
# Or via systemd if installed:
systemctl --user start mcp-proxy
Configure your LLM to use http://127.0.0.1:9200/mcp as its MCP server. It will see:
mcp_search — find tools by natural languagemcp_call — invoke a toolmcp_schema — see tool detailsExample query:
mcp_search("send an email")
# Returns: google_send_email (Gmail)
mcp_call(ref="google_send_email", args={"to": "user@example.com", "subject": "Hello", "body": "Test"})
export MCP_PROXY_UPSTREAMS='[
{
"name": "google",
"transport": "stdio",
"command": "node",
"args": ["/path/to/google/server.mjs"],
"env": {
"GOOGLE_TOKEN_FILE": "token.json"
}
}
]'
export MCP_PROXY_SINGLETON_PORT=9200
export MCP_PROXY_DASHBOARD_PORT=9100
node dist/index.js
Create ~/.config/4q-tokens/config.json:
{
"upstreams": [
{
"name": "google-workspace",
"transport": "stdio",
"command": "npx",
"args": ["@antidrift/mcp-google"],
"env": {
"GOOGLE_TOKEN_FILE": "token.json",
"GOOGLE_CONNECTORS": "gmail,calendar,drive"
}
},
{
"name": "external-api",
"transport": "http",
"url": "https://mcp.example.com/",
"auth": {
"apiKey": "MY_API_KEY_ENV_VAR"
}
}
],
"searchLimit": 3,
"callItemLimit": 20,
"maxTextLength": 500,
"maxOutputTokens": 8000,
"idleTimeoutMs": 300000
}
Then run:
node dist/index.js
The proxy will load the config from ~/.config/4q-tokens/config.json if it exists, otherwise fall back to the MCP_PROXY_UPSTREAMS environment variable.
{
"name": "unique-id",
"transport": "stdio" | "http",
// For stdio transport:
"command": "node",
"args": ["path/to/server.mjs"],
"cwd": "/working/dir", // optional
"env": { "KEY": "value" }, // optional
// For http transport:
"url": "https://example.com/mcp",
"auth": {
"apiKey": "ENV_VAR_NAME" // reads from process.env[ENV_VAR_NAME]
}
}
| Option | Default | Description |
|---|---|---|
searchLimit | 3 | Max tools returned by mcp_search |
callItemLimit | 20 | Max items in mcp_call response |
maxTextLength | 500 | Truncate text fields to N chars (detail=false: 500, detail=true: 1500) |
maxOutputTokens | 8000 | Hard cap on response size |
idleTimeoutMs | 300000 | Disconnect upstream servers after N ms of inactivity (0 = disabled) |
Environment variable overrides:
export MCP_PROXY_SEARCH_LIMIT=5
export MCP_PROXY_CALL_ITEM_LIMIT=30
export MCP_PROXY_MAX_TEXT_LENGTH=800
export MCP_PROXY_MAX_OUTPUT_TOKENS=10000
export MCP_PROXY_IDLE_TIMEOUT_MS=600000
node dist/index.js
The proxy connects via stdio to your LLM. Use it with Claude or other MCP clients.
The install script can set this up for you (see below), or manually:
~/.config/systemd/user/mcp-proxy.service:[Unit]
Description=MCP Proxy Gateway
After=network.target
[Service]
Type=simple
ExecStart=%h/.local/bin/mcp-proxy
Restart=on-failure
RestartSec=5s
Environment="PATH=%h/.local/bin:/usr/local/bin:/usr/bin"
[Install]
WantedBy=default.target
systemctl --user daemon-reload
systemctl --user enable mcp-proxy
systemctl --user start mcp-proxy
journalctl --user -u mcp-proxy -f
The dashboard exposes a Prometheus-compatible /metrics endpoint:
curl http://localhost:9100/metrics
Metrics exposed:
| Metric | Type | Description |
|---|---|---|
mcp_proxy_uptime_seconds | gauge | Seconds since process started |
mcp_proxy_registered_tools | gauge | Tools in the registry |
mcp_proxy_upstream_up | gauge | 1 if upstream is connected/idle, 0 if error |
mcp_proxy_upstream_tools | gauge | Tools discovered per upstream |
mcp_proxy_calls_total | counter | Calls by tool, provider, status |
mcp_proxy_call_duration_ms_total | counter | Cumulative call duration (ms) |
mcp_proxy_output_bytes_total | counter | Cumulative output bytes |
Add a scrape job in your Prometheus/Alloy config:
scrape_configs:
- job_name: mcp-proxy
static_configs:
- targets: ['localhost:9100']
metrics_path: /metrics
The proxy always starts an HTTP transport on port 9200 by default. Set MCP_PROXY_SINGLETON_PORT to use a different port. This allows multiple clients to connect to a single proxy instance.
export MCP_PROXY_SINGLETON_PORT=9200
node dist/index.js &
# From another process:
curl -X POST http://127.0.0.1:9200/mcp -H "Content-Type: application/json" \
-d '{"jsonrpc": "2.0", "method": "tools/call", "params": {...}}'
Check the server logs in the dashboard (port 9100 by default) or daemon logs:
journalctl --user -u mcp-proxy -e
The proxy logs:
The proxy has comprehensive error handling to gracefully degrade on upstream failures:
NO_UPSTREAMSFor unhandled errors, check:
journalctl --user -u mcp-proxy -n 50 # Last 50 lines
When a tool returns null or malformed data, the output shaper handles it gracefully:
[]{value: string}_rawContentIf a tool response looks truncated, retry with detail=true in mcp_call to disable output shaping:
mcp_call(ref="google_send_email", args={...}, detail=true)
The proxy binds to 127.0.0.1 only for security — it's not accessible from the network by default. To access remotely:
127.0.0.1:9200ssh -L 9200:127.0.0.1:9200 user@remote-host
/metrics endpoint on the dashboard port (9100 by default)mcp_proxy_uptime_seconds, mcp_proxy_registered_tools, mcp_proxy_upstream_up, mcp_proxy_upstream_tools, mcp_proxy_calls_total, mcp_proxy_call_duration_ms_total, mcp_proxy_output_bytes_total@xenova/transformers entirely — eliminates the protobufjs/ONNX runtime supply chain@modelcontextprotocol/sdk from ~1.22.0 to ^1.26.0 — resolves 3 high-severity CVEs: ReDoS, cross-client data leak, DNS rebindingprotobufjs >=7.5.8 — resolves critical arbitrary code execution and multiple high CVEs in @xenova/transformers transitive dependency chainqs >=6.15.2 — resolves moderate DoS vulnerability in express transitive dependencyMCP Proxy Gateway is a fork of @arvoretech/mcp-proxy, originally created by João Augusto and Árvore Educação.
Forked and extended with:
MIT. See LICENSE.
com.mcparmory/google-search
io.github.pipeworx-io/brave-search
marcopesani/mcp-server-serper
brave/brave-search-mcp-server
com.mcparmory/google-search-console
acamolese/google-search-console-mcp