Connects Claude and other MCP clients to Forge, Voxell's hosted text-embedding API running the Qwen3-Embedding family. You get two tools: embed text into vectors (with input_type flags for query vs. document, Matryoshka dimension truncation, and three quality tiers) and list_models to see what's available. The ultra tier runs the 8B model that sits at #4 on MTEB English. Voxell doesn't store your text or vectors, just token counts for billing. Useful when you need semantic search, RAG retrieval, or similarity scoring without managing your own embedding infrastructure. Also speaks OpenAI's embeddings API if you want to swap it into existing code that already calls text-embedding-3-large.
Public tool metadata for what this MCP can expose to an agent.
compare_ai_toolsCompare 2 or more AI coding tools side-by-side. Free tier returns a summary comparison table. Pro tier ($9) returns full detailed analysis with recommendations, pricing breakdowns, and growth trends. Available tools: claude-code, cursor, windsurf, devin, openhands, github-copi...3 paramsCompare 2 or more AI coding tools side-by-side. Free tier returns a summary comparison table. Pro tier ($9) returns full detailed analysis with recommendations, pricing breakdowns, and growth trends. Available tools: claude-code, cursor, windsurf, devin, openhands, github-copi...
toolsarrayapi_keystringaspectsarrayget_tool_profileGet detailed profile for a single AI coding tool. Includes features, pricing, strengths/weaknesses, and use cases. Available: claude-code, cursor, windsurf, devin, openhands, github-copilot, aider, cline.1 paramsGet detailed profile for a single AI coding tool. Includes features, pricing, strengths/weaknesses, and use cases. Available: claude-code, cursor, windsurf, devin, openhands, github-copilot, aider, cline.
tool_idstringclaude-code · cursor · windsurf · devin · openhands · github-copilotrecommend_toolGet an AI-powered recommendation for which tool best fits your use case. Analyzes your requirements and returns ranked suggestions with reasoning. [PRO ONLY — $9 one-time key]5 paramsGet an AI-powered recommendation for which tool best fits your use case. Analyzes your requirements and returns ranked suggestions with reasoning. [PRO ONLY — $9 one-time key]
budgetstringfree · under-20 · under-50 · unlimitedapi_keystringuse_casestringexperiencestringbeginner · intermediate · expertpreferencesarrayget_pricing_comparisonGet a complete pricing comparison table for all AI coding tools. Shows free tiers, pro pricing, team pricing, and enterprise options. Always free — no API key needed.1 paramsGet a complete pricing comparison table for all AI coding tools. Shows free tiers, pro pricing, team pricing, and enterprise options. Always free — no API key needed.
sort_bystringprice_asc · price_desc · popularity · namepurchase_pro_keyGet instructions to purchase a Pro API key ($9 one-time) for full comparisons and AI recommendations. Unlock detailed analysis, growth trends, and personalized tool recommendations.1 paramsGet instructions to purchase a Pro API key ($9 one-time) for full comparisons and AI recommendations. Unlock detailed analysis, growth trends, and personalized tool recommendations.
payment_methodstringpaypalAn MCP server for Forge — Voxell's hosted text-embedding API. It exposes Forge to any MCP client (Claude, Cursor, Cline, Windsurf, VS Code, …) as two tools:
embed — turn text into vectorslist_models — list available models and their dimensionsYou bring a Forge API key. The server is stateless, and Voxell does not store the text you send or the vectors it returns — only usage metadata (token counts) is recorded, for billing. It does embeddings only — no storage, no search, no RAG. Those are different products.
One-click install in your editor (then replace your-key-here with a real key from
dash.voxell.ai):
Claude Code — one command:
claude mcp add forge -e FORGE_API_KEY=your-key-here -- npx -y @voxell/forge-mcp
Any other client (Claude Desktop, Cline, Windsurf, Zed, …) uses the standard mcpServers
block — see Use it below.
ultra is the 8B — ~75+
average task score on MTEB, currently #4 on MTEB (English), and the top usable model (the
three ranked above it are research-only). turbo (0.6B) is the fast/cheap default. Pick your
quality/cost point.dim to truncate (re-normalized) for ~4× smaller, cheaper vectors.input_type: "document" and each query
with input_type: "query", then rank by cosine similarity.dim to truncate (Matryoshka) and trade a little accuracy
for smaller, cheaper vectors.embed tool — no separate script.Most MCP clients run it on demand with npx. Add this to your client's MCP config:
{
"mcpServers": {
"forge": {
"command": "npx",
"args": ["-y", "@voxell/forge-mcp"],
"env": { "FORGE_API_KEY": "your-key-here" }
}
}
}
(Cursor, Claude Desktop, Cline, Windsurf, and VS Code all use this mcpServers shape.)
embed| arg | type | default | notes |
|---|---|---|---|
input | string or string[] | — | text(s) to embed (required) |
model | string | turbo | turbo (1024-d), pro (2560-d), ultra (4096-d) |
dim | number | model default | truncate to N dimensions (Matryoshka) — works on every model |
input_type | "query" | "document" | document | use query for search queries |
Returns the vectors plus the model, dimension, and token count.
Default is turbo — the one you probably want. pro/ultra trade size and speed for more
dimensions.
list_modelsLists the available models and their dimensions.
| env | required | default |
|---|---|---|
FORGE_API_KEY | yes | — |
FORGE_BASE_URL | no | https://api.voxell.ai |
Forge speaks the OpenAI embeddings API. Point any OpenAI client at Forge — no code change, and your existing vector dimensions are preserved:
from openai import OpenAI
client = OpenAI(base_url="https://api.voxell.ai/v1", api_key="your-forge-key")
# the exact call you already make — now on a higher-ranked engine:
client.embeddings.create(model="text-embedding-3-large", input=["hello world"]) # -> 3072-d
Your OpenAI model names map to a matching-dimension Forge tier (text-embedding-3-small/
ada-002 → 1536-d, text-embedding-3-large → 3072-d), so existing vector stores slot in
unchanged. Or address Forge tiers directly — turbo | pro | ultra. Also supports dimensions
(Matryoshka, re-normalized) and encoding_format: "base64".
It's an upgrade on every path. Forge's smallest tier (turbo, Qwen3-Embedding-0.6B)
outranks OpenAI's largest embedding model (text-embedding-3-large) on MTEB — so there's no
drop-in that lands worse. ultra (Qwen3-Embedding-8B, ~75+ average task score, #4 on MTEB English)
is a different league.
Why re-embedding onto Forge is worth it. Embedding is a one-way door: whatever an encoder discards at write time is gone — no reranker, longer prompt, or bigger LLM downstream reconstructs what the vectors never captured. The model you embed with sets the ceiling on everything above it. Re-embed once onto a higher-ranked engine and that ceiling rises — permanently.
MIT © Voxell, Inc.
FORGE_API_KEY*secretYour Forge API key — create one at https://dash.voxell.ai (free tokens to start, no card).
io.github.ericm1018/skillfm-llm-cost-optimizer-openai-anthropic-usage
io.github.mikerawsonnz/llm-orchestration-agent
io.github.mikerawsonnz/authenticated-llm-agent
labforgedev/copilot-memory-mcp
csoai-org/agent-prompt-injection-firewall-mcp
io.github.mikerawsonnz/authenticated-multi-llm-agent