Self Hosted Ai

4 toolsHTTPregistry active

Summary

Exposes 60+ engineering articles from a self-hosted AI blog running on NVIDIA DGX Spark hardware (GB10/SM121A). You get four read-only tools: TF-IDF full-text search with optional tag filtering, a tag browser, article fetcher by slug, and a pattern-matching diagnostic for SGLang runtime errors. The corpus covers niche ground like SGLang and vLLM patches for GB10, ARM64 deployment quirks, KV-cache tuning, and podcast-quality TTS pipelines. Runs over Streamable HTTP via FastMCP with DNS rebinding protection. Reach for this when you're debugging similar hardware stacks and need version-current setup notes instead of stale LLM training data.

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

Tools

Public tool metadata for what this MCP can expose to an agent.

4 tools

search_blogSearch the Sovereign AI Blog for articles matching a natural language query, optionally filtered by tag and sorted by relevance or date. Behaviour matrix: - query='', sort=* -> list newest-first, optionally tag-filtered - query!='', sort=relevance -> TF-IDF ranked, optionally...4 params

Search the Sovereign AI Blog for articles matching a natural language query, optionally filtered by tag and sorted by relevance or date. Behaviour matrix: - query='', sort=* -> list newest-first, optionally tag-filtered - query!='', sort=relevance -> TF-IDF ranked, optionally...

Parameters* required

ninteger

Maximum number of results to returndefault: 5

tagvalue

Optional tag filter (e.g. 'setup', 'fixes', 'strategy'). Only articles with this tag are considered. Use list_tags to discover available tags.

sortstring

Result ordering. 'relevance' uses TF-IDF score (default for non-empty query). 'date_desc' sorts newest first (default behaviour when query is empty). When query is empty, 'relevance' is treated as 'date_desc'.one of relevance · date_descdefault: relevance

querystring

Natural language search query (e.g. 'flashinfer OOM on GB10'). Multi-word queries are tokenized and TF-IDF ranked. Pass empty string to list articles without ranking by relevance.default:

get_articleRetrieve the full content of a blog article by its slug. Returns the article body (Markdown) plus metadata. If the slug does not match any article, returns an Article with `error='article_not_found'` and other fields at their defaults.1 params

Retrieve the full content of a blog article by its slug. Returns the article body (Markdown) plus metadata. If the slug does not match any article, returns an Article with `error='article_not_found'` and other fields at their defaults.

Parameters* required

slugstring

Article slug as returned by search_blog (e.g. 'setup-mistral-sglang-setup'). Lower-case, hyphenated.

diagnose_sglangValidate an SGLang configuration for NVIDIA DGX Spark (GB10/SM121A). Pure pattern-matching against known failure modes documented in the Sovereign AI Blog. No inference, no external calls. Returns critical issues, non-fatal warnings, and a recommended baseline config. All para...6 params

Validate an SGLang configuration for NVIDIA DGX Spark (GB10/SM121A). Pure pattern-matching against known failure modes documented in the Sovereign AI Blog. No inference, no external calls. Returns critical issues, non-fatal warnings, and a recommended baseline config. All para...

Parameters* required

hardwarestring

Hardware description (e.g. 'GB10', 'DGX Spark', 'SM121A'). Empty = skip GB10-specific rules.default:

image_tagstring

Docker image tag in use (e.g. 'lmsysorg/sglang:latest', 'lmsysorg/sglang:v0.4.0'). Empty = skip.default:

mem_fractionnumber

SGLang --mem-fraction-static value (e.g. 0.88). 0.0 = skip this check.default: 0

error_messagestring

Paste error log output here for pattern matching against known failure modes.default:

attention_backendstring

SGLang --attention-backend value (e.g. 'flashinfer', 'triton'). Empty string = skip this check.default:

cuda_graph_max_bsinteger

SGLang --cuda-graph-max-bs value. 0 = skip this check.default: 0

list_tagsList all topic tags used across the Sovereign AI Blog corpus, with article counts. Use this to browse the topic space before calling search_blog with a tag filter.1 params

List all topic tags used across the Sovereign AI Blog corpus, with article counts. Use this to browse the topic space before calling search_blog with a tag filter.

Parameters* required

sortstring

Result ordering. 'count_desc' lists most-used tags first (default). 'alpha' sorts alphabetically.one of count_desc · alphadefault: count_desc

Sovereign AI MCP

MCP server exposing the Sovereign AI Blog to AI agents. The blog is a hands-on engineering log of self-hosted AI on NVIDIA DGX Spark (GB10/SM121A).

Live endpoint: https://mcp.sovgrid.org/self-hosted-ai Transport: Streamable HTTP (FastMCP) Auth: none (free tier, 60 req/min/IP)

Why use it

Training data on niche hardware (GB10, SM121A, SGLang on ARM64) is sparse and stale. This MCP gives agents direct, structured access to 60+ articles documenting actual setups, fixes, and benchmarks. If you're building or debugging on similar stacks, your agent can pull verified, version-current information instead of hallucinating.

The corpus covers SGLang and vLLM patches for GB10, voxtral and TTS pipelines on ARM64, KV-cache and quantization tradeoffs, podcast-grade audio generation, MCP server design, knowledge-base construction, and the operational side of running it all on a hardened European VPS.

Tools

Tool	Purpose
`search_blog(query, tag?, sort?, n?)`	TF-IDF full-text search. Optional `tag` filter, `sort` by relevance or `date_desc`. Empty `query` lists newest articles. Returns ranked `SearchResult` items with quality score, style, slug, and excerpt.
`list_tags(sort?)`	List all topic tags across the corpus with article counts. Sort by `count_desc` (default) or `alpha`. Use to discover the topic space before filtering `search_blog`.
`get_article(slug)`	Fetch full article body and frontmatter by slug. Returns markdown content plus tags, quality score, publish date.
`diagnose_sglang(error_message)`	Pattern-match a runtime error against a curated rule set for SGLang on GB10/SM121A. Returns matched fixes with links to setup articles.

All tools are read-only, idempotent, and declared with ToolAnnotations so MCP clients can calibrate retry policy and trust signals. Inputs use Pydantic Annotated[type, Field(description=...)] so parameter docs reach agents through introspection. Outputs are typed BaseModel shapes — schemas are real, not vacuous dicts.

Quick start

With Claude Code

claude mcp add sovereign-ai --transport http https://mcp.sovgrid.org/self-hosted-ai

Verify:

claude mcp list | grep sovereign-ai

With Cline / Continue / other MCP clients

Add to your client's MCP server config:

{
  "sovereign-ai": {
    "type": "http",
    "url": "https://mcp.sovgrid.org/self-hosted-ai"
  }
}

Run locally

From source (uv)

git clone https://github.com/cipherfoxie/sovereign-mcp.git
cd sovereign-mcp
uv sync
uv run uvicorn src.main:app --host 127.0.0.1 --port 8002

Docker

git clone https://github.com/cipherfoxie/sovereign-mcp.git
cd sovereign-mcp
docker build -t sovereign-mcp .
docker run -p 8002:8002 sovereign-mcp

The repo ships a placeholder data/knowledge-base.json (zero articles, valid schema) so the server starts and answers MCP introspection cleanly out-of-the-box. To populate it with real content, generate from the sovgrid.org blog source using scripts/generate_knowledge_base.py, or build your own KB matching the schema in src/knowledge.py. Or just use the live endpoint at https://mcp.sovgrid.org/self-hosted-ai.

A walk-through of the same KB pattern (Markdown plus JSON index, no vector store) is documented in Build a Self-Hosted Knowledge Base with Plain Text and LLMs.

Architecture

FastMCP 1.27+ with Streamable HTTP transport at path /self-hosted-ai
DNS rebinding protection via TransportSecuritySettings: only allows requests with Host: mcp.sovgrid.org (or localhost for healthchecks)
Health endpoint at /health returns article count and KB generation timestamp
Knowledge base is a flat JSON file generated from blog Markdown content; loaded at startup, queried via TF-IDF for search_blog

The server is stateless. All blog content is already public (CC BY-SA 4.0). No PII, no auth tokens, no secrets.

Operations

Live deployment runs on a privacy-focused European VPS via Docker, fronted by Caddy with TLS. Server logs flow into a privacy-respecting analytics pipeline (Caddy JSON access logs, no client-side tracking, no JS pixels).

License

Server code: MIT, see LICENSE
Blog content (returned by tools): CC BY-SA 4.0, see creativecommons.org/licenses/by-sa/4.0/

Contact

Blog: sovgrid.org
Nostr: cipherfox@sovgrid.org (NIP-05) — npub1ndrjgfcwkc0y4753zyj3p7qjf795pvjq2dn4m7y7f72vmu7t0nrs6y363u
Bug reports / questions: open an issue

Featured

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

Sovereign AI MCP

MCP server exposing the Sovereign AI Blog to AI agents. The blog is a hands-on engineering log of self-hosted AI on NVIDIA DGX Spark (GB10/SM121A).

Live endpoint: https://mcp.sovgrid.org/self-hosted-ai Transport: Streamable HTTP (FastMCP) Auth: none (free tier, 60 req/min/IP)

Why use it

Tools

Tool	Purpose
`search_blog(query, tag?, sort?, n?)`	TF-IDF full-text search. Optional `tag` filter, `sort` by relevance or `date_desc`. Empty `query` lists newest articles. Returns ranked `SearchResult` items with quality score, style, slug, and excerpt.
`list_tags(sort?)`	List all topic tags across the corpus with article counts. Sort by `count_desc` (default) or `alpha`. Use to discover the topic space before filtering `search_blog`.
`get_article(slug)`	Fetch full article body and frontmatter by slug. Returns markdown content plus tags, quality score, publish date.
`diagnose_sglang(error_message)`	Pattern-match a runtime error against a curated rule set for SGLang on GB10/SM121A. Returns matched fixes with links to setup articles.

Quick start

With Claude Code

claude mcp add sovereign-ai --transport http https://mcp.sovgrid.org/self-hosted-ai

Verify:

claude mcp list | grep sovereign-ai

With Cline / Continue / other MCP clients

Add to your client's MCP server config:

{
  "sovereign-ai": {
    "type": "http",
    "url": "https://mcp.sovgrid.org/self-hosted-ai"
  }
}

Run locally

From source (uv)

git clone https://github.com/cipherfoxie/sovereign-mcp.git
cd sovereign-mcp
uv sync
uv run uvicorn src.main:app --host 127.0.0.1 --port 8002

Docker

git clone https://github.com/cipherfoxie/sovereign-mcp.git
cd sovereign-mcp
docker build -t sovereign-mcp .
docker run -p 8002:8002 sovereign-mcp

A walk-through of the same KB pattern (Markdown plus JSON index, no vector store) is documented in Build a Self-Hosted Knowledge Base with Plain Text and LLMs.

Architecture

FastMCP 1.27+ with Streamable HTTP transport at path /self-hosted-ai
DNS rebinding protection via TransportSecuritySettings: only allows requests with Host: mcp.sovgrid.org (or localhost for healthchecks)
Health endpoint at /health returns article count and KB generation timestamp
Knowledge base is a flat JSON file generated from blog Markdown content; loaded at startup, queried via TF-IDF for search_blog

The server is stateless. All blog content is already public (CC BY-SA 4.0). No PII, no auth tokens, no secrets.

Operations

License

Server code: MIT, see LICENSE
Blog content (returned by tools): CC BY-SA 4.0, see creativecommons.org/licenses/by-sa/4.0/

Contact

Blog: sovgrid.org
Nostr: cipherfox@sovgrid.org (NIP-05) — npub1ndrjgfcwkc0y4753zyj3p7qjf795pvjq2dn4m7y7f72vmu7t0nrs6y363u
Bug reports / questions: open an issue

Self Hosted Ai

Tools

Sovereign AI MCP

Why use it

Tools

Quick start

With Claude Code

With Cline / Continue / other MCP clients

Run locally

From source (uv)

Docker

Architecture

Operations

License

Contact

Self Hosted Ai

Tools

Sovereign AI MCP

Why use it

Tools

Quick start

With Claude Code

With Cline / Continue / other MCP clients

Run locally

From source (uv)

Docker

Architecture

Operations

License

Contact

Related Monitoring & Observability MCP Servers

Related Monitoring & Observability MCP Servers