Ai Seo

120 toolsSTDIOregistry active

Summary

Ships 19 tools that audit why AI engines do or don't cite your pages. You get schema validation against citation best practices, robots.txt parsing for GPTBot and friends, llms.txt linting and generation, canonical URL hygiene checks, and scoring functions that predict AI Overview eligibility and answer extractability per content block. The rewrite tools restructure pages for AEO and GEO formats. Runs as an MCP server inside Claude or any stdio client, doubles as a GitHub Action to gate PRs on minimum citation scores, and includes headless rendering for SPAs. No API keys. The audit response hands you ranked findings with exact fixes and estimated impact, not opaque scores.

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

Tools

Public tool metadata for what this MCP can expose to an agent.

20 tools

audit.pageFull AI-SEO audit of a single URL: returns categorized findings (info/warning/error) with severity, fix instructions, and a 0-100 composite score plus per-dimension subscores. Read-only. Fetches the URL once and runs every sub-audit (schema, robots, technical, sitemap, AI-Over...5 params

Full AI-SEO audit of a single URL: returns categorized findings (info/warning/error) with severity, fix instructions, and a 0-100 composite score plus per-dimension subscores. Read-only. Fetches the URL once and runs every sub-audit (schema, robots, technical, sitemap, AI-Over...

Parameters* required

urlstring

Public URL to audit. Must be a fully-qualified http(s) URL that returns HTTP 200 (redirects are followed). The tool fetches this URL once and runs every sub-audit (schema, robots, technical, sitemap, AI-Overview eligibility) against the response.

renderstring

Rendering mode. `static` (default) fetches raw HTML via HTTP — fast (<1s) but misses JS-rendered content typical of SPAs (React/Vue/Angular landing pages). `headless` spins up Playwright Chromium, waits for networkidle, and audits the rendered DOM — adds 3-10s per audit and requires `playwright-core` installed plus a one-time `npx playwright install chromium`. Use `headless` when the static audit shows `content_quality: "spa_empty"` or you know the target is JS-rendered.one of static · headlessdefault: static

respect_robotsboolean

If true (default), the tool checks robots.txt before fetching and skips disallowed paths, returning a robots_blocked finding instead. Set to false ONLY for auditing your own site where you've intentionally blocked crawlers and need the audit to bypass that block.default: true

generate_reportboolean

If true, return a standalone HTML scorecard in the `report_html` field. The HTML is self-contained (no external dependencies) and can be saved as a .html file or pasted to Gist/CodePen. Default false to keep audits cheap.default: false

include_raw_htmlboolean

If true, return the full raw HTML in the response under `raw_html`. Default false. Set true only when you need to inspect markup that wasn't captured by the structured findings; the payload can be large.default: false

audit.schemaValidate JSON-LD structured data against Schema.org rules and AI-citation best practices. Accepts either a URL (fetched) or a raw JSON string (parsed directly). Read-only when given `url` (one HTTP GET). Zero network when given `schema_json`. No writes. Deterministic, rule-bas...3 params

Validate JSON-LD structured data against Schema.org rules and AI-citation best practices. Accepts either a URL (fetched) or a raw JSON string (parsed directly). Read-only when given `url` (one HTTP GET). Zero network when given `schema_json`. No writes. Deterministic, rule-bas...

Parameters* required

urlstring

Public URL to fetch and audit. Either this OR `schema_json` is required. Read-only HTTP GET.

schema_jsonstring

Raw JSON-LD as a string (the contents of a `<script type="application/ld+json">` block). Use this to validate a schema block offline without fetching a URL. Either this OR `url` is required.

respect_robotsboolean

If true (default), respect robots.txt before fetching `url`. Ignored when `schema_json` is used.default: true

audit.canonicalAudit a page's canonical link integrity: presence, self-reference, cross-domain mismatches, trailing-slash hygiene, and og:url consistency. Read-only. One HTTP GET to fetch the HEAD section. Deterministic, rule-based; no LLM. When to use: a focused canonical-only audit (e.g. d...2 params

Audit a page's canonical link integrity: presence, self-reference, cross-domain mismatches, trailing-slash hygiene, and og:url consistency. Read-only. One HTTP GET to fetch the HEAD section. Deterministic, rule-based; no LLM. When to use: a focused canonical-only audit (e.g. d...

Parameters* required

urlstring

Public URL whose canonical link tag and og:url consistency you want to audit. Must be a fully-qualified http(s) URL. The tool fetches the URL (following redirects) and inspects only the <head> section; the body is not parsed.

respect_robotsboolean

If true (default), respect robots.txt before fetching. Set false only for auditing your own site where you've intentionally blocked crawlers.default: true

check.robotsFetch and parse a domain's robots.txt; report per-crawler allow/disallow posture for every known AI training crawler (GPTBot, CCBot, Anthropic-AI, Google-Extended, etc.), AI search crawlers (ChatGPT-User, PerplexityBot, OAI-SearchBot), and user-triggered fetchers. Read-only. O...1 params

Fetch and parse a domain's robots.txt; report per-crawler allow/disallow posture for every known AI training crawler (GPTBot, CCBot, Anthropic-AI, Google-Extended, etc.), AI search crawlers (ChatGPT-User, PerplexityBot, OAI-SearchBot), and user-triggered fetchers. Read-only. O...

Parameters* required

domainstring

Hostname or origin to inspect. Examples: `example.com`, `https://example.com`, `https://example.com/`. The tool fetches `https://<domain>/robots.txt` and reports per-crawler allow/disallow posture for all known AI training crawlers (GPTBot, CCBot, etc.), AI search crawlers (ChatGPT-User, PerplexityBot), and user-triggered fetchers. Read-only HTTP GET to /robots.txt only.

check.sitemapValidate a domain's XML sitemap: presence, accessibility, URL count, lastmod freshness, sitemap-index handling, and image/video sitemap extensions. Read-only. Issues N+1 HTTP GETs: one for robots.txt + sitemap, then up to `max_urls_to_check` HEADs against sampled URLs. Determi...2 params

Validate a domain's XML sitemap: presence, accessibility, URL count, lastmod freshness, sitemap-index handling, and image/video sitemap extensions. Read-only. Issues N+1 HTTP GETs: one for robots.txt + sitemap, then up to `max_urls_to_check` HEADs against sampled URLs. Determi...

Parameters* required

domainstring

Hostname or origin to inspect. Examples: `example.com`, `https://example.com`. The tool tries `/sitemap.xml` then the sitemap URL declared in robots.txt; follows sitemap index files one level deep. Read-only HTTP GETs against the domain only.

max_urls_to_checkinteger

Cap on how many URLs from the sitemap to sample for lastmod, image/video extension, and structural checks. Default 100. Increase up to 500 for large sites where you want a more representative sample; each extra URL is one HTTP HEAD.default: 100

check.technicalAudit a page's HEAD section for technical signals relevant to AI crawlers: HTTPS, canonical, OpenGraph, Twitter Card, hreflang, noindex, and title-vs-H1 hygiene. Read-only. One HTTP GET, inspects HEAD only (body is not parsed). Deterministic, rule-based; no LLM. When to use: w...2 params

Audit a page's HEAD section for technical signals relevant to AI crawlers: HTTPS, canonical, OpenGraph, Twitter Card, hreflang, noindex, and title-vs-H1 hygiene. Read-only. One HTTP GET, inspects HEAD only (body is not parsed). Deterministic, rule-based; no LLM. When to use: w...

Parameters* required

urlstring

Public URL to audit. The tool fetches the URL once and inspects HEAD-section signals: HTTPS, canonical, OpenGraph, Twitter Card, hreflang, noindex, title length and overlap with H1. Body content is not parsed. Read-only HTTP GET.

respect_robotsboolean

If true (default), respect robots.txt before fetching. Set false only for auditing your own site where you've intentionally blocked crawlers.default: true

score.ai_overview_eligibilityScore a page's probability of appearing in Google AI Overviews. Returns an overall 0-100 score plus six factor subscores: semantic completeness, structured data, E-E-A-T signals, entity density, freshness, and technical hygiene. Read-only. One HTTP GET. Deterministic, rule-bas...2 params

Score a page's probability of appearing in Google AI Overviews. Returns an overall 0-100 score plus six factor subscores: semantic completeness, structured data, E-E-A-T signals, entity density, freshness, and technical hygiene. Read-only. One HTTP GET. Deterministic, rule-bas...

Parameters* required

urlstring

Public URL to score. The tool fetches the URL once and runs deterministic, rule-based scoring across six factors (semantic completeness, structured data, E-E-A-T signals, entity density, freshness, technical hygiene) using published 2025-2026 correlation studies. No LLM calls. Read-only HTTP GET.

respect_robotsboolean

If true (default), respect robots.txt before fetching. Set false only for auditing your own site where you've intentionally blocked crawlers.default: true

score.agentic_browsingScore a page against the four signals Google added to the Lighthouse "Agentic Browsing" category in May 2026: presence of an llms.txt, WebMCP integration, accessibility-tree integrity, and layout stability. Returns an overall 0-100 score, a letter grade, and a per-factor break...5 params

Score a page against the four signals Google added to the Lighthouse "Agentic Browsing" category in May 2026: presence of an llms.txt, WebMCP integration, accessibility-tree integrity, and layout stability. Returns an overall 0-100 score, a letter grade, and a per-factor break...

Parameters* required

urlstring

Public URL to fetch and score. Either this OR `html` is required.

htmlstring

Raw HTML to score offline without fetching. Either this OR `url` is required. llms.txt is treated as absent in this mode.

renderstring

Rendering mode for `url`. `static` (default) reads raw HTML; `headless` runs Playwright Chromium (adds 3-10s; requires `playwright-core`). Ignored when `html` is used.one of static · headlessdefault: static

check_llms_txtboolean

If true (default), probe /llms.txt for the host to score the llms.txt factor. Set false to skip that extra HTTP GET.default: true

respect_robotsboolean

If true (default), respect robots.txt when fetching `url`. Ignored when `html` is used.default: true

llms_txt.generateGenerate a spec-compliant llms.txt (and optionally llms-full.txt) for a domain by reading its sitemap, sampling up to `max_pages` pages, and synthesizing a grouped, sectioned summary. Read-only. Issues one HTTP GET for the sitemap then one per sampled page. Deterministic; no L...5 params

Generate a spec-compliant llms.txt (and optionally llms-full.txt) for a domain by reading its sitemap, sampling up to `max_pages` pages, and synthesizing a grouped, sectioned summary. Read-only. Issues one HTTP GET for the sitemap then one per sampled page. Deterministic; no L...

Parameters* required

domainstring

Hostname or origin to generate llms.txt for. Examples: `example.com`, `https://example.com`. The tool reads the domain's sitemap, fetches up to `max_pages` of them, and synthesizes a spec-compliant llms.txt grouped by section. Issues N+1 HTTP GETs: one for the sitemap, then one per sampled page. Read-only.

max_pagesinteger

How many pages to sample from the sitemap when building section groupings. Default 30. Each page is fetched (one HTTP GET per page) - keep this low for large sites or rate-limited hosts.default: 30

site_namestring

Override the site name used in the generated llms.txt header. If omitted, inferred from the homepage's <title> tag.

include_fullboolean

If true, also generate llms-full.txt (the expanded variant containing full page text, not just URLs and titles). Default false. The llms-full.txt output can be large; only enable when you actually plan to host both files.default: false

site_descriptionstring

Override the site description used in the generated llms.txt header. If omitted, inferred from the homepage's meta description.

pricing.generateGenerate a machine-readable /pricing.md for AI shopping/agent flows. Finds the site's pricing page (or uses `pricing_url`), extracts named tiers and price lines, and returns the file content as a string. Read-only. Issues a few HTTP GETs probing common pricing paths. Determini...2 params

Generate a machine-readable /pricing.md for AI shopping/agent flows. Finds the site's pricing page (or uses `pricing_url`), extracts named tiers and price lines, and returns the file content as a string. Read-only. Issues a few HTTP GETs probing common pricing paths. Determini...

Parameters* required

domainstring

Hostname or origin to generate pricing.md for, e.g. `example.com`. The tool finds the pricing page (or uses `pricing_url`), extracts tiers and prices, and returns a machine-readable pricing.md string. Read-only.

pricing_urlstring

Explicit pricing page URL. If omitted, the tool probes common paths (/pricing, /plans, /pricing/).

llms_txt.validateValidate an existing llms.txt or llms-full.txt against the spec: structure, section ordering, link format, and (optionally) broken-link detection. Read-only. One HTTP GET when given `url`; zero network when given `content`. Optional link-check issues HEAD requests against each...3 params

Validate an existing llms.txt or llms-full.txt against the spec: structure, section ordering, link format, and (optionally) broken-link detection. Read-only. One HTTP GET when given `url`; zero network when given `content`. Optional link-check issues HEAD requests against each...

Parameters* required

urlstring

Public URL of an existing llms.txt or llms-full.txt to validate (e.g. `https://example.com/llms.txt`). Either this OR `content` is required.

contentstring

Raw llms.txt content as a string. Use this to validate a file offline without fetching. Either this OR `url` is required.

check_linksboolean

If true (default), HEAD each linked URL to detect broken links. Set false to skip link checks for faster, network-light validation of just the structural rules.default: true

score.citation_worthinessScore how citable a page or text block is for AI engines (ChatGPT, Claude, Perplexity, Google AI Overviews). Evaluates BLUF (bottom-line-up-front) opening, FAQ patterns, statistic density, entity clarity, and answer-shape fit for the optional `target_query`. Also returns `extr...4 params

Score how citable a page or text block is for AI engines (ChatGPT, Claude, Perplexity, Google AI Overviews). Evaluates BLUF (bottom-line-up-front) opening, FAQ patterns, statistic density, entity clarity, and answer-shape fit for the optional `target_query`. Also returns `extr...

Parameters* required

urlstring

Public URL to fetch and score. Either this OR `text` is required.

textstring

Raw text/markdown/HTML to score directly without fetching. Either this OR `url` is required.

target_querystring

Optional target search query the content is supposed to answer (e.g. `how to fix CORS errors in Next.js`). When provided, scoring weights answer-shape fit and query-term coverage. Omit if you want a query-agnostic citability score.

respect_robotsboolean

If true (default), respect robots.txt when fetching `url`. Ignored when `text` is used.default: true

rewrite.aeoRewrite a content block for Answer Engine Optimization. Adds a BLUF opening, FAQ structure, schema additions, and concise question-shaped headings tuned for ChatGPT / Perplexity / Google AI Overviews. Read-only when given `url` (one HTTP GET). Zero network when given `text`. T...6 params

Rewrite a content block for Answer Engine Optimization. Adds a BLUF opening, FAQ structure, schema additions, and concise question-shaped headings tuned for ChatGPT / Perplexity / Google AI Overviews. Read-only when given `url` (one HTTP GET). Zero network when given `text`. T...

Parameters* required

urlstring

Public URL whose content should be fetched and rewritten. Either this OR `text` is required.

textstring

Raw content (markdown or HTML) to rewrite directly. Either this OR `url` is required.

formatstring

Output shape. `article` for prose-with-headings. `faq` for Q&A list. `howto` for numbered-step procedural content with HowTo schema hints. `comparison` for X-vs-Y tables. Default `article`.one of article · faq · howto · comparisondefault: article

max_wordsinteger

Soft word budget for the rewrite. Default 1500. Range 100-5000. The rewrite tries to stay under this; very small budgets may force truncation.default: 1500

target_querystring

The user query the rewrite should answer (e.g. `what is RAG`, `how to deploy Ghost to Docker`). Required - drives heading shape and BLUF wording.

respect_robotsboolean

If true (default), respect robots.txt when fetching `url`. Ignored when `text` is used.default: true

rewrite.geoRewrite a content block for Generative Engine Optimization: entity-rich, comparison-ready, synthesis-friendly. Tuned for surfaces that summarize across sources (Perplexity, Google AI Mode, Claude search). Read-only on input. Does NOT write back to the source URL - returns the...6 params

Rewrite a content block for Generative Engine Optimization: entity-rich, comparison-ready, synthesis-friendly. Tuned for surfaces that summarize across sources (Perplexity, Google AI Mode, Claude search). Read-only on input. Does NOT write back to the source URL - returns the...

Parameters* required

urlstring

Public URL whose content should be fetched and rewritten. Either this OR `text` is required.

textstring

Raw content to rewrite directly. Either this OR `url` is required.

max_wordsinteger

Soft word budget. Default 1500. Range 100-5000.default: 1500

target_querystring

The user query the rewrite should answer. Required - drives entity selection and comparison framing.

respect_robotsboolean

If true (default), respect robots.txt when fetching `url`. Ignored when `text` is used.default: true

add_comparison_tableboolean

If true, inject an explicit X-vs-Y comparison table into the rewrite (useful for `X vs Y` queries). Default false.default: false

extract.entitiesExtract named entities, linked concepts, and sameAs graph nodes from a page's content and structured data. Combines body-text NER with JSON-LD `@type` / `sameAs` walking. Read-only when given `url` (one HTTP GET). Zero network when given `text`. Primary path: MCP sampling - th...4 params

Extract named entities, linked concepts, and sameAs graph nodes from a page's content and structured data. Combines body-text NER with JSON-LD `@type` / `sameAs` walking. Read-only when given `url` (one HTTP GET). Zero network when given `text`. Primary path: MCP sampling - th...

Parameters* required

urlstring

Public URL to fetch and analyze. Either this OR `text` is required.

textstring

Raw text/HTML to analyze directly. Either this OR `url` is required.

renderstring

Rendering mode for `url`. `static` (default) reads raw HTML. `headless` runs Playwright Chromium to capture JS-rendered content (adds 3-10s; requires `playwright-core` + `npx playwright install chromium`). Ignored when `text` is used.one of static · headlessdefault: static

respect_robotsboolean

If true (default), respect robots.txt when fetching `url`. Ignored when `text` is used.default: true

score.test_citationSimulate `would an AI engine cite this page for this query?`. The host LLM role-plays the chosen engine (chatgpt / claude / perplexity / google_ai_overviews / any), reads the page content, and returns a cite/no-cite verdict with the verbatim excerpt it would surface plus ranke...5 params

Simulate `would an AI engine cite this page for this query?`. The host LLM role-plays the chosen engine (chatgpt / claude / perplexity / google_ai_overviews / any), reads the page content, and returns a cite/no-cite verdict with the verbatim excerpt it would surface plus ranke...

Parameters* required

urlstring

Public URL to fetch and test. Either this OR `text` is required.

textstring

Raw text/HTML to test directly. Either this OR `url` is required.

enginestring

Which engine to simulate. `any` (default) uses a generic AI-search persona. Specific engines tune the cite criteria (e.g. perplexity favors statistic-dense excerpts; google_ai_overviews favors schema + freshness).one of chatgpt · claude · perplexity · google_ai_overviews · anydefault: any

target_querystring

The user query the engine is answering. Required. Example: `how to add JSON-LD to a Next.js app`.

respect_robotsboolean

If true (default), respect robots.txt when fetching `url`. Ignored when `text` is used.default: true

diff.pagesCompare two URLs for AI citation-worthiness and return a structured breakdown of which page is more likely to be cited and why. Typical use: your page (url_a) vs a competitor's page (url_b). Read-only. Runs audit.page on both URLs in parallel (2 HTTP fetches per URL), then dif...4 params

Compare two URLs for AI citation-worthiness and return a structured breakdown of which page is more likely to be cited and why. Typical use: your page (url_a) vs a competitor's page (url_b). Read-only. Runs audit.page on both URLs in parallel (2 HTTP fetches per URL), then dif...

Parameters* required

querystring

Optional target search query both pages are competing for (e.g. 'how to connect Zapier to Notion'). When provided, it is surfaced in fix_recommendations_for_a as context. Does not alter the scoring algorithm - scoring is based on audit_page's existing rubric.

url_astring

First URL to compare - typically your own page. Must be a fully-qualified http(s) URL that returns HTTP 200 (redirects are followed).

url_bstring

Second URL to compare - typically a competitor's page. Must be a fully-qualified http(s) URL that returns HTTP 200 (redirects are followed).

respect_robotsboolean

If true (default), respect robots.txt before fetching each URL. Set false only when auditing your own sites where you have intentionally blocked crawlers.default: true

audit.siteSingle-call site sweep: runs audit.page (homepage), check.robots, check.sitemap, and audit.schema in parallel and returns an overall grade (A–F) plus top-5 highest-impact fixes. Read-only. Issues several HTTP GETs against the domain (homepage fetch, robots.txt, sitemap.xml, an...2 params

Single-call site sweep: runs audit.page (homepage), check.robots, check.sitemap, and audit.schema in parallel and returns an overall grade (A–F) plus top-5 highest-impact fixes. Read-only. Issues several HTTP GETs against the domain (homepage fetch, robots.txt, sitemap.xml, an...

Parameters* required

domainstring

Hostname or origin to audit. Examples: `example.com`, `https://example.com`. The tool resolves the homepage and runs audit_page + check_robots + check_sitemap + audit_schema in parallel against it, then returns an overall grade plus top-5 fixes. Issues several HTTP GETs against the domain.

respect_robotsboolean

If true (default), respect robots.txt before fetching the homepage. Set false ONLY to audit a site you own that has temporarily blocked crawlers.default: true

audit.sitemapSite-wide content audit: discovers the sitemap, samples N URLs by deterministic uniform stride, runs audit.page on each, and returns score distribution + worst pages + most-common findings. Read-only. One HTTP GET for sitemap discovery, optionally a few more for sitemap-index...4 params

Site-wide content audit: discovers the sitemap, samples N URLs by deterministic uniform stride, runs audit.page on each, and returns score distribution + worst pages + most-common findings. Read-only. One HTTP GET for sitemap discovery, optionally a few more for sitemap-index...

Parameters* required

domainstring

Hostname or origin to audit. Examples: `example.com`, `https://example.com`. The tool discovers the sitemap, samples N URLs by uniform stride, and runs audit_page on each.

concurrencyinteger

Parallel audit_page calls. Default 2 (gentle). Max 5. The shared politeFetch host-delay is still enforced, so this is per-batch dispatch concurrency, not bypass.default: 2

sample_sizeinteger

Number of URLs to sample from the sitemap. Default 10. Max 50 (sampling caps to avoid runaway audits — each sample is one full audit_page call, ~1-3s with polite throttling). Sampling is deterministic uniform-stride: if the sitemap has 1000 URLs and sample_size=10, every 100th URL is picked.default: 10

respect_robotsboolean

If true (default), respect robots.txt for each sampled URL. Set false only for self-audits where you've intentionally blocked crawlers.default: true

report.saveRender an audit.page or audit.site result as a Markdown report and write it to a file under MCP_WORKSPACE_ROOT (defaults to cwd).3 params

Render an audit.page or audit.site result as a Markdown report and write it to a file under MCP_WORKSPACE_ROOT (defaults to cwd).

Parameters* required

pathstring

Target file path. May be relative to MCP_WORKSPACE_ROOT (or cwd if unset). Paths that escape the workspace root are rejected.

overwriteboolean

If true (default), overwrite an existing file. If false, the write fails when the target already exists.default: true

audit_resultvalue

The return value of `audit.page` or `audit.site`. Pass the structured result verbatim - the tool detects which shape it is and renders the matching Markdown report.

@automatelab/ai-seo-mcp

AI Citation Toolkit for the Model Context Protocol

Audit why AI systems do or do not cite your pages. MCP server. No API keys.

Works inside Claude, Cursor, Windsurf, Codex, and any MCP client that speaks stdio.

What it checks

AI crawler access - GPTBot, OAI-SearchBot, ClaudeBot, and PerplexityBot allowed or blocked in robots.txt
llms.txt - present, spec-compliant, links alive
Structured answer extraction - FAQ headings, BLUF paragraphs, answer-ready blocks
[[schema]] completeness - FAQPage, Article, Organization, Person; flags deprecated patterns
Entity clarity - named entity density and sameAs coverage that help AI systems identify the subject
Citation formatting - canonical URL hygiene, og:url, hreflang, noindex traps
Sitemap freshness - lastmod signals that tell crawlers the page is current

Run an audit. Get a list of citation-blockers, ranked.

You: Run an AI-SEO audit on https://automatelab.tech/launching-the-ai-seo-mcp/.

Result (truncated):

{
  "url": "https://automatelab.tech/launching-the-ai-seo-mcp/",
  "score": 61,
  "grade": "C",
  "dimension_scores": {
    "schema": 45, "technical": 80, "structure": 40,
    "robots": 90, "freshness": 85, "authority": 40,
    "entity_density": 21, "sitemap": 100
  },
  "findings": [
    {
      "severity": "critical",
      "category": "structure",
      "message": "No FAQ structure found (no FAQPage schema or H3 question headings).",
      "fix": "Add FAQ H3 headings ending in '?' with answer paragraphs, and a FAQPage JSON-LD block.",
      "estimated_impact": "high"
    },
    {
      "severity": "warning",
      "category": "authority",
      "message": "Low authority signals - missing Organization or author Person schema.",
      "fix": "Add Organization JSON-LD and Article.author as a Person node with sameAs links.",
      "estimated_impact": "high"
    }
  ]
}

Each finding names the exact fix. No opaque scores, no guesswork.

Install

npx -y @automatelab/ai-seo-mcp

Requires Node 20 or later.

Claude Desktop

Add to %APPDATA%\Claude\claude_desktop_config.json (Windows) or ~/Library/Application Support/Claude/claude_desktop_config.json (macOS):

{
  "mcpServers": {
    "ai-seo": {
      "command": "npx",
      "args": ["-y", "@automatelab/ai-seo-mcp"]
    }
  }
}

Restart Claude Desktop. Any MCP client that supports stdio transport works - same command / args pattern.

Optional: headless rendering for SPAs

By default audit_page reads raw HTML — fast, but misses content on React/Vue/Angular SPAs. Pass render: "headless" to spin up Chromium and audit the rendered DOM (adds 3-10s per audit).

One-time install:

npm install playwright-core
npx playwright install chromium

Then call audit_page with render: "headless". Use static for everything else — most marketing sites and docs render fine without it.

Run it in CI (GitHub Action)

This repo doubles as a GitHub Action. Drop it in a workflow to fail a PR when any page regresses below an AI-citation score - the same audit engine, gated on every change.

- uses: actions/checkout@v4
- name: AI-SEO audit
  uses: AutomateLab-tech/ai-seo-mcp@v0.5.0
  with:
    urls: "https://example.com,https://example.com/pricing"
    min-score: "70"            # fail if any URL scores below this
    respect-robots: "true"     # set false for staging / sites you own
    report-path: "ai-seo-report.md"   # optional Markdown report artifact
    fail-on-regression: "true"

The Action builds the auditor from the pinned ref, runs audit_page on each URL, writes a scorecard to the job summary, and exits non-zero if any URL falls below min-score (when fail-on-regression is true). Outputs: min_score_observed, urls_audited, report_path. Full example: examples/github-action-usage.yml.

Contributing

Bug reports, feature ideas, and PRs welcome. See CONTRIBUTING.md.

Security

To report a vulnerability, see SECURITY.md.

License

MIT - see LICENSE.

Built by automatelab.tech

Featured

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

Registryactive

Package@automatelab/ai-seo-mcp

TransportSTDIO

UpdatedMay 15, 2026

View on GitHub

@automatelab/ai-seo-mcp

AI Citation Toolkit for the Model Context Protocol

Audit why AI systems do or do not cite your pages. MCP server. No API keys.

Works inside Claude, Cursor, Windsurf, Codex, and any MCP client that speaks stdio.

What it checks

AI crawler access - GPTBot, OAI-SearchBot, ClaudeBot, and PerplexityBot allowed or blocked in robots.txt
llms.txt - present, spec-compliant, links alive
Structured answer extraction - FAQ headings, BLUF paragraphs, answer-ready blocks
[[schema]] completeness - FAQPage, Article, Organization, Person; flags deprecated patterns
Entity clarity - named entity density and sameAs coverage that help AI systems identify the subject
Citation formatting - canonical URL hygiene, og:url, hreflang, noindex traps
Sitemap freshness - lastmod signals that tell crawlers the page is current

Run an audit. Get a list of citation-blockers, ranked.

You: Run an AI-SEO audit on https://automatelab.tech/launching-the-ai-seo-mcp/.

Result (truncated):

{
  "url": "https://automatelab.tech/launching-the-ai-seo-mcp/",
  "score": 61,
  "grade": "C",
  "dimension_scores": {
    "schema": 45, "technical": 80, "structure": 40,
    "robots": 90, "freshness": 85, "authority": 40,
    "entity_density": 21, "sitemap": 100
  },
  "findings": [
    {
      "severity": "critical",
      "category": "structure",
      "message": "No FAQ structure found (no FAQPage schema or H3 question headings).",
      "fix": "Add FAQ H3 headings ending in '?' with answer paragraphs, and a FAQPage JSON-LD block.",
      "estimated_impact": "high"
    },
    {
      "severity": "warning",
      "category": "authority",
      "message": "Low authority signals - missing Organization or author Person schema.",
      "fix": "Add Organization JSON-LD and Article.author as a Person node with sameAs links.",
      "estimated_impact": "high"
    }
  ]
}

Each finding names the exact fix. No opaque scores, no guesswork.

Install

npx -y @automatelab/ai-seo-mcp

Requires Node 20 or later.

Claude Desktop

Add to %APPDATA%\Claude\claude_desktop_config.json (Windows) or ~/Library/Application Support/Claude/claude_desktop_config.json (macOS):

{
  "mcpServers": {
    "ai-seo": {
      "command": "npx",
      "args": ["-y", "@automatelab/ai-seo-mcp"]
    }
  }
}

Restart Claude Desktop. Any MCP client that supports stdio transport works - same command / args pattern.

Optional: headless rendering for SPAs

By default audit_page reads raw HTML — fast, but misses content on React/Vue/Angular SPAs. Pass render: "headless" to spin up Chromium and audit the rendered DOM (adds 3-10s per audit).

One-time install:

npm install playwright-core
npx playwright install chromium

Then call audit_page with render: "headless". Use static for everything else — most marketing sites and docs render fine without it.

Run it in CI (GitHub Action)

This repo doubles as a GitHub Action. Drop it in a workflow to fail a PR when any page regresses below an AI-citation score - the same audit engine, gated on every change.

- uses: actions/checkout@v4
- name: AI-SEO audit
  uses: AutomateLab-tech/ai-seo-mcp@v0.5.0
  with:
    urls: "https://example.com,https://example.com/pricing"
    min-score: "70"            # fail if any URL scores below this
    respect-robots: "true"     # set false for staging / sites you own
    report-path: "ai-seo-report.md"   # optional Markdown report artifact
    fail-on-regression: "true"

Contributing

Bug reports, feature ideas, and PRs welcome. See CONTRIBUTING.md.

Security

To report a vulnerability, see SECURITY.md.

License

MIT - see LICENSE.

Built by automatelab.tech

Tool	Purpose
`audit_page`	Composite AI-SEO audit with 8-dimension scoring (schema, technical, structure, robots, freshness, authority, entity density, sitemap).
`audit_schema`	Validate JSON-LD against Schema.org rules and AI-citation best practice. Flags deprecated patterns.
`audit_canonical`	Canonical link integrity, trailing-slash hygiene, `og:url` consistency.
`audit_site`	Single-call site sweep: `audit_page` + `check_robots` + `check_sitemap` + `audit_schema` with overall grade and top-5 fixes.
`audit_sitemap`	Site-wide content audit: stride-sample N URLs from the sitemap, run `audit_page` on each, return distribution + worst pages + top findings.
`check_robots`	Parse `robots.txt` and report per-crawler allow/disallow for all known AI crawlers. Surfaces the GPTBot-blocked-but-OAI-SearchBot-allowed trap.
`check_sitemap`	Validate XML sitemaps: presence, URL count, `lastmod` freshness, image/video extensions.
`check_technical`	HEAD tag audit: canonical, OpenGraph, Twitter Card, hreflang, HTTPS, noindex, title hygiene.
`score_ai_overview_eligibility`	Score a page's probability of appearing in Google AI Overviews using current correlation factors.
`score_citation_worthiness`	Score how citable a page or text block is for Perplexity, ChatGPT, Google AI Overviews, and Claude. Includes per-section `chunk_analysis` / `extractability_score`: how cleanly an LLM can lift a standalone answer from each heading.
`score_agentic_browsing`	Score a page against the Lighthouse "Agentic Browsing" category (May 2026): llms.txt, WebMCP, accessibility-tree integrity, and layout stability.
`score_test_citation`	Simulate "would an AI engine cite this for this query?" via MCP sampling, with deterministic heuristic fallback.
`llms_txt_generate`	Generate `llms.txt` and optionally `llms-full.txt` from a domain's sitemap.
`llms_txt_validate`	Lint an existing `llms.txt` for spec compliance and broken links.
`rewrite_aeo`	Rewrite content for Answer Engine Optimization (BLUF structure, FAQ format, schema additions).
`rewrite_geo`	Rewrite content for Generative Engine Optimization (entity definitions, comparison tables, synthesis-ready structure).
`extract_entities`	Extract named entities, `sameAs` links, and citation-density score from a page's content and structured data.
`diff_pages`	Compare two URLs for AI citation-worthiness: side-by-side dimension scores, gap analysis, and prioritized fix recommendations for url_a.
`report_save`	Render an `audit_page` / `audit_site` result as a Markdown report and write it to disk under `MCP_WORKSPACE_ROOT`.

Tool	Purpose
`audit_page`	Composite AI-SEO audit with 8-dimension scoring (schema, technical, structure, robots, freshness, authority, entity density, sitemap).
`audit_schema`	Validate JSON-LD against Schema.org rules and AI-citation best practice. Flags deprecated patterns.
`audit_canonical`	Canonical link integrity, trailing-slash hygiene, `og:url` consistency.
`audit_site`	Single-call site sweep: `audit_page` + `check_robots` + `check_sitemap` + `audit_schema` with overall grade and top-5 fixes.
`audit_sitemap`	Site-wide content audit: stride-sample N URLs from the sitemap, run `audit_page` on each, return distribution + worst pages + top findings.
`check_robots`	Parse `robots.txt` and report per-crawler allow/disallow for all known AI crawlers. Surfaces the GPTBot-blocked-but-OAI-SearchBot-allowed trap.
`check_sitemap`	Validate XML sitemaps: presence, URL count, `lastmod` freshness, image/video extensions.
`check_technical`	HEAD tag audit: canonical, OpenGraph, Twitter Card, hreflang, HTTPS, noindex, title hygiene.
`score_ai_overview_eligibility`	Score a page's probability of appearing in Google AI Overviews using current correlation factors.
`score_citation_worthiness`	Score how citable a page or text block is for Perplexity, ChatGPT, Google AI Overviews, and Claude. Includes per-section `chunk_analysis` / `extractability_score`: how cleanly an LLM can lift a standalone answer from each heading.
`score_agentic_browsing`	Score a page against the Lighthouse "Agentic Browsing" category (May 2026): llms.txt, WebMCP, accessibility-tree integrity, and layout stability.
`score_test_citation`	Simulate "would an AI engine cite this for this query?" via MCP sampling, with deterministic heuristic fallback.
`llms_txt_generate`	Generate `llms.txt` and optionally `llms-full.txt` from a domain's sitemap.
`llms_txt_validate`	Lint an existing `llms.txt` for spec compliance and broken links.
`rewrite_aeo`	Rewrite content for Answer Engine Optimization (BLUF structure, FAQ format, schema additions).
`rewrite_geo`	Rewrite content for Generative Engine Optimization (entity definitions, comparison tables, synthesis-ready structure).
`extract_entities`	Extract named entities, `sameAs` links, and citation-density score from a page's content and structured data.
`diff_pages`	Compare two URLs for AI citation-worthiness: side-by-side dimension scores, gap analysis, and prioritized fix recommendations for url_a.
`report_save`	Render an `audit_page` / `audit_site` result as a Markdown report and write it to disk under `MCP_WORKSPACE_ROOT`.

Ai Seo

Tools

@automatelab/ai-seo-mcp

What it checks

Run an audit. Get a list of citation-blockers, ranked.

Install

Claude Desktop

Optional: headless rendering for SPAs

Run it in CI (GitHub Action)

Further reading

Contributing

Security

License

Ai Seo

Tools

@automatelab/ai-seo-mcp

What it checks

Run an audit. Get a list of citation-blockers, ranked.

Install

Claude Desktop

Optional: headless rendering for SPAs

Run it in CI (GitHub Action)

Further reading

Contributing

Security

License