Datadog

1authSTDIOregistry active

Summary

A comprehensive Datadog integration that exposes 165 tools across metrics, monitors, logs, APM, RUM, incidents, CI/CD, and fleet automation. Unlike read-only alternatives, this supports full SLO lifecycle operations (create, update, delete), fleet deployment scheduling, and status page management. The aggregation tools are the standout feature: analyze-monitor-state and slo-compliance-snapshot collapse 5-7 sequential API calls into single structured responses with partial failure handling. Ships with category toggles and field projection to keep token usage manageable despite the wide API surface. Self-hosted via stdio or HTTP, so you control the deployment and can point it at internal or sovereign Datadog sites. Reach for this when you need write operations or fleet automation that the official Bits AI MCP doesn't cover.

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

Datadog MCP Server

The Datadog MCP that answers "why is this happening?" — not just "what's the value?"

Aggregation tools that fold 5–7 sequential API calls into one structured response. Full SLO CRUD. Fleet automation. The widest Datadog API coverage in any MCP — 163 tools built on the @us-all MCP standard.

What it does that others don't

Aggregation tools — analyze-monitor-state and slo-compliance-snapshot collapse 5–7 sequential API calls into one structured response with a caveats array for partial failures. No other Datadog MCP ships this pattern.
Full SLO CRUD — create, update, delete SLOs (and their corrections). The official Bits AI MCP and community alternatives are read-only on SLOs.
Fleet Automation — 15 tools across deployments and schedules. Only this server.
Status Pages — 21 tools for full status-page lifecycle (components, degradations, maintenances). Only this server.
Token-efficient by design — extractFields projection, DD_TOOLS/DD_DISABLE 16-category toggles, and a search-tools meta-tool keep LLM context low across 163 tools.
Apps SDK card — slo-compliance-snapshot renders as a visual card on ChatGPT clients via _meta["openai/outputTemplate"]. Claude clients receive the same JSON content (non-breaking).
stdio + Streamable HTTP — defaults to stdio (Claude Desktop / Code). Set MCP_TRANSPORT=http for ChatGPT Apps SDK or remote clients (Bearer auth via MCP_HTTP_TOKEN).

Try this — 5 prompts

Connect the server to Claude Desktop or Claude Code, then paste any of these:

SLO health — "List my SLOs and their error budget remaining this month. Group by status: compliant, at-risk, breached."
Incident triage — "There's an active incident on checkout-service. Pull the linked monitors, the recent error spikes from APM, and which deployments touched the service in the last 24h."
Monitor noise audit — "Find monitors that alerted more than 10 times in the last 7 days but had MTTR under 5 minutes — these are probably flapping."
RUM error spike — "RUM error rate jumped on the checkout funnel between 14:00 and 14:30 today. Show me the top error groups, affected sessions, and the user actions before the errors."
Fleet rollout — "Schedule the datadog-agent 7.55.0 rollout to the staging cluster, weekends only, starting next Saturday."

When to use this vs Datadog's official MCP

Datadog's official MCP (Bits AI MCP, GA 2026-03-09) is complementary, not a replacement:

	Official Datadog MCP	`@us-all/datadog-mcp` (this)
Tool count	16+ core toolsets	163 tools across full API surface
Deployment	Remote (managed by Datadog)	Self-host stdio (npx / Docker / npm)
Auth	Datadog SSO	API + APP key
Sites	Public Datadog sites	Any site, incl. internal/sovereign; US5 default
SLO writes	❌	✅ create/update/delete SLOs + corrections
Fleet automation	❌	✅ 15 tools
Status pages	❌	✅ 21 tools
Aggregation tools	❌	✅ `analyze-monitor-state`, `slo-compliance-snapshot`
MCP Prompts	❌	✅ 4 (`triage-incident`, `audit-monitor-noise`, `analyze-rum-error-spike`, `investigate-slow-trace`)
MCP Resources	❌	✅ `dd://service/{serviceName}`, `dd://team/{teamId}`, `dd://synthetics/{testId}`, etc.

Use the official Bits AI MCP for fast managed onboarding and SSO. Use this when you need full API coverage, SLO/fleet/status-page write parity, or self-hosting (internal sites, isolated networks, dev/CI sandboxes).

Install

Claude Desktop

Add to ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "datadog": {
      "command": "npx",
      "args": ["-y", "@us-all/datadog-mcp"],
      "env": {
        "DD_API_KEY": "<your-api-key>",
        "DD_APP_KEY": "<your-app-key>",
        "DD_SITE": "datadoghq.com"
      }
    }
  }
}

Claude Code

claude mcp add datadog -s user \
  -e DD_API_KEY=<your-api-key> -e DD_APP_KEY=<your-app-key> -e DD_SITE=datadoghq.com \
  -- npx -y @us-all/datadog-mcp

Docker

docker run -e DD_API_KEY=... -e DD_APP_KEY=... -e DD_SITE=datadoghq.com \
  ghcr.io/us-all/datadog-mcp-server:latest

Build from source

git clone https://github.com/us-all/datadog-mcp-server.git
cd datadog-mcp-server && pnpm install && pnpm build
node dist/index.js

Configuration

Variable	Required	Default	Description
`DD_API_KEY`	✅	—	Datadog API key
`DD_APP_KEY`	✅	—	Datadog Application key
`DD_SITE`	❌	`us5.datadoghq.com`	Datadog site (see table below)
`DD_ALLOW_WRITE`	❌	`false`	Set `true` to enable mutations (create/update/delete)
`DD_TOOLS`	❌	—	Comma-sep allowlist of categories. Only these load — biggest token saver.
`DD_DISABLE`	❌	—	Comma-sep denylist. Ignored when `DD_TOOLS` is set.
`MCP_TRANSPORT`	❌	`stdio`	`http` to enable Streamable HTTP transport
`MCP_HTTP_TOKEN`	conditional	—	Bearer token. Required when `MCP_TRANSPORT=http`
`MCP_HTTP_PORT`	❌	`3000`	HTTP listen port
`MCP_HTTP_HOST`	❌	`127.0.0.1`	HTTP bind host (DNS rebinding protection auto-enabled for localhost)
`MCP_HTTP_SKIP_AUTH`	❌	`false`	Skip Bearer auth — e.g. behind a reverse proxy that handles it

Categories (16): metrics, monitors, dashboards, logs, apm, rum, incidents, security, synthetics, ci, infra, fleet, status-pages, oncall, teams, account.

When MCP_TRANSPORT=http: POST /mcp (Bearer-auth JSON-RPC) + GET /health (public liveness).

Sites:

Site	Value	Region
US1	`datadoghq.com`	US (Virginia)
US3	`us3.datadoghq.com`	US (Virginia)
US5	`us5.datadoghq.com`	US (Oregon)
EU1	`datadoghq.eu`	EU (Frankfurt)
AP1	`ap1.datadoghq.com`	Asia-Pacific (Tokyo)

Token efficiency

Naive setup loads ~25K tokens of tool schema before any conversation. Three knobs mitigate:

Scenario	Tools	Schema tokens	vs default
default (all categories)	163	25,200	—
typical (`DD_TOOLS=metrics,monitors,logs,apm,dashboards`)	55	9,300	−63%
narrow (`DD_TOOLS=metrics,monitors`)	24	3,800	−85%

Category toggles — DD_TOOLS=metrics,monitors,logs,apm (biggest win).
extractFields response projection — get-dashboard { dashboardId: "abc", extractFields: "id,title,widgets.*.definition.type" }.
search-tools meta-tool — always enabled; lets the LLM discover tools at runtime instead of preloading all schemas.

Read-only mode

By default, all writes are blocked to prevent accidental mutations by AI agents. The following require DD_ALLOW_WRITE=true:

create-monitor, update-monitor, delete-monitor, mute-monitor, create-dashboard, update-dashboard, delete-dashboard, send-logs, post-event, trigger-synthetics, create-synthetics-test, update-synthetics-test, delete-synthetics-test, create-downtime, cancel-downtime, create-case, update-case-status, send-dora-deployment, send-dora-incident, create-slo, update-slo, delete-slo, plus all fleet/status-page/security writes.

MCP Prompts (4)

Workflow templates the model can invoke directly:

triage-incident — given an incident ID, walks linked monitors, recent error spikes, and recent deploys.
audit-monitor-noise — flag flapping monitors via alert frequency × MTTR.
analyze-rum-error-spike — diff RUM error rates across two windows, attribute to top error groups.
investigate-slow-trace — given a slow trace ID, traverse the span tree and surface bottleneck spans.

MCP Resources

Read-only entities by URI: dd://monitor/{id}, dd://dashboard/{id}, dd://slo/{id}, dd://incident/{id}, dd://service/{serviceName}, dd://team/{teamId} (team + members), dd://synthetics/{testId}, dd://host/{name}.

Tool reference

163 tools across 16 categories. Use the search-tools meta-tool to discover at runtime; the full list is collapsed below.

Domain	Tools
Status Pages	21
RUM (events + apps + metrics + retention)	27
Metrics, Hosts, SLOs, Downtimes, Containers, Processes	19
Fleet Automation	15
Synthetics, Logs/Spans Metrics, SLO Corrections	16
Monitors, Dashboards, Notebooks, Events	16
Incidents, Cases, Error Tracking, Audit	13
OnCall, Teams, Users, Services, Bots	11
Security signals + rules + suppressions	9
APM, CI Visibility, DORA, Network Devices	9
+ aggregations	`analyze-monitor-state`, `slo-compliance-snapshot`
+ meta	`search-tools`

Full tool list (click to expand)

Metrics (5)

query-metrics, get-metrics, get-metric-metadata, list-active-metrics, list-metric-tags

Monitors (7)

get-monitors, get-monitor, create-monitor, update-monitor, delete-monitor, mute-monitor, validate-monitor, analyze-monitor-state (aggregation)

Dashboards (5)

get-dashboards, get-dashboard, create-dashboard, update-dashboard, delete-dashboard

Logs (3)

search-logs, aggregate-logs, send-logs

Events (2)

get-events, post-event

Incidents (6)

get-incidents, get-incident, search-incidents, create-incident, update-incident, delete-incident

APM (1)

search-spans

RUM (17)

search-rum-events, aggregate-rum, list-rum-applications, get-rum-application, create-rum-application, update-rum-application, delete-rum-application, list-rum-metrics, get-rum-metric, create-rum-metric, update-rum-metric, delete-rum-metric, list-rum-retention-filters, get-rum-retention-filter, create-rum-retention-filter, update-rum-retention-filter, delete-rum-retention-filter

SLOs (6)

list-slos, get-slo, get-slo-history, create-slo, update-slo, delete-slo, slo-compliance-snapshot (aggregation), plus 5 SLO-correction tools

Synthetics (6)

list-synthetics, get-synthetics-result, trigger-synthetics, create-synthetics-test, update-synthetics-test, delete-synthetics-test

Hosts / Containers / Processes (4)

list-hosts, get-host-totals, list-containers, list-processes

Downtimes (3)

list-downtimes, create-downtime, cancel-downtime

Security (9)

search-security-signals, get-security-signal, list-security-rules, get-security-rule, delete-security-rule, list-security-suppressions, get-security-suppression, create-security-suppression, delete-security-suppression

CI Visibility (4)

search-ci-pipelines, aggregate-ci-pipelines, search-ci-tests, aggregate-ci-tests

Cases (4)

list-cases, get-case, create-case, update-case-status

Error Tracking (2)

list-error-tracking-issues, get-error-tracking-issue

DORA (2)

send-dora-deployment, send-dora-incident

Network Devices (2)

list-network-devices, get-network-device

Notebooks (2)

list-notebooks, get-notebook

OnCall (2)

get-team-oncall, get-oncall-schedule

Services & Software Catalog (2)

list-services, get-service-definition

Teams (6)

list-teams, get-team, create-team, update-team, delete-team, get-team-members

Account & Users (2)

get-usage-summary, list-users

Logs/Spans/APM Retention metrics (15)

5 each for logs-metrics, spans-metrics, apm-retention-filters (list/get/create/update/delete)

Status Pages (21)

Full lifecycle: pages, components, degradations, maintenances. See src/tools/status-pages.ts.

Fleet Automation (15)

Agents, deployments, schedules. See src/tools/fleet.ts.

Audit (1)

search-audit-logs

Meta (1)

search-tools — query other tools by keyword; always enabled regardless of DD_TOOLS.

Architecture

Claude → MCP stdio → index.ts → tools/*.ts → @datadog/datadog-api-client → Datadog API

Built on @us-all/mcp-toolkit:

extractFields — token-efficient response projections
aggregate(fetchers, caveats) — fan-out helper for aggregation tools
createWrapToolHandler — domain-specific redaction (DD_API_KEY/DD_APP_KEY) + Datadog ApiException error extraction
search-tools meta-tool

Tech stack

Node.js 22+ • TypeScript strict ESM • pnpm • @modelcontextprotocol/sdk • @datadog/datadog-api-client (official) • zod • dotenv • vitest + dd-trace.

Contributing

See CONTRIBUTING.md. New shared patterns belong in @us-all/mcp-toolkit — single source of truth for the 7-server suite.

License

MIT

Featured

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

Configuration

DD_API_KEY*secret

Datadog API key

DD_APP_KEY*secret

Datadog application key

DD_SITEdefault: datadoghq.com

Datadog site (datadoghq.com, datadoghq.eu, us3.datadoghq.com, etc.)

DD_TOOLS

Comma-separated category allowlist (e.g. metrics,monitors,logs). Default: all categories enabled.

DD_DISABLE

Comma-separated category disablelist.

DD_ALLOW_WRITEdefault: false

Set to 'true' to enable write/destructive tools. Default read-only.

Datadog MCP Server

The Datadog MCP that answers "why is this happening?" — not just "what's the value?"

Aggregation tools that fold 5–7 sequential API calls into one structured response. Full SLO CRUD. Fleet automation. The widest Datadog API coverage in any MCP — 163 tools built on the @us-all MCP standard.

What it does that others don't

Aggregation tools — analyze-monitor-state and slo-compliance-snapshot collapse 5–7 sequential API calls into one structured response with a caveats array for partial failures. No other Datadog MCP ships this pattern.
Full SLO CRUD — create, update, delete SLOs (and their corrections). The official Bits AI MCP and community alternatives are read-only on SLOs.
Fleet Automation — 15 tools across deployments and schedules. Only this server.
Status Pages — 21 tools for full status-page lifecycle (components, degradations, maintenances). Only this server.
Token-efficient by design — extractFields projection, DD_TOOLS/DD_DISABLE 16-category toggles, and a search-tools meta-tool keep LLM context low across 163 tools.
Apps SDK card — slo-compliance-snapshot renders as a visual card on ChatGPT clients via _meta["openai/outputTemplate"]. Claude clients receive the same JSON content (non-breaking).
stdio + Streamable HTTP — defaults to stdio (Claude Desktop / Code). Set MCP_TRANSPORT=http for ChatGPT Apps SDK or remote clients (Bearer auth via MCP_HTTP_TOKEN).

Try this — 5 prompts

Connect the server to Claude Desktop or Claude Code, then paste any of these:

SLO health — "List my SLOs and their error budget remaining this month. Group by status: compliant, at-risk, breached."
Incident triage — "There's an active incident on checkout-service. Pull the linked monitors, the recent error spikes from APM, and which deployments touched the service in the last 24h."
Monitor noise audit — "Find monitors that alerted more than 10 times in the last 7 days but had MTTR under 5 minutes — these are probably flapping."
RUM error spike — "RUM error rate jumped on the checkout funnel between 14:00 and 14:30 today. Show me the top error groups, affected sessions, and the user actions before the errors."
Fleet rollout — "Schedule the datadog-agent 7.55.0 rollout to the staging cluster, weekends only, starting next Saturday."

When to use this vs Datadog's official MCP

Datadog's official MCP (Bits AI MCP, GA 2026-03-09) is complementary, not a replacement:

	Official Datadog MCP	`@us-all/datadog-mcp` (this)
Tool count	16+ core toolsets	163 tools across full API surface
Deployment	Remote (managed by Datadog)	Self-host stdio (npx / Docker / npm)
Auth	Datadog SSO	API + APP key
Sites	Public Datadog sites	Any site, incl. internal/sovereign; US5 default
SLO writes	❌	✅ create/update/delete SLOs + corrections
Fleet automation	❌	✅ 15 tools
Status pages	❌	✅ 21 tools
Aggregation tools	❌	✅ `analyze-monitor-state`, `slo-compliance-snapshot`
MCP Prompts	❌	✅ 4 (`triage-incident`, `audit-monitor-noise`, `analyze-rum-error-spike`, `investigate-slow-trace`)
MCP Resources	❌	✅ `dd://service/{serviceName}`, `dd://team/{teamId}`, `dd://synthetics/{testId}`, etc.

Install

Claude Desktop

Add to ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "datadog": {
      "command": "npx",
      "args": ["-y", "@us-all/datadog-mcp"],
      "env": {
        "DD_API_KEY": "<your-api-key>",
        "DD_APP_KEY": "<your-app-key>",
        "DD_SITE": "datadoghq.com"
      }
    }
  }
}

Claude Code

claude mcp add datadog -s user \
  -e DD_API_KEY=<your-api-key> -e DD_APP_KEY=<your-app-key> -e DD_SITE=datadoghq.com \
  -- npx -y @us-all/datadog-mcp

Docker

docker run -e DD_API_KEY=... -e DD_APP_KEY=... -e DD_SITE=datadoghq.com \
  ghcr.io/us-all/datadog-mcp-server:latest

Build from source

git clone https://github.com/us-all/datadog-mcp-server.git
cd datadog-mcp-server && pnpm install && pnpm build
node dist/index.js

Configuration

Variable	Required	Default	Description
`DD_API_KEY`	✅	—	Datadog API key
`DD_APP_KEY`	✅	—	Datadog Application key
`DD_SITE`	❌	`us5.datadoghq.com`	Datadog site (see table below)
`DD_ALLOW_WRITE`	❌	`false`	Set `true` to enable mutations (create/update/delete)
`DD_TOOLS`	❌	—	Comma-sep allowlist of categories. Only these load — biggest token saver.
`DD_DISABLE`	❌	—	Comma-sep denylist. Ignored when `DD_TOOLS` is set.
`MCP_TRANSPORT`	❌	`stdio`	`http` to enable Streamable HTTP transport
`MCP_HTTP_TOKEN`	conditional	—	Bearer token. Required when `MCP_TRANSPORT=http`
`MCP_HTTP_PORT`	❌	`3000`	HTTP listen port
`MCP_HTTP_HOST`	❌	`127.0.0.1`	HTTP bind host (DNS rebinding protection auto-enabled for localhost)
`MCP_HTTP_SKIP_AUTH`	❌	`false`	Skip Bearer auth — e.g. behind a reverse proxy that handles it

Categories (16): metrics, monitors, dashboards, logs, apm, rum, incidents, security, synthetics, ci, infra, fleet, status-pages, oncall, teams, account.

When MCP_TRANSPORT=http: POST /mcp (Bearer-auth JSON-RPC) + GET /health (public liveness).

Sites:

Site	Value	Region
US1	`datadoghq.com`	US (Virginia)
US3	`us3.datadoghq.com`	US (Virginia)
US5	`us5.datadoghq.com`	US (Oregon)
EU1	`datadoghq.eu`	EU (Frankfurt)
AP1	`ap1.datadoghq.com`	Asia-Pacific (Tokyo)

Token efficiency

Naive setup loads ~25K tokens of tool schema before any conversation. Three knobs mitigate:

Scenario	Tools	Schema tokens	vs default
default (all categories)	163	25,200	—
typical (`DD_TOOLS=metrics,monitors,logs,apm,dashboards`)	55	9,300	−63%
narrow (`DD_TOOLS=metrics,monitors`)	24	3,800	−85%

Category toggles — DD_TOOLS=metrics,monitors,logs,apm (biggest win).
extractFields response projection — get-dashboard { dashboardId: "abc", extractFields: "id,title,widgets.*.definition.type" }.
search-tools meta-tool — always enabled; lets the LLM discover tools at runtime instead of preloading all schemas.

Read-only mode

By default, all writes are blocked to prevent accidental mutations by AI agents. The following require DD_ALLOW_WRITE=true:

MCP Prompts (4)

Workflow templates the model can invoke directly:

triage-incident — given an incident ID, walks linked monitors, recent error spikes, and recent deploys.
audit-monitor-noise — flag flapping monitors via alert frequency × MTTR.
analyze-rum-error-spike — diff RUM error rates across two windows, attribute to top error groups.
investigate-slow-trace — given a slow trace ID, traverse the span tree and surface bottleneck spans.

MCP Resources

Tool reference

163 tools across 16 categories. Use the search-tools meta-tool to discover at runtime; the full list is collapsed below.

Domain	Tools
Status Pages	21
RUM (events + apps + metrics + retention)	27
Metrics, Hosts, SLOs, Downtimes, Containers, Processes	19
Fleet Automation	15
Synthetics, Logs/Spans Metrics, SLO Corrections	16
Monitors, Dashboards, Notebooks, Events	16
Incidents, Cases, Error Tracking, Audit	13
OnCall, Teams, Users, Services, Bots	11
Security signals + rules + suppressions	9
APM, CI Visibility, DORA, Network Devices	9
+ aggregations	`analyze-monitor-state`, `slo-compliance-snapshot`
+ meta	`search-tools`

Full tool list (click to expand)

Metrics (5)

query-metrics, get-metrics, get-metric-metadata, list-active-metrics, list-metric-tags

Monitors (7)

get-monitors, get-monitor, create-monitor, update-monitor, delete-monitor, mute-monitor, validate-monitor, analyze-monitor-state (aggregation)

Dashboards (5)

get-dashboards, get-dashboard, create-dashboard, update-dashboard, delete-dashboard

Logs (3)

search-logs, aggregate-logs, send-logs

Events (2)

get-events, post-event

Incidents (6)

get-incidents, get-incident, search-incidents, create-incident, update-incident, delete-incident

APM (1)

search-spans

RUM (17)

SLOs (6)

list-slos, get-slo, get-slo-history, create-slo, update-slo, delete-slo, slo-compliance-snapshot (aggregation), plus 5 SLO-correction tools

Synthetics (6)

list-synthetics, get-synthetics-result, trigger-synthetics, create-synthetics-test, update-synthetics-test, delete-synthetics-test

Hosts / Containers / Processes (4)

list-hosts, get-host-totals, list-containers, list-processes

Downtimes (3)

list-downtimes, create-downtime, cancel-downtime

Security (9)

CI Visibility (4)

search-ci-pipelines, aggregate-ci-pipelines, search-ci-tests, aggregate-ci-tests

Cases (4)

list-cases, get-case, create-case, update-case-status

Error Tracking (2)

list-error-tracking-issues, get-error-tracking-issue

DORA (2)

send-dora-deployment, send-dora-incident

Network Devices (2)

list-network-devices, get-network-device

Notebooks (2)

list-notebooks, get-notebook

OnCall (2)

get-team-oncall, get-oncall-schedule

Services & Software Catalog (2)

list-services, get-service-definition

Teams (6)

list-teams, get-team, create-team, update-team, delete-team, get-team-members

Account & Users (2)

get-usage-summary, list-users

Logs/Spans/APM Retention metrics (15)

5 each for logs-metrics, spans-metrics, apm-retention-filters (list/get/create/update/delete)

Status Pages (21)

Full lifecycle: pages, components, degradations, maintenances. See src/tools/status-pages.ts.

Fleet Automation (15)

Agents, deployments, schedules. See src/tools/fleet.ts.

Audit (1)

search-audit-logs

Meta (1)

search-tools — query other tools by keyword; always enabled regardless of DD_TOOLS.

Architecture

Claude → MCP stdio → index.ts → tools/*.ts → @datadog/datadog-api-client → Datadog API

Built on @us-all/mcp-toolkit:

extractFields — token-efficient response projections
aggregate(fetchers, caveats) — fan-out helper for aggregation tools
createWrapToolHandler — domain-specific redaction (DD_API_KEY/DD_APP_KEY) + Datadog ApiException error extraction
search-tools meta-tool

Tech stack

Node.js 22+ • TypeScript strict ESM • pnpm • @modelcontextprotocol/sdk • @datadog/datadog-api-client (official) • zod • dotenv • vitest + dd-trace.

Contributing

See CONTRIBUTING.md. New shared patterns belong in @us-all/mcp-toolkit — single source of truth for the 7-server suite.

License

MIT