Wraps xAI's official Grok CLI to give Claude and other MCP hosts four tools for adversarial code review: grok_review scores diffs across correctness, security, and architecture dimensions; grok_challenge hunts for bugs, races, and edge cases; grok_consult handles multi-turn technical discussions; and grok_chat answers one-off questions. Authentication works via browser OAuth or XAI_API_KEY for CI. The standout use case is letting your primary coding agent ask a different model to attack its own work before you ship. Includes a grok-review-ci binary and GitHub Action for blocking PRs on critical findings. Timeout defaults to five minutes since Grok's reasoning models take longer than chat completions.
Use Grok as a peer code reviewer and rigorous second-opinion consultant inside Claude Code, Cursor, Cline, OpenClaw, and any other MCP host — talking to xAI's API directly (just an
XAI_API_KEY, no install) or via the official Grok CLI.
grok-mcp (npm: grok-cli-mcp) is a Model Context Protocol server for Grok. It gives your primary agent (Claude, Cursor, etc.) four tools so it can delegate to Grok for high-quality second opinions and rigorous validation without leaving the session. As of v0.3.0 it talks to xAI's API directly — no grok binary required — and still supports the CLI for OAuth users:
grok_review — structured diff review with per-dimension scoresgrok_challenge — thorough analysis for bugs, races, edge cases and security issuesgrok_consult — multi-turn consultation (caller owns history)grok_chat — one-shot questionsEnglish | 繁體中文
Most "Grok MCP" packages expose Grok's chat/search/image capabilities so Claude can use Grok. grok-mcp lets your main coding agent (Claude/Cursor/…) ask Grok for a rigorous second opinion on its own work. A different model providing thorough review often catches issues that single-model loops miss.
Four tools, all stateless, all stdout-only:
| Tool | Use it for |
|---|---|
grok_chat | One-shot prompt → Grok's reply |
grok_review | Pass a unified diff (or auto-grab git diff main...HEAD) and get a per-dimension code review |
grok_consult | Replay a message history for multi-turn — caller owns the thread |
grok_challenge | Rigorous analysis: ask Grok to surface bugs, race conditions, edge cases, and security issues |
XAI_API_KEY from console.x.ai. The server calls xAI's HTTP API directly — no extra binary needed.XAI_API_KEY is set:
curl -fsSL https://x.ai/cli/install.sh | bash
Then authenticate with browser OAuth (run grok once interactively). See Authentication below.npm install -g grok-cli-mcp
# or use npx — no install needed
npx grok-cli-mcp
Why the npm name is
grok-cli-mcpinstead ofgrok-mcp? The baregrok-mcpname on npm was already taken by an unrelated project (a Grok HTTP-API integration). The brand, GitHub repo, and MCP server identity staygrok-mcp; only the npm install identifier isgrok-cli-mcp— chosen to highlight that this server wraps the official Grok CLI.
There are two auth methods, each tied to a backend:
| Method | Backend | Best for | Rate limits |
|---|---|---|---|
API key (XAI_API_KEY env var) | API mode — no grok binary needed | MCP / CI / automation | Pay-per-call, no subscription cap |
Browser OAuth (grok interactive login) | CLI mode | Local interactive use | Subject to your grok.com plan tier |
Setting XAI_API_KEY switches the server to API mode, so you can keep your browser login for interactive grok use and use a key just for this MCP server via its env block:
{
"mcpServers": {
"grok": {
"command": "npx",
"args": ["-y", "grok-cli-mcp"],
"env": {
"XAI_API_KEY": "xai-...",
"GROK_MCP_TIMEOUT": "600000"
}
}
}
}
Treat the key file as a secret — it ends up in your MCP host's config (e.g. ~/.claude.json), which is plain JSON on disk.
Recommended — use add-json so the env block parses cleanly:
claude mcp add-json -s user grok '{
"command": "npx",
"args": ["-y", "grok-cli-mcp"],
"env": { "XAI_API_KEY": "xai-...", "GROK_MCP_TIMEOUT": "600000" }
}'
Why
add-jsonnotclaude mcp add -e ...? The-e KEY=valflag is variadic and will greedily consume the server name as another env value if you pass more than one.add-jsonsidesteps that footgun entirely.
Or edit ~/.claude.json directly. Minimal (OAuth fallback):
{
"mcpServers": {
"grok": {
"command": "npx",
"args": ["-y", "grok-cli-mcp"]
}
}
}
Create .cursor/mcp.json (project) or ~/.cursor/mcp.json (global):
{
"mcpServers": {
"grok": {
"command": "npx",
"args": ["-y", "grok-cli-mcp"]
}
}
}
Settings → Cline → MCP Servers:
{
"grok": {
"command": "npx",
"args": ["-y", "grok-cli-mcp"]
}
}
grok-mcp speaks plain stdio MCP. Point any client at npx -y grok-cli-mcp and it works.
grok_chat{ "prompt": "Explain consistent hashing in two sentences." }
Optional: model to override the default Grok model; timeout (seconds) to extend the per-call limit for long grok-4 reasoning. All four tools accept timeout.
grok_review{ "base_ref": "main", "focus": "security" }
If diff is omitted, runs git diff <base_ref>...HEAD in cwd (defaults to your host's working directory). Returns a markdown review by default with verdict, per-dimension scores (correctness / readability / architecture / security / performance), and concrete fix-it items.
Pass "format": "json" to get machine-parseable output suitable for CI gating — see Use as a PR gate.
grok_consult{
"messages": [
{ "role": "system", "content": "You are a senior backend engineer." },
{ "role": "user", "content": "How would you cache this query?" },
{ "role": "assistant", "content": "Two options..." },
{ "role": "user", "content": "What's the failure mode of option 2?" }
]
}
The server is stateless — the caller passes the full thread each time. Most MCP hosts handle this naturally.
grok_challenge{
"code": "function transfer(from, to, amount) { from.balance -= amount; to.balance += amount; }",
"context": "Node.js, called concurrently from HTTP handlers"
}
Returns severity-ranked issues (Critical / High / Medium / Low) with concrete reproductions and patches.
| Env var | Default | Purpose |
|---|---|---|
XAI_API_KEY | (unset — falls back to OAuth) | API key from console.x.ai. When set, the server uses API mode (direct HTTP) and bills pay-per-call with no subscription rate cap. See Authentication. |
GROK_MCP_BACKEND | auto | Which backend to use: api (direct HTTP), cli (shell out to grok), or auto (API when XAI_API_KEY is set, else CLI). See Backends. |
GROK_MCP_MODEL | grok-4 | Model used in API mode. (CLI mode reads ~/.grok/config.toml.) |
GROK_MCP_BASE_URL | https://api.x.ai/v1 | API base URL — point at a proxy or compatible gateway in API mode. |
GROK_MCP_BIN | grok | Path to the grok binary (CLI mode only) |
GROK_MCP_TIMEOUT | 300000 | Default per-call timeout in milliseconds |
The server can reach Grok two ways and chooses one at startup (it logs which to stderr):
/chat/completions endpoint directly using Node's built-in fetch. No grok binary required, cleaner errors, pay-per-call. Selected when XAI_API_KEY is set, or forced with GROK_MCP_BACKEND=api.grok binary (supports browser OAuth). Selected when no XAI_API_KEY is set, or forced with GROK_MCP_BACKEND=cli.Force a mode with GROK_MCP_BACKEND. In API mode, set the model with GROK_MCP_MODEL; in CLI mode, model defaults live in ~/.grok/config.toml.
grok-4 is a reasoning model and long prompts routinely take longer than two minutes. The server's default per-call limit is 300s (5 min). You can change it three ways:
timeout (seconds) to any tool: { "prompt": "...", "timeout": 600 }.GROK_MCP_TIMEOUT (milliseconds) in the MCP server's env.MCP_TIMEOUT (server startup) and MCP_TOOL_TIMEOUT (per tool call), both in milliseconds.On timeout the error includes any partial output Grok produced before the deadline, so you don't lose a near-complete answer.
grok-mcp ships a grok-review-ci bin and a composite GitHub Action so Grok can review every PR and fail the check on block.
Drop this into .github/workflows/grok-review.yml in your repo:
name: Grok review
on: { pull_request: { branches: [main] } }
permissions: { contents: read, pull-requests: write }
jobs:
grok:
runs-on: ubuntu-latest
if: ${{ github.event.pull_request.head.repo.full_name == github.repository }}
steps:
- uses: actions/checkout@v4
with: { fetch-depth: 0 }
- uses: howardpen9/grok-mcp/.github/actions/grok-review@main
with:
xai-api-key: ${{ secrets.XAI_API_KEY }}
gate-on: block # also accepts: block,request_changes
# focus: security # optional
# min-score: 6 # optional — fail any dimension below this
The action posts a sticky PR comment with verdict + per-dimension scores + concrete blockers, and exits non-zero (failing the check) when the verdict matches gate-on. Full example with comments: examples/workflows/grok-review.yml.
Want JSON straight from the tool instead? Pass format: "json" to grok_review — same schema as the bin emits, suitable for any pipeline:
{
"verdict": "block",
"summary": "Unparameterised SQL query in src/db.ts.",
"scores": { "correctness": 4, "readability": 7, "architecture": 5, "security": 2, "performance": 8 },
"blockers": [
{ "severity": "critical", "title": "SQL injection", "file": "src/db.ts", "line": 42,
"reason": "User input concatenated directly into the query.",
"fix": "Use the parameterised form `db.query(sql, [userId])`." }
],
"notes": []
}
docs/improvement-plan.md and CHANGELOG.md.grok_review JSON mode + grok-review-ci bin + GitHub Action for PR gating.grok CLI required); GROK_MCP_BACKEND api/cli/auto.grok_consult can take a conversation_idprogress notificationsgit clone https://github.com/howardpen9/grok-mcp.git
cd grok-mcp
npm install
npm test
npm run build
Bug reports & feature requests → GitHub issues. DMs welcome on X: @0xHoward_Peng.
MIT