Agent Replay

STDIOregistry active

Summary

Records every action an AI agent takes so you can replay sessions step by step and catch regressions. Provides tools to start and stop recording, log individual actions with inputs, outputs, reasoning and timing, then replay the full sequence. The real utility is in compare_sessions, which diffs two runs and shows you exactly where behavior diverged, and find_divergence_point, which pinpoints the first step that deviated from expected output. Export sessions as JSON for tooling or markdown for human review. Useful when debugging non-deterministic agent behavior, comparing model versions, or proving that a workflow changed between runs. Works over stdio and stores sessions in memory keyed by session ID.

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

agent-replay-mcp

MCP server for agent session recording and replay — debug non-deterministic agent behavior with session comparison and divergence detection.

Record every action an agent takes, replay sessions step by step, diff two runs to find behavioral regressions, and pinpoint exactly where an agent diverged from expected output.

Install

npx agent-replay-mcp

Claude Desktop

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "agent-replay": {
      "command": "npx",
      "args": ["agent-replay-mcp"]
    }
  }
}

From source

git clone https://github.com/mdfifty50-boop/agent-replay-mcp.git
cd agent-replay-mcp
npm install
node src/index.js

Tools

record_session

Start recording all actions for an agent session.

Param	Type	Default	Description
`agent_id`	string	required	Unique agent identifier
`metadata`	object	`{}`	Optional metadata (task, model, environment)

Returns a session_id for use with other tools.

stop_recording

Stop recording and return a session summary.

Param	Type	Description
`session_id`	string	Session ID from record_session

Returns: action count, total duration, action type breakdown.

log_action

Log a single action during a recording session.

Param	Type	Default	Description
`session_id`	string	required	Active session ID
`action_type`	string	required	Type (tool_call, llm_response, decision, error)
`input`	any	required	Input to the action
`output`	any	required	Output from the action
`reasoning`	string	`""`	Agent reasoning for this step
`duration_ms`	number	`0`	Action duration in milliseconds

replay_session

Replay a recorded session step by step with full action detail.

Param	Type	Description
`session_id`	string	Session ID to replay

Returns: complete action sequence with timing, reasoning, inputs, and outputs.

compare_sessions

Behavioral diff between two sessions. Aligns actions by step index and highlights differences.

Param	Type	Description
`session_id_1`	string	First session
`session_id_2`	string	Second session

Returns: similarity ratio, identical/divergent step counts, first divergence step, and per-step diffs.

find_divergence_point

Find where an agent first deviated from expected output.

Param	Type	Description
`session_id`	string	Session to analyze
`expected_output`	any	Expected final output, or array of per-step expected outputs

If expected_output is an array, compares step by step. If a single value, finds the last matching output and flags the next step as the divergence point.

export_session

Export a session for sharing and offline analysis.

Param	Type	Default	Description
`session_id`	string	required	Session to export
`format`	string	`"json"`	`"json"` or `"markdown"`

Markdown format produces a readable transcript with step headers, reasoning, and code blocks.

Resources

URI	Description
`agent-replay://sessions`	All recorded sessions with status and action counts

Usage Pattern

1. record_session — start recording at agent launch
2. For each agent action:
   - log_action — capture input, output, reasoning, timing
3. stop_recording — finalize the session
4. Debug:
   - replay_session — review what happened step by step
   - compare_sessions — diff today's run vs yesterday's
   - find_divergence_point — pinpoint where it went wrong
5. Share:
   - export_session — JSON for tooling, markdown for humans

Tests

npm test

License

MIT

Featured

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

agent-replay-mcp

MCP server for agent session recording and replay — debug non-deterministic agent behavior with session comparison and divergence detection.

Record every action an agent takes, replay sessions step by step, diff two runs to find behavioral regressions, and pinpoint exactly where an agent diverged from expected output.

Install

npx agent-replay-mcp

Claude Desktop

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "agent-replay": {
      "command": "npx",
      "args": ["agent-replay-mcp"]
    }
  }
}

From source

git clone https://github.com/mdfifty50-boop/agent-replay-mcp.git
cd agent-replay-mcp
npm install
node src/index.js

Tools

record_session

Start recording all actions for an agent session.

Param	Type	Default	Description
`agent_id`	string	required	Unique agent identifier
`metadata`	object	`{}`	Optional metadata (task, model, environment)

Returns a session_id for use with other tools.

stop_recording

Stop recording and return a session summary.

Param	Type	Description
`session_id`	string	Session ID from record_session

Returns: action count, total duration, action type breakdown.

log_action

Log a single action during a recording session.

Param	Type	Default	Description
`session_id`	string	required	Active session ID
`action_type`	string	required	Type (tool_call, llm_response, decision, error)
`input`	any	required	Input to the action
`output`	any	required	Output from the action
`reasoning`	string	`""`	Agent reasoning for this step
`duration_ms`	number	`0`	Action duration in milliseconds

replay_session

Replay a recorded session step by step with full action detail.

Param	Type	Description
`session_id`	string	Session ID to replay

Returns: complete action sequence with timing, reasoning, inputs, and outputs.

compare_sessions

Behavioral diff between two sessions. Aligns actions by step index and highlights differences.

Param	Type	Description
`session_id_1`	string	First session
`session_id_2`	string	Second session

Returns: similarity ratio, identical/divergent step counts, first divergence step, and per-step diffs.

find_divergence_point

Find where an agent first deviated from expected output.

Param	Type	Description
`session_id`	string	Session to analyze
`expected_output`	any	Expected final output, or array of per-step expected outputs

If expected_output is an array, compares step by step. If a single value, finds the last matching output and flags the next step as the divergence point.

export_session

Export a session for sharing and offline analysis.

Param	Type	Default	Description
`session_id`	string	required	Session to export
`format`	string	`"json"`	`"json"` or `"markdown"`

Markdown format produces a readable transcript with step headers, reasoning, and code blocks.

Resources

URI	Description
`agent-replay://sessions`	All recorded sessions with status and action counts

Usage Pattern

1. record_session — start recording at agent launch
2. For each agent action:
   - log_action — capture input, output, reasoning, timing
3. stop_recording — finalize the session
4. Debug:
   - replay_session — review what happened step by step
   - compare_sessions — diff today's run vs yesterday's
   - find_divergence_point — pinpoint where it went wrong
5. Share:
   - export_session — JSON for tooling, markdown for humans

Tests

npm test

License

MIT

Agent Replay

agent-replay-mcp

Install

Claude Desktop

From source

Tools

record_session

stop_recording

log_action

replay_session

compare_sessions

find_divergence_point

export_session

Resources

Usage Pattern

Tests

License

Agent Replay

agent-replay-mcp

Install

Claude Desktop

From source

Tools

record_session

stop_recording

log_action

replay_session

compare_sessions

find_divergence_point

export_session

Resources

Usage Pattern

Tests

License

Related AI & LLM Tools MCP Servers

Related AI & LLM Tools MCP Servers