Streams through massive NDJSON log files without loading them into memory, which matters when your service crashes and the log is 2GB. Exposes query_log_pattern for field filtering, detect_error_anomalies for Z-score spike detection, and summarize_log_timeline for chronological severity bucketing. Also includes correlate_request for distributed trace reconstruction across multiple files, discover_log_schema for format inference, and group_semantic_patterns using the Drain algorithm for clustering message templates. The start_live_triage tool tails logs with real-time anomaly alerts, and query_external_logs bridges to Datadog, Splunk, and Elasticsearch with OpenTelemetry output mapping. Reach for this when you need to triage production incidents without waiting for your editor to choke on gigabyte files.
Your service just crashed. The log file is 2GB. Your AI agent can't help.
MCP server that stream-parses NDJSON log files without loading them into memory — filter by pattern, detect error spikes via Z-score analysis, summarize severity timelines by time window.
A service crashes at 3am. The log file is app.log.ndjson and it's 2GB. You ask your agent to find what caused the spike in errors around 03:17. The agent can't read 2GB. It can't even try.
ndjson-local-log-triage-mcp streams the file line by line — never loading it into memory — and gives the agent exactly the slice it needs.
query_log_patternFilter log entries by a field/value match. Returns up to N matching entries, streaming the file without loading it entirely. Pass lineStartPattern (e.g. "^{") to reconstruct multiline stack traces silently dropped by the default parser.
Log Query Results
File: /var/log/app.log.ndjson
Filter: service contains "auth"
Lines read: 847,293
Matches: 50 (limit 50 reached)
{"timestamp":"2025-01-15T03:17:02Z","level":"error","service":"auth","msg":"token validation failed","userId":"u_abc123"}
...
detect_error_anomaliesZ-score frequency analysis. Buckets errors by time window, computes mean + stddev, flags windows where the error rate is anomalously high.
Error Anomaly Detection
File: /var/log/app.log.ndjson
Window: 5min
Z-score cutoff: 2.0
Baseline: mean=3.2 errors/window, stdDev=1.8
Anomalies found: 2
[z=4.71] 2025-01-15T03:15:00.000Z 23 errors
[z=2.33] 2025-01-15T03:20:00.000Z 9 errors
summarize_log_timelineChronological aggregation of errors, warnings, and info counts per time window. Quick visual of where the incident is.
Pass adaptive: true to auto-scale bucket size to actual event density and zoom in on the peak error window at 10× finer resolution.
Log Timeline Summary
File: /var/log/app.log.ndjson
Window: 5min
Buckets: 48
Time (UTC) Errors Warnings Info Other
─────────────────────────────────────────────────────────
2025-01-15 03:00:00Z 2 8 142 0
2025-01-15 03:05:00Z 1 5 138 0
2025-01-15 03:10:00Z 3 9 141 0
! 2025-01-15 03:15:00Z 23 14 119 0
2025-01-15 03:20:00Z 9 11 133 0
correlate_requestReconstructs a distributed trace from multiple NDJSON log files. Given a trace_id, collects all correlated events in chronological order across all files and surfaces the services involved and total duration.
Request Correlation
Trace ID: trace-8f7a9b2c
Files scanned: 2
Events found: 10
Services involved: api, worker
Duration: 890ms
[2025-01-15T14:00:00.001Z] api {"level":"info","msg":"incoming request",...}
[2025-01-15T14:00:00.045Z] api {"level":"info","msg":"auth token validated",...}
[2025-01-15T14:00:00.112Z] worker {"level":"info","msg":"job queued",...}
...
discover_log_schemaAnalyze a log file to infer its wrapper format (NDJSON, Syslog, Kubernetes container logs) and extract type schemas, identifying polymorphic keys, timestamp patterns, and severity fields.
{
"fileFormat": "NDJSON",
"detectedKeys": {
"timestamp": { "type": "string", "format": "date-time", "isChronologicalIndex": true },
"level": { "type": "string", "isSeverityField": true, "possibleValues": ["info", "error"] }
}
}
group_semantic_patternsCluster log messages dynamically using the fixed-depth tree-based Drain parsing algorithm to isolate distinct log templates and analyze their parameter distributions (wildcard variations).
Processed Logs: 1500
Unique Patterns: 2
- Template: "connection failed from * port *"
Occurrences: 1200
Parameters:
- param_0 (client_ip): 192.168.1.1 (80%), 10.0.0.5 (20%)
start_live_triageStart background log tailing with real-time Z-score anomaly alerting on error frequency spikes and heap memory protection limits. Dispatches notifications directly over standard JSON-RPC channels.
{
"method": "notifications/triage",
"params": {
"type": "anomaly",
"message": "Live Anomaly Detected: 45 errors in current window (Z-score: 3.52)",
"z_score": 3.52,
"error_count": 45
}
}
query_external_logsA unified gateway to query central log providers (Datadog, Splunk, Elasticsearch), converting search patterns to vendor-specific dialects and mapping the output into the standardized OpenTelemetry Log Data Model structure.
{
"mcpServers": {
"log-triage": {
"command": "npx",
"args": ["-y", "ndjson-local-log-triage-mcp"]
}
}
}
"Analyze /var/log/app.log.ndjson — summarize the error timeline in 5-minute windows, detect any anomalous spikes, and show me the error entries around the spike."
Works great alongside:
MIT
io.github.infoinlet-marketplace/mcp-observability
betterdb-inc/monitor
com.mcparmory/datadog
thotischner/observability-mcp
io.github.tantiope/datadog-mcp
io.github.us-all/datadog