Gives Claude the ability to capture screenshots across Linux, macOS, and Windows using native APIs. Exposes tools to grab full screens, specific windows by ID or title pattern, and custom regions with pixel coordinates. The standout feature is OCR-based PII masking that redacts emails, phone numbers, and credit cards before sending images to the AI. Built for visual debugging workflows where you want Claude to analyze actual UI layouts, create documentation with real screenshots, or audit accessibility issues like contrast ratios and spacing. Includes security controls like path validation, rate limiting, and audit logging. Ships with multi-monitor support and configurable image formats with quality settings.
Give AI agents visual superpowers to see, analyze, and document your applications like senior UX designers.
This enterprise-grade MCP server transforms AI from code-only assistants into visual experts capable of UI analysis, accessibility auditing, documentation generation, and responsive design testing.
This package is now maintained in its own repository: https://github.com/Digital-Defiance/mcp-screenshot
This repository is part of the AI Capabilitites Suite on GitHub.
AI agents today are powerful but visually blind:
Result: You're stuck manually creating screenshots, documentation, and visual analysis that AI should handle.
You: "Document the checkout process with screenshots"
AI: *Captures each step, analyzes UI flow*
AI: "Generated complete user guide with 6 annotated screenshots, highlighting key interactions and error states..."
You: "Check if this form meets WCAG guidelines"
AI: *Captures form, analyzes visual elements*
AI: "Found 4 accessibility issues: contrast ratio 2.1:1 (needs 4.5:1), missing focus indicators, touch targets too small..."
You: "The mobile navigation is broken"
AI: *Captures different viewport sizes*
AI: "Created bug report with before/after screenshots, exact breakpoint where it fails, and CSS fix suggestions..."
You: "Which hero section design works better?"
AI: *Captures both versions, analyzes visual hierarchy*
AI: "Version B has 28% better visual flow—CTA more prominent, text hierarchy clearer, better use of whitespace..."
You: "How does this look on different screen sizes?"
AI: *Captures multiple viewport sizes*
AI: "Layout breaks at 768px—sidebar overlaps content. Here's the media query fix with visual proof..."
Before: AI worked blind, relying on code descriptions
After: AI sees and analyzes your actual user interface
npm install @ai-capabilities-suite/mcp-screenshot
Linux:
imagemagick package (provides import command)grim package# Ubuntu/Debian
sudo apt-get install imagemagick grim
# Fedora
sudo dnf install ImageMagick grim
# Arch
sudo pacman -S imagemagick grim
macOS:
screencapture command (no additional dependencies)Windows:
Add to your MCP settings file (e.g., ~/.kiro/settings/mcp.json or .kiro/settings/mcp.json):
{
"mcpServers": {
"screenshot": {
"command": "node",
"args": ["/path/to/mcp-screenshot/dist/cli.js"],
"env": {
"SCREENSHOT_ALLOWED_DIRS": "/home/user/screenshots,/tmp",
"SCREENSHOT_MAX_CAPTURES_PER_MIN": "60",
"SCREENSHOT_ENABLE_AUDIT_LOG": "true"
}
}
}
}
Purpose-built for AI agents to capture, analyze, and work with visual information:
The server exposes 5 comprehensive MCP tools that enable AI agents to see and understand your applications:
Capture full screen or specific display.
Parameters:
display (string, optional): Display ID to capture (defaults to primary display)format (string, optional): Image format - png, jpeg, webp, or bmp (default: png)quality (number, optional): Compression quality 1-100 for lossy formats (default: 90)savePath (string, optional): File path to save screenshot (returns base64 if not provided)enablePIIMasking (boolean, optional): Enable PII detection and masking (default: false)Example:
{
"name": "screenshot_capture_full",
"arguments": {
"format": "png",
"savePath": "/home/user/screenshots/desktop.png",
"enablePIIMasking": true
}
}
Response:
{
"status": "success",
"filePath": "/home/user/screenshots/desktop.png",
"metadata": {
"width": 1920,
"height": 1080,
"format": "png",
"fileSize": 245678,
"timestamp": "2024-12-01T10:30:00.000Z",
"display": {
"id": "0",
"name": "Primary Display",
"resolution": { "width": 1920, "height": 1080 },
"position": { "x": 0, "y": 0 },
"isPrimary": true
},
"piiMasking": {
"emailsRedacted": 2,
"phonesRedacted": 1,
"creditCardsRedacted": 0,
"customPatternsRedacted": 0
}
}
}
Capture specific application window by ID or title pattern.
Parameters:
windowId (string, optional): Window identifier (use windowId or windowTitle)windowTitle (string, optional): Window title pattern to match (use windowId or windowTitle)includeFrame (boolean, optional): Include window frame and title bar (default: false)format (string, optional): Image format (default: png)quality (number, optional): Compression quality 1-100 (default: 90)savePath (string, optional): File path to save screenshotExample:
{
"name": "screenshot_capture_window",
"arguments": {
"windowTitle": "Chrome",
"includeFrame": false,
"format": "jpeg",
"quality": 85
}
}
Response:
{
"status": "success",
"data": "iVBORw0KGgoAAAANSUhEUgAA...",
"mimeType": "image/jpeg",
"metadata": {
"width": 1280,
"height": 720,
"format": "jpeg",
"fileSize": 89234,
"timestamp": "2024-12-01T10:31:00.000Z",
"window": {
"id": "12345",
"title": "Google Chrome",
"processName": "chrome",
"pid": 5678,
"bounds": { "x": 100, "y": 100, "width": 1280, "height": 720 }
}
}
}
Capture specific rectangular region of the screen.
Parameters:
x (number, required): X coordinate of top-left cornery (number, required): Y coordinate of top-left cornerwidth (number, required): Width of region in pixelsheight (number, required): Height of region in pixelsformat (string, optional): Image format (default: png)quality (number, optional): Compression quality 1-100 (default: 90)savePath (string, optional): File path to save screenshotExample:
{
"name": "screenshot_capture_region",
"arguments": {
"x": 100,
"y": 100,
"width": 800,
"height": 600,
"format": "png"
}
}
Response:
{
"status": "success",
"data": "iVBORw0KGgoAAAANSUhEUgAA...",
"mimeType": "image/png",
"metadata": {
"width": 800,
"height": 600,
"format": "png",
"fileSize": 123456,
"timestamp": "2024-12-01T10:32:00.000Z",
"region": {
"x": 100,
"y": 100,
"width": 800,
"height": 600
}
}
}
List all connected displays with resolution and position information.
Parameters: None
Example:
{
"name": "screenshot_list_displays",
"arguments": {}
}
Response:
{
"status": "success",
"displays": [
{
"id": "0",
"name": "Primary Display",
"resolution": { "width": 1920, "height": 1080 },
"position": { "x": 0, "y": 0 },
"isPrimary": true
},
{
"id": "1",
"name": "Secondary Display",
"resolution": { "width": 1920, "height": 1080 },
"position": { "x": 1920, "y": 0 },
"isPrimary": false
}
]
}
List all visible windows with title, process, and position information.
Parameters: None
Example:
{
"name": "screenshot_list_windows",
"arguments": {}
}
Response:
{
"status": "success",
"windows": [
{
"id": "12345",
"title": "Google Chrome",
"processName": "chrome",
"pid": 5678,
"bounds": { "x": 100, "y": 100, "width": 1280, "height": 720 },
"isMinimized": false
},
{
"id": "67890",
"title": "Terminal",
"processName": "gnome-terminal",
"pid": 9012,
"bounds": { "x": 200, "y": 200, "width": 800, "height": 600 },
"isMinimized": false
}
]
}
The server enforces security policies to control screenshot operations. Configure via environment variables or security policy file.
SCREENSHOT_ALLOWED_DIRS: Comma-separated list of allowed directories for saving screenshotsSCREENSHOT_MAX_CAPTURES_PER_MIN: Maximum captures per minute (default: 60)SCREENSHOT_ENABLE_AUDIT_LOG: Enable audit logging (default: true)SCREENSHOT_BLOCKED_WINDOWS: Comma-separated list of window title patterns to excludeCreate a security-policy.json file:
{
"allowedDirectories": ["/home/user/screenshots", "/tmp/screenshots"],
"blockedWindowPatterns": [
".*Password.*",
".*1Password.*",
".*LastPass.*",
".*Bitwarden.*",
".*Authentication.*"
],
"maxCapturesPerMinute": 60,
"enableAuditLog": true
}
Load the policy when starting the server:
import { MCPScreenshotServer } from "@ai-capabilities-suite/mcp-screenshot";
import * as fs from "fs";
const policy = JSON.parse(fs.readFileSync("security-policy.json", "utf-8"));
const server = new MCPScreenshotServer(policy);
await server.start();
All tools return structured error responses with error codes and remediation suggestions.
| Code | Description | Remediation |
|---|---|---|
PERMISSION_DENIED | Insufficient permissions to capture | Grant Screen Recording permission (macOS) or check user permissions |
INVALID_PATH | File path outside allowed directories | Use a path within configured allowed directories |
WINDOW_NOT_FOUND | Specified window does not exist | Use screenshot_list_windows to find available windows |
DISPLAY_NOT_FOUND | Specified display does not exist | Use screenshot_list_displays to find available displays |
UNSUPPORTED_FORMAT | Requested format not supported | Use png, jpeg, webp, or bmp |
CAPTURE_FAILED | Screenshot capture failed | Check permissions and try again |
RATE_LIMIT_EXCEEDED | Too many captures in time window | Wait before making additional requests |
INVALID_REGION | Invalid region coordinates or dimensions | Ensure coordinates are non-negative and dimensions are positive |
OUT_OF_MEMORY | Insufficient memory for operation | Reduce capture size or close other applications |
ENCODING_FAILED | Image encoding failed | Try different format or reduce quality |
FILE_SYSTEM_ERROR | File system operation failed | Check permissions and disk space |
{
"status": "error",
"error": {
"code": "WINDOW_NOT_FOUND",
"message": "Window with ID '12345' not found",
"details": {
"windowId": "12345"
},
"remediation": "Verify the window exists and is visible. Use screenshot_list_windows to see available windows."
}
}
Problem: import: command not found or grim: command not found
Solution: Install required packages:
# X11
sudo apt-get install imagemagick
# Wayland
sudo apt-get install grim
Problem: Black screen or empty captures
Solution: Check display server environment variables:
echo $DISPLAY # Should show :0 or similar for X11
echo $WAYLAND_DISPLAY # Should show wayland-0 or similar for Wayland
Problem: PERMISSION_DENIED error
Solution: Grant Screen Recording permission:
Problem: Retina display captures are double resolution
Solution: This is expected behavior. Retina displays have 2x pixel density. Use the width and height from metadata to determine actual dimensions.
Problem: Capture fails with access denied
Solution: Run the application with administrator privileges or check Windows Defender settings.
Problem: Multi-monitor captures show wrong display
Solution: Use screenshot_list_displays to get correct display IDs and positions.
Problem: RATE_LIMIT_EXCEEDED error
Solution: The server limits captures to prevent abuse. Wait 60 seconds or adjust maxCapturesPerMinute in security policy.
Problem: INVALID_PATH error when saving
Solution: Ensure the save path is within allowed directories configured in security policy.
Problem: PII masking not working
Solution:
eng.traineddata language file is availableProblem: Large file sizes
Solution:
Problem: Out of memory errors
Solution:
import { MCPScreenshotServer } from "@ai-capabilities-suite/mcp-screenshot";
// Create server with custom security policy
const server = new MCPScreenshotServer({
allowedDirectories: ["/home/user/screenshots"],
maxCapturesPerMinute: 30,
enableAuditLog: true,
blockedWindowPatterns: [".*Password.*"],
});
// Start server
await server.start();
// Server will handle MCP protocol requests via stdio
// Keep process running
process.on("SIGINT", async () => {
await server.stop();
process.exit(0);
});
import { createCaptureEngine } from "@ai-capabilities-suite/mcp-screenshot";
// Create platform-specific capture engine
const engine = createCaptureEngine();
// Capture full screen
const fullScreen = await engine.captureScreen();
// List and capture windows
const windows = await engine.getWindows();
const window = windows.find((w) => w.title.includes("Chrome"));
if (window) {
const buffer = await engine.captureWindow(window.id, false);
}
// Capture region
const region = await engine.captureRegion(100, 100, 800, 600);
// List displays
const displays = await engine.getDisplays();
console.log(`Found ${displays.length} displays`);
This package is part of the AI Capabilities Suite monorepo.
npm run build
# Run all tests
npm test
# Run specific test suites
npm test -- capture
npm test -- security
npm test -- property
# Run with coverage
npm test -- --coverage
packages/mcp-screenshot/
├── src/
│ ├── capture/ # Platform-specific capture engines
│ ├── processing/ # Image processing and encoding
│ ├── privacy/ # PII detection and masking
│ ├── security/ # Security policy enforcement
│ ├── storage/ # File operations
│ ├── tools/ # MCP tool implementations
│ ├── interfaces/ # TypeScript interfaces
│ ├── types/ # Type definitions
│ ├── errors/ # Error classes
│ ├── server.ts # MCP server implementation
│ └── cli.ts # CLI entry point
├── README.md
├── TESTING.md
└── package.json
Contributions are welcome! Please ensure:
npm test)MIT
For issues and questions: