Wraps Ultralytics YOLOE to expose zero-shot object detection and segmentation through two MCP tools. You pass an image as a local path, URL, or Base64 string, optionally specify text prompts like "blue coffee cup next to the spoon", and get back bounding boxes or polygon masks without training a model. Uses yoloe-26l-seg.pt by default, which hits 55.0 mAP on COCO at 6.2ms on T4. Reach for this when you need Claude or another agent to parse objects from images using arbitrary natural language queries instead of fixed class lists.
mcp-name: io.github.rjn32s/mcp-yolo
MCP-YOLO is an agent-first development platform that provides Zero-Shot Object Detection and Segmentation as a Model Context Protocol (MCP) server. Powered by Ultralytics YOLOE, it enables developers and AI agents to detect and segment objects using arbitrary text prompts without retraining.
YOLOE builds upon the latest YOLO architectures (like YOLO11 and YOLO26) to provide state-of-the-art open-vocabulary performance.
| Model | Based On | mAP (COCO) | Speed (T4/ms) | Params (M) |
|---|---|---|---|---|
| YOLOE26-N | YOLO26-N | 40.9 | 1.7 | ~3.0 |
| YOLOE26-S | YOLO26-S | 48.6 | 2.5 | ~10.0 |
| YOLOE26-L | YOLO26-L | 55.0 | 6.2 | ~40.0 |
| YOLOE-L | YOLO11-L | ~52.0 | ~5.0 | ~26.0 |
Note: Performance varies depending on the hardware and input resolution. mcp-yolo uses yoloe-26l-seg.pt by default for high precision.
uv pip install mcp-yolo
uv run mcp-yolo
detect_objectsPerforms zero-shot detection.
image_source (str): Path, URL, or Base64.classes (list[str], optional): Custom text prompts to detect.segment_objectsPerforms zero-shot instance segmentation.
image_source (str): Path, URL, or Base64.classes (list[str], optional): Custom text prompts to segment.This project is configured for automated PyPI publishing. See the pypi_setup_guide.md for details.