Gives Claude the ability to actually browse the web through a native Chromium instance, not a headless scraper or extension. Ships with 12 tools including page description, click/fill automation, structured data extraction, multi-step pipelines, and even shell command execution. The clever part is token efficiency: around 30-80 tokens per page view instead of burning through screenshot tokens. Built as an Electron app with a stdio-to-HTTP bridge since the main process can't touch webview content directly. Includes stealth fingerprinting patches, PII redaction, credential vault integration, and four permission levels from auto-approve to blocked. Reach for this when you need Claude to log into sites, fill forms, extract tables, or run repeated workflows with DOM diffing and selector caching for speed.
Public tool metadata for what this MCP can expose to an agent.
pageDescribe the current page in Oculo browser. Default: compact (~30-80 tokens). Use detail="a11y" for ref-tagged accessibility tree, detail="markdown" for full article content.3 paramsDescribe the current page in Oculo browser. Default: compact (~30-80 tokens). Use detail="a11y" for ref-tagged accessibility tree, detail="markdown" for full article content.
scopestringdetailstringcompact · a11y · markdownscreenshotbooleanactPerform an action in Oculo browser: click, navigate, scroll, press key, hover, type, login. Elements found by ref (from a11y snapshot), text, role, label, placeholder, or CSS selector.6 paramsPerform an action in Oculo browser: click, navigate, scroll, press key, hover, type, login. Elements found by ref (from a11y snapshot), text, role, label, placeholder, or CSS selector.
keystringrefstringurlstringtextstringactionstringclick · navigate · scroll · press · hover · typeselectorstringfillFill form fields in Oculo browser by label, placeholder, or data-placeholder text. Handles text, select, checkbox, textarea, and contenteditable (DraftJS, ProseMirror).2 paramsFill form fields in Oculo browser by label, placeholder, or data-placeholder text. Handles text, select, checkbox, textarea, and contenteditable (DraftJS, ProseMirror).
fieldsobjectsubmitvaluereadExtract structured data from the page in Oculo browser (search results, tables, lists, articles).4 paramsExtract structured data from the page in Oculo browser (search results, tables, lists, articles).
whatstringlimitnumberscopestringformatstringtext · jsonrunExecute a multi-step pipeline in Oculo browser. PREFERRED for any task with 2+ actions. Each step is an object with exactly ONE key: page, act, fill, read, wait, or if.3 paramsExecute a multi-step pipeline in Oculo browser. PREFERRED for any task with 2+ actions. Each step is an object with exactly ONE key: page, act, fill, read, wait, or if.
stepsarrayreturnAllbooleandescriptionstringmediaGenerate images (Nano Banana / DALL-E 3) or videos (Veo 3.1) via Oculo browser.2 paramsGenerate images (Nano Banana / DALL-E 3) or videos (Veo 3.1) via Oculo browser.
typestringimage · videopromptstringshellExecute shell commands non-interactively via Oculo.1 paramsExecute shell commands non-interactively via Oculo.
commandstringtabsList all open browser tabs in Oculo.List all open browser tabs in Oculo.
No parameter schema in public metadata yet.
researchDeep web research across multiple tabs via Oculo browser.2 paramsDeep web research across multiple tabs via Oculo browser.
depthstringquick · medium · deepquerystringpreviewPre-fetch URL without navigating in Oculo browser.1 paramsPre-fetch URL without navigating in Oculo browser.
urlstringtranslateTranslate page or text to any language via Oculo browser.2 paramsTranslate page or text to any language via Oculo browser.
tostringtextstringlensVisual analysis via screenshot in Oculo browser.1 paramsVisual analysis via screenshot in Oculo browser.
promptstring
AI-Powered Native Browser
Website · Download · Quick Start · MCP Tools · Contributing
Cursor : VSCode :: Oculo : Chrome
Open-source AI browser that gives Claude Code, Cursor, Windsurf, and any MCP client the ability to see and interact with any website. 12 tools, under 300 tokens per flow.
| Feature | |
|---|---|
| Native browser | Full Chromium engine -- not a wrapper, extension, or headless scraper |
| 12 MCP tools | page, act, fill, read, run, media, shell, tabs, research, preview, translate, lens |
| < 300 tokens/flow | Compact responses by default -- cheaper than screenshot-based approaches |
| Self-healing automation | Selector caching + DOM diffing -- 44%+ faster on repeated workflows |
| Multi-provider AI | Built-in chat with Claude, OpenAI, Gemini, Grok, OpenClaw, Ollama |
| 4-level security | auto / notify / confirm / blocked permission gate on every action |
| OS keychain vault | Credentials encrypted via electron.safeStorage (macOS Keychain / Windows DPAPI) |
| PII redaction | Credit cards, SSNs, JWTs, API keys, Bearer tokens stripped from all MCP responses |
| Anti-injection | Content boundary markers + regex-based injection detection |
| 19 stealth patches | Navigator, WebGL, canvas, WebRTC, audio, font, battery, screen fingerprint defenses |
| Headless mode | Run without UI -- Docker support included |
| Cross-platform | macOS, Windows, Linux |
| Python SDK | pip install oculo -- sync and async clients |
Grab the latest release from Releases, or build from source:
git clone https://github.com/xidik12/oculo.git
cd oculo
npm install
npm run dev
claude mcp add oculo -- node ~/oculo/bin/oculo-mcp.mjs
Add to your MCP config (.cursor/mcp.json or equivalent):
{
"mcpServers": {
"oculo": {
"command": "node",
"args": ["/path/to/oculo/bin/oculo-mcp.mjs"]
}
}
}
Tools are always discoverable (static definitions in the bridge), but Oculo must be running for tool calls to succeed.
| Tool | What it does | Token cost |
|---|---|---|
page | Describe current page -- headings, forms, buttons, links. Supports compact, a11y (ref-tagged), and markdown modes | ~30-80 |
act | Navigate, click, hover, scroll, type, press keys, login via vault, manage tabs, cookies, proxy, recording | ~1 line |
fill | Fill form fields by label/placeholder matching, optional submit. Handles text, select, checkbox, contenteditable | ~1 line |
read | Extract structured data -- search results, tables, lists, articles | compact |
run | Multi-step pipeline with conditionals (page/act/fill/read/wait/if). Cached for replay | header + last |
media | Generate images (Nano Banana 2 / DALL-E 3) or videos (Veo 3.1). Image-to-image editing | file path |
shell | Execute shell commands non-interactively (ls, npm, git, python, etc.) | stdout+stderr |
tabs | List all open browser tabs with URLs and titles | compact |
research | Deep web research -- opens multiple tabs, reads pages, synthesizes findings | synthesized |
preview | Pre-fetch a URL without navigating away from the current page | page description |
translate | Translate page content or specific text to any language | translated text |
lens | Visual analysis of the current page via screenshot + AI vision | description |
Bonus: webmcp_list and webmcp_call discover and invoke page-declared tools via the WebMCP protocol.
You: "Log into GitHub and star the oculo repo"
Claude Code calls:
1. act({action: "navigate", url: "https://github.com/login"})
2. act({action: "login", site: "github.com"}) # vault lookup
3. act({action: "navigate", url: "https://github.com/xidik12/oculo"})
4. act({action: "click", text: "Star"})
Total: 4 tool calls, <100 tokens response
You: "Fill out the contact form on example.com"
Claude Code calls:
1. act({action: "navigate", url: "https://example.com/contact"})
2. page() # see the form
3. fill({fields: {"Name": "...", "Email": "..."}, submit: true})
Total: 3 tool calls
Run Oculo without a visible window for CI/CD, scraping, or server-side automation:
# Via convenience script
node bin/oculo-headless.mjs
# Or with flags
npx electron . --headless
npx electron . --headless --headless-auto-approve # auto-approve CONFIRM actions
# Environment variable
OCULO_HEADLESS=1 npm run dev
docker compose up
The included Dockerfile and docker-compose.yml run Oculo headless in a container with Xvfb.
from oculo import OculoClient
# Auto-discovers port from ~/.oculo-port
client = OculoClient()
# Describe the page
print(client.page())
# Navigate
client.act("navigate", url="https://example.com")
# Fill a form
client.fill({"Email": "hi@oculo.com", "Message": "Hello!"}, submit=True)
# Extract data
results = client.read("search results", format="json")
Async version available:
from oculo import AsyncOculoClient
async_client = AsyncOculoClient()
await async_client.act("navigate", url="https://example.com")
Install from the SDK directory:
pip install oculo
Claude Code / Cursor / Windsurf
|
| stdio (MCP protocol)
v
bin/oculo-mcp.mjs <-- stdio-to-HTTP bridge
|
| HTTP POST :19516/mcp (auth token)
v
McpServerManager <-- Electron main process
|
| IPC
v
Renderer (React 19) <-- Chromium process
|
| webview.executeJavaScript()
v
<webview> tags <-- Actual web pages
Why HTTP instead of stdio? Electron's <webview> is only accessible from the renderer process. The main process (where stdio lives) can't touch page content. The HTTP bridge solves this via main-to-renderer IPC.
Port discovery: Oculo writes port:authtoken to ~/.oculo-port on startup. The bridge reads this file automatically.
| Level | Actions | Behavior |
|---|---|---|
| Auto | navigate, page, read, scroll, screenshot, back, forward, reload, hover, listTabs, switchTab, preview, translate, lens | Executes silently |
| Notify | click, type, fill, select, press, submit, newTab, closeTab | Executes + OS notification |
| Confirm | payment, delete_account, change_password, send_email, download, oauth, shell, evaluate, setProxy, startRecording | Native dialog approval required |
| Blocked | read_vault, export_cookies, export_tokens, disable_security | Always rejected |
electron.safeStorage (OS Keychain on macOS, DPAPI on Windows)act({action: "login", site: "github.com"}) retrieves and fills credentials automaticallyAll MCP responses pass through a redactor before reaching the AI client. Stripped patterns: credit card numbers, SSNs, JWTs, API keys, private keys, Bearer tokens.
MCP content is wrapped in boundary markers. Regex-based detection blocks prompt injection attempts embedded in page content.
Navigator (webdriver, languages, plugins, mimeTypes, connection, hardwareConcurrency, deviceMemory), window (chrome API, dimensions), WebGL (vendor/renderer spoofing), canvas (fingerprint randomization), WebRTC (IP leak prevention), AudioContext, font enumeration blocking, Battery API, screen resolution randomization.
After successful act or fill calls, element selectors are cached with stability scores:
| Selector type | Score |
|---|---|
id | 10 |
data-testid | 10 |
aria-label | 9 |
role + name | 8 |
text | 7 |
css | 5 |
On subsequent runs, DOM diffing determines the strategy:
Built-in chat panel supports multiple providers:
| Provider | Auth | Models |
|---|---|---|
| Claude | API Key or CLI subscription | Opus, Sonnet, Haiku |
| OpenAI | API Key or Codex CLI | GPT-4o, GPT-4o mini, o1, o3 |
| Gemini | API Key | 2.0 Flash, 1.5 Pro, 1.5 Flash |
| Grok | API Key | Grok 2, Grok 2 Mini |
| Ollama | Local (no key) | Any pulled model |
| OpenClaw | API Key | OpenClaw models |
# Production build
npm run build
# Platform distributables
npm run dist:mac # macOS DMG + ZIP
npm run dist:win # Windows NSIS + portable
npm run dist:linux # Linux AppImage + deb
# Other commands
npm run typecheck # TypeScript checking
npm run lint # ESLint
npm run test # Vitest
npm run clean # Remove build artifacts
src/
main/ Electron main process
ai/agent.ts Multi-provider AI controller
captcha/ CAPTCHA detection + solvers
data/ Bookmarks, downloads, history, session recording
engine/ Page describer, extractor, form-detector, pipeline, resolver,
selector-cache, dom-differ, tab-manager
mcp/server.ts HTTP MCP server (port 19516-19520, auth token)
mcp/tools/ act, fill, page, read, run tool handlers
network/proxy.ts HTTP/SOCKS proxy manager
security/ Vault, permissions, redactor, audit, anti-injection
preload/index.ts contextBridge API
renderer/
App.tsx Root browser UI component
components/ TabBar, AddressBar, ChatPanel, WebViewContainer,
bookmarks, downloads, find, history, layout, common
shared/ Types, constants, IPC channels, AI provider definitions
bin/
oculo-mcp.mjs stdio-to-HTTP MCP bridge (for Claude Code / Cursor)
oculo-headless.mjs Headless mode launcher
sdk/python/ Python SDK (pip install oculo)
Dockerfile Container deployment
docker-compose.yml Docker Compose for headless mode
See CONTRIBUTING.md for development setup, architecture details, and how to add new MCP tools.
If Oculo saves you time, consider supporting development:
BTC: 12yRGpUfFznzZoz4yVfZKRxLSkAwbanw2B
Built by Salakhitdinov Khidayotullo | getoculo.com
therealtimex/browser-use
jae-jae/fetcher-mcp
merajmehrabi/puppeteer-mcp-server
com.thenextgennexus/playwright-mcp-server
saik0s/mcp-browser-use