Browser Tools

224 installs1.9k stars

Summary

Connects to Chrome via DevTools Protocol for programmatic browser automation. You get tools to navigate, evaluate JavaScript, take screenshots, and extract page content as markdown. The browser-pick tool is interesting for letting users click elements directly when you need selectors. The efficiency guide pushes you toward DOM inspection over screenshots and batching interactions in single eval calls, which is the right instinct. Use this when you're testing frontends, dealing with JavaScript-heavy pages, or need to preserve login state by copying the user's browser profile. It's real browser automation, not headless scraping, so you see what actually renders.

Install to Claude Code

npx -y skills add badlogic/pi-skills --skill browser-tools --agent claude-code

Installs into .claude/skills of the current project.

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

Files

SKILL.mdView on GitHub

Browser Tools

Chrome DevTools Protocol tools for agent-assisted web automation. These tools connect to Chrome running on :9222 with remote debugging enabled.

Setup

Run once before first use:

cd {baseDir}/browser-tools
npm install

Start Chrome

{baseDir}/browser-start.js              # Fresh profile
{baseDir}/browser-start.js --profile    # Copy user's profile (cookies, logins)

Launch Chrome with remote debugging on :9222. Use --profile to preserve user's authentication state.

Navigate

{baseDir}/browser-nav.js https://example.com
{baseDir}/browser-nav.js https://example.com --new

Navigate to URLs. Use --new flag to open in a new tab instead of reusing current tab.

Evaluate JavaScript

{baseDir}/browser-eval.js 'document.title'
{baseDir}/browser-eval.js 'document.querySelectorAll("a").length'

Execute JavaScript in the active tab. Code runs in async context. Use this to extract data, inspect page state, or perform DOM operations programmatically.

Screenshot

{baseDir}/browser-screenshot.js

Capture current viewport and return temporary file path. Use this to visually inspect page state or verify UI changes.

Pick Elements

{baseDir}/browser-pick.js "Click the submit button"

IMPORTANT: Use this tool when the user wants to select specific DOM elements on the page. This launches an interactive picker that lets the user click elements to select them. The user can select multiple elements (Cmd/Ctrl+Click) and press Enter when done. The tool returns CSS selectors for the selected elements.

Common use cases:

User says "I want to click that button" → Use this tool to let them select it
User says "extract data from these items" → Use this tool to let them select the elements
When you need specific selectors but the page structure is complex or ambiguous

Cookies

{baseDir}/browser-cookies.js

Display all cookies for the current tab including domain, path, httpOnly, and secure flags. Use this to debug authentication issues or inspect session state.

Extract Page Content

{baseDir}/browser-content.js https://example.com

Navigate to a URL and extract readable content as markdown. Uses Mozilla Readability for article extraction and Turndown for HTML-to-markdown conversion. Works on pages with JavaScript content (waits for page to load).

When to Use

Testing frontend code in a real browser
Interacting with pages that require JavaScript
When user needs to visually see or interact with a page
Debugging authentication or session issues
Scraping dynamic content that requires JS execution

Efficiency Guide

DOM Inspection Over Screenshots

Don't take screenshots to see page state. Do parse the DOM directly:

// Get page structure
document.body.innerHTML.slice(0, 5000)

// Find interactive elements
Array.from(document.querySelectorAll('button, input, [role="button"]')).map(e => ({
  id: e.id,
  text: e.textContent.trim(),
  class: e.className
}))

Complex Scripts in Single Calls

Wrap everything in an IIFE to run multi-statement code:

(function() {
  // Multiple operations
  const data = document.querySelector('#target').textContent;
  const buttons = document.querySelectorAll('button');
  
  // Interactions
  buttons[0].click();
  
  // Return results
  return JSON.stringify({ data, buttonCount: buttons.length });
})()

Batch Interactions

Don't make separate calls for each click. Do batch them:

(function() {
  const actions = ["btn1", "btn2", "btn3"];
  actions.forEach(id => document.getElementById(id).click());
  return "Done";
})()

Typing/Input Sequences

(function() {
  const text = "HELLO";
  for (const char of text) {
    document.getElementById("key-" + char).click();
  }
  document.getElementById("submit").click();
  return "Submitted: " + text;
})()

Reading App/Game State

Extract structured state in one call:

(function() {
  const state = {
    score: document.querySelector('.score')?.textContent,
    status: document.querySelector('.status')?.className,
    items: Array.from(document.querySelectorAll('.item')).map(el => ({
      text: el.textContent,
      active: el.classList.contains('active')
    }))
  };
  return JSON.stringify(state, null, 2);
})()

Waiting for Updates

If DOM updates after actions, add a small delay with bash:

sleep 0.5 && {baseDir}/browser-eval.js '...'

Investigate Before Interacting

Always start by understanding the page structure:

(function() {
  return {
    title: document.title,
    forms: document.forms.length,
    buttons: document.querySelectorAll('button').length,
    inputs: document.querySelectorAll('input').length,
    mainContent: document.body.innerHTML.slice(0, 3000)
  };
})()

Then target specific elements based on what you find.

Featured

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

Browser Tools

Chrome DevTools Protocol tools for agent-assisted web automation. These tools connect to Chrome running on :9222 with remote debugging enabled.

Setup

Run once before first use:

cd {baseDir}/browser-tools
npm install

Start Chrome

{baseDir}/browser-start.js              # Fresh profile
{baseDir}/browser-start.js --profile    # Copy user's profile (cookies, logins)

Launch Chrome with remote debugging on :9222. Use --profile to preserve user's authentication state.

Navigate

{baseDir}/browser-nav.js https://example.com
{baseDir}/browser-nav.js https://example.com --new

Navigate to URLs. Use --new flag to open in a new tab instead of reusing current tab.

Evaluate JavaScript

{baseDir}/browser-eval.js 'document.title'
{baseDir}/browser-eval.js 'document.querySelectorAll("a").length'

Execute JavaScript in the active tab. Code runs in async context. Use this to extract data, inspect page state, or perform DOM operations programmatically.

Screenshot

{baseDir}/browser-screenshot.js

Capture current viewport and return temporary file path. Use this to visually inspect page state or verify UI changes.

Pick Elements

{baseDir}/browser-pick.js "Click the submit button"

Common use cases:

User says "I want to click that button" → Use this tool to let them select it
User says "extract data from these items" → Use this tool to let them select the elements
When you need specific selectors but the page structure is complex or ambiguous

Cookies

{baseDir}/browser-cookies.js

Display all cookies for the current tab including domain, path, httpOnly, and secure flags. Use this to debug authentication issues or inspect session state.

Extract Page Content

{baseDir}/browser-content.js https://example.com

When to Use

Testing frontend code in a real browser
Interacting with pages that require JavaScript
When user needs to visually see or interact with a page
Debugging authentication or session issues
Scraping dynamic content that requires JS execution

Efficiency Guide

DOM Inspection Over Screenshots

Don't take screenshots to see page state. Do parse the DOM directly:

// Get page structure
document.body.innerHTML.slice(0, 5000)

// Find interactive elements
Array.from(document.querySelectorAll('button, input, [role="button"]')).map(e => ({
  id: e.id,
  text: e.textContent.trim(),
  class: e.className
}))

Complex Scripts in Single Calls

Wrap everything in an IIFE to run multi-statement code:

(function() {
  // Multiple operations
  const data = document.querySelector('#target').textContent;
  const buttons = document.querySelectorAll('button');
  
  // Interactions
  buttons[0].click();
  
  // Return results
  return JSON.stringify({ data, buttonCount: buttons.length });
})()

Batch Interactions

Don't make separate calls for each click. Do batch them:

(function() {
  const actions = ["btn1", "btn2", "btn3"];
  actions.forEach(id => document.getElementById(id).click());
  return "Done";
})()

Typing/Input Sequences

(function() {
  const text = "HELLO";
  for (const char of text) {
    document.getElementById("key-" + char).click();
  }
  document.getElementById("submit").click();
  return "Submitted: " + text;
})()

Reading App/Game State

Extract structured state in one call:

(function() {
  const state = {
    score: document.querySelector('.score')?.textContent,
    status: document.querySelector('.status')?.className,
    items: Array.from(document.querySelectorAll('.item')).map(el => ({
      text: el.textContent,
      active: el.classList.contains('active')
    }))
  };
  return JSON.stringify(state, null, 2);
})()

Waiting for Updates

If DOM updates after actions, add a small delay with bash:

sleep 0.5 && {baseDir}/browser-eval.js '...'

Investigate Before Interacting

Always start by understanding the page structure:

(function() {
  return {
    title: document.title,
    forms: document.forms.length,
    buttons: document.querySelectorAll('button').length,
    inputs: document.querySelectorAll('input').length,
    mainContent: document.body.innerHTML.slice(0, 3000)
  };
})()

Then target specific elements based on what you find.

Browser Tools

Install to Claude Code

Browser Tools

Setup

Start Chrome

Navigate

Evaluate JavaScript

Screenshot

Pick Elements

Cookies

Extract Page Content

When to Use

Efficiency Guide

DOM Inspection Over Screenshots

Complex Scripts in Single Calls

Batch Interactions

Typing/Input Sequences

Reading App/Game State

Waiting for Updates

Investigate Before Interacting

Browser Tools

Install to Claude Code

Browser Tools

Setup

Start Chrome

Navigate

Evaluate JavaScript

Screenshot

Pick Elements

Cookies

Extract Page Content

When to Use

Efficiency Guide

DOM Inspection Over Screenshots

Complex Scripts in Single Calls

Batch Interactions

Typing/Input Sequences

Reading App/Game State

Waiting for Updates

Investigate Before Interacting

Recommended

Recommended