Scrapeninja

131 installs63 stars

Summary

ScrapeNinja gives Claude two endpoints for web scraping: a fast non-JS mode with Chrome TLS fingerprinting, and a full browser mode for JavaScript-heavy sites. The smart retry system lets you specify status codes or text patterns that should trigger another attempt, which is handy for flaky targets. It includes geo-based proxies across US, EU, and a few other regions, plus a Cheerio extractor for pulling structured data without writing separate parsing code. The AJAX interception feature is genuinely useful if you're trying to grab API responses that populate a page. Start with the fast endpoint and only move to JS rendering if you need it, since it's slower and burns more credits.

Install to Claude Code

npx -y skills add vm0-ai/vm0-skills --skill scrapeninja --agent claude-code

Installs into .claude/skills of the current project.

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

Files

SKILL.mdView on GitHub

Troubleshooting

If requests fail, run zero doctor check-connector --env-name SCRAPENINJA_TOKEN or zero doctor check-connector --url https://scrapeninja.p.rapidapi.com/scrape --method POST

How to Use

1. Basic Scrape (Non-JS, Fast)

High-performance scraping with Chrome TLS fingerprint, no JavaScript:

Write to /tmp/scrapeninja_request.json:

{
  "url": "https://example.com"
}

Then run:

curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $SCRAPENINJA_TOKEN" -d @/tmp/scrapeninja_request.json | jq '{status: .info.statusCode, url: .info.finalUrl, bodyLength: (.body | length)}'

With custom headers and retries:

Write to /tmp/scrapeninja_request.json:

{
  "url": "https://example.com",
  "headers": ["Accept-Language: en-US"],
  "retryNum": 3,
  "timeout": 15
}

Then run:

curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $SCRAPENINJA_TOKEN" -d @/tmp/scrapeninja_request.json

2. Scrape with JavaScript Rendering

For JavaScript-heavy sites (React, Vue, etc.):

Write to /tmp/scrapeninja_request.json:

{
  "url": "https://example.com",
  "waitForSelector": "h1",
  "timeout": 20
}

Then run:

curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape-js" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $SCRAPENINJA_TOKEN" -d @/tmp/scrapeninja_request.json | jq '{status: .info.statusCode, bodyLength: (.body | length)}'

With screenshot:

Write to /tmp/scrapeninja_request.json:

{
  "url": "https://example.com",
  "screenshot": true
}

Then run:

# Get screenshot URL from response
curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape-js" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $SCRAPENINJA_TOKEN" -d @/tmp/scrapeninja_request.json | jq -r '.info.screenshot'

3. Geo-Based Proxy Selection

Use proxies from specific regions:

Write to /tmp/scrapeninja_request.json:

{
  "url": "https://example.com",
  "geo": "eu"
}

Then run:

curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $SCRAPENINJA_TOKEN" -d @/tmp/scrapeninja_request.json | jq .info

Available geos: us, eu, br (Brazil), fr (France), de (Germany), 4g-eu

4. Smart Retries

Retry on specific HTTP status codes or text patterns:

Write to /tmp/scrapeninja_request.json:

{
  "url": "https://example.com",
  "retryNum": 3,
  "statusNotExpected": [403, 429, 503],
  "textNotExpected": ["captcha", "Access Denied"]
}

Then run:

curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $SCRAPENINJA_TOKEN" -d @/tmp/scrapeninja_request.json

5. Extract Data with Cheerio

Extract structured JSON using Cheerio extractor functions:

Write to /tmp/scrapeninja_request.json:

{
  "url": "https://news.ycombinator.com",
  "extractor": "function(input, cheerio) { let $ = cheerio.load(input); return $(\".titleline > a\").slice(0,5).map((i,el) => ({title: $(el).text(), url: $(el).attr(\"href\")})).get(); }"
}

Then run:

curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $SCRAPENINJA_TOKEN" -d @/tmp/scrapeninja_request.json | jq '.extractor'

6. Intercept AJAX Requests

Capture XHR/fetch responses:

Write to /tmp/scrapeninja_request.json:

{
  "url": "https://example.com",
  "catchAjaxHeadersUrlMask": "api/data"
}

Then run:

curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape-js" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $SCRAPENINJA_TOKEN" -d @/tmp/scrapeninja_request.json | jq '.info.catchedAjax'

7. Block Resources for Speed

Speed up JS rendering by blocking images and media:

Write to /tmp/scrapeninja_request.json:

{
  "url": "https://example.com",
  "blockImages": true,
  "blockMedia": true
}

Then run:

curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape-js" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $SCRAPENINJA_TOKEN" -d @/tmp/scrapeninja_request.json

API Endpoints

Endpoint	Description
`/scrape`	Fast non-JS scraping with Chrome TLS fingerprint
`/scrape-js`	Full Chrome browser with JS rendering
`/v2/scrape-js`	Enhanced JS rendering for protected sites (APIRoad only)

Request Parameters

Common Parameters (all endpoints)

Parameter	Type	Default	Description
`url`	string	required	URL to scrape
`headers`	string[]	-	Custom HTTP headers
`retryNum`	int	1	Number of retry attempts
`geo`	string	`us`	Proxy geo: us, eu, br, fr, de, 4g-eu
`proxy`	string	-	Custom proxy URL (overrides geo)
`timeout`	int	10/16	Timeout per attempt in seconds
`textNotExpected`	string[]	-	Text patterns that trigger retry
`statusNotExpected`	int[]	[403, 502]	HTTP status codes that trigger retry
`extractor`	string	-	Cheerio extractor function

JS Rendering Parameters (`/scrape-js`, `/v2/scrape-js`)

Parameter	Type	Default	Description
`waitForSelector`	string	-	CSS selector to wait for
`postWaitTime`	int	-	Extra wait time after load (1-12s)
`screenshot`	bool	true	Take page screenshot
`blockImages`	bool	false	Block image loading
`blockMedia`	bool	false	Block CSS/fonts loading
`catchAjaxHeadersUrlMask`	string	-	URL pattern to intercept AJAX
`viewport`	object	1920x1080	Custom viewport size

Response Format

{
  "info": {
  "statusCode": 200,
  "finalUrl": "https://example.com",
  "headers": ["content-type: text/html"],
  "screenshot": "base64-encoded-png",
  "catchedAjax": {
  "url": "https://example.com/api/data",
  "method": "GET",
  "body": "...",
  "status": 200
  }
  },
  "body": "<html>...</html>",
  "extractor": { "extracted": "data" }
}

Guidelines

Start with /scrape: Use the fast non-JS endpoint first, only switch to /scrape-js if needed
Retries: Set retryNum to 2-3 for unreliable sites
Geo Selection: Use eu for European sites, us for American sites
Extractors: Test extractors at https://scrapeninja.net/cheerio-sandbox/
Blocked Sites: For Cloudflare/Datadome protected sites, use /v2/scrape-js via APIRoad
Screenshots: Set screenshot: false to speed up JS rendering
Rate Limits: Check your plan limits on RapidAPI/APIRoad dashboard

Tools

Playground: https://scrapeninja.net/scraper-sandbox
Cheerio Sandbox: https://scrapeninja.net/cheerio-sandbox
cURL Converter: https://scrapeninja.net/curl-to-scraper

Featured

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

First SeenJun 3, 2026

View on GitHub

Troubleshooting

If requests fail, run zero doctor check-connector --env-name SCRAPENINJA_TOKEN or zero doctor check-connector --url https://scrapeninja.p.rapidapi.com/scrape --method POST

How to Use

1. Basic Scrape (Non-JS, Fast)

High-performance scraping with Chrome TLS fingerprint, no JavaScript:

Write to /tmp/scrapeninja_request.json:

{
  "url": "https://example.com"
}

Then run:

curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $SCRAPENINJA_TOKEN" -d @/tmp/scrapeninja_request.json | jq '{status: .info.statusCode, url: .info.finalUrl, bodyLength: (.body | length)}'

With custom headers and retries:

Write to /tmp/scrapeninja_request.json:

{
  "url": "https://example.com",
  "headers": ["Accept-Language: en-US"],
  "retryNum": 3,
  "timeout": 15
}

Then run:

curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $SCRAPENINJA_TOKEN" -d @/tmp/scrapeninja_request.json

2. Scrape with JavaScript Rendering

For JavaScript-heavy sites (React, Vue, etc.):

Write to /tmp/scrapeninja_request.json:

{
  "url": "https://example.com",
  "waitForSelector": "h1",
  "timeout": 20
}

Then run:

curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape-js" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $SCRAPENINJA_TOKEN" -d @/tmp/scrapeninja_request.json | jq '{status: .info.statusCode, bodyLength: (.body | length)}'

With screenshot:

Write to /tmp/scrapeninja_request.json:

{
  "url": "https://example.com",
  "screenshot": true
}

Then run:

# Get screenshot URL from response
curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape-js" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $SCRAPENINJA_TOKEN" -d @/tmp/scrapeninja_request.json | jq -r '.info.screenshot'

3. Geo-Based Proxy Selection

Use proxies from specific regions:

Write to /tmp/scrapeninja_request.json:

{
  "url": "https://example.com",
  "geo": "eu"
}

Then run:

curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $SCRAPENINJA_TOKEN" -d @/tmp/scrapeninja_request.json | jq .info

Available geos: us, eu, br (Brazil), fr (France), de (Germany), 4g-eu

4. Smart Retries

Retry on specific HTTP status codes or text patterns:

Write to /tmp/scrapeninja_request.json:

{
  "url": "https://example.com",
  "retryNum": 3,
  "statusNotExpected": [403, 429, 503],
  "textNotExpected": ["captcha", "Access Denied"]
}

Then run:

curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $SCRAPENINJA_TOKEN" -d @/tmp/scrapeninja_request.json

5. Extract Data with Cheerio

Extract structured JSON using Cheerio extractor functions:

Write to /tmp/scrapeninja_request.json:

{
  "url": "https://news.ycombinator.com",
  "extractor": "function(input, cheerio) { let $ = cheerio.load(input); return $(\".titleline > a\").slice(0,5).map((i,el) => ({title: $(el).text(), url: $(el).attr(\"href\")})).get(); }"
}

Then run:

curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $SCRAPENINJA_TOKEN" -d @/tmp/scrapeninja_request.json | jq '.extractor'

6. Intercept AJAX Requests

Capture XHR/fetch responses:

Write to /tmp/scrapeninja_request.json:

{
  "url": "https://example.com",
  "catchAjaxHeadersUrlMask": "api/data"
}

Then run:

curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape-js" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $SCRAPENINJA_TOKEN" -d @/tmp/scrapeninja_request.json | jq '.info.catchedAjax'

7. Block Resources for Speed

Speed up JS rendering by blocking images and media:

Write to /tmp/scrapeninja_request.json:

{
  "url": "https://example.com",
  "blockImages": true,
  "blockMedia": true
}

Then run:

curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape-js" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $SCRAPENINJA_TOKEN" -d @/tmp/scrapeninja_request.json

API Endpoints

Endpoint	Description
`/scrape`	Fast non-JS scraping with Chrome TLS fingerprint
`/scrape-js`	Full Chrome browser with JS rendering
`/v2/scrape-js`	Enhanced JS rendering for protected sites (APIRoad only)

Request Parameters

Common Parameters (all endpoints)

Parameter	Type	Default	Description
`url`	string	required	URL to scrape
`headers`	string[]	-	Custom HTTP headers
`retryNum`	int	1	Number of retry attempts
`geo`	string	`us`	Proxy geo: us, eu, br, fr, de, 4g-eu
`proxy`	string	-	Custom proxy URL (overrides geo)
`timeout`	int	10/16	Timeout per attempt in seconds
`textNotExpected`	string[]	-	Text patterns that trigger retry
`statusNotExpected`	int[]	[403, 502]	HTTP status codes that trigger retry
`extractor`	string	-	Cheerio extractor function

JS Rendering Parameters (`/scrape-js`, `/v2/scrape-js`)

Parameter	Type	Default	Description
`waitForSelector`	string	-	CSS selector to wait for
`postWaitTime`	int	-	Extra wait time after load (1-12s)
`screenshot`	bool	true	Take page screenshot
`blockImages`	bool	false	Block image loading
`blockMedia`	bool	false	Block CSS/fonts loading
`catchAjaxHeadersUrlMask`	string	-	URL pattern to intercept AJAX
`viewport`	object	1920x1080	Custom viewport size

Response Format

{
  "info": {
  "statusCode": 200,
  "finalUrl": "https://example.com",
  "headers": ["content-type: text/html"],
  "screenshot": "base64-encoded-png",
  "catchedAjax": {
  "url": "https://example.com/api/data",
  "method": "GET",
  "body": "...",
  "status": 200
  }
  },
  "body": "<html>...</html>",
  "extractor": { "extracted": "data" }
}

Guidelines

Start with /scrape: Use the fast non-JS endpoint first, only switch to /scrape-js if needed
Retries: Set retryNum to 2-3 for unreliable sites
Geo Selection: Use eu for European sites, us for American sites
Extractors: Test extractors at https://scrapeninja.net/cheerio-sandbox/
Blocked Sites: For Cloudflare/Datadome protected sites, use /v2/scrape-js via APIRoad
Screenshots: Set screenshot: false to speed up JS rendering
Rate Limits: Check your plan limits on RapidAPI/APIRoad dashboard

Tools

Playground: https://scrapeninja.net/scraper-sandbox
Cheerio Sandbox: https://scrapeninja.net/cheerio-sandbox
cURL Converter: https://scrapeninja.net/curl-to-scraper

Scrapeninja

Install to Claude Code

Troubleshooting

How to Use

1. Basic Scrape (Non-JS, Fast)

2. Scrape with JavaScript Rendering

3. Geo-Based Proxy Selection

4. Smart Retries

5. Extract Data with Cheerio

6. Intercept AJAX Requests

7. Block Resources for Speed

API Endpoints

Request Parameters

Common Parameters (all endpoints)

JS Rendering Parameters (`/scrape-js`, `/v2/scrape-js`)

Response Format

Guidelines

Tools

Scrapeninja

Install to Claude Code

Troubleshooting

How to Use

1. Basic Scrape (Non-JS, Fast)

2. Scrape with JavaScript Rendering

3. Geo-Based Proxy Selection

4. Smart Retries

5. Extract Data with Cheerio

6. Intercept AJAX Requests

7. Block Resources for Speed

API Endpoints

Request Parameters

Common Parameters (all endpoints)

JS Rendering Parameters (`/scrape-js`, `/v2/scrape-js`)

Response Format

Guidelines

Tools

Recommended

Recommended

Scrapeninja

Install to Claude Code

Troubleshooting

How to Use

1. Basic Scrape (Non-JS, Fast)

2. Scrape with JavaScript Rendering

3. Geo-Based Proxy Selection

4. Smart Retries

5. Extract Data with Cheerio

6. Intercept AJAX Requests

7. Block Resources for Speed

API Endpoints

Request Parameters

Common Parameters (all endpoints)

JS Rendering Parameters (/scrape-js, /v2/scrape-js)

Response Format

Guidelines

Tools

Scrapeninja

Install to Claude Code

Troubleshooting

How to Use

1. Basic Scrape (Non-JS, Fast)

2. Scrape with JavaScript Rendering

3. Geo-Based Proxy Selection

4. Smart Retries

5. Extract Data with Cheerio

6. Intercept AJAX Requests

7. Block Resources for Speed

API Endpoints

Request Parameters

Common Parameters (all endpoints)

JS Rendering Parameters (/scrape-js, /v2/scrape-js)

Response Format

Guidelines

Tools

Recommended

Recommended

JS Rendering Parameters (`/scrape-js`, `/v2/scrape-js`)

JS Rendering Parameters (`/scrape-js`, `/v2/scrape-js`)