CAT
/MCP
SkillsMCPMarketplacesDigestToolsAdvertise

This week in Claude

Every Monday: Claude Code, Agent SDK, MCP, and the Anthropic platform moves worth your time.

Skills by Category
Frontend DevelopmentBackend & APIsTesting & QASecurityDevOps & CI/CDGit & Pull RequestsDocumentationCode Review & QualityAI & Agent BuildingSkill Development
MCP Servers by Category
Sales & MarketingWeb & Browser AutomationDatabasesAI & LLM ToolsCloud & InfrastructureCommunication & MessagingDeveloper ToolsDesign & CreativeDocuments & KnowledgeSearch & Web Crawling
Marketplaces by Category
AI Agents & OrchestrationLLM IntegrationDevelopment ToolsFrontend & UIBackend & APIsDatabasesTesting & Code QualityDevOps & CloudSecurity & ComplianceGit & Version Control

Cross AI Tools

Discover Claude Code plugins, extensions, and tools. Automatically updated directory of Anthropic Claude AI marketplaces with development tools, productivity plugins, and integrations.

Resources

  • Browse Skills
  • Browse MCP Servers
  • Browse Marketplaces
  • Plugins Reference

Community

  • About
  • Tools
  • Feedback
  • Privacy Policy
  • Advertise

Built for the Claude Code community with Claude Code by @mertduzgun

Independent project, not affiliated with Anthropic

webEmbedding

jongko54/webembedding
26 toolsSTDIO, HTTPregistry active
Summary

This is a Playwright-based URL cloning engine that tries to reuse embeddable sources before rebuilding blocked pages from captured DOM, styles, assets, and HAR network traces. It exposes MCP tools for URL inspection, clone route classification, live browser capture, and visual/DOM/computed-style verification across desktop, tablet, and mobile breakpoints. The stdio server runs full capture and rebuild locally, while the hosted endpoint at webembedding-mcp.vercel.app provides read-only routing helpers for Apps SDK integrations. Reach for it when you need to recreate marketing pages, documentation sites, or iframe-blocked surfaces with self-verified fidelity scores, or when you want HAR replay and responsive breakpoint evidence instead of raw screenshots.

Install to Claude Code

verified
claude mcp add --transport http webembedding https://webembedding-mcp.vercel.app/mcp

Run in your terminal. Add --scope user to make it available in every project.

Review the command, arguments, and environment values before installing — MCP servers run with your local permissions.

CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →
CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →

Tools

Verified live against the running server on Jun 10, 2026.

verified live6 tools
detect_runtime_capabilitiesReport the hosted Apps SDK intake runtime capabilities and explain when the local stdio MCP is required.

Report the hosted Apps SDK intake runtime capabilities and explain when the local stdio MCP is required.

No parameters — call it with no arguments.

inspect_urlFetch a public or user-authorized URL and inspect title, metadata, frame policy, and likely source/embed candidates. Does not capture screenshots or persist artifacts.2 params

Fetch a public or user-authorized URL and inspect title, metadata, frame policy, and likely source/embed candidates. Does not capture screenshots or persist artifacts.

Parameters* required
url*string
timeout_secondsinteger
discover_embed_candidatesExtract likely embed, preview, viewer, remix, and source URLs from a public or user-authorized page.2 params

Extract likely embed, preview, viewer, remix, and source URLs from a public or user-authorized page.

Parameters* required
url*string
timeout_secondsinteger
classify_clone_modeDecide whether a reference should be embedded, sourced, locally captured, bounded-rebuilt, or blocked before reproduction.5 params

Decide whether a reference should be embedded, sourced, locally captured, bounded-rebuilt, or blocked before reproduction.

Parameters* required
candidatesarray
license_textstring
site_profileobject
source_signalsarray
exact_requestedboolean
generate_embed_snippetGenerate an iframe snippet for a known frameable and authorized URL. Does not verify frameability by itself.3 params

Generate an iframe snippet for a known frameable and authorized URL. Does not verify frameability by itself.

Parameters* required
url*string
titlestring
frameworkstring
one of html · nextjs
plan_reproduction_pathCreate a source-first plan that separates exact embed/source reuse from local capture and bounded rebuild work.6 params

Create a source-first plan that separates exact embed/source reuse from local capture and bounded rebuild work.

Parameters* required
candidatesarray
license_textstring
site_profileobject
capture_bundleobject
source_signalsarray
exact_requestedboolean

webEmbedding

webEmbedding is a source-first website cloning engine for AI coding agents: it captures live pages with Playwright, replays network evidence from HAR artifacts, rebuilds only when direct reuse is blocked, and self-verifies the result.

It ships as a Skill + MCP server. Instead of asking a model to "clone this site" from a screenshot, it inspects the URL, chooses a reuse or rebuild route, captures DOM/runtime HTML/styles/assets/network traces, generates bounded frontend reconstruction artifacts, and checks the output with visual, DOM, computed-style, interaction, and responsive-breakpoint verification.

webEmbedding Skill and MCP workflow

GitHub listing, social preview, and launch-copy recommendations are in docs/github-listing.md.

Current Status

The current pipeline is strongest for static and semi-static web pages:

  • company, brand, marketing, and documentation pages
  • public landing pages
  • iframe-blocked pages that need capture-based reconstruction
  • responsive page snapshots across desktop, tablet, and mobile

It is not a full backend or app-logic clone engine. Login-only screens, app-first or native-app-required services, captcha-heavy sites, maps, games, canvas/WebGL-heavy pages, real-time feeds, payments, booking flows, and private server behavior still need separate handling.

Operationally, the repo is now a production-candidate clone engine for URL-based capture and bounded reconstruction: jobs can be queued, network evidence can be replay-audited from HAR artifacts, authenticated dashboard runs can be driven from user-owned browser state, and local gates verify the route corpus, score checks, package contents, and CI wiring. The remaining hard boundary is server-side product behavior, not front-end evidence capture and reconstruction.

Measured Checkpoints

Recent local benchmark runs from this repo:

URLPathScore
https://developer.mozilla.org/en-US/iframe-blocked bounded rebuildroot 94, visual 95, mobile 94, tablet 94, breakpoint average 94
https://www.mozilla.org/bounded rebuildroot 94, visual 100
https://www.python.orgharder bounded rebuild sampleroot 90, visual 100
https://www.example.comexact reuseready yes

These are generated by the local self-verify pipeline, not manually assigned ratings. The reproducible commands and score thresholds are tracked in docs/benchmark-evidence.json. Production readiness gates are tracked in docs/production-pipeline-gates.json.

Core Features

  • Source-first routing:
    • direct iframe or embed reuse when it is safe and frameable
    • original preview, export, remix, or source routes when available
    • bounded rebuild only when exact reuse is unavailable
  • Live browser capture:
    • DOM snapshot
    • runtime HTML
    • full-page screenshot
    • computed style summaries
    • CSS analysis
    • asset inventory
    • HAR-like network metadata
    • interaction states and replay traces
    • storage state export for session-aware flows
  • Blocked-site rebuild:
    • handles X-Frame-Options and CSP-blocked pages by rebuilding from captured evidence
    • generates reusable frontend reconstruction artifacts from captured page structure
    • preserves custom tags, shadow-root host structure, and semantic document structure where captured
  • Evidence limitation reporting:
    • separates directly captured artifacts from inferred or missing evidence in reproduction results and prompts
    • marks app-gated, auth-gated, and native-app-led surfaces as bounded evidence, with recommendations for user screenshots or authenticated session capture
  • Operational failure classification:
    • reports typed pipeline action codes such as network-replay-limited, auth-session-missing, public-app-gate, and canvas-visual-fallback
    • exposes HAR/network replay_readiness before treating captured network evidence as replay-grade
  • Production pipeline helpers:
    • filesystem-backed async clone job queue with durable JSON records, worker locks, retry scheduling, cancellation, and manifest annotation
    • deterministic HAR replay engine for standard HAR, near-HAR, and captured network/manifest.json artifacts
    • authenticated dashboard live corpus runner that accepts user-provided storage_state_path or user_data_dir outside the repo
  • Self-verification:
    • screenshot similarity
    • DOM snapshot similarity
    • computed-style similarity
    • hover/focus/click interaction state parity
    • interaction trace parity
    • desktop/mobile/tablet breakpoint reports
  • Responsive benchmark support:
    • primary desktop viewport: 1440x1200
    • tablet profile: 768x1024
    • mobile profile: 390x844
  • Repair loop:
    • bounded self-repair can run when the first scaffold misses the readiness threshold

Install

Requirements

  • Node.js 18 or newer
  • Python 3.9 or newer
  • Chrome or Chromium available locally for Playwright runtime capture

The package uses playwright-core; it does not download a browser by itself.

Installing this project adds the source-first-clone plugin bundle, the exact-clone-intake skill, and the MCP server that exposes the URL inspection, capture, rebuild, and verification tools.

Install From npm

npm install -g web-embedding
web-embedding install
web-embedding doctor

Clone a public URL after installing:

web-embedding clone \
  --url https://developer.mozilla.org/en-US/ \
  --output-dir ./.tmp/mdn-clone \
  --wait-seconds 2 \
  --timeout-seconds 35 \
  --breakpoints mobile tablet

If you already have an older local plugin installed, overwrite it with:

web-embedding install --force
web-embedding doctor

You can also run the installer without a global install:

npx web-embedding install

Use As An MCP Server

For MCP clients that can launch npm stdio servers:

{
  "mcpServers": {
    "source-first-clone": {
      "command": "npx",
      "args": ["-y", "web-embedding@latest", "mcp"]
    }
  }
}

For local smoke testing:

npx web-embedding@latest mcp

The MCP Registry identity is io.github.jongko54/web-embedding; server.json and package.json#mcpName are kept in sync for registry ownership verification.

Hosted Apps SDK Intake Endpoint

The public remote MCP intake endpoint for Apps SDK Developer Mode is:

https://webembedding-mcp.vercel.app/mcp

It exposes low-risk source-first routing tools such as URL inspection, embed candidate discovery, clone-mode classification, and embed snippet generation. Full browser capture, HAR replay, queues, bounded rebuilds, and one-pass clone execution remain local-first through the stdio MCP package.

Apps SDK review pages are hosted alongside the endpoint: https://webembedding-mcp.vercel.app/privacy.html, https://webembedding-mcp.vercel.app/terms.html, and https://webembedding-mcp.vercel.app/submission.html.

Sandboxing And Approvals

webEmbedding has two different execution boundaries:

  • Hosted Apps SDK intake: read-only URL routing and classification only. It accepts absolute http and https URLs, does not run Playwright, does not read local files, does not use browser profiles or storage state, and does not persist capture artifacts.
  • Local stdio MCP and CLI: full capture, HAR replay, queues, rebuild scaffolds, and self-verify run on the user's machine under the user's local agent and filesystem permissions. Output is written only to caller-provided paths such as output_dir or queue_root.
  • Authenticated capture: session-aware runs require the caller to intentionally provide a storage_state_path or user_data_dir. webEmbedding does not collect credentials, perform login bypasses, or treat a public login shell as private app evidence.
  • Access-controlled surfaces: paywalls, captcha flows, private dashboards, payment/checkout/account/admin flows, and native-app-led screens should be blocked, marked needs_session, or sent to manual review unless the user has explicit authorization and supplies the needed evidence.

Local URL entrypoints reject non-HTTP schemes such as file:// so an agent cannot use clone/capture tools as a local file reader. Telemetry is disabled by default and, when enabled, excludes target URLs, local paths, captured HTML, screenshots, storage state, environment variables, API keys, and command output.

Agent Marketplaces

This repository includes marketplace metadata for the two local agent surfaces:

  • Codex: .agents/plugins/marketplace.json points to ./bundle/source-first-clone.
  • Claude Code: .claude-plugin/marketplace.json points to the same bundle and the bundle includes .claude-plugin/plugin.json.

Claude Code users can add the marketplace from GitHub with:

/plugin marketplace add jongko54/webEmbedding
/plugin install source-first-clone@webembedding

AI auto-selection expectations and golden prompts live in docs/ai-distribution.md and evals/ai-selection/webembedding-golden-prompts.json.

Install From Release

curl -fsSL https://github.com/jongko54/webEmbedding/releases/latest/download/install.sh | bash

Install From This Checkout

git clone https://github.com/jongko54/webEmbedding.git
cd webEmbedding
npm install
node ./bin/web-embedding.mjs install
node ./bin/web-embedding.mjs doctor

Install Into A Temporary Home

Useful for testing without touching your real agent home:

python3 python/web_embedding/installer.py install --target-home ./.tmp/home
python3 python/web_embedding/installer.py doctor --target-home ./.tmp/home
python3 python/web_embedding/installer.py uninstall --target-home ./.tmp/home

Opt-in Telemetry

Telemetry is disabled by default. On an interactive first install, web-embedding install asks once and defaults to No. Non-interactive installs such as CI and curl | bash do not prompt. If you opt in, web-embedding sends a small anonymous command-completion event to a JSON POST endpoint you control. It does not send target URLs, local paths, captured HTML, screenshots, storage state, environment variables, API keys, or command output.

Enable it during install:

web-embedding install --telemetry --telemetry-endpoint https://your-collector.example/events

Or manage it later:

web-embedding telemetry enable --endpoint https://your-collector.example/events
web-embedding telemetry status
web-embedding telemetry disable
web-embedding telemetry reset-id

Each event contains an anonymous install id, package version, command name, success/failure status, OS/runtime basics, and coarse option flags such as breakpoint_count or install_source.

Environment controls:

WEB_EMBEDDING_TELEMETRY=1
WEB_EMBEDDING_NO_TELEMETRY=1
WEB_EMBEDDING_TELEMETRY_PROMPT=0
WEB_EMBEDDING_TELEMETRY_ENDPOINT=https://your-collector.example/events
WEB_EMBEDDING_TELEMETRY_LOG=./telemetry.jsonl

Run a local/self-hosted JSONL collector:

npm run telemetry:collector -- --host 127.0.0.1 --port 8765 --out ./telemetry.jsonl
WEB_EMBEDDING_TELEMETRY=1 \
WEB_EMBEDDING_TELEMETRY_ENDPOINT=http://127.0.0.1:8765/events \
web-embedding doctor

Summarize collected usage:

npm run telemetry:summarize -- ./telemetry.jsonl

The summary includes install and clone executions, total command executions, unique anonymous install IDs, command counts, and version counts. See docs/telemetry.md for collector and analyzer details.

Quick Start

Inspect a URL and get route hints:

node ./bin/web-embedding.mjs inspect \
  --url https://developer.mozilla.org/en-US/

Run a safe preflight audit before capture or clone:

node ./bin/web-embedding.mjs audit \
  --url https://developer.mozilla.org/en-US/

The audit reports whether the reference is ready for exact/embed reuse, needs local capture, needs an authenticated session, requires manual review, or should be blocked before any browser capture or filesystem output runs.

Run the full clone workflow:

node ./bin/web-embedding.mjs clone \
  --url https://developer.mozilla.org/en-US/ \
  --output-dir ./.tmp/mdn-clone \
  --wait-seconds 2 \
  --timeout-seconds 35 \
  --breakpoints mobile tablet

Run a lightweight quality benchmark:

python3 scripts/check_clone_quality_bench.py \
  https://developer.mozilla.org/en-US/ \
  --output-root ./.tmp/clone-quality-bench \
  --wait-seconds 1 \
  --timeout-seconds 35 \
  --breakpoints mobile tablet

The benchmark prints compact rows for root, visual, and breakpoint scores. The full artifacts are written under the output directory.

CLI Commands

node ./bin/web-embedding.mjs capabilities
node ./bin/web-embedding.mjs install
node ./bin/web-embedding.mjs doctor
node ./bin/web-embedding.mjs uninstall
node ./bin/web-embedding.mjs paths
node ./bin/web-embedding.mjs telemetry status
node ./bin/web-embedding.mjs inspect --url https://www.mozilla.org/
node ./bin/web-embedding.mjs audit --url https://www.mozilla.org/
node ./bin/web-embedding.mjs capture \
  --url https://www.mozilla.org/ \
  --output-dir ./.tmp/capture-mozilla \
  --breakpoints mobile tablet
node ./bin/web-embedding.mjs reproduce \
  --url https://www.mozilla.org/ \
  --output-dir ./.tmp/reproduce-mozilla \
  --breakpoints mobile tablet
node ./bin/web-embedding.mjs clone \
  --url https://www.mozilla.org/ \
  --output-dir ./.tmp/clone-mozilla \
  --breakpoints mobile tablet
node ./bin/web-embedding.mjs verify \
  --reference-bundle ./.tmp/reference/capture.json \
  --candidate-bundle ./.tmp/candidate/capture.json

Output Artifacts

A clone run can produce:

  • capture.json
  • pipeline-run-manifest.json
  • dom/snapshot.json
  • dom/runtime.html
  • styles/computed-summary.json
  • styles/css-analysis.json
  • network/manifest.json
  • network/har.json
  • network/har-like.json
  • network/replay-report.json
  • assets/inventory.json
  • interactions/states.json
  • interactions/trace.json
  • screenshots/runtime.png
  • session/storage-state.json
  • reproduction/plan.json
  • reproduction/evidence-limitations.json
  • reproduction/rebuild-prompt.txt
  • reproduction/rebuild/starter.html
  • reproduction/rebuild/starter.css
  • reproduction/rebuild/starter.tsx
  • reproduction/rebuild/next-app/
  • reproduction/self-verify/summary.json
  • reproduction/self-verify/renderers/*/verification.json
  • reproduction/self-verify/renderers/*/visual-qa.json
  • reproduction/self-verify/renderers/*/breakpoints/*-verification.json

Quality Benchmark

Run the default small benchmark:

npm run check:clone-bench:local

Run the universal route regression corpus and expectations gate:

npm run check:benchmark-routes:local

Run a lightweight clone score gate:

npm run check:clone-score-gate:local

Validate the committed benchmark evidence manifest:

npm run check:benchmark-evidence:local

Validate production pipeline gates:

npm run check:production-readiness:local

Run the operational smokes individually:

npm run check:job-queue:local
npm run check:har-replay:local
npm run check:authenticated-corpus:local

Classify failure/action codes from a route report:

npm run classify:pipeline-failures -- --report ./.tmp/universal-route-benchmark/universal-route-report.json

Find low-scoring persisted benchmark artifacts:

npm run summarize:benchmark-scores -- --root ./.tmp --min-score 60 --max-score 70

Run specific URLs:

python3 scripts/check_clone_quality_bench.py \
  https://www.example.com \
  https://www.mozilla.org/ \
  --no-breakpoints

Run a responsive benchmark:

python3 scripts/check_clone_quality_bench.py \
  https://developer.mozilla.org/en-US/ \
  --breakpoints mobile tablet

Development Checks

python3 -m py_compile \
  bundle/source-first-clone/mcp/source_first_clone/*.py \
  scripts/check_integration_smoke.py \
  scripts/check_clone_quality_bench.py
npm run check:integration:local
git diff --check

Repo Layout

  • bundle/source-first-clone Installed plugin bundle, MCP server, and exact-clone intake skill.
  • bundle/source-first-clone/mcp/source_first_clone Capture, planning, rebuild, repair, and verification engine.
  • bin/web-embedding.mjs Node CLI wrapper.
  • python/web_embedding/installer.py Shared installer and command dispatcher.
  • scripts/check_clone_quality_bench.py URL clone quality benchmark helper.
  • scripts/benchmark_routes.py Universal route/capture-depth regression benchmark helper.
  • scripts/check_benchmark_report.py Benchmark expectation validator for exact, minimum, and contains-style checks.
  • scripts/check_benchmark_evidence.py Benchmark evidence manifest validator.
  • scripts/check_job_queue_smoke.py Filesystem async clone job queue smoke test.
  • scripts/check_har_replay_smoke.py Deterministic HAR replay engine smoke test.
  • scripts/benchmark_authenticated_corpus.py User-provided authenticated dashboard corpus runner.
  • scripts/summarize_benchmark_scores.py Utility for finding low or high scoring persisted benchmark artifacts under an output root.
  • scripts/classify_pipeline_failures.py Operational failure/action taxonomy summarizer for reports and capture artifacts.
  • scripts/check_production_readiness.py Production readiness gate validator for corpus, failure taxonomy, CI wiring, and policy docs.
  • scripts/check_integration_smoke.py Release, install, and URL-only clone smoke test.
  • scripts/release_bundle.py Release artifact builder.
  • docs/ Architecture notes and universal benchmark documentation.

Positioning

The strongest claim for this project is:

A source-first website cloning engine that combines Playwright capture, HAR replay, MCP tools, and self-verification to rebuild iframe-blocked public pages with reproducible visual, DOM, style, interaction, and responsive scores.

Avoid treating the output as a legal or ownership bypass. The engine can reconstruct public page structure, but permission, licensing, and acceptable use still matter.

License

MIT

Featured
CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →
Categories
AI & LLM Tools
Registryactive
Packageweb-embedding
TransportSTDIO, HTTP
Resources1
Tools verifiedJun 10, 2026
UpdatedMay 21, 2026
View on GitHub

Related AI & LLM Tools MCP Servers

View all →
SkillFM LLM Cost Optimizer

io.github.ericm1018/skillfm-llm-cost-optimizer-openai-anthropic-usage

LLM cost optimizer for OpenAI, Anthropic, token usage, BYOK, and SkillFM Beacon audits.
Llm Orchestration Agent

io.github.mikerawsonnz/llm-orchestration-agent

Run a prompt through a LangChain (system + human) chain over Gemini on Vertex AI; optional LangSmith
Authenticated Llm Agent

io.github.mikerawsonnz/authenticated-llm-agent

JWT-gated LLM gateway: authenticate (bcrypt/JWT), then run a LangChain-on-Vertex Gemini completion.
Copilot Memory MCP

labforgedev/copilot-memory-mcp

Persistent semantic memory for AI agents using local ChromaDB vector search. No cloud required.
1
Agent Prompt Injection Firewall Mcp

csoai-org/agent-prompt-injection-firewall-mcp

The WAF for agents. Pattern-based + heuristic firewall scans prompts, RAG documents, tool argume...
Authenticated Multi Llm Agent

io.github.mikerawsonnz/authenticated-multi-llm-agent

Google-OAuth-gated LLM gateway: verify a Google ID token, then run a Gemini (Vertex AI) completion f