CAT
/MCP
SkillsMCPMarketplacesDigestToolsAdvertise

This week in Claude

Every Monday: Claude Code, Agent SDK, MCP, and the Anthropic platform moves worth your time.

Skills by Category
Frontend DevelopmentBackend & APIsTesting & QASecurityDevOps & CI/CDGit & Pull RequestsDocumentationCode Review & QualityAI & Agent BuildingSkill Development
MCP Servers by Category
Sales & MarketingWeb & Browser AutomationDatabasesAI & LLM ToolsCloud & InfrastructureCommunication & MessagingDeveloper ToolsDesign & CreativeDocuments & KnowledgeSearch & Web Crawling
Marketplaces by Category
AI Agents & OrchestrationLLM IntegrationDevelopment ToolsFrontend & UIBackend & APIsDatabasesTesting & Code QualityDevOps & CloudSecurity & ComplianceGit & Version Control

Cross AI Tools

Discover Claude Code plugins, extensions, and tools. Automatically updated directory of Anthropic Claude AI marketplaces with development tools, productivity plugins, and integrations.

Resources

  • Browse Skills
  • Browse MCP Servers
  • Browse Marketplaces
  • Plugins Reference

Community

  • About
  • Tools
  • Feedback
  • Privacy Policy
  • Advertise

Built for the Claude Code community with Claude Code by @mertduzgun

Independent project, not affiliated with Anthropic

ClicheFactory Document Intelligence

clichefactory/clichefactory-mcp
authSTDIOregistry active
Summary

Connects ClicheFactory's document extraction API to Claude and other MCP clients through three tools: extract, to_markdown, and doctor. You give it a file (PDF, DOCX, XLSX, images, EML) and a JSON schema, it returns structured data. Runs in two modes: local uses your own LLM key with OCR on your machine, service mode hits the ClicheFactory API and supports trained pipelines and robust verification. The to_markdown tool is handy for previewing documents before deciding what to extract. Reach for this when you need to pull invoice line items, form fields, or tabular data from documents without writing parsers.

CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →
CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →

clichefactory-mcp

MCP (Model Context Protocol) server for ClicheFactory — structured data extraction from documents.

This server exposes ClicheFactory's extraction and document conversion capabilities as MCP tools, allowing AI assistants in Cursor, Claude Desktop, OpenClaw, and other MCP-compatible clients to extract structured data from PDFs, images, DOCX, XLSX, CSV, EML, and more.

Quick start (recommended — service mode)

Service mode uses the ClicheFactory cloud for the best extraction quality. You only need one API key.

  1. Sign up at clichefactory.com — free pages included, no credit card required.

  2. Create an API key in Settings → API Keys (format: cliche-...).

  3. Install the MCP server:

    pip install clichefactory-mcp
    
  4. Configure — either paste the key into your MCP client (see below) or run once in a terminal:

    pip install clichefactory   # if you don't have the CLI yet
    clichefactory configure
    

    The interactive wizard saves credentials to ~/.clichefactory/config.toml, which the MCP server reads automatically.

That's it — one env var (CLICHEFACTORY_API_KEY) or a config file, and you're on hosted extraction.

Tools

ToolDescription
extractExtract structured JSON from a document using a schema
to_markdownConvert a document to markdown text
doctorCheck configuration, dependencies, and system binaries

extract

The main tool. Pass a document file and a JSON schema — get structured data back.

Supports all extraction modes:

ModeDescriptionRequires
(default)OCR + LLM extractionService API key (recommended)
fastFastest pipelineService API key
trainedTrained pipeline artifactService + artifact_id
robustTwo-stage extract + verifyService only
robust-trainedTrained extract + verificationService + artifact_id

The schema can be provided as:

  • File path: absolute path to a .json schema file
  • Inline dict: the LLM constructs a JSON schema from the conversation (e.g., the user says "extract the invoice number and total" and the LLM builds {"type": "object", "properties": {"invoice_number": {"type": "string"}, "total": {"type": "number"}}})

to_markdown

Converts any supported document to markdown. Useful for inspecting document contents or feeding them to the LLM for analysis before deciding on an extraction schema.

doctor

Runs diagnostics on the ClicheFactory setup — config file, API keys, Python dependencies, system binaries. Call this when things aren't working.

Execution Modes

The server defaults to service mode (ClicheFactory cloud). Local mode is available for BYOK / air-gapped use.

  • service (recommended) — Uses the ClicheFactory cloud service. Requires a ClicheFactory API key. Supports all extraction modes including trained pipelines and robust verification. Best extraction quality out of the box.

  • local (advanced) — Runs extraction on your machine. You bring your own LLM key (BYOK). Requires pip install "clichefactory-mcp[local]" (~2 GB of parsing/OCR dependencies) plus system binaries (tesseract, LibreOffice). Quality depends on your local setup.

Installation

Prerequisites

  • Python ≥ 3.12
  • uv (recommended) or pip

From PyPI

pip install clichefactory-mcp

For local-mode extraction (BYOK, runs on your machine), install with the local extras:

pip install "clichefactory-mcp[local]"

Configuration

Environment Variables

Set these in your MCP client configuration (see below) or in ~/.clichefactory/config.toml via clichefactory configure.

VariableRequiredDescription
CLICHEFACTORY_API_KEYYes (service mode)ClicheFactory API key from Settings → API Keys (cliche-...)
CLICHEFACTORY_API_URLNoOverride the default service URL (https://api.clichefactory.com); useful for local development against a self-hosted ClicheFactory backend
LLM_MODEL_NAMELocal mode onlyModel name, e.g. gemini/gemini-3-flash-preview
LLM_API_KEYLocal mode onlyAPI key for the LLM provider
OCR_MODEL_NAMENoSeparate OCR/VLM model (defaults to main model)
OCR_API_KEYNoAPI key for OCR model (defaults to main key)

Environment variables take precedence over the config file at ~/.clichefactory/config.toml.

Cursor

Add to .cursor/mcp.json in your project (or global Cursor settings):

{
  "mcpServers": {
    "clichefactory": {
      "command": "uvx",
      "args": ["clichefactory-mcp"],
      "env": {
        "CLICHEFACTORY_API_KEY": "cliche-your-key-here"
      }
    }
  }
}

For local development from a git checkout, replace uvx with:

"command": "uv",
"args": ["--directory", "/absolute/path/to/cliche-mcp", "run", "clichefactory-mcp"]

Claude Desktop

Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "clichefactory": {
      "command": "uvx",
      "args": ["clichefactory-mcp"],
      "env": {
        "CLICHEFACTORY_API_KEY": "cliche-your-key-here"
      }
    }
  }
}

OpenClaw

Register the MCP server with your OpenClaw agent:

openclaw mcp set clichefactory '{"command":"uvx","args":["clichefactory-mcp"],"env":{"CLICHEFACTORY_API_KEY":"cliche-your-key-here"}}'

Verify with openclaw mcp list. The agent can now use extract, to_markdown, and doctor tools in any conversation.

An OpenClaw skill with agent instructions is also available in integrations/openclaw/. To install it into your workspace:

cp -r /path/to/cliche-mcp/integrations/openclaw ~/.openclaw/skills/clichefactory

Or, once published to ClawHub:

openclaw skills install clichefactory

Local mode (advanced)

If you prefer BYOK extraction on your machine, install the local extras and set LLM credentials:

{
  "mcpServers": {
    "clichefactory": {
      "command": "uvx",
      "args": ["clichefactory-mcp"],
      "env": {
        "LLM_MODEL_NAME": "gemini/gemini-3-flash-preview",
        "LLM_API_KEY": "your-gemini-api-key"
      }
    }
  }
}

Pass mode="local" explicitly in tool calls, or run clichefactory configure --local to set local as the default in ~/.clichefactory/config.toml.

Supported File Types

PDF, PNG, JPG, JPEG, WebP, GIF, BMP, DOCX, DOC, ODT, XLSX, CSV, EML, TXT, MD.

Differences from the CLI

This MCP server covers the core extraction and conversion workflows. The following CLI features are not included in v1:

FeatureReason
Batch operations (extract-batch, to-markdown-batch)MCP tools are typically called one-at-a-time by the LLM. For multiple documents, the LLM calls extract in sequence. Batch support may be added in a future version.
configureInteractive prompts don't work in MCP. Use env vars or run clichefactory configure in a terminal.
--output / -o flagMCP tools return results directly to the LLM rather than writing to files.
allow_partialNot exposed as a tool parameter in v1.
OCR engine selectionUses the SDK defaults (RapidOCR). Configure via ~/.clichefactory/config.toml or pass parsing options through the SDK if needed.

Development

# Install in development mode
uv sync

# Run the server directly (stdio transport, for testing with MCP clients)
uv run clichefactory-mcp

# Inspect available tools (requires mcp CLI)
uv run mcp dev cliche_mcp/server.py

License

MIT — Copyright (c) 2026 Urban Susnik s.p.

Featured
CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →

Configuration

CLICHEFACTORY_API_KEYsecret

ClicheFactory API key for service mode.

CLICHEFACTORY_API_URL

Optional ClicheFactory API base URL override.

LLM_MODEL_NAME

Model name for local mode, for example gemini/gemini-3-flash-preview.

LLM_API_KEYsecret

LLM provider API key for local mode.

OCR_MODEL_NAME

Optional OCR/VLM model override.

OCR_API_KEYsecret

Optional OCR/VLM provider API key.

Categories
Search & Web CrawlingData & Analytics
Registryactive
Packageclichefactory-mcp
TransportSTDIO
AuthRequired
UpdatedMay 17, 2026
View on GitHub

Related Search & Web Crawling MCP Servers

View all →
Google Search

com.mcparmory/google-search

Scrape Google search results with SERP data, ads, and knowledge panels
25
Brave Search

io.github.pipeworx-io/brave-search

Brave Search MCP — independent web index (no Google/Bing dependency)
Serper Search and Scrape

marcopesani/mcp-server-serper

Serper MCP Server supporting search and webpage scraping
154
Brave Search Mcp Server

brave/brave-search-mcp-server

Brave Search MCP Server: web results, images, videos, rich results, AI summaries, and more.
1.2k
Google Search Console

com.mcparmory/google-search-console

Query search analytics, manage sitemaps, and inspect site URLs and status
25
Google Search Console

acamolese/google-search-console-mcp

Google Search Console MCP server: SEO audits, performance queries, URL inspection, indexing checks.
3