ThumbGate

2368 toolsSTDIOregistry active

Summary

Turns thumbs-down feedback into executable prevention rules that block AI agents from repeating mistakes before the tool call executes. The PreToolUse hook intercepts risky actions like force-pushes, destructive file operations, or API calls that match patterns learned from prior sessions. One thumbs-down creates a rule, next session the agent gets blocked with zero tokens spent. Ships with a context brain generator that consolidates lessons, guardrails, and gates into a single agent-readable markdown file your repo can version in git. Free tier gives you unlimited feedback capture and five active rules. Works with Claude Code, Cursor, Gemini CLI, Cline, and any MCP-compatible agent. Designed for workflows where one repeated mistake is a liability event or a token bill you don't want to pay twice.

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

Tools

Public tool metadata for what this MCP can expose to an agent.

68 tools

capture_feedbackCapture an up/down signal plus one line of why. Vague feedback is logged, then returned with a clarification prompt instead of memory promotion.13 params

Capture an up/down signal plus one line of why. Vague feedback is logged, then returned with a clarification prompt instead of memory promotion.

Parameters* required

tagsarray

skillstring

signalstring

one of up · down

contextstring

One-sentence reason describing what worked or failed

guardrailsobject

whatWorkedstring

chatHistoryarray

Optional caller-supplied recent conversation window used for history-aware lesson distillation. The current Claude auto-capture path sends up to 8 prior recorded entries for vague negative inline signals.

failureTypestring

Dual-signal: "decision" = wrong tool/action chosen, "execution" = right tool but bad parameters/output. Improves Thompson Sampling precision.one of decision · execution

rubricScoresarray

whatToChangestring

whatWentWrongstring

relatedFeedbackIdstring

Optional prior feedback event to merge with later follow-up context.

conversationWindowarray

Recent conversation turns before the feedback signal. Raw messages, not summaries.

feedback_summaryGet summary of recent feedback1 params

Get summary of recent feedback

Parameters* required

recentnumber

search_lessonsSearch promoted lessons and show the corrective actions, lifecycle state, prevention rules, gates, and next harness fixes linked to each result.4 params

Search promoted lessons and show the corrective actions, lifecycle state, prevention rules, gates, and next harness fixes linked to each result.

Parameters* required

tagsarray

Require all tags to be present on a lesson

limitnumber

Maximum results to return (default 10)

querystring

Search query. Leave empty to list the most recent lessons.

categorystring

one of error · learning · preference

retrieve_lessonsRetrieve the most relevant lessons for a given tool/action context. Use in PreToolUse hooks for per-action guidance.3 params

Retrieve the most relevant lessons for a given tool/action context. Use in PreToolUse hooks for per-action guidance.

Parameters* required

toolNamestring

The tool being called (e.g., Bash, Edit, Read)

maxResultsnumber

Max lessons to return (default 5)

actionContextstring

Description of what the tool call is doing

search_thumbgateSearch raw ThumbGate state across feedback logs, ContextFS memory, prevention rules, and imported policy documents.4 params

Search raw ThumbGate state across feedback logs, ContextFS memory, prevention rules, and imported policy documents.

Parameters* required

limitnumber

Maximum results to return (default 10)

querystring

Search query for ThumbGate state.

signalstring

Optional feedback-signal filter when searching feedback data.one of up · down · positive · negative

sourcestring

Restrict search to a single ThumbGate source.one of all · feedback · context · rules · documents

import_documentImport a local policy or runbook document into ThumbGate, normalize it for search, and propose provenance-backed gate candidates.7 params

Import a local policy or runbook document into ThumbGate, normalize it for search, and propose provenance-backed gate candidates.

Parameters* required

tagsarray

Optional tags such as policy, runbook, or team.

titlestring

Optional display title override.

contentstring

Inline document content for hosted or generated imports.

filePathstring

Local file path inside the active workspace or ThumbGate runtime.

sourceUrlstring

Optional external URL or provenance label for the imported document.

proposeGatesboolean

When true (default), derive reviewable gate proposals from the document.

sourceFormatstring

Optional source format override when importing inline content.one of markdown · text · yaml · json · html

list_imported_documentsList imported policy and runbook documents stored in local ThumbGate state.3 params

List imported policy and runbook documents stored in local ThumbGate state.

Parameters* required

tagstring

Optional tag or matched template id filter.

limitnumber

Maximum documents to return (default 20).

querystring

Optional title or excerpt filter.

get_imported_documentRead a previously imported document with its proposed gate candidates and provenance.1 params

Read a previously imported document with its proposed gate candidates and provenance.

Parameters* required

documentIdstring

Imported document id.

feedback_statsGet feedback stats and recommendations

Get feedback stats and recommendations

No parameter schema in public metadata yet.

diagnose_failureDiagnose a failed or suspect workflow step using MCP schema, workflow, gate, and approval constraints.13 params

Diagnose a failed or suspect workflow step using MCP schema, workflow, gate, and approval constraints.

Parameters* required

stepstring

errorstring

outputstring

contextstring

approvedboolean

exitCodenumber

intentIdstring

toolArgsobject

toolNamestring

guardrailsobject

mcpProfilestring

rubricScoresarray

verificationobject

infer_lesson_from_historyPerform autonomous inference on chat history to identify why a failure occurred and what rule should be recorded.2 params

Perform autonomous inference on chat history to identify why a failure occurred and what rule should be recorded.

Parameters* required

lastActionobject

chatHistoryarray

list_intentsList available intent plans and whether each requires human approval in the active profile3 params

List available intent plans and whether each requires human approval in the active profile

Parameters* required

bundleIdstring

mcpProfilestring

partnerProfilestring

plan_intentGenerate an intent execution plan with policy checkpoints8 params

Generate an intent execution plan with policy checkpoints

Parameters* required

contextstring

approvedboolean

bundleIdstring

intentIdstring

repoPathstring

mcpProfilestring

delegationModestring

one of off · auto · sequential

partnerProfilestring

start_handoffStart a sequential delegation handoff from a delegation-eligible intent plan9 params

Start a sequential delegation handoff from a delegation-eligible intent plan

Parameters* required

contextstring

approvedboolean

bundleIdstring

intentIdstring

repoPathstring

mcpProfilestring

plannedChecksarray

partnerProfilestring

delegateProfilestring

complete_handoffComplete a sequential delegation handoff and record verification outcomes8 params

Complete a sequential delegation handoff and record verification outcomes

Parameters* required

outcomestring

one of accepted · rejected · aborted

summarystring

attemptsnumber

handoffIdstring

latencyMsnumber

resultContextstring

tokenEstimatenumber

violationCountnumber

describe_reliability_entityGet the definition and state of a business entity (Customer, Revenue, Funnel). Aliased to describe_semantic_entity.1 params

Get the definition and state of a business entity (Customer, Revenue, Funnel). Aliased to describe_semantic_entity.

Parameters* required

typestring

one of Customer · Revenue · Funnel

get_reliability_rulesRetrieve active prevention rules and success patterns. Aliased to prevention_rules.

Retrieve active prevention rules and success patterns. Aliased to prevention_rules.

No parameter schema in public metadata yet.

enforcement_matrixShow the full Enforcement Matrix: feedback pipeline stats, active pre-action gates, and rejection ledger with revival conditions.

Show the full Enforcement Matrix: feedback pipeline stats, active pre-action gates, and rejection ledger with revival conditions.

No parameter schema in public metadata yet.

security_scanScan code for OWASP vulnerabilities (injection, XSS, path traversal, SSRF, prototype pollution) and supply chain risks (typosquatting, install script abuse, wildcard versions). Returns findings with severity, category, and line numbers.3 params

Scan code for OWASP vulnerabilities (injection, XSS, path traversal, SSRF, prototype pollution) and supply chain risks (typosquatting, install script abuse, wildcard versions). Returns findings with severity, category, and line numbers.

Parameters* required

contentstring

Code content to scan

diffModeboolean

When true, treats content as git diff output

filePathstring

File path for language-aware scanning

capture_memory_feedbackCapture success/failure feedback to harden future workflows. Aliased to capture_feedback.3 params

Capture success/failure feedback to harden future workflows. Aliased to capture_feedback.

Parameters* required

tagsarray

signalstring

one of up · down

contextstring

bootstrap_internal_agentNormalize a GitHub/Slack/Linear trigger into startup context, construct a recall pack, prepare a git worktree sandbox, and emit an execution plus reviewer-lane plan.15 params

Normalize a GitHub/Slack/Linear trigger into startup context, construct a recall pack, prepare a git worktree sandbox, and emit an execution plus reviewer-lane plan.

Parameters* required

taskobject

sourcestring

one of github · slack · linear · api · cli

threadobject

contextstring

triggerobject

approvedboolean

commentsarray

intentIdstring

messagesarray

repoPathstring

mcpProfilestring

sandboxRootstring

delegationModestring

one of off · auto · sequential

partnerProfilestring

prepareSandboxboolean

prevention_rulesGenerate prevention rules from repeated mistake patterns2 params

Generate prevention rules from repeated mistake patterns

Parameters* required

outputPathstring

minOccurrencesnumber

export_dpo_pairsExport DPO preference pairs from local memory log1 params

Export DPO preference pairs from local memory log

Parameters* required

memoryLogPathstring

export_hf_datasetExport ThumbGate agent traces and DPO preference pairs as a HuggingFace-compatible dataset. Produces traces.jsonl, preferences.jsonl, and dataset_info.json with PII-redacted paths. Ready for huggingface-cli upload.2 params

Export ThumbGate agent traces and DPO preference pairs as a HuggingFace-compatible dataset. Produces traces.jsonl, preferences.jsonl, and dataset_info.json with PII-redacted paths. Ready for huggingface-cli upload.

Parameters* required

outputDirstring

Output directory (default: feedback-dir/hf-dataset)

includeProvenanceboolean

Include provenance events in traces (default: true)

export_databricks_bundleExport ThumbGate logs and proof artifacts as a Databricks-ready analytics bundle1 params

Export ThumbGate logs and proof artifacts as a Databricks-ready analytics bundle

Parameters* required

outputPathstring

construct_context_packConstruct a bounded context pack from contextfs4 params

Construct a bounded context pack from contextfs

Parameters* required

querystring

maxCharsnumber

maxItemsnumber

namespacesarray

evaluate_context_packRecord evaluation outcome for a context pack6 params

Record evaluation outcome for a context pack

Parameters* required

notesstring

packIdstring

signalstring

outcomestring

guardrailsobject

rubricScoresarray

context_provenanceGet recent context/provenance events1 params

Get recent context/provenance events

Parameters* required

limitnumber

generate_skillAuto-generate Claude skills from repeated feedback patterns. Clusters failure patterns by tags and produces SKILL.md files with DO/INSTEAD rules.2 params

Auto-generate Claude skills from repeated feedback patterns. Clusters failure patterns by tags and produces SKILL.md files with DO/INSTEAD rules.

Parameters* required

tagsarray

Filter to specific tags

minOccurrencesnumber

Minimum pattern occurrences to trigger skill generation (default 3)

recallRecall relevant past feedback, memories, and prevention rules for the current task. Call this at the start of any task to inject past learnings into the conversation.3 params

Recall relevant past feedback, memories, and prevention rules for the current task. Call this at the start of any task to inject past learnings into the conversation.

Parameters* required

limitnumber

Max memories to return (default 5)

querystring

Describe the current task or context to find relevant past feedback

repoPathstring

Optional repository path for structural impact analysis on coding tasks

unified_contextAssemble a complete, role-aware context object in one call. Combines session state, user profile, relevant lessons, prevention guards, context pack, and code-graph impact — with tiered graceful degradation (full → warm → cold). Replaces multiple recall/retrieve/session_primer...5 params

Assemble a complete, role-aware context object in one call. Combines session state, user profile, relevant lessons, prevention guards, context pack, and code-graph impact — with tiered graceful degradation (full → warm → cold). Replaces multiple recall/retrieve/session_primer...

Parameters* required

querystring

Describe the current task to find relevant context

repoPathstring

Repository path for code-graph impact analysis

toolNamestring

Current tool being invoked (improves lesson matching)

agentTypestring

Agent type — shapes context budget and feature inclusionone of claude · cursor · forgecode · codex

toolInputobject

Current tool input (for guard evaluation)

satisfy_gateSatisfy a gate condition with optional structured reasoning. Evidence is stored with a 5-minute TTL. When structuredReasoning is provided, the premise/evidence/conclusion chain is stored in the audit trail.3 params

Satisfy a gate condition with optional structured reasoning. Evidence is stored with a 5-minute TTL. When structuredReasoning is provided, the premise/evidence/conclusion chain is stored in the audit trail.

Parameters* required

gatestring

Gate condition ID to satisfy (e.g., pr_threads_checked)

evidencestring

Evidence text (e.g., "0 unresolved threads")

structuredReasoningobject

Structured pre-gate reasoning: state premises, trace evidence, assess risk, derive conclusion before unlocking.

set_task_scopeDeclare or clear the current task scope so ThumbGate can compare affected files and diffs against the approved path set.7 params

Declare or clear the current task scope so ThumbGate can compare affected files and diffs against the approved path set.

Parameters* required

clearboolean

Clear the current task scope instead of setting one

taskIdstring

Optional stable task identifier (ticket, issue, or work item id)

summarystring

Short summary of the task being worked

repoPathstring

Optional repo root used when evaluating git diff scope

localOnlyboolean

When true, also marks the task as local-only

allowedPathsarray

Glob patterns that define the allowed file scope for this task

protectedPathsarray

Optional protected-file globs that require explicit approval before editing or publishing

get_scope_stateReturn the active task scope and any unexpired protected-file approvals.

Return the active task scope and any unexpired protected-file approvals.

No parameter schema in public metadata yet.

set_branch_governanceDeclare or clear branch and release governance so PR, merge, release, and publish actions can be evaluated against explicit workflow state.11 params

Declare or clear branch and release governance so PR, merge, release, and publish actions can be evaluated against explicit workflow state.

Parameters* required

clearboolean

Clear the current branch governance state instead of setting it

prUrlstring

Optional pull request URL once a PR exists

prNumberstring

Optional pull request number once a PR exists

localOnlyboolean

When true, PR, merge, release, and publish actions are blocked for this lane

baseBranchstring

Protected base branch for merge and release operations (defaults to main)

branchNamestring

Optional branch name the governance applies to

prRequiredboolean

Whether this lane must go through a pull request (defaults to true)

queueRequiredboolean

Whether the target branch requires a merge queue

releaseVersionstring

Expected package version for release or publish actions

releaseEvidencestring

Optional evidence or release plan note for the governed version

releaseSensitiveGlobsarray

Optional custom globs that define release-sensitive files for this branch lane

get_branch_governanceReturn the active branch and release governance state.

Return the active branch and release governance state.

No parameter schema in public metadata yet.

approve_protected_actionGrant a time-limited approval for edits or publish actions that touch protected files.5 params

Grant a time-limited approval for edits or publish actions that touch protected files.

Parameters* required

ttlMsnumber

Optional approval lifetime in milliseconds (defaults to 1 hour, max 24 hours)

reasonstring

Why this protected-file action is approved

taskIdstring

Optional task id this approval is tied to

evidencestring

Optional supporting evidence or approval note

pathGlobsarray

Protected-file globs covered by this approval

track_actionRecord a verification action in the current session (for example figma_verified or tests_passed). Session actions expire after one hour.2 params

Record a verification action in the current session (for example figma_verified or tests_passed). Session actions expire after one hour.

Parameters* required

actionIdstring

Verification action ID to record

metadataobject

Optional structured metadata describing the evidence source

verify_claimCheck whether a claim has enough tracked evidence before the agent asserts it.1 params

Check whether a claim has enough tracked evidence before the agent asserts it.

Parameters* required

claimstring

The claim text to verify

check_operational_integrityEvaluate whether the current repo state is safe for PR, merge, release, and publish operations.5 params

Evaluate whether the current repo state is safe for PR, merge, release, and publish operations.

Parameters* required

commandstring

Optional git, PR, or publish command to evaluate against the current governance state

repoPathstring

Optional repository path to inspect

baseBranchstring

Protected base branch to compare against (defaults to main)

requireVersionNotBehindBaseboolean

When true, release-sensitive changes cannot lag behind the base branch package version

requirePrForReleaseSensitiveboolean

When true, release-sensitive changes on non-base branches require an open PR

workflow_sentinelPredict pre-action workflow risk, blast radius, and remediations before a tool call executes.8 params

Predict pre-action workflow risk, blast radius, and remediations before a tool call executes.

Parameters* required

commandstring

Optional shell command when toolName is Bash

filePathstring

Optional primary file path for edit-like tools

repoPathstring

Optional repository path used for git-aware integrity checks

toolNamestring

Tool being assessed, such as Bash, Edit, or Write

baseBranchstring

Optional protected base branch override (defaults to main)

changedFilesarray

Optional affected-file list used to estimate blast radius

requireVersionNotBehindBaseboolean

When true, release-sensitive changes cannot lag behind the base branch package version

requirePrForReleaseSensitiveboolean

When true, release-sensitive changes on non-base branches require an open PR

register_claim_gateRegister a custom claim verification rule in local runtime state without editing tracked repo config.3 params

Parameters* required

messagestring

Custom message returned when evidence is missing

claimPatternstring

Regex pattern that should trigger claim verification

requiredActionsarray

Tracked actions that must be present before the claim is verified

gate_statsGet gate enforcement statistics -- blocked count, warned count, top gates

Get gate enforcement statistics -- blocked count, warned count, top gates

No parameter schema in public metadata yet.

dashboardGet full ThumbGate dashboard -- Harness Score, gate stats, prevention impact, proof, and system health

Get full ThumbGate dashboard -- Harness Score, gate stats, prevention impact, proof, and system health

No parameter schema in public metadata yet.

org_dashboardOrg-wide multi-agent dashboard — shows all active agents, gate decisions, adherence rates, risk agents, and top blocked gates across the organization. Team rollout: full visibility. Free preview: limited to 3 agents.1 params

Org-wide multi-agent dashboard — shows all active agents, gate decisions, adherence rates, risk agents, and top blocked gates across the organization. Team rollout: full visibility. Free preview: limited to 3 agents.

Parameters* required

windowHoursnumber

Lookback window in hours (default 24)

settings_statusResolve managed, user, project, and local ThumbGate settings with per-field origin metadata for policy visibility.

Resolve managed, user, project, and local ThumbGate settings with per-field origin metadata for policy visibility.

No parameter schema in public metadata yet.

commerce_recallRecall past feedback filtered by commerce categories (product_recommendation, brand_compliance, sizing, pricing, regulatory). Returns quality scores alongside memories for agentic commerce agents.3 params

Recall past feedback filtered by commerce categories (product_recommendation, brand_compliance, sizing, pricing, regulatory). Returns quality scores alongside memories for agentic commerce agents.

Parameters* required

limitnumber

Max memories to return (default 5)

querystring

Product or brand context to find relevant past feedback

categoriesarray

Commerce categories to filter (default: all commerce categories)

get_business_metricsRetrieve high-level business metrics (Revenue, Conversion, Customers) from the Semantic Layer.1 params

Retrieve high-level business metrics (Revenue, Conversion, Customers) from the Semantic Layer.

Parameters* required

windowstring

Analytics window (today, 7d, 30d, all)

describe_semantic_entityGet the canonical definition and state of a business entity (Customer, Revenue, Funnel).1 params

Get the canonical definition and state of a business entity (Customer, Revenue, Funnel).

Parameters* required

typestring

one of Customer · Revenue · Funnel

estimate_uncertaintyEstimate Bayesian uncertainty for a set of tags based on past feedback.1 params

Estimate Bayesian uncertainty for a set of tags based on past feedback.

Parameters* required

tagsarray

Tags to analyze for uncertainty

session_handoffWrite a session handoff primer that auto-captures git state (branch, last 5 commits, modified files), last completed task, next step, and blockers. The next session reads this automatically for seamless context continuity.6 params

Write a session handoff primer that auto-captures git state (branch, last 5 commits, modified files), last completed task, next step, and blockers. The next session reads this automatically for seamless context continuity.

Parameters* required

projectstring

Project name (auto-detected from cwd if omitted)

blockersarray

Open blockers or unresolved issues

lastTaskstring

What was completed this session

nextStepstring

Exact next action for the next session

openFilesarray

Key files being worked on

customContextstring

Any additional context for the next session

session_primerRead the most recent session handoff primer to restore context from the previous session. Call at session start.

Read the most recent session handoff primer to restore context from the previous session. Call at session start.

No parameter schema in public metadata yet.

list_harnessesList natural-language harness specs for portable workflow control, proof-backed verification, and GTM execution.1 params

List natural-language harness specs for portable workflow control, proof-backed verification, and GTM execution.

Parameters* required

tagstring

Optional tag filter such as verification, acquisition, or workflow.

run_harnessExecute a natural-language harness through the async job runner with checkpoints, verification, and proof-backed outcomes.3 params

Execute a natural-language harness through the async job runner with checkpoints, verification, and proof-backed outcomes.

Parameters* required

jobIdstring

Optional stable job id for the resulting runtime.

inputsobject

Optional input overrides for template variables.

harnessstring

Harness id or file basename to execute.

scheduleCreate, list, or delete scheduled tasks. Supports natural language scheduling like "daily 9:00", "weekly monday 8:30", "hourly". Installs as macOS LaunchAgent or Linux crontab.6 params

Create, list, or delete scheduled tasks. Supports natural language scheduling like "daily 9:00", "weekly monday 8:30", "hourly". Installs as macOS LaunchAgent or Linux crontab.

Parameters* required

namestring

Schedule name/ID

actionstring

Schedule actionone of create · list · delete

commandstring

Node.js code to execute on schedule

schedulestring

Schedule spec: "daily 9:00", "weekly monday 8:30", "hourly"

descriptionstring

What this schedule does

workingDirectorystring

Working directory for the command

user_profileManage persistent user profile — preferences, style, domain knowledge that persists across sessions. Actions: add, remove, replace, view.3 params

Manage persistent user profile — preferences, style, domain knowledge that persists across sessions. Actions: add, remove, replace, view.

Parameters* required

actionstring

Profile actionone of add · remove · replace · view

contentstring

Content to add or new content for replace

old_textstring

Substring to match for remove/replace

session_searchSearch past session notes and conversations using full-text search. Returns relevant sessions from the SQLite FTS5 index for cross-session recall.2 params

Search past session notes and conversations using full-text search. Returns relevant sessions from the SQLite FTS5 index for cross-session recall.

Parameters* required

limitnumber

Max results to return (default 10)

querystring

Search query to find relevant past sessions

open_feedback_sessionOpen a feedback session after thumbs up/down. Follow-up messages will be captured for 60s.3 params

Open a feedback session after thumbs up/down. Follow-up messages will be captured for 60s.

Parameters* required

signalstring

one of up · down

initialContextstring

feedbackEventIdstring

The feedback event ID from capture_feedback

append_feedback_contextAppend a follow-up message to an open feedback session. Call this when the user types additional context after giving thumbs up/down.3 params

Append a follow-up message to an open feedback session. Call this when the user types additional context after giving thumbs up/down.

Parameters* required

rolestring

one of user · assistantdefault: user

messagestring

The follow-up message from the user

sessionIdstring

finalize_feedback_sessionFinalize a feedback session and re-infer the lesson with all follow-up context.1 params

Finalize a feedback session and re-infer the lesson with all follow-up context.

Parameters* required

sessionIdstring

webhook_deliverSend a message to Teams, Slack, or Discord via webhook. Use for status reports, alerts, and notifications.4 params

Send a message to Teams, Slack, or Discord via webhook. Use for status reports, alerts, and notifications.

Parameters* required

titlestring

Message title

messagestring

Message body (markdown supported)

platformstring

Target platformone of teams · slack · discord

webhook_urlstring

Webhook URL for the target channel

reflect_on_feedbackRun a post-mortem analysis on negative feedback. Returns a proposed rule and recurrence info.4 params

Run a post-mortem analysis on negative feedback. Returns a proposed rule and recurrence info.

Parameters* required

contextstring

One-line context from the caller

whatWentWrongstring

What the caller said went wrong

feedbackEventIdstring

ID of a previously captured feedback event

conversationWindowarray

Last 5-10 conversation turns before the feedback signal.

report_product_issueReport a bug, suggestion, or complaint about ThumbGate itself (not project feedback). Auto-files a GitHub issue with system context. Use when the user expresses frustration or requests a feature for the thumbgate tool.3 params

Report a bug, suggestion, or complaint about ThumbGate itself (not project feedback). Auto-files a GitHub issue with system context. Use when the user expresses frustration or requests a feature for the thumbgate tool.

Parameters* required

bodystring

Description of the problem or suggestion, in the user own words

titlestring

Short issue title (e.g. "Gate blocks valid migration")

categorystring

Issue categoryone of bug · feature · question

run_managed_lesson_agentRun the LLM-powered lesson inference and rule generation agent over accumulated feedback. Requires ANTHROPIC_API_KEY for LLM mode; falls back to heuristics if unavailable.3 params

Run the LLM-powered lesson inference and rule generation agent over accumulated feedback. Requires ANTHROPIC_API_KEY for LLM mode; falls back to heuristics if unavailable.

Parameters* required

limitnumber

Max feedback entries to process (default: 20)

modelstring

Override the Claude model (default: claude-haiku-4-5)

dryRunboolean

Preview what would be written without persisting

managed_agent_statusShow status of the last managed lesson agent run: entries processed, lessons created, gates promoted, and total runs.

Show status of the last managed lesson agent run: entries processed, lessons created, gates promoted, and total runs.

No parameter schema in public metadata yet.

run_self_distillRun the self-distillation agent to auto-evaluate recent agent sessions and generate improvement lessons without human feedback. Reads conversation logs, detects success/failure signals, and persists lessons.3 params

Run the self-distillation agent to auto-evaluate recent agent sessions and generate improvement lessons without human feedback. Reads conversation logs, detects success/failure signals, and persists lessons.

Parameters* required

limitnumber

Max conversation logs to process (default 20)

modelstring

LLM model to use for analysis (requires ANTHROPIC_API_KEY)

dryRunboolean

If true, analyzes but does not persist lessons

self_distill_statusShow status of the last self-distillation run: sessions analyzed, lessons generated, signals detected.

Show status of the last self-distillation run: sessions analyzed, lessons generated, signals detected.

No parameter schema in public metadata yet.

context_stuff_lessonsDump ALL prevention lessons into a single text block for context-window injection. Bypasses RAG/search — returns every lesson sorted by confidence. For most projects (20-200 lessons), fits in 1K-10K tokens.3 params

Dump ALL prevention lessons into a single text block for context-window injection. Bypasses RAG/search — returns every lesson sorted by confidence. For most projects (20-200 lessons), fits in 1K-10K tokens.

Parameters* required

formatstring

Output format (default: compact)one of compact · full

signalstring

Filter by signal typeone of positive · negative

maxTokenBudgetnumber

Approximate token budget (default: 10000)

ThumbGate

AI coding agents repeat mistakes — and one wrong tool call can wipe a directory, leak a key, or push broken code.

ThumbGate is the local-first firewall for AI coding agents. It runs in the PreToolUse hook on your machine and blocks dangerous tool calls — rm -rf, secret exfiltration, off-scope edits, a bad git push — before they execute, across Claude Code, Cursor, Codex, Gemini, Amp, Cline, and OpenCode. No server, no gateway. (Regulated-industry policy templates — legal intake, financial compliance, healthcare — build on the same engine.)

The product is a self-improving enforcement layer: thumbs-down feedback, prompt evaluation, and proof from prior runs become prevention rules that permanently stop repeated failures before the next tool call.

ThumbGate blocking an AI agent's dangerous commands (rm -rf, force-push, chmod 777) in real time, while letting safe commands through

  Agent tries:   rm -rf tests/
  ThumbGate:     ⛔ BLOCKED — "Never delete test directories"
                 Pattern matched: rm.*-rf.*tests
                 Source: your thumbs-down from last Tuesday
                 Tokens spent on this repeat: 0

npx thumbgate init   # auto-detects your agent, wires hooks, 30 seconds

Works with Claude Code, Cursor, Codex, Gemini CLI, Amp, Cline, OpenCode and any MCP-compatible agent. Free tier: 2 feedback captures/day (10 total) and up to 3 active auto-promoted prevention rules. Pro: $19/mo or $149/yr — unlimited rules, history-aware lessons, feedback sessions, dashboard, DPO export. Enterprise (custom pricing, scoped after intake) adds a shared hosted lesson DB, org dashboard, and shared org-wide enforcement.

"A better dashboard doesn't make the agents more reliable. The hard part isn't visibility. It's trust."

— Rob May, CEO & co-founder, Neurometric AI, quoted in The New Stack on Anthropic's Claude Code Agent View (May 2026).

ThumbGate is the open-source layer that makes the trust part real: PreToolUse gates, thumbs-down to rule, audit trail on every interception.

Agentic development cycle fit

Agentic development is becoming a loop: Guide → Generate → Verify → Solve. ThumbGate gives that loop a hard execution boundary.

Guide: standards, prior thumbs-downs, and approval policies become concrete context.
Generate: Claude Code, Cursor, Codex, Gemini, Amp, Cline, OpenCode, and MCP agents keep producing plans and tool calls.
Verify: risky actions need evidence before execution, not just after PR review.
Solve: blocked failures become reusable lessons, shared prevention rules, DPO exports, and audit events.

In that stack, ThumbGate is the pre-action gate between generated intent and executed action.

Discoverable slash-commands — the guardrail layer for spec-driven agents

Spec-driven agent frameworks like GSD (get-shit-done) and GitHub Spec Kit are great at planning and generating work — they expose dozens of discoverable /gsd-* / /specify commands in the agent command palette. ThumbGate is the guardrail layer for spec-driven agents: it sits after the plan, on the boundary between a generated tool call and its execution. It works alongside GSD / Spec-Kit, not instead of them — they decide what to build; ThumbGate enforces what the agent must never do while building it.

npx thumbgate init installs these commands into your agent's palette (.claude/commands/, .gemini/commands/, .antigravitycli/commands/) so the enforcement layer is as browsable as the planning layer:

Command	What it does	Wraps (existing capability)
`/thumbgate-guard`	Turn the last agent mistake into a hard prevention rule	`capture_feedback` + `thumbgate force-gate`
`/thumbgate-rules`	List the active prevention rules + lessons guarding this repo	`prevention_rules`, `get_reliability_rules`, `search_lessons`
`/thumbgate-blocked`	Show what's actually been blocked — gate stats + enforcement matrix	`gate_stats`, `enforcement_matrix`
`/thumbgate-protect`	Show branch/release governance; grant a scoped, expiring approval	`get_branch_governance`, `approve_protected_action`
`/thumbgate-doctor`	Health-check the wiring (hooks, MCP, agent-readiness)	`thumbgate doctor`

Each is a thin wrapper over an existing MCP tool or CLI command — no new enforcement logic, just discoverability.

🎬 90-second demo

Watch the force-push scenario: agent tries to git push --force, one thumbs-down, next session it's blocked — zero tokens spent on the repeat.

▶ Watch the 90-second demo · Script · ElevenLabs narration: npm run demo:voiceover

First-dollar activation path

If someone is not already bought into ThumbGate, do not lead with architecture. Lead with one repeated mistake.

Show the pain: open the ThumbGate GPT and paste the bad answer, risky command, deploy, PR action, or agent plan before it runs again.
Capture the lesson: type thumbs down: or thumbs up: with one concrete sentence. Native ChatGPT rating buttons are not the ThumbGate capture path; typed feedback is.
Enforce the repeat: run npx thumbgate init where the agent executes so the lesson can become one of your Pre-Action Checks instead of another reminder.
Upgrade only after proof: Solo Pro is for the dashboard, DPO export, proof-ready evidence, and higher capture limits after one real blocked repeat. Team starts with the Workflow Hardening Sprint around one repeated failure, one owner, and one proof review.

The buying question is simple: what repeated AI mistake would be worth blocking before the next tool call?

The Problem — the bill nobody talks about

Frontier-model calls are not cheap. Sonnet 4.5 is ~$3 / 1M input tokens and ~$15 / 1M output tokens. Opus is 5× that. Every time your agent:

hallucinates a function name and you have to correct it,
retries the same failing tool call until it gives up,
regenerates a 4,000-token plan you already approved last session,
repeats a destructive command you blocked manually yesterday,

…you are paying for that round-trip. Twice if it retries. Three times if you re-prompt. And the agent has no memory across sessions, so the meter resets every Monday.

Session 1:  Agent force-pushes to main.     You fix it.    +4,200 tokens
Session 2:  Agent force-pushes again.       You fix it.    +4,200 tokens
Session 3:  Same mistake. Again.            You lose 45m.  +5,800 tokens

That's ~$0.21 in tokens just to fix the same mistake three times — multiplied by every developer, every repeated-mistake class, every week. The math gets ugly fast.

The Solution — fix it once, the bill never sees it again

Session 1:  Agent force-pushes to main.     You 👎 it.       +4,200 tokens
Session 2:  ⛔ Check blocks the force-push.  Zero round-trip. +0 tokens
Session 3+: Never happens again.                              +0 tokens

One thumbs-down. The PreToolUse hook intercepts the call before it reaches the model — no input tokens, no output tokens, no retry loop. The dashboard tracks tokens saved this week as a live counter so you can see exactly what your prevention rules are worth. Mark a review checkpoint once, and the dashboard narrows the next pass to only the feedback, lessons, and check blocks that landed since your last review.

ThumbGate doesn't make your agent smarter. It makes your agent cheaper to be wrong with.

🧠 The Context Brain

Every coding agent starts each session amnesiac — it has no memory of the mistakes it made yesterday, the fixes your team already rejected, or the rules this repo enforces. So it repeats them, and you pay for it again.

ThumbGate gives your repo a context brain: a single, versioned, agent-readable artifact that consolidates everything the agent should know before it acts — the lessons it has learned, the guardrails it must not cross, the gates that are enforced, and the project's own instruction files.

npx thumbgate brain --write     # → .thumbgate/BRAIN.md

Then point your agent at it — add Read .thumbgate/BRAIN.md first to your CLAUDE.md / AGENTS.md, and every Claude Code, Codex, Cursor, or Gemini CLI session boots with your repo's institutional memory already loaded. The output is deterministic, so BRAIN.md lives in git and only changes when the underlying memory does — review it like any other file.

# ThumbGate Context Brain
## What this codebase taught its agents (lessons)
- ⛔ Force-pushing to main was rejected — use --force-with-lease on feature branches only
## Guardrails — do NOT repeat these (prevention rules)
- Never run DROP on production tables
## Active enforcement (gates)
- `DROP.*production` → block

Same idea the SEO world is now calling a "client brain" — persistent context that AI reads before doing the work — applied to engineering: the institutional memory that stops your coding agent from relearning the same lesson on your dime.

Quick Start

npx thumbgate init                                                         # auto-detects your agent, wires everything
npx thumbgate capture down "Never run DROP on production tables"

That single command creates a prevention rule. Next time any AI agent tries to run DROP on production:

⛔ Check blocked: "Never run DROP on production tables"
   Pattern: DROP.*production
   Verdict: BLOCK

Architecture

ThumbGate operates as a 4-layer enforcement stack between your AI agent and your codebase:

ThumbGate Architecture

Layer 1: Feedback Capture

Your thumbs-up/down reactions are captured via MCP protocol, CLI, or the ChatGPT GPT surface. Each reaction is stored as a structured lesson with context, timestamp, and severity.

Layer 2: Check Engine

The check engine converts lessons into enforceable rules. The runtime gate decision is deterministic — literal pattern match → AST match → scoped rule lookup. No LLM call on the enforcement path.

Where retrieval is needed (an agent is about to run a destructive command not on the literal block list, but semantically similar to one we've blocked before), ThumbGate uses local CPU-only bge-small embeddings via LanceDB's built-in pipeline. No external API call, no inference cost beyond CPU. So "no LLM in enforcement" holds: the gate decision uses no LLM; the rule corpus is just searchable via local embeddings.

Thompson Sampling tunes per-rule confidence weights for soft-gating rules so high-noise rules quiet down and high-signal rules sharpen. It never decides whether a rule fires — a hard rule like "block git push --force on main" always fires deterministically. Bandit exploration would be terrifying for hard rules; we don't do it.

Rules stay in local ThumbGate runtime state.

Layer 3: Pre-Action Interception

Before any agent action executes, ThumbGate's PreToolUse hook intercepts the command and evaluates it against all active checks. This happens at the MCP protocol level — the agent physically cannot bypass it.

Layer 4: Multi-Agent Distribution (the actual moat vs hand-rolled hooks)

Claude Code already ships permissions.deny and PreToolUse hooks. Cursor and Codex have their own. So why ThumbGate over a hand-written hook?

Two things hand-written hooks structurally cannot do:

Cross-agent propagation. A permissions.deny pattern lives in one agent's config and stays there. ThumbGate's checks distribute across every connected agent over MCP stdio — thumbs-down once in Cursor, the same pattern blocks on Claude Code, Codex, Gemini CLI, Cline, OpenCode, Amp in the next session, no copy-paste between configs.
Learning loop. A hand-written hook covers exactly the patterns you wrote. ThumbGate promotes every thumbs-down into a fresh rule, tunes existing rules' confidence weights from outcomes (Thompson Sampling, see Layer 2), and pulls semantically-near patterns into scope via local embeddings. The rule corpus sharpens without an operator hand-writing a regex for every new mistake shape.

Hand-rolled hooks are the right tool for a small, static denylist you maintain by hand. ThumbGate is the right tool when you want corrections from any agent to harden every agent automatically.

Prompt engineering still matters, but it is only the starting point. ThumbGate adds prompt evaluation on top: proof lanes, benchmarks, and self-heal checks tell you whether your prompt and workflow actually held up under execution instead of leaving you to guess from vibes. Run npx thumbgate eval --from-feedback --write-report=.thumbgate/prompt-eval-proof.md to turn real thumbs-up/down feedback into reusable eval cases and a buyer-ready proof report.

Retrieval & latency: local-first, zero network hops

ThumbGate's latency advantage is structural, not a tuned cloud cluster: there is no retrieval service and no model on the enforcement path, so the gate decision never leaves your machine.

flowchart LR
    A["Agent about to run<br/>a tool call"] --> B{"Literal / AST match<br/>on an active rule?"}
    B -- "exact match" --> D["Deterministic gate decision<br/>(no model, on-device)"]
    B -- "no exact match, but<br/>semantically near a<br/>blocked pattern" --> C["Local CPU embeddings<br/>bge-small via LanceDB<br/>(no external API)"]
    C --> D
    D -- "known-bad" --> E["⛔ BLOCK before execution"]
    D -- "safe" --> F["✓ Allow"]

Deterministic first. Most decisions are a literal or AST pattern match against your active rules — sub-millisecond, on-device, no embeddings.
Local semantic fallback. When an action isn't on the literal block list but is semantically near one you've blocked before, ThumbGate searches the rule corpus with CPU-only bge-small embeddings via LanceDB — still local, still no external API call.
No LLM on the enforcement path. The gate never calls a model to decide block/allow. Thompson Sampling only tunes soft-rule confidence weights; hard rules always fire deterministically (see Layer 2).

The fastest network round-trip is the one you never make: enforcement is fully local, so it adds negligible latency to the agent loop — no cloud retrieval, no inference hop, no data leaving the machine.

Managed model benchmark lane

When a new managed model drops, do not swap ThumbGate over on vendor claims alone. Rank it against the actual ThumbGate workload first:

npx thumbgate model-candidates --workload=pretool-gating --json
npx thumbgate model-candidates --workload=long-trace-review --provider=openai-compatible --gateway=tinker --json

The catalog currently includes the April 23, 2026 Tinker additions:

tinker/qwen3.6-35b-a3b for pre-action gating, agentic coding, and tool-use
tinker/qwen3.6-27b for the cheap fast-path
tinker/kimi-k2.6-128k for long-trace review and multi-agent sessions

Each recommendation ships with the benchmark commands to run next: feedback-derived prompt eval, gate-eval, and thumbgate bench. For whole-repo clone claims, add npx thumbgate bench --programbench-smoke to generate a ProgramBench-style cleanroom proof report without claiming an official ProgramBench score. That keeps model selection evidence-backed instead of hype-driven.

Feedback Pipeline

Agent Integration

Install for Your Agent

Agent	Command
Claude Code	`npx thumbgate init --agent claude-code`
Cursor	`npx thumbgate init --agent cursor`
VS Code / Open VSX	plugins/vscode-extension/README.md
Antigravity-compatible	plugins/antigravity-extension/INSTALL.md
JetBrains	plugins/jetbrains-plugin/README.md
Codex	`npx thumbgate init --agent codex`
Gemini CLI	`npx thumbgate init --agent gemini`
Amp	`npx thumbgate init --agent amp`
Cline (Roo Code successor)	`npx thumbgate init --agent cline`
Claude Desktop	Download extension bundle
Any MCP agent	`npx thumbgate serve`

Works with Claude Code, Cursor, Codex, Gemini CLI, Amp, Cline, OpenCode, and any MCP-compatible agent. Migrating from Roo Code (sunsetting 2026-05-15)? See adapters/cline/INSTALL.md.

Install scope: machine-wide vs per-project

ThumbGate supports two install scopes. Pick once when you install — you can switch later by re-running with the other flag.

Scope	Command	Settings file	Lesson DB + dashboard live in	When to use
Machine-wide (default)	`npx thumbgate init`	`~/.claude/settings.json`	`~/.claude/memory/feedback/`	Solo dev — one shared dashboard across every repo on this machine. A lesson learned in `repo-A` blocks the same mistake in `repo-B` automatically.
Per-project	`npx thumbgate init --project` (in the repo root)	`<repo>/.claude/settings.json`	`<repo>/.claude/memory/feedback/`	Client work, compliance, or multi-tenant — separate dashboard per repo, lessons stay isolated, audit trail belongs to the repo.

Both scopes write mcpServers.thumbgate + the PreToolUse / UserPromptSubmit / PostToolUse / SessionStart hooks; the only difference is where. Machine-wide is the right default for most developers. Switch to --project only when you have a reason to keep lessons from bleeding between repos.

Per-project lesson DBs live under each repo's .claude/memory/feedback/ and must stay gitignored — they're a runtime store, not source. ThumbGate's bundled .gitignore template handles this.

Status bar proof

Claude Code ThumbGate footer

Codex ThumbGate test lane

Claude renders the live ThumbGate footer today. npx thumbgate init --agent codex now installs the full Codex hook bundle and writes the ThumbGate statusLine target into ~/.codex/config.json so you can test it on your local Codex build immediately.

Install Codex Plugin

Open the Codex plugin install page or download the standalone bundle from GitHub Releases. The Codex launcher resolves thumbgate@latest when MCP and hooks start, so published npm fixes reach active Codex installs without hand-editing ~/.codex/config.toml.

Install page: thumbgate.ai/codex-plugin
Direct zip: thumbgate-codex-plugin.zip
Follow: plugins/codex-profile/INSTALL.md

Install ChatGPT App / GPT Action

ChatGPT is the advice, checkpointing, and typed-feedback surface; ThumbGate's hard enforcement still runs locally in Codex, Claude Code, Cursor, Gemini CLI, Amp, OpenCode, MCP, or CI after install.

App page: thumbgate.ai/chatgpt-app
Live GPT: thumbgate.ai/go/gpt
GPT Action schema: thumbgate.ai/openapi.yaml
Follow: adapters/chatgpt/INSTALL.md

How It Works

  STEP 1              STEP 2                 STEP 3
  ────────            ────────               ────────

  You react           ThumbGate learns       The check holds

  👎 on a bad    ──►  Feedback becomes  ──►  Next time the
  agent action        a saved lesson         agent tries the
                      and a block rule       same thing:
  👍 on a good   ──►  Good pattern gets      ⛔ BLOCKED
  agent action        reinforced                 (or ✅ allowed)

No manual rule-writing. No config files. Your reactions teach the agent what your team actually wants.

ThumbGate sells three concrete outcomes:

Prevent expensive AI mistakes — catch bad commands, destructive database actions, unsafe publishes, and risky API calls before they run.
Make AI stop repeating mistakes — fix it once, turn the lesson into a rule, and block the repeat before the next tool call lands.
Turn AI into a reliable operator — move from a smart assistant that apologizes after damage to a production-ready operator with checkpoints, proof, and enforcement.
Measure prompts instead of rewriting them blindly — use thumbgate eval --from-feedback, proof lanes, ThumbGate Bench, and self-heal:check to evaluate whether prompts and workflows actually improved behavior.

Use Cases

Developer Workflows

Stop force-push to main — Check blocks git push --force on protected branches before it runs
Prevent repeated migration failures — Each mistake becomes a searchable lesson that fires before the next attempt
Block unauthorized file edits — Control which files agents can touch with path-based rules
Memory across sessions — The agent remembers your feedback from yesterday
Shared team safety — One developer's thumbs-down protects the whole team
Auto-improving without feedback — Self-improvement mode evaluates outcomes and generates rules automatically

Enterprise & Regulated Industries

Legal AI intake governance — Block unauthorized practice of law (ABA Rule 5.5), require conflict-of-interest clearance before fact collection (Rules 1.7/1.9/1.10), prevent privileged content from leaving firm boundaries (Rule 1.6)
Financial compliance — Gate AI-generated trade recommendations, block unauthorized disclosures, enforce approval chains before customer-facing outputs
Healthcare — Prevent AI agents from providing medical diagnoses, enforce HIPAA-compliant data routing, require clinician review before patient-facing content
Audit trail — Every gate decision (block, allow, reroute) is preserved with rule version, timestamp, and reviewer path for compliance review

See the legal-intake demo →

Built-in Checks

⛔ force-push          → blocks git push --force
⛔ protected-branch    → blocks direct push to main
⛔ unresolved-threads  → blocks push with open reviews
⛔ package-lock-reset  → blocks destructive lock edits
⛔ env-file-edit       → blocks .env secret exposure

+ custom prevention rules for project-specific failures

CLI Reference

npx thumbgate init                                              # detect agent, wire hooks
npx thumbgate doctor                                            # health check
npx thumbgate capture up|down "<text>"                         # capture a signal as a stored lesson (positional format)
npx thumbgate lessons                                           # see what's been learned
npx thumbgate brain --write                                     # build .thumbgate/BRAIN.md — the agent-readable context brain
npx thumbgate explore    # terminal explorer for lessons, checks, stats
npx thumbgate background-governance  # review background-agent run risk
npx thumbgate model-candidates --workload=dashboard-analysis --provider=openai --json  # evaluate GPT-5.5 routing
npx thumbgate native-messaging-audit  # inspect local browser bridges and extension hosts
npx thumbgate dashboard --open                                  # open local project-scoped dashboard in browser
thumbgate-dashboard                                             # standalone browser dashboard shortcut (run '/project:thumbgate-dashboard' in Claude/Grok)
npx thumbgate check-update                                      # check if a new version is available on npm/GitHub
npx thumbgate self-update                                       # update ThumbGate to the latest version globally
npx thumbgate serve      # start MCP server on stdio
npx thumbgate bench      # run reliability benchmark
npx thumbgate bench --programbench-smoke  # include cleanroom whole-repo proof lane
npx thumbgate break-glass --reason="ThumbGate over-fired"  # short TTL recovery for gate over-fire

Recovery if a gate over-fires

ThumbGate should block repeated unsafe actions, not trap the operator. If a noisy rule or stale memory pattern blocks the hook/settings change you need to recover, open a short-lived break-glass window:

npx thumbgate break-glass --reason="ThumbGate over-fired and blocked operator recovery"

What this unlocks for up to 5 minutes:

Edits to .claude/settings.local.json, .claude/settings.json, .codex/config.toml, and the same files inside nested workspaces.
The short-lived proof gates used for PR recovery: pr_create_allowed and pr_threads_checked.

What stays gated:

Force pushes, protected-branch pushes, broad rm -rf, unsafe chmod, package publishes/releases, and local-only remote side effects.
Arbitrary protected files such as README.md, AGENTS.md, policy bundles, or credentials.

Verify the recovery window and runtime health before continuing:

npx thumbgate break-glass --reason="verify recovery path" --json
npx thumbgate doctor

If you change MCP or hook settings, restart the affected agent session so Claude Code, Cursor, Codex, or another runtime reloads .mcp.json and local settings.

Pricing

	Free	Pro ($19/mo)	Enterprise
Local CLI + enforced checks	✅	✅	✅
Feedback captures	2/day (10 total)	Unlimited	Unlimited
Active auto-promoted prevention rules	3	Unlimited	Unlimited
MCP agent integrations	All	All	All
Personal dashboard	—	✅	✅
DPO export (model fine-tuning)	—	✅	✅
Lesson export/import	—	✅	✅
Shared hosted lesson DB	—	—	✅
Org-wide dashboard	—	—	✅
Approval + audit proof	—	—	✅
Regulatory gate templates	—	—	✅
Custom policy layers (firm/practice-area)	—	—	✅
Compliance audit export	—	—	✅
Dedicated onboarding + SLA	—	—	✅

The free tier gives you 2 feedback captures/day (10 total) and up to 3 active auto-promoted prevention rules — enough to make ThumbGate part of your daily flow before you upgrade. MCP integrations for all agents (Claude Code, Cursor, Codex, Gemini, Amp, Cline, OpenCode) ship free.

Pro ($19/mo or $149/yr) removes the rule cap and adds history-aware lesson recall, lesson search, DPO export, and a personal dashboard. Enterprise (custom pricing, scoped after intake) adds a shared hosted lesson DB, org dashboard, and shared enforcement across the org, plus regulatory gate templates (legal intake, financial compliance, healthcare), custom policy layers scoped to firm/practice-area, compliance audit export, and dedicated onboarding with SLA.

Best first paid motion for teams: the Workflow Hardening Sprint — qualify one repeated failure before committing to a full rollout. Start intake →

Best first technical motion: install the CLI-first and let init wire hooks for the agent you already use.

Paid path for individual operators: ThumbGate Pro is the self-serve side lane for a personal dashboard and export-ready evidence.

Start free · See Pro · Team Sprint intake

Team Lesson Sharing (Pro + Team)

One team's hard-won lessons shouldn't stay trapped on one laptop. ThumbGate Pro and Team can export lessons as portable bundles and import them into any other ThumbGate instance — so a mistake caught by Team A becomes a prevention rule for Team B.

Export lessons from one project:

curl -X POST http://localhost:3456/v1/lessons/export \
  -H "Authorization: Bearer $THUMBGATE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"outputPath": "./lessons-export.json"}'

Filter by signal or tags:

curl -X POST http://localhost:3456/v1/lessons/export \
  -H "Authorization: Bearer $THUMBGATE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"signal": "down", "tags": ["push-notifications", "ci"]}'

Import into another team's ThumbGate:

curl -X POST http://localhost:3456/v1/lessons/import \
  -H "Authorization: Bearer $THUMBGATE_API_KEY" \
  -H "Content-Type: application/json" \
  -d @lessons-export.json

What happens on import:

Deduplication — lessons with the same ID or title+signal are skipped
Provenance tracking — every imported lesson is tagged team-import with original source project, export timestamp, and original ID
No overwrite — import is additive; existing lessons are never modified

The export bundle includes full lesson metadata: signal, title, context, tags, failure type, skill, structured rules, and diagnosis. It's the same data you see in the lesson detail dashboard — portable as JSON.

Use cases:

Share enforcement patterns across repos in the same org
Onboard a new team with pre-built lessons from a mature project
Export lessons before a project handoff so institutional knowledge transfers
Feed lessons from multiple teams into a centralized DPO training pipeline

DPO Export for Fine-Tuning (Pro + Team)

Every thumbs-up and thumbs-down becomes a training signal. ThumbGate Pro exports your captured feedback as DPO (Direct Preference Optimization) pairs — ready to feed into a LoRA fine-tune so your model stops repeating known mistakes at the weight level, not just the check level.

Export DPO pairs:

curl -X POST http://localhost:3456/v1/dpo/export \
  -H "Authorization: Bearer $THUMBGATE_API_KEY" \
  -o dpo-pairs.jsonl

What you get: JSONL where each line is a preference pair:

chosen — the agent action you thumbed up
rejected — the action you thumbed down for the same task context
prompt — the originating user intent

Use cases:

Fine-tune Llama 3 / Mistral / local models with a LoRA adapter trained on your real mistakes
Feed into RLAIF or KTO pipelines (KTO export also available via /v1/kto/export)
Build a model that natively avoids your team's known failure patterns — no check at inference time needed

Why this matters: Checks block mistakes. Fine-tuning prevents them from being attempted. Combine both for belt-and-suspenders governance.

Tech Stack

Layer	Technology
Storage	SQLite + FTS5, LanceDB vectors, JSONL logs
Capture	2/day, 10 total on Free; unlimited on Pro, Team, and Enterprise
Intelligence	MemAlign dual recall, Thompson Sampling
Enforcement	PreToolUse hook engine, Checks config
Interfaces	MCP stdio, HTTP API, CLI (Node.js >=18)
Billing	Stripe
Execution	Railway, Cloudflare Workers, Docker Sandboxes
Governance	Workflow Sentinel, control plane, Docker Sandboxes

Every Changeset is tied to the exact main merge commit and generates Verification Evidence for Release Confidence.

Conversational ad / AI-search answer assets: AI Mode ads for agent governance · MCP tool governance · AI agent pre-action approval gates

Workflow Hardening Sprint · Live Dashboard

Integrations

ChatGPT App / GPT Action — First-class ChatGPT distribution page with the live GPT, public OpenAPI Action schema, and local enforcement install path
Open ThumbGate GPT — ThumbGate GPT: start here. Paste agent actions, get advice + checkpointing. No, users do not have to keep chatting inside the ThumbGate GPT to use ThumbGate — the hard enforcement layer still runs where the work happens.
Claude Desktop Extension — One-click install for Claude Desktop
Codex Plugin — Auto-updating standalone bundle and install page for Codex CLI
VS Code / Open VSX Extension — Marketplace-ready MCP provider and .vscode/mcp.json fallback for VS Code-compatible IDEs
Antigravity-compatible VSIX — Open VSX/direct VSIX install path while Antigravity-specific marketplace support is still unproven
JetBrains Plugin Scaffold — IntelliJ/PyCharm Marketplace path for the same thumbgate@latest runtime
Perplexity Command Center — AI-search visibility + lead discovery
ThumbGate Bench — Reliability benchmark and ProgramBench-style cleanroom proof lane
Manus AI Skill — ThumbGate integration for Manus AI agents

Feedback Sessions

Give the agent more context when a thumbs-down isn't enough:

👎 thumbs down
  └─► open_feedback_session
        └─► "you lied about deployment"    (append_feedback_context)
        └─► "tests were actually failing"  (append_feedback_context)
        └─► finalize_feedback_session
              └─► lesson inferred from full conversation

Free and self-hosted users can invoke search_lessons directly through MCP, and via the CLI with npx thumbgate lessons. History-aware feedback sessions give the agent full context for each lesson.

Enterprise Data Chat and Optional Google Adapters

The Enterprise dashboard chat is local/open-source first: it answers over local ThumbGate data using lesson retrieval, LanceDB-backed vectors, and your configured LLM. Set THUMBGATE_LOCAL_LLM_ENDPOINT to an OpenAI-compatible local endpoint (Ollama, llama.cpp, vLLM, LM Studio, etc.) when you want generated answers without sending dashboard data to Google.

Google Cloud is an optional regulated-enterprise adapter, not a dashboard chatbot requirement. If a buyer already standardizes on Vertex AI or Dialogflow CX, ThumbGate can verify that posture and deploy guard adapters in their tenancy.

Optional Vertex Setup

To wire local ThumbGate scoring to Vertex AI, run:

npx thumbgate setup-vertex

Auto-Discovery: Automatically detects your active authenticated gcloud session and active project ID.
Auto-Enablement: Programmatically enables the Vertex AI API in your project.
Auto-Configuration: Writes local Vertex routing settings to your .env file.

This command does not create or verify a live Dialogflow CX agent. Dialogflow is only relevant when a customer wants ThumbGate guard adapters in front of their own production DFCX agents. On current Google Cloud CLI installs, the old alpha gcloud CX command group is not available; verify Conversational Agents / Dialogflow CX with the Google Cloud console or the official Dialogflow CX REST API (projects.locations.agents) before claiming a live DFCX deployment.

Zero-Friction Cost Containment ($10/mo Hard Cap)

Google Cloud budget alerts are "alert-only" and do not stop API traffic, risking unexpected bill shock. ThumbGate completely resolves this on the client side:

Instant Shutdown: ThumbGate maintains a lightweight, local token ledger and instantly halts outgoing API traffic the millisecond your monthly token spending approaches the $10 limit (500k tokens of Gemini 1.5 Flash).
Bypasses extra shutdown plumbing: Requires no Pub/Sub or Cloud Functions for the local ThumbGate-side stop condition. You still need normal Google Cloud billing/API setup and live-agent verification for DFCX pilots.

FAQ

Is ThumbGate a model fine-tuning tool? No. ThumbGate does not update model weights. It captures feedback, stores lessons, injects context at runtime, and blocks bad actions before they execute.

How is this different from CLAUDE.md or .cursorrules? Those are suggestions the agent can ignore. ThumbGate checks are enforced — they physically block the action before it runs. They also auto-generate from feedback instead of requiring manual writing.

Does it work with my agent? If it supports MCP or pre-action hooks, yes. Claude Code, Claude Desktop, Cursor, Codex, Gemini CLI, Amp, Cline, OpenCode all work out of the box.

Is it free? The free tier gives you 2 feedback captures/day, 10 total captures, and up to 3 active auto-promoted prevention rules — enough for solo devs to prove a blocked repeat before upgrading. MCP integrations ship free for every agent.

Pro ($19/mo or $149/yr) removes the rule cap and adds history-aware lesson recall, lesson search, and a personal dashboard. Enterprise (custom pricing, scoped after intake) adds a shared hosted lesson DB, org dashboard, and shared enforcement.

Docs

ThumbGate for Federal Agencies — pilot-ready posture, NIST 800-53 control mapping, OMB M-24-10 / EO 14110 alignment, ThumbGate-Core gov deployment mode, public/Core boundary invariants. Landing page: thumbgate.ai/federal.
First Dollar Playbook — turning one painful workflow into the next booked pilot
Commercial Truth — pricing, claims, what we don't say
Goal Contracts — evidence-before-done contracts for multi-agent handoffs
Changeset Strategy — release notes and version bump enforcement
Release Confidence — changesets, version checks, proof lanes
Verification Evidence — proof artifacts
Claude Desktop Extension Guide
Agent Workflow Contract — the agent-run contract for all ThumbGate operations
Ready for Agent Intake — ready-for-agent intake template
SEO Guide: Claude Code Guardrails
Unsupervised Learning Signals — silent-failure clustering (on by default as of 2026-05-21; opt out via THUMBGATE_SILENT_FAILURE_CLUSTERING=0; only meaningfully active on workspaces with ≥ 50 tool calls/day)
ThumbGate-Core — private core for hosted overlays, ranking, policy synthesis, billing intelligence, and org/team workflows

License

MIT. See LICENSE.

Featured

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

Registryactive

Packagethumbgate

TransportSTDIO

UpdatedMay 29, 2026

View on GitHub

ThumbGate

AI coding agents repeat mistakes — and one wrong tool call can wipe a directory, leak a key, or push broken code.

ThumbGate blocking an AI agent's dangerous commands (rm -rf, force-push, chmod 777) in real time, while letting safe commands through

  Agent tries:   rm -rf tests/
  ThumbGate:     ⛔ BLOCKED — "Never delete test directories"
                 Pattern matched: rm.*-rf.*tests
                 Source: your thumbs-down from last Tuesday
                 Tokens spent on this repeat: 0

npx thumbgate init   # auto-detects your agent, wires hooks, 30 seconds

"A better dashboard doesn't make the agents more reliable. The hard part isn't visibility. It's trust."

— Rob May, CEO & co-founder, Neurometric AI, quoted in The New Stack on Anthropic's Claude Code Agent View (May 2026).

ThumbGate is the open-source layer that makes the trust part real: PreToolUse gates, thumbs-down to rule, audit trail on every interception.

Agentic development cycle fit

Agentic development is becoming a loop: Guide → Generate → Verify → Solve. ThumbGate gives that loop a hard execution boundary.

Guide: standards, prior thumbs-downs, and approval policies become concrete context.
Generate: Claude Code, Cursor, Codex, Gemini, Amp, Cline, OpenCode, and MCP agents keep producing plans and tool calls.
Verify: risky actions need evidence before execution, not just after PR review.
Solve: blocked failures become reusable lessons, shared prevention rules, DPO exports, and audit events.

In that stack, ThumbGate is the pre-action gate between generated intent and executed action.

Discoverable slash-commands — the guardrail layer for spec-driven agents

Command	What it does	Wraps (existing capability)
`/thumbgate-guard`	Turn the last agent mistake into a hard prevention rule	`capture_feedback` + `thumbgate force-gate`
`/thumbgate-rules`	List the active prevention rules + lessons guarding this repo	`prevention_rules`, `get_reliability_rules`, `search_lessons`
`/thumbgate-blocked`	Show what's actually been blocked — gate stats + enforcement matrix	`gate_stats`, `enforcement_matrix`
`/thumbgate-protect`	Show branch/release governance; grant a scoped, expiring approval	`get_branch_governance`, `approve_protected_action`
`/thumbgate-doctor`	Health-check the wiring (hooks, MCP, agent-readiness)	`thumbgate doctor`

Each is a thin wrapper over an existing MCP tool or CLI command — no new enforcement logic, just discoverability.

🎬 90-second demo

Watch the force-push scenario: agent tries to git push --force, one thumbs-down, next session it's blocked — zero tokens spent on the repeat.

▶ Watch the 90-second demo · Script · ElevenLabs narration: npm run demo:voiceover

First-dollar activation path

If someone is not already bought into ThumbGate, do not lead with architecture. Lead with one repeated mistake.

Show the pain: open the ThumbGate GPT and paste the bad answer, risky command, deploy, PR action, or agent plan before it runs again.
Capture the lesson: type thumbs down: or thumbs up: with one concrete sentence. Native ChatGPT rating buttons are not the ThumbGate capture path; typed feedback is.
Enforce the repeat: run npx thumbgate init where the agent executes so the lesson can become one of your Pre-Action Checks instead of another reminder.
Upgrade only after proof: Solo Pro is for the dashboard, DPO export, proof-ready evidence, and higher capture limits after one real blocked repeat. Team starts with the Workflow Hardening Sprint around one repeated failure, one owner, and one proof review.

The buying question is simple: what repeated AI mistake would be worth blocking before the next tool call?

The Problem — the bill nobody talks about

Frontier-model calls are not cheap. Sonnet 4.5 is ~$3 / 1M input tokens and ~$15 / 1M output tokens. Opus is 5× that. Every time your agent:

hallucinates a function name and you have to correct it,
retries the same failing tool call until it gives up,
regenerates a 4,000-token plan you already approved last session,
repeats a destructive command you blocked manually yesterday,

…you are paying for that round-trip. Twice if it retries. Three times if you re-prompt. And the agent has no memory across sessions, so the meter resets every Monday.

Session 1:  Agent force-pushes to main.     You fix it.    +4,200 tokens
Session 2:  Agent force-pushes again.       You fix it.    +4,200 tokens
Session 3:  Same mistake. Again.            You lose 45m.  +5,800 tokens

That's ~$0.21 in tokens just to fix the same mistake three times — multiplied by every developer, every repeated-mistake class, every week. The math gets ugly fast.

The Solution — fix it once, the bill never sees it again

Session 1:  Agent force-pushes to main.     You 👎 it.       +4,200 tokens
Session 2:  ⛔ Check blocks the force-push.  Zero round-trip. +0 tokens
Session 3+: Never happens again.                              +0 tokens

ThumbGate doesn't make your agent smarter. It makes your agent cheaper to be wrong with.

🧠 The Context Brain

npx thumbgate brain --write     # → .thumbgate/BRAIN.md

# ThumbGate Context Brain
## What this codebase taught its agents (lessons)
- ⛔ Force-pushing to main was rejected — use --force-with-lease on feature branches only
## Guardrails — do NOT repeat these (prevention rules)
- Never run DROP on production tables
## Active enforcement (gates)
- `DROP.*production` → block

Quick Start

npx thumbgate init                                                         # auto-detects your agent, wires everything
npx thumbgate capture down "Never run DROP on production tables"

That single command creates a prevention rule. Next time any AI agent tries to run DROP on production:

⛔ Check blocked: "Never run DROP on production tables"
   Pattern: DROP.*production
   Verdict: BLOCK

Architecture

ThumbGate operates as a 4-layer enforcement stack between your AI agent and your codebase:

ThumbGate Architecture

Layer 1: Feedback Capture

Your thumbs-up/down reactions are captured via MCP protocol, CLI, or the ChatGPT GPT surface. Each reaction is stored as a structured lesson with context, timestamp, and severity.

Layer 2: Check Engine

Rules stay in local ThumbGate runtime state.

Layer 3: Pre-Action Interception

Layer 4: Multi-Agent Distribution (the actual moat vs hand-rolled hooks)

Claude Code already ships permissions.deny and PreToolUse hooks. Cursor and Codex have their own. So why ThumbGate over a hand-written hook?

Two things hand-written hooks structurally cannot do:

Cross-agent propagation. A permissions.deny pattern lives in one agent's config and stays there. ThumbGate's checks distribute across every connected agent over MCP stdio — thumbs-down once in Cursor, the same pattern blocks on Claude Code, Codex, Gemini CLI, Cline, OpenCode, Amp in the next session, no copy-paste between configs.
Learning loop. A hand-written hook covers exactly the patterns you wrote. ThumbGate promotes every thumbs-down into a fresh rule, tunes existing rules' confidence weights from outcomes (Thompson Sampling, see Layer 2), and pulls semantically-near patterns into scope via local embeddings. The rule corpus sharpens without an operator hand-writing a regex for every new mistake shape.

Hand-rolled hooks are the right tool for a small, static denylist you maintain by hand. ThumbGate is the right tool when you want corrections from any agent to harden every agent automatically.

Retrieval & latency: local-first, zero network hops

ThumbGate's latency advantage is structural, not a tuned cloud cluster: there is no retrieval service and no model on the enforcement path, so the gate decision never leaves your machine.

flowchart LR
    A["Agent about to run<br/>a tool call"] --> B{"Literal / AST match<br/>on an active rule?"}
    B -- "exact match" --> D["Deterministic gate decision<br/>(no model, on-device)"]
    B -- "no exact match, but<br/>semantically near a<br/>blocked pattern" --> C["Local CPU embeddings<br/>bge-small via LanceDB<br/>(no external API)"]
    C --> D
    D -- "known-bad" --> E["⛔ BLOCK before execution"]
    D -- "safe" --> F["✓ Allow"]

Deterministic first. Most decisions are a literal or AST pattern match against your active rules — sub-millisecond, on-device, no embeddings.
Local semantic fallback. When an action isn't on the literal block list but is semantically near one you've blocked before, ThumbGate searches the rule corpus with CPU-only bge-small embeddings via LanceDB — still local, still no external API call.
No LLM on the enforcement path. The gate never calls a model to decide block/allow. Thompson Sampling only tunes soft-rule confidence weights; hard rules always fire deterministically (see Layer 2).

Managed model benchmark lane

When a new managed model drops, do not swap ThumbGate over on vendor claims alone. Rank it against the actual ThumbGate workload first:

npx thumbgate model-candidates --workload=pretool-gating --json
npx thumbgate model-candidates --workload=long-trace-review --provider=openai-compatible --gateway=tinker --json

The catalog currently includes the April 23, 2026 Tinker additions:

tinker/qwen3.6-35b-a3b for pre-action gating, agentic coding, and tool-use
tinker/qwen3.6-27b for the cheap fast-path
tinker/kimi-k2.6-128k for long-trace review and multi-agent sessions

Feedback Pipeline

Agent Integration

Install for Your Agent

Agent	Command
Claude Code	`npx thumbgate init --agent claude-code`
Cursor	`npx thumbgate init --agent cursor`
VS Code / Open VSX	plugins/vscode-extension/README.md
Antigravity-compatible	plugins/antigravity-extension/INSTALL.md
JetBrains	plugins/jetbrains-plugin/README.md
Codex	`npx thumbgate init --agent codex`
Gemini CLI	`npx thumbgate init --agent gemini`
Amp	`npx thumbgate init --agent amp`
Cline (Roo Code successor)	`npx thumbgate init --agent cline`
Claude Desktop	Download extension bundle
Any MCP agent	`npx thumbgate serve`

Works with Claude Code, Cursor, Codex, Gemini CLI, Amp, Cline, OpenCode, and any MCP-compatible agent. Migrating from Roo Code (sunsetting 2026-05-15)? See adapters/cline/INSTALL.md.

Install scope: machine-wide vs per-project

ThumbGate supports two install scopes. Pick once when you install — you can switch later by re-running with the other flag.

Scope	Command	Settings file	Lesson DB + dashboard live in	When to use
Machine-wide (default)	`npx thumbgate init`	`~/.claude/settings.json`	`~/.claude/memory/feedback/`	Solo dev — one shared dashboard across every repo on this machine. A lesson learned in `repo-A` blocks the same mistake in `repo-B` automatically.
Per-project	`npx thumbgate init --project` (in the repo root)	`<repo>/.claude/settings.json`	`<repo>/.claude/memory/feedback/`	Client work, compliance, or multi-tenant — separate dashboard per repo, lessons stay isolated, audit trail belongs to the repo.

Per-project lesson DBs live under each repo's .claude/memory/feedback/ and must stay gitignored — they're a runtime store, not source. ThumbGate's bundled .gitignore template handles this.

Status bar proof

Claude Code ThumbGate footer

Codex ThumbGate test lane

Install Codex Plugin

Install page: thumbgate.ai/codex-plugin
Direct zip: thumbgate-codex-plugin.zip
Follow: plugins/codex-profile/INSTALL.md

Install ChatGPT App / GPT Action

ChatGPT is the advice, checkpointing, and typed-feedback surface; ThumbGate's hard enforcement still runs locally in Codex, Claude Code, Cursor, Gemini CLI, Amp, OpenCode, MCP, or CI after install.

App page: thumbgate.ai/chatgpt-app
Live GPT: thumbgate.ai/go/gpt
GPT Action schema: thumbgate.ai/openapi.yaml
Follow: adapters/chatgpt/INSTALL.md

How It Works

  STEP 1              STEP 2                 STEP 3
  ────────            ────────               ────────

  You react           ThumbGate learns       The check holds

  👎 on a bad    ──►  Feedback becomes  ──►  Next time the
  agent action        a saved lesson         agent tries the
                      and a block rule       same thing:
  👍 on a good   ──►  Good pattern gets      ⛔ BLOCKED
  agent action        reinforced                 (or ✅ allowed)

No manual rule-writing. No config files. Your reactions teach the agent what your team actually wants.

ThumbGate sells three concrete outcomes:

Prevent expensive AI mistakes — catch bad commands, destructive database actions, unsafe publishes, and risky API calls before they run.
Make AI stop repeating mistakes — fix it once, turn the lesson into a rule, and block the repeat before the next tool call lands.
Turn AI into a reliable operator — move from a smart assistant that apologizes after damage to a production-ready operator with checkpoints, proof, and enforcement.
Measure prompts instead of rewriting them blindly — use thumbgate eval --from-feedback, proof lanes, ThumbGate Bench, and self-heal:check to evaluate whether prompts and workflows actually improved behavior.

Use Cases

Developer Workflows

Stop force-push to main — Check blocks git push --force on protected branches before it runs
Prevent repeated migration failures — Each mistake becomes a searchable lesson that fires before the next attempt
Block unauthorized file edits — Control which files agents can touch with path-based rules
Memory across sessions — The agent remembers your feedback from yesterday
Shared team safety — One developer's thumbs-down protects the whole team
Auto-improving without feedback — Self-improvement mode evaluates outcomes and generates rules automatically

Enterprise & Regulated Industries

Legal AI intake governance — Block unauthorized practice of law (ABA Rule 5.5), require conflict-of-interest clearance before fact collection (Rules 1.7/1.9/1.10), prevent privileged content from leaving firm boundaries (Rule 1.6)
Financial compliance — Gate AI-generated trade recommendations, block unauthorized disclosures, enforce approval chains before customer-facing outputs
Healthcare — Prevent AI agents from providing medical diagnoses, enforce HIPAA-compliant data routing, require clinician review before patient-facing content
Audit trail — Every gate decision (block, allow, reroute) is preserved with rule version, timestamp, and reviewer path for compliance review

See the legal-intake demo →

Built-in Checks

⛔ force-push          → blocks git push --force
⛔ protected-branch    → blocks direct push to main
⛔ unresolved-threads  → blocks push with open reviews
⛔ package-lock-reset  → blocks destructive lock edits
⛔ env-file-edit       → blocks .env secret exposure

+ custom prevention rules for project-specific failures

CLI Reference

npx thumbgate init                                              # detect agent, wire hooks
npx thumbgate doctor                                            # health check
npx thumbgate capture up|down "<text>"                         # capture a signal as a stored lesson (positional format)
npx thumbgate lessons                                           # see what's been learned
npx thumbgate brain --write                                     # build .thumbgate/BRAIN.md — the agent-readable context brain
npx thumbgate explore    # terminal explorer for lessons, checks, stats
npx thumbgate background-governance  # review background-agent run risk
npx thumbgate model-candidates --workload=dashboard-analysis --provider=openai --json  # evaluate GPT-5.5 routing
npx thumbgate native-messaging-audit  # inspect local browser bridges and extension hosts
npx thumbgate dashboard --open                                  # open local project-scoped dashboard in browser
thumbgate-dashboard                                             # standalone browser dashboard shortcut (run '/project:thumbgate-dashboard' in Claude/Grok)
npx thumbgate check-update                                      # check if a new version is available on npm/GitHub
npx thumbgate self-update                                       # update ThumbGate to the latest version globally
npx thumbgate serve      # start MCP server on stdio
npx thumbgate bench      # run reliability benchmark
npx thumbgate bench --programbench-smoke  # include cleanroom whole-repo proof lane
npx thumbgate break-glass --reason="ThumbGate over-fired"  # short TTL recovery for gate over-fire

Recovery if a gate over-fires

npx thumbgate break-glass --reason="ThumbGate over-fired and blocked operator recovery"

What this unlocks for up to 5 minutes:

Edits to .claude/settings.local.json, .claude/settings.json, .codex/config.toml, and the same files inside nested workspaces.
The short-lived proof gates used for PR recovery: pr_create_allowed and pr_threads_checked.

What stays gated:

Force pushes, protected-branch pushes, broad rm -rf, unsafe chmod, package publishes/releases, and local-only remote side effects.
Arbitrary protected files such as README.md, AGENTS.md, policy bundles, or credentials.

Verify the recovery window and runtime health before continuing:

npx thumbgate break-glass --reason="verify recovery path" --json
npx thumbgate doctor

If you change MCP or hook settings, restart the affected agent session so Claude Code, Cursor, Codex, or another runtime reloads .mcp.json and local settings.

Pricing

	Free	Pro ($19/mo)	Enterprise
Local CLI + enforced checks	✅	✅	✅
Feedback captures	2/day (10 total)	Unlimited	Unlimited
Active auto-promoted prevention rules	3	Unlimited	Unlimited
MCP agent integrations	All	All	All
Personal dashboard	—	✅	✅
DPO export (model fine-tuning)	—	✅	✅
Lesson export/import	—	✅	✅
Shared hosted lesson DB	—	—	✅
Org-wide dashboard	—	—	✅
Approval + audit proof	—	—	✅
Regulatory gate templates	—	—	✅
Custom policy layers (firm/practice-area)	—	—	✅
Compliance audit export	—	—	✅
Dedicated onboarding + SLA	—	—	✅

Best first paid motion for teams: the Workflow Hardening Sprint — qualify one repeated failure before committing to a full rollout. Start intake →

Best first technical motion: install the CLI-first and let init wire hooks for the agent you already use.

Paid path for individual operators: ThumbGate Pro is the self-serve side lane for a personal dashboard and export-ready evidence.

Start free · See Pro · Team Sprint intake

Team Lesson Sharing (Pro + Team)

Export lessons from one project:

curl -X POST http://localhost:3456/v1/lessons/export \
  -H "Authorization: Bearer $THUMBGATE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"outputPath": "./lessons-export.json"}'

Filter by signal or tags:

curl -X POST http://localhost:3456/v1/lessons/export \
  -H "Authorization: Bearer $THUMBGATE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"signal": "down", "tags": ["push-notifications", "ci"]}'

Import into another team's ThumbGate:

curl -X POST http://localhost:3456/v1/lessons/import \
  -H "Authorization: Bearer $THUMBGATE_API_KEY" \
  -H "Content-Type: application/json" \
  -d @lessons-export.json

What happens on import:

Deduplication — lessons with the same ID or title+signal are skipped
Provenance tracking — every imported lesson is tagged team-import with original source project, export timestamp, and original ID
No overwrite — import is additive; existing lessons are never modified

Use cases:

Share enforcement patterns across repos in the same org
Onboard a new team with pre-built lessons from a mature project
Export lessons before a project handoff so institutional knowledge transfers
Feed lessons from multiple teams into a centralized DPO training pipeline

DPO Export for Fine-Tuning (Pro + Team)

Export DPO pairs:

curl -X POST http://localhost:3456/v1/dpo/export \
  -H "Authorization: Bearer $THUMBGATE_API_KEY" \
  -o dpo-pairs.jsonl

What you get: JSONL where each line is a preference pair:

chosen — the agent action you thumbed up
rejected — the action you thumbed down for the same task context
prompt — the originating user intent

Use cases:

Fine-tune Llama 3 / Mistral / local models with a LoRA adapter trained on your real mistakes
Feed into RLAIF or KTO pipelines (KTO export also available via /v1/kto/export)
Build a model that natively avoids your team's known failure patterns — no check at inference time needed

Why this matters: Checks block mistakes. Fine-tuning prevents them from being attempted. Combine both for belt-and-suspenders governance.

Tech Stack

Layer	Technology
Storage	SQLite + FTS5, LanceDB vectors, JSONL logs
Capture	2/day, 10 total on Free; unlimited on Pro, Team, and Enterprise
Intelligence	MemAlign dual recall, Thompson Sampling
Enforcement	PreToolUse hook engine, Checks config
Interfaces	MCP stdio, HTTP API, CLI (Node.js >=18)
Billing	Stripe
Execution	Railway, Cloudflare Workers, Docker Sandboxes
Governance	Workflow Sentinel, control plane, Docker Sandboxes

Every Changeset is tied to the exact main merge commit and generates Verification Evidence for Release Confidence.

Conversational ad / AI-search answer assets: AI Mode ads for agent governance · MCP tool governance · AI agent pre-action approval gates

Workflow Hardening Sprint · Live Dashboard

Integrations

ChatGPT App / GPT Action — First-class ChatGPT distribution page with the live GPT, public OpenAPI Action schema, and local enforcement install path
Open ThumbGate GPT — ThumbGate GPT: start here. Paste agent actions, get advice + checkpointing. No, users do not have to keep chatting inside the ThumbGate GPT to use ThumbGate — the hard enforcement layer still runs where the work happens.
Claude Desktop Extension — One-click install for Claude Desktop
Codex Plugin — Auto-updating standalone bundle and install page for Codex CLI
VS Code / Open VSX Extension — Marketplace-ready MCP provider and .vscode/mcp.json fallback for VS Code-compatible IDEs
Antigravity-compatible VSIX — Open VSX/direct VSIX install path while Antigravity-specific marketplace support is still unproven
JetBrains Plugin Scaffold — IntelliJ/PyCharm Marketplace path for the same thumbgate@latest runtime
Perplexity Command Center — AI-search visibility + lead discovery
ThumbGate Bench — Reliability benchmark and ProgramBench-style cleanroom proof lane
Manus AI Skill — ThumbGate integration for Manus AI agents

Feedback Sessions

Give the agent more context when a thumbs-down isn't enough:

👎 thumbs down
  └─► open_feedback_session
        └─► "you lied about deployment"    (append_feedback_context)
        └─► "tests were actually failing"  (append_feedback_context)
        └─► finalize_feedback_session
              └─► lesson inferred from full conversation

Free and self-hosted users can invoke search_lessons directly through MCP, and via the CLI with npx thumbgate lessons. History-aware feedback sessions give the agent full context for each lesson.

Enterprise Data Chat and Optional Google Adapters

Optional Vertex Setup

To wire local ThumbGate scoring to Vertex AI, run:

npx thumbgate setup-vertex

Auto-Discovery: Automatically detects your active authenticated gcloud session and active project ID.
Auto-Enablement: Programmatically enables the Vertex AI API in your project.
Auto-Configuration: Writes local Vertex routing settings to your .env file.

Zero-Friction Cost Containment ($10/mo Hard Cap)

Google Cloud budget alerts are "alert-only" and do not stop API traffic, risking unexpected bill shock. ThumbGate completely resolves this on the client side:

Instant Shutdown: ThumbGate maintains a lightweight, local token ledger and instantly halts outgoing API traffic the millisecond your monthly token spending approaches the $10 limit (500k tokens of Gemini 1.5 Flash).
Bypasses extra shutdown plumbing: Requires no Pub/Sub or Cloud Functions for the local ThumbGate-side stop condition. You still need normal Google Cloud billing/API setup and live-agent verification for DFCX pilots.

FAQ

Is ThumbGate a model fine-tuning tool? No. ThumbGate does not update model weights. It captures feedback, stores lessons, injects context at runtime, and blocks bad actions before they execute.

Does it work with my agent? If it supports MCP or pre-action hooks, yes. Claude Code, Claude Desktop, Cursor, Codex, Gemini CLI, Amp, Cline, OpenCode all work out of the box.

Docs

ThumbGate for Federal Agencies — pilot-ready posture, NIST 800-53 control mapping, OMB M-24-10 / EO 14110 alignment, ThumbGate-Core gov deployment mode, public/Core boundary invariants. Landing page: thumbgate.ai/federal.
First Dollar Playbook — turning one painful workflow into the next booked pilot
Commercial Truth — pricing, claims, what we don't say
Goal Contracts — evidence-before-done contracts for multi-agent handoffs
Changeset Strategy — release notes and version bump enforcement
Release Confidence — changesets, version checks, proof lanes
Verification Evidence — proof artifacts
Claude Desktop Extension Guide
Agent Workflow Contract — the agent-run contract for all ThumbGate operations
Ready for Agent Intake — ready-for-agent intake template
SEO Guide: Claude Code Guardrails
Unsupervised Learning Signals — silent-failure clustering (on by default as of 2026-05-21; opt out via THUMBGATE_SILENT_FAILURE_CLUSTERING=0; only meaningfully active on workspaces with ≥ 50 tool calls/day)
ThumbGate-Core — private core for hosted overlays, ranking, policy synthesis, billing intelligence, and org/team workflows

License

MIT. See LICENSE.