GoldenCheck

219 toolsSTDIO, HTTPregistry active

Summary

This server brings zero-config data validation into Claude. Instead of writing rules upfront, it scans CSVs or tabular data to auto-discover validation issues like nulls, outliers, format problems, and drift from learned baselines. You can scan a dataset, get findings with severity and confidence scores, pin the ones you care about, and export them to YAML for CI validation. It exposes operations to profile data, create baselines, detect drift, and optionally enhance findings with LLM-generated context. Available as both stdio and HTTP transports. Note that the repo has moved to a monorepo at benzsevern/goldenmatch, so check there for active development. Useful when you need quick data quality checks without setting up heavy validation frameworks first.

Install to Claude Code

verified

claude mcp add --transport http goldencheck https://goldencheck-mcp-production.up.railway.app/mcp/

Run in your terminal. Add --scope user to make it available in every project.

Review the command, arguments, and environment values before installing — MCP servers run with your local permissions.

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

Tools

Verified live against the running server on Jun 10, 2026.

verified live19 tools

scanScan a data file (CSV, Parquet, Excel) for data quality issues. Returns findings with severity, confidence, affected rows, and sample values. No configuration needed — rules are discovered from the data.4 params

Scan a data file (CSV, Parquet, Excel) for data quality issues. Returns findings with severity, confidence, affected rows, and sample values. No configuration needed — rules are discovered from the data.

Parameters* required

file_path*string

Path to the data file (CSV, Parquet, Excel)

llm_boostboolean

Enable LLM enhancement (requires API key env var)default: false

sample_sizeinteger

Max rows to sample (default 100000)default: 100000

llm_providerstring

LLM provider: 'anthropic' or 'openai'one of anthropic · openaidefault: anthropic

validateValidate a data file against pinned rules in goldencheck.yml. Returns validation findings (existence, required, unique, enum, range checks).2 params

Validate a data file against pinned rules in goldencheck.yml. Returns validation findings (existence, required, unique, enum, range checks).

Parameters* required

file_path*string

Path to the data file

config_pathstring

Path to goldencheck.yml (default: ./goldencheck.yml)default: goldencheck.yml

profileProfile a data file and return column-level statistics: type, null%, unique%, min/max, top values, detected formats. Also returns a health score (A-F) based on finding severity.2 params

Profile a data file and return column-level statistics: type, null%, unique%, min/max, top values, detected formats. Also returns a health score (A-F) based on finding severity.

Parameters* required

file_path*string

Path to the data file

sample_sizeinteger

Max rows to sample (default 100000)default: 100000

health_scoreGet the health score (A-F, 0-100) for a data file. Quick summary of overall data quality.1 params

Get the health score (A-F, 0-100) for a data file. Quick summary of overall data quality.

Parameters* required

file_path*string

Path to the data file

list_checksList all available profiler checks and what they detect. No arguments needed.

List all available profiler checks and what they detect. No arguments needed.

No parameters — call it with no arguments.

get_column_detailGet detailed profile and findings for a specific column.2 params

Get detailed profile and findings for a specific column.

Parameters* required

column*string

Column name to inspect

file_path*string

Path to the data file

list_domainsList all available domain packs (healthcare, finance, ecommerce, etc.). Domain packs provide specialized semantic type definitions for specific data domains.

List all available domain packs (healthcare, finance, ecommerce, etc.). Domain packs provide specialized semantic type definitions for specific data domains.

No parameters — call it with no arguments.

get_domain_infoGet detailed info about a specific domain pack — lists all semantic types, their name hints, and suppression rules.1 params

Get detailed info about a specific domain pack — lists all semantic types, their name hints, and suppression rules.

Parameters* required

domain*string

Domain pack name (e.g., healthcare, finance, ecommerce)

install_domainDownload a community domain pack from the goldencheck-types repository and save it for use in future scans.2 params

Download a community domain pack from the goldencheck-types repository and save it for use in future scans.

Parameters* required

domain*string

Domain pack name to install

output_pathstring

Output path (default: goldencheck_domain.yaml)default: goldencheck_domain.yaml

analyze_dataAnalyze a data file to detect its domain, profile columns, and recommend a scanning strategy. Returns domain detection, column count, row count, strategy decisions, and alternative approaches.1 params

Analyze a data file to detect its domain, profile columns, and recommend a scanning strategy. Returns domain detection, column count, row count, strategy decisions, and alternative approaches.

Parameters* required

file_path*string

Path to the data file (CSV, Parquet, Excel)

auto_configureScan a data file, triage findings by confidence, and generate goldencheck.yml content from the pinned findings. Optionally accepts constraints to filter or adjust the generated config.2 params

Scan a data file, triage findings by confidence, and generate goldencheck.yml content from the pinned findings. Optionally accepts constraints to filter or adjust the generated config.

Parameters* required

file_path*string

Path to the data file

constraintsobject

Optional constraints: {min_confidence, severity_filter, include_columns, exclude_columns}

explain_findingExplain a single finding in natural language. Requires the finding as a JSON dict and the file_path to load a profile for context.2 params

Explain a single finding in natural language. Requires the finding as a JSON dict and the file_path to load a profile for context.

Parameters* required

finding*object

Finding dict with keys: severity, column, check, message, affected_rows, confidence, sample_values

file_path*string

Path to the data file (needed for profile context)

explain_columnGet a natural-language health narrative for a specific column. Scans the file, profiles the column, and explains all findings.2 params

Get a natural-language health narrative for a specific column. Scans the file, profiles the column, and explains all findings.

Parameters* required

column*string

Column name to explain

file_path*string

Path to the data file

review_queueList all pending review items for a given job. Returns items that need human decision (medium-confidence findings).1 params

List all pending review items for a given job. Returns items that need human decision (medium-confidence findings).

Parameters* required

job_name*string

Job name to filter review items

approve_rejectApprove (pin) or reject (dismiss) a review queue item. Decision must be 'pin' or 'dismiss'.3 params

Approve (pin) or reject (dismiss) a review queue item. Decision must be 'pin' or 'dismiss'.

Parameters* required

reasonstring

Optional reason for the decision

item_id*string

Review item ID to update

decision*string

Decision: 'pin' (approve) or 'dismiss' (reject)one of pin · dismiss

compare_domainsScan a file with every available domain pack (plus base/no-domain) and compare health scores. Recommends the best-fitting domain.1 params

Scan a file with every available domain pack (plus base/no-domain) and compare health scores. Recommends the best-fitting domain.

Parameters* required

file_path*string

Path to the data file

suggest_fixPreview fixes for a data file without applying them. Shows what would change (columns, fix types, rows affected, before/after samples).2 params

Preview fixes for a data file without applying them. Shows what would change (columns, fix types, rows affected, before/after samples).

Parameters* required

modestring

Fix mode: 'safe' (default) or 'aggressive'one of safe · aggressivedefault: safe

file_path*string

Path to the data file

pipeline_handoffGenerate a structured quality attestation JSON for a data file. Includes health score, findings summary, pinned rules, and attestation status (PASS, PASS_WITH_WARNINGS, REVIEW_REQUIRED, FAIL).2 params

Generate a structured quality attestation JSON for a data file. Includes health score, findings summary, pinned rules, and attestation status (PASS, PASS_WITH_WARNINGS, REVIEW_REQUIRED, FAIL).

Parameters* required

job_name*string

Job name for the handoff record

file_path*string

Path to the data file

review_statsGet review queue statistics for a job — counts of pending, pinned, and dismissed items.1 params

Get review queue statistics for a job — counts of pending, pinned, and dismissed items.

Parameters* required

job_name*string

Job name to get stats for

Moved. This repo has moved into the benzsevern/goldenmatch monorepo at packages/python/goldencheck (and packages/typescript/goldencheck)/. This repo is archived; new development happens in the monorepo.

GoldenCheck

Data validation that discovers rules from your data so you don't have to write them. Built by Ben Severn.

Every competitor makes you write rules first. GoldenCheck flips it: validate first, keep the rules you care about.

Why GoldenCheck?

	GoldenCheck	Great Expectations	Pandera	Pointblank
Rules	Discovered from data	Written by hand	Written by hand	Written by hand
Config	Zero to start	Heavy YAML/Python setup	Decorators/schemas	YAML/Python
Interface	CLI + interactive TUI	HTML reports	Exceptions	HTML/notebook
Learning curve	One command	Hours/days	Moderate	Moderate
LLM enhancement	Yes ($0.01/scan)	No	No	No
Fix suggestions	Yes, in TUI	No	No	No
Confidence scoring	Yes (H/M/L per finding)	No	No	No
DQBench Score	88.40	21.68 (best-effort)	32.51 (best-effort)	6.94 (auto)

Install

pip install goldencheck

With LLM boost support:

pip install goldencheck[llm]

With deep profiling & baseline support (scipy, numpy):

pip install goldencheck[baseline]

With semantic type inference for baseline (sentence-transformers):

pip install goldencheck[baseline,semantic]

JavaScript / TypeScript

npm install goldencheck

Edge-safe core (browsers, Cloudflare Workers, Vercel Edge):

import { scanData, TabularData } from "goldencheck/core";

Node.js (file reading, CLI, MCP):

import { readFile, scanData } from "goldencheck/node";

Quick Start

# Scan a file — discovers issues, launches interactive TUI
goldencheck data.csv

# CLI-only output (no TUI)
goldencheck data.csv --no-tui

# With LLM enhancement (requires API key)
goldencheck data.csv --llm-boost --no-tui

# Validate against saved rules (for CI/pipelines)
goldencheck validate data.csv

# JSON output for CI integration
goldencheck data.csv --no-tui --json

# Learn baseline (one-time, deep analysis)
goldencheck baseline data.csv

# Scan with drift detection (fast, uses saved baseline)
goldencheck scan new_data.csv

TypeScript Quick Start

// Scan an array of records (edge-safe — works anywhere)
import { scanData, TabularData, Severity } from "goldencheck";

const data = new TabularData([
  { id: 1, email: "alice@example.com", age: 30, status: "active" },
  { id: 2, email: "bob@test.com", age: -5, status: "inactive" },
  { id: 3, email: "not-an-email", age: 25, status: "active" },
]);

const { findings, profile } = scanData(data);
for (const f of findings) {
  console.log(`[${f.severity === Severity.ERROR ? "ERROR" : "WARNING"}] ${f.column}: ${f.message}`);
}

// Scan a CSV file (Node.js)
import { readFile, scanData, applyConfidenceDowngrade, healthScore } from "goldencheck/node";

const data = readFile("data.csv");
const result = scanData(data, { domain: "healthcare" });
const findings = applyConfidenceDowngrade(result.findings, false);

// Health score
const byCol = {};
for (const f of findings) {
  if (f.severity >= 2) {
    byCol[f.column] ??= { errors: 0, warnings: 0 };
    byCol[f.column][f.severity === 3 ? "errors" : "warnings"]++;
  }
}
const { grade, points } = healthScore(byCol);
console.log(`Health: ${grade} (${points}/100)`);

// Validate against pinned rules
import { readFile, scanData, validateConfig, validateData } from "goldencheck/node";
import { readFileSync } from "node:fs";
import YAML from "yaml";

const config = validateConfig(YAML.parse(readFileSync("goldencheck.yml", "utf-8")));
const data = readFile("data.csv");
const findings = validateData(data, config);

// Create baseline and detect drift
import { readFile, createBaseline, serializeBaseline, scanData } from "goldencheck/node";
import { runDriftChecks, deserializeBaseline } from "goldencheck";
import { writeFileSync, readFileSync } from "node:fs";

// Learn baseline
const data = readFile("reference.csv");
const baseline = createBaseline(data);
writeFileSync("baseline.json", serializeBaseline(baseline));

// Later: detect drift
const newData = readFile("production.csv");
const saved = deserializeBaseline(readFileSync("baseline.json", "utf-8"));
const driftFindings = runDriftChecks(newData, saved);

// LLM-enhanced scanning (edge-safe)
import { scanData, TabularData, callLlm, parseLlmResponse, mergeLlmFindings, buildSampleBlocks } from "goldencheck";

const data = new TabularData(records);
const result = scanData(data, { returnSample: true });
const blocks = buildSampleBlocks(result.sample, result.findings);
const { text } = await callLlm("anthropic", JSON.stringify(blocks));
const llmResponse = parseLlmResponse(text);
if (llmResponse) {
  const enhanced = mergeLlmFindings(result.findings, llmResponse);
}

How It Works

1. SCAN     →  goldencheck data.csv
                GoldenCheck profiles your data and discovers what "healthy" looks like

2. REVIEW   →  Interactive TUI shows findings sorted by severity
                Each finding has: description, affected rows, sample values

3. PIN      →  Press Space to promote findings into permanent rules
                Dismiss false positives — they won't come back

4. EXPORT   →  Press F2 to save rules to goldencheck.yml
                Human-readable YAML with your pinned rules

5. VALIDATE →  goldencheck validate data.csv
                Enforce rules in CI with exit codes (0 = pass, 1 = fail)

What It Detects

Column-Level Profilers

Profiler	What It Catches	Example
Type inference	String columns that are actually numeric	"Column `age` is string but 98% are integer"
Nullability	Required vs. optional columns	"0 nulls across 50k rows — likely required"
Uniqueness	Primary key candidates, near-duplicates	"100% unique — likely primary key"
Format detection	Emails, phones, URLs, dates	"94% email format, 6% malformed"
Range & distribution	Outliers, min/max bounds	"3 rows have values >10,000"
Cardinality	Low-cardinality enum suggestions	"4 unique values — possible enum"
Pattern consistency	Mixed formats within a column	"3 phone formats detected"

Cross-Column Profilers

Profiler	What It Catches
Temporal ordering	start_date > end_date violations
Null correlation	Columns that are null together (e.g., address + city + zip)
Numeric cross-column	value > max violations (e.g., claim_amount > policy_max)
Age vs DOB	Age column doesn't match calculated age from date_of_birth

Baseline Deep Profiling & Drift Detection

Run goldencheck baseline once to build a statistical profile of healthy data. On every subsequent scan, GoldenCheck compares the new data against the saved baseline and reports drift across 13 check types:

Check Type	What It Catches
`distribution_drift`	Value distribution has shifted significantly
`entropy_drift`	Entropy of column values has changed
`bound_violation`	Values exceed historical min/max bounds
`benford_drift`	Leading-digit distribution deviates from Benford's Law
`fd_violation`	Functional dependency between columns is broken
`key_uniqueness_loss`	Previously unique column now has duplicates
`temporal_order_drift`	Historical column ordering constraint violated
`type_drift`	Dominant semantic type of column has changed
`correlation_break`	Previously correlated columns are no longer correlated
`new_correlation`	New unexpected correlation appeared
`pattern_drift`	Value format/pattern distribution has shifted
`new_pattern`	New structural patterns appeared in a column

The baseline is built using 6 techniques: statistical profiler (distributions, Benford's Law, entropy), constraint miner (functional dependencies, temporal orders), semantic type inferrer (embeddings + keywords), correlation analyzer (Pearson, Cramér's V), pattern grammar inducer, and confidence prior builder.

Domain Packs

Improve detection accuracy with domain-specific type definitions:

goldencheck scan data.csv --domain healthcare   # NPI, ICD, insurance, patient types
goldencheck scan data.csv --domain finance      # accounts, routing, CUSIP, transactions
goldencheck scan data.csv --domain ecommerce    # SKUs, orders, tracking, products

Domain packs add semantic types that reduce false positives and improve classification for industry-specific data.

Schema Diff

Compare two versions of a data file:

goldencheck diff data.csv                  # compare against git HEAD
goldencheck diff old.csv new.csv           # compare two files
goldencheck diff data.csv --ref main       # compare against a branch

Auto-Fix

Apply automated fixes to clean your data:

goldencheck fix data.csv                          # safe: trim, normalize, fix encoding
goldencheck fix data.csv --mode moderate          # + standardize case
goldencheck fix data.csv --mode aggressive --force # + coerce types
goldencheck fix data.csv --dry-run                # preview changes

Watch Mode

Continuously monitor a directory for data quality:

goldencheck watch data/ --interval 30        # re-scan every 30s
goldencheck watch data/ --exit-on error      # CI mode: fail on first error

REST API

Run GoldenCheck as a microservice:

goldencheck serve --port 8000

# Scan via file upload
curl -X POST http://localhost:8000/scan --data-binary @data.csv

# Scan via URL
curl -X POST http://localhost:8000/scan/url -d '{"url": "https://example.com/data.csv"}'

Database Scanning

Scan tables directly — no CSV export needed:

pip install goldencheck[db]
goldencheck scan-db "postgresql://user:pass@host/db" --table orders
goldencheck scan-db "snowflake://..." --query "SELECT * FROM orders WHERE date > '2024-01-01'"

Scheduled Runs

Cron-like scheduling with webhook notifications:

goldencheck schedule data/*.csv --interval hourly --webhook https://hooks.slack.com/...
goldencheck schedule data/*.csv --interval daily --notify-on grade-drop

LLM Boost

Add --llm-boost to enhance profiler findings with LLM intelligence. The LLM receives a representative sample of your data and:

Finds issues profilers miss — semantic understanding (e.g., "12345" in a name column)
Upgrades severity — knows "emails should be required" even if the profiler only says "INFO"
Discovers relationships — identifies temporal ordering between columns like signup_date and last_login
Downgrades false positives — "mixed phone formats are common, not an error"

# Using OpenAI
export OPENAI_API_KEY=sk-...
goldencheck data.csv --llm-boost --llm-provider openai --no-tui

# Using Anthropic
export ANTHROPIC_API_KEY=sk-ant-...
goldencheck data.csv --llm-boost --no-tui

Cost: ~$0.01 per scan (one API call with representative samples, not per-row).

Budget control:

export GOLDENCHECK_LLM_BUDGET=0.50  # max spend per scan in USD

Configuration (goldencheck.yml)

version: 1

settings:
  sample_size: 100000
  fail_on: error

columns:
  email:
    type: string
    required: true
    format: email
    unique: true

  age:
    type: integer
    range: [0, 120]

  status:
    type: string
    enum: [active, inactive, pending, closed]

relations:
  - type: temporal_order
    columns: [start_date, end_date]

ignore:
  - column: notes
    check: nullability

Only pinned rules appear in this file — not every finding. The ignore list prevents dismissed findings from reappearing.

CLI Reference

Command	Description
`goldencheck <file>`	Scan and launch TUI
`goldencheck scan <file>`	Explicit scan (supports `--smart`, `--guided`)
`goldencheck validate <file>`	Validate against goldencheck.yml
`goldencheck review <file>`	Scan + validate, launch TUI
`goldencheck init <file>`	Interactive setup wizard (scan → config → CI)
`goldencheck diff <file> [file2]`	Compare two files or against git HEAD
`goldencheck watch <dir>`	Poll directory, re-scan on change
`goldencheck fix <file>`	Auto-fix data quality issues
`goldencheck baseline <file>`	Deep-profile data and save statistical baseline to YAML
`goldencheck learn <file>`	Generate LLM validation rules
`goldencheck history`	Show scan history and trends
`goldencheck serve`	Start REST API server
`goldencheck scan-db <conn>`	Scan a database table directly
`goldencheck schedule <files>`	Run scans on a cron schedule
`goldencheck mcp-serve`	Start MCP server (19 tools)

Flags

Flag	Description
`--no-tui`	Print results to console
`--json`	JSON output
`--fail-on <level>`	Exit 1 on severity: `error` or `warning`
`--domain <name>`	Domain pack: `healthcare`, `finance`, `ecommerce`
`--llm-boost`	Enable LLM enhancement
`--llm-provider <name>`	LLM provider: `anthropic` (default) or `openai`
`--mode <level>`	Fix mode: `safe`, `moderate`, `aggressive`
`--smart`	Auto-triage: pin high-confidence, dismiss low
`--guided`	Walk through findings one-by-one
`--webhook <url>`	POST findings to Slack/PagerDuty/any URL
`--notify-on <trigger>`	Webhook trigger: `grade-drop`, `any-error`, `any-warning`
`--baseline <path>`	Path to baseline YAML for drift detection
`--no-baseline`	Skip auto-discovery of `goldencheck_baseline.yaml`
`--skip <technique>`	Skip a baseline technique (can repeat)
`--update`	Update existing baseline instead of overwriting
`-o <path>`	Output path for baseline file (default: `goldencheck_baseline.yaml`)
`--version`	Show version

TypeScript CLI

npx goldencheck-js scan data.csv --json
npx goldencheck-js scan data.csv --domain healthcare
npx goldencheck-js health-score data.csv
npx goldencheck-js profile data.csv
npx goldencheck-js validate data.csv --config goldencheck.yml
npx goldencheck-js baseline data.csv --output baseline.json
npx goldencheck-js fix data.csv --mode safe
npx goldencheck-js diff old.csv new.csv
npx goldencheck-js demo

TypeScript Architecture

goldencheck (npm)
├── goldencheck/core    # Edge-safe: browsers, Workers, Edge Runtime
│   ├── types           # Finding, Severity, DatasetProfile, Config types
│   ├── data            # TabularData — zero-dep columnar abstraction
│   ├── profilers       # 10 column profilers + 4 relation profilers
│   ├── semantic        # Type classifier, suppression, 3 domain packs
│   ├── engine          # Scanner, confidence, validator, triage, differ, fixer
│   ├── baseline        # Statistical profiling, constraints, correlation, patterns
│   ├── drift           # 13 drift checks against saved baseline
│   ├── llm             # Anthropic + OpenAI via fetch(), merger, budget
│   ├── agent           # Strategy, handoff, review queue
│   └── reporters       # JSON, CI
└── goldencheck/node    # Node.js >= 20
    ├── reader          # CSV, Parquet (via nodejs-polars)
    ├── mcp             # MCP server (7 tools)
    ├── a2a             # Agent-to-Agent HTTP server
    ├── tui             # ANSI terminal output
    ├── db-scanner      # Postgres, MySQL, SQLite
    └── watcher         # Directory polling

Benchmarks

Speed

Dataset	Time	Throughput
1K rows	0.05s	19K rows/sec
10K rows	0.23s	43K rows/sec
100K rows	2.29s	44K rows/sec
1M rows	2.07s	482K rows/sec

DQBench v1.0 — Head-to-Head

Tool	Mode	DQBench Score
GoldenCheck	zero-config	88.40
Pandera	best-effort rules	32.51
Soda Core	best-effort rules	22.36
Great Expectations	best-effort rules	21.68

GoldenCheck's zero-config discovery outperforms every competitor — even when they have hand-written rules.

Run the benchmark yourself:

pip install dqbench goldencheck
dqbench run goldencheck

Detection Accuracy

Mode	Column Recall	Cost
Profiler-only (v0.1.0)	87%	$0
Profiler-only (v0.2.0 with confidence)	100%	$0
With LLM Boost	100%	~$0.003-0.01

Tested on a custom benchmark with 341 planted data quality issues across 9 categories.

v0.2.0 improvements: minority wrong-type detection, range profiler chaining, broader temporal heuristics, and confidence scoring pushed profiler-only recall from 87% to 100%.

Raha Benchmark Datasets

Dataset	Column Recall
Flights (2,376 rows)	100% (4/4 columns)
Beers (2,410 rows)	80% (4/5 columns)

Tech Stack

Dependency	Purpose
Polars	All data operations
Typer	CLI framework
Textual	Interactive TUI
Rich	CLI output formatting
Pydantic 2	Config validation

Optional: Anthropic SDK / OpenAI SDK for LLM Boost | MCP SDK for MCP server | scipy + numpy for deep baseline profiling ([baseline]) | sentence-transformers for semantic type inference in baseline ([semantic])

TypeScript / Node.js

Dependency	Purpose
Zero runtime deps	Core package has no dependencies (edge-safe)
nodejs-polars	Parquet reading (optional, Node.js only)
csv-parse	CSV reading (Node.js only)
@modelcontextprotocol/sdk	MCP server (Node.js only)

MCP Server (Claude Desktop)

GoldenCheck includes an MCP server for Claude Desktop integration:

pip install goldencheck[mcp]

Add to your Claude Desktop config (claude_desktop_config.json):

{
  "mcpServers": {
    "goldencheck": {
      "command": "goldencheck",
      "args": ["mcp-serve"]
    }
  }
}

Available tools:

Tool	Description
`scan`	Scan a file for data quality issues (with optional LLM boost)
`validate`	Validate against pinned rules in goldencheck.yml
`profile`	Get column-level statistics and health score
`health_score`	Quick A-F grade for a data file
`get_column_detail`	Deep-dive into a specific column
`list_checks`	List all available profiler checks

Remote MCP Server

GoldenCheck is available as a hosted MCP server on Smithery — connect from any MCP client without installing anything.

Claude Desktop / Claude Code:

{
  "mcpServers": {
    "goldencheck": {
      "url": "https://goldencheck-mcp-production.up.railway.app/mcp/"
    }
  }
}

Local server:

pip install goldencheck[mcp]
goldencheck mcp-serve

19 tools available: scan files, validate rules, profile columns, health-score datasets, auto-configure validation, explain findings, compare domains, suggest fixes.

Jupyter / Colab

GoldenCheck renders rich HTML in Jupyter notebooks:

from goldencheck.engine.scanner import scan_file
from goldencheck.engine.confidence import apply_confidence_downgrade
from goldencheck.notebook import ScanResult

findings, profile = scan_file("data.csv")
findings = apply_confidence_downgrade(findings, llm_boost=False)

# Rich HTML display in notebooks
ScanResult(findings=findings, profile=profile)

API Quick Reference

Python

import goldencheck

# Scan a CSV for quality issues
findings = goldencheck.scan_file("data.csv")
for f in findings:
    print(f"[{f.severity}] {f.column}: {f.check} — {f.message}")

# Create baseline and detect drift
from goldencheck import create_baseline, scan_file
baseline = create_baseline("data.csv")
baseline.save("goldencheck_baseline.yaml")
findings, profile = scan_file("data.csv", baseline="goldencheck_baseline.yaml")

# Health score
score = goldencheck.health_score("data.csv")
print(score)  # e.g. "B (78/100)"

TypeScript

import { scanData, TabularData, Severity } from "goldencheck";

// Scan records (edge-safe)
const data = new TabularData(records);
const { findings, profile } = scanData(data);
for (const f of findings) {
  console.log(`[${f.severity === Severity.ERROR ? "ERROR" : "WARNING"}] ${f.column}: ${f.message}`);
}

import { readFile, scanData, applyConfidenceDowngrade, healthScore } from "goldencheck/node";

// Scan a CSV file (Node.js)
const data = readFile("data.csv");
const result = scanData(data, { domain: "healthcare" });
const findings = applyConfidenceDowngrade(result.findings, false);

// Health score
const byCol = {};
for (const f of findings) {
  if (f.severity >= 2) {
    byCol[f.column] ??= { errors: 0, warnings: 0 };
    byCol[f.column][f.severity === 3 ? "errors" : "warnings"]++;
  }
}
const { grade, points } = healthScore(byCol);
console.log(`Health: ${grade} (${points}/100)`);

import { readFile, createBaseline, serializeBaseline } from "goldencheck/node";
import { runDriftChecks, deserializeBaseline } from "goldencheck";
import { writeFileSync, readFileSync } from "node:fs";

// Create baseline and detect drift
const data = readFile("reference.csv");
const baseline = createBaseline(data);
writeFileSync("baseline.json", serializeBaseline(baseline));

const newData = readFile("production.csv");
const saved = deserializeBaseline(readFileSync("baseline.json", "utf-8"));
const driftFindings = runDriftChecks(newData, saved);

Contributing

See CONTRIBUTING.md for development setup and guidelines.

Author

Ben Severn

License

MIT — see LICENSE

Part of the Golden Suite

Tool	Purpose	Install
GoldenCheck	Validate & profile data quality	`pip install goldencheck` / `npm install goldencheck`
GoldenFlow	Transform & standardize data	`pip install goldenflow`
GoldenMatch	Deduplicate & match records	`pip install goldenmatch`
GoldenPipe	Orchestrate the full pipeline	`pip install goldenpipe`

Companion projects:

dbt-goldencheck — data validation as a dbt test.
goldencheck-types — community-contributed domain type packs.
goldencheck-action — GitHub Action for CI with PR comments.

Featured

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

GoldenCheck

Data validation that discovers rules from your data so you don't have to write them. Built by Ben Severn.

Every competitor makes you write rules first. GoldenCheck flips it: validate first, keep the rules you care about.

Why GoldenCheck?

	GoldenCheck	Great Expectations	Pandera	Pointblank
Rules	Discovered from data	Written by hand	Written by hand	Written by hand
Config	Zero to start	Heavy YAML/Python setup	Decorators/schemas	YAML/Python
Interface	CLI + interactive TUI	HTML reports	Exceptions	HTML/notebook
Learning curve	One command	Hours/days	Moderate	Moderate
LLM enhancement	Yes ($0.01/scan)	No	No	No
Fix suggestions	Yes, in TUI	No	No	No
Confidence scoring	Yes (H/M/L per finding)	No	No	No
DQBench Score	88.40	21.68 (best-effort)	32.51 (best-effort)	6.94 (auto)

Install

pip install goldencheck

With LLM boost support:

pip install goldencheck[llm]

With deep profiling & baseline support (scipy, numpy):

pip install goldencheck[baseline]

With semantic type inference for baseline (sentence-transformers):

pip install goldencheck[baseline,semantic]

JavaScript / TypeScript

npm install goldencheck

Edge-safe core (browsers, Cloudflare Workers, Vercel Edge):

import { scanData, TabularData } from "goldencheck/core";

Node.js (file reading, CLI, MCP):

import { readFile, scanData } from "goldencheck/node";

Quick Start

# Scan a file — discovers issues, launches interactive TUI
goldencheck data.csv

# CLI-only output (no TUI)
goldencheck data.csv --no-tui

# With LLM enhancement (requires API key)
goldencheck data.csv --llm-boost --no-tui

# Validate against saved rules (for CI/pipelines)
goldencheck validate data.csv

# JSON output for CI integration
goldencheck data.csv --no-tui --json

# Learn baseline (one-time, deep analysis)
goldencheck baseline data.csv

# Scan with drift detection (fast, uses saved baseline)
goldencheck scan new_data.csv

TypeScript Quick Start

// Scan an array of records (edge-safe — works anywhere)
import { scanData, TabularData, Severity } from "goldencheck";

const data = new TabularData([
  { id: 1, email: "alice@example.com", age: 30, status: "active" },
  { id: 2, email: "bob@test.com", age: -5, status: "inactive" },
  { id: 3, email: "not-an-email", age: 25, status: "active" },
]);

const { findings, profile } = scanData(data);
for (const f of findings) {
  console.log(`[${f.severity === Severity.ERROR ? "ERROR" : "WARNING"}] ${f.column}: ${f.message}`);
}

// Scan a CSV file (Node.js)
import { readFile, scanData, applyConfidenceDowngrade, healthScore } from "goldencheck/node";

const data = readFile("data.csv");
const result = scanData(data, { domain: "healthcare" });
const findings = applyConfidenceDowngrade(result.findings, false);

// Health score
const byCol = {};
for (const f of findings) {
  if (f.severity >= 2) {
    byCol[f.column] ??= { errors: 0, warnings: 0 };
    byCol[f.column][f.severity === 3 ? "errors" : "warnings"]++;
  }
}
const { grade, points } = healthScore(byCol);
console.log(`Health: ${grade} (${points}/100)`);

// Validate against pinned rules
import { readFile, scanData, validateConfig, validateData } from "goldencheck/node";
import { readFileSync } from "node:fs";
import YAML from "yaml";

const config = validateConfig(YAML.parse(readFileSync("goldencheck.yml", "utf-8")));
const data = readFile("data.csv");
const findings = validateData(data, config);

// Create baseline and detect drift
import { readFile, createBaseline, serializeBaseline, scanData } from "goldencheck/node";
import { runDriftChecks, deserializeBaseline } from "goldencheck";
import { writeFileSync, readFileSync } from "node:fs";

// Learn baseline
const data = readFile("reference.csv");
const baseline = createBaseline(data);
writeFileSync("baseline.json", serializeBaseline(baseline));

// Later: detect drift
const newData = readFile("production.csv");
const saved = deserializeBaseline(readFileSync("baseline.json", "utf-8"));
const driftFindings = runDriftChecks(newData, saved);

// LLM-enhanced scanning (edge-safe)
import { scanData, TabularData, callLlm, parseLlmResponse, mergeLlmFindings, buildSampleBlocks } from "goldencheck";

const data = new TabularData(records);
const result = scanData(data, { returnSample: true });
const blocks = buildSampleBlocks(result.sample, result.findings);
const { text } = await callLlm("anthropic", JSON.stringify(blocks));
const llmResponse = parseLlmResponse(text);
if (llmResponse) {
  const enhanced = mergeLlmFindings(result.findings, llmResponse);
}

How It Works

1. SCAN     →  goldencheck data.csv
                GoldenCheck profiles your data and discovers what "healthy" looks like

2. REVIEW   →  Interactive TUI shows findings sorted by severity
                Each finding has: description, affected rows, sample values

3. PIN      →  Press Space to promote findings into permanent rules
                Dismiss false positives — they won't come back

4. EXPORT   →  Press F2 to save rules to goldencheck.yml
                Human-readable YAML with your pinned rules

5. VALIDATE →  goldencheck validate data.csv
                Enforce rules in CI with exit codes (0 = pass, 1 = fail)

What It Detects

Column-Level Profilers

Profiler	What It Catches	Example
Type inference	String columns that are actually numeric	"Column `age` is string but 98% are integer"
Nullability	Required vs. optional columns	"0 nulls across 50k rows — likely required"
Uniqueness	Primary key candidates, near-duplicates	"100% unique — likely primary key"
Format detection	Emails, phones, URLs, dates	"94% email format, 6% malformed"
Range & distribution	Outliers, min/max bounds	"3 rows have values >10,000"
Cardinality	Low-cardinality enum suggestions	"4 unique values — possible enum"
Pattern consistency	Mixed formats within a column	"3 phone formats detected"

Cross-Column Profilers

Profiler	What It Catches
Temporal ordering	start_date > end_date violations
Null correlation	Columns that are null together (e.g., address + city + zip)
Numeric cross-column	value > max violations (e.g., claim_amount > policy_max)
Age vs DOB	Age column doesn't match calculated age from date_of_birth

Baseline Deep Profiling & Drift Detection

Check Type	What It Catches
`distribution_drift`	Value distribution has shifted significantly
`entropy_drift`	Entropy of column values has changed
`bound_violation`	Values exceed historical min/max bounds
`benford_drift`	Leading-digit distribution deviates from Benford's Law
`fd_violation`	Functional dependency between columns is broken
`key_uniqueness_loss`	Previously unique column now has duplicates
`temporal_order_drift`	Historical column ordering constraint violated
`type_drift`	Dominant semantic type of column has changed
`correlation_break`	Previously correlated columns are no longer correlated
`new_correlation`	New unexpected correlation appeared
`pattern_drift`	Value format/pattern distribution has shifted
`new_pattern`	New structural patterns appeared in a column

Domain Packs

Improve detection accuracy with domain-specific type definitions:

goldencheck scan data.csv --domain healthcare   # NPI, ICD, insurance, patient types
goldencheck scan data.csv --domain finance      # accounts, routing, CUSIP, transactions
goldencheck scan data.csv --domain ecommerce    # SKUs, orders, tracking, products

Domain packs add semantic types that reduce false positives and improve classification for industry-specific data.

Schema Diff

Compare two versions of a data file:

goldencheck diff data.csv                  # compare against git HEAD
goldencheck diff old.csv new.csv           # compare two files
goldencheck diff data.csv --ref main       # compare against a branch

Auto-Fix

Apply automated fixes to clean your data:

goldencheck fix data.csv                          # safe: trim, normalize, fix encoding
goldencheck fix data.csv --mode moderate          # + standardize case
goldencheck fix data.csv --mode aggressive --force # + coerce types
goldencheck fix data.csv --dry-run                # preview changes

Watch Mode

Continuously monitor a directory for data quality:

goldencheck watch data/ --interval 30        # re-scan every 30s
goldencheck watch data/ --exit-on error      # CI mode: fail on first error

REST API

Run GoldenCheck as a microservice:

goldencheck serve --port 8000

# Scan via file upload
curl -X POST http://localhost:8000/scan --data-binary @data.csv

# Scan via URL
curl -X POST http://localhost:8000/scan/url -d '{"url": "https://example.com/data.csv"}'

Database Scanning

Scan tables directly — no CSV export needed:

pip install goldencheck[db]
goldencheck scan-db "postgresql://user:pass@host/db" --table orders
goldencheck scan-db "snowflake://..." --query "SELECT * FROM orders WHERE date > '2024-01-01'"

Scheduled Runs

Cron-like scheduling with webhook notifications:

goldencheck schedule data/*.csv --interval hourly --webhook https://hooks.slack.com/...
goldencheck schedule data/*.csv --interval daily --notify-on grade-drop

LLM Boost

Add --llm-boost to enhance profiler findings with LLM intelligence. The LLM receives a representative sample of your data and:

Finds issues profilers miss — semantic understanding (e.g., "12345" in a name column)
Upgrades severity — knows "emails should be required" even if the profiler only says "INFO"
Discovers relationships — identifies temporal ordering between columns like signup_date and last_login
Downgrades false positives — "mixed phone formats are common, not an error"

# Using OpenAI
export OPENAI_API_KEY=sk-...
goldencheck data.csv --llm-boost --llm-provider openai --no-tui

# Using Anthropic
export ANTHROPIC_API_KEY=sk-ant-...
goldencheck data.csv --llm-boost --no-tui

Cost: ~$0.01 per scan (one API call with representative samples, not per-row).

Budget control:

export GOLDENCHECK_LLM_BUDGET=0.50  # max spend per scan in USD

Configuration (goldencheck.yml)

version: 1

settings:
  sample_size: 100000
  fail_on: error

columns:
  email:
    type: string
    required: true
    format: email
    unique: true

  age:
    type: integer
    range: [0, 120]

  status:
    type: string
    enum: [active, inactive, pending, closed]

relations:
  - type: temporal_order
    columns: [start_date, end_date]

ignore:
  - column: notes
    check: nullability

Only pinned rules appear in this file — not every finding. The ignore list prevents dismissed findings from reappearing.

CLI Reference

Command	Description
`goldencheck <file>`	Scan and launch TUI
`goldencheck scan <file>`	Explicit scan (supports `--smart`, `--guided`)
`goldencheck validate <file>`	Validate against goldencheck.yml
`goldencheck review <file>`	Scan + validate, launch TUI
`goldencheck init <file>`	Interactive setup wizard (scan → config → CI)
`goldencheck diff <file> [file2]`	Compare two files or against git HEAD
`goldencheck watch <dir>`	Poll directory, re-scan on change
`goldencheck fix <file>`	Auto-fix data quality issues
`goldencheck baseline <file>`	Deep-profile data and save statistical baseline to YAML
`goldencheck learn <file>`	Generate LLM validation rules
`goldencheck history`	Show scan history and trends
`goldencheck serve`	Start REST API server
`goldencheck scan-db <conn>`	Scan a database table directly
`goldencheck schedule <files>`	Run scans on a cron schedule
`goldencheck mcp-serve`	Start MCP server (19 tools)

Flags

Flag	Description
`--no-tui`	Print results to console
`--json`	JSON output
`--fail-on <level>`	Exit 1 on severity: `error` or `warning`
`--domain <name>`	Domain pack: `healthcare`, `finance`, `ecommerce`
`--llm-boost`	Enable LLM enhancement
`--llm-provider <name>`	LLM provider: `anthropic` (default) or `openai`
`--mode <level>`	Fix mode: `safe`, `moderate`, `aggressive`
`--smart`	Auto-triage: pin high-confidence, dismiss low
`--guided`	Walk through findings one-by-one
`--webhook <url>`	POST findings to Slack/PagerDuty/any URL
`--notify-on <trigger>`	Webhook trigger: `grade-drop`, `any-error`, `any-warning`
`--baseline <path>`	Path to baseline YAML for drift detection
`--no-baseline`	Skip auto-discovery of `goldencheck_baseline.yaml`
`--skip <technique>`	Skip a baseline technique (can repeat)
`--update`	Update existing baseline instead of overwriting
`-o <path>`	Output path for baseline file (default: `goldencheck_baseline.yaml`)
`--version`	Show version

TypeScript CLI

npx goldencheck-js scan data.csv --json
npx goldencheck-js scan data.csv --domain healthcare
npx goldencheck-js health-score data.csv
npx goldencheck-js profile data.csv
npx goldencheck-js validate data.csv --config goldencheck.yml
npx goldencheck-js baseline data.csv --output baseline.json
npx goldencheck-js fix data.csv --mode safe
npx goldencheck-js diff old.csv new.csv
npx goldencheck-js demo

TypeScript Architecture

goldencheck (npm)
├── goldencheck/core    # Edge-safe: browsers, Workers, Edge Runtime
│   ├── types           # Finding, Severity, DatasetProfile, Config types
│   ├── data            # TabularData — zero-dep columnar abstraction
│   ├── profilers       # 10 column profilers + 4 relation profilers
│   ├── semantic        # Type classifier, suppression, 3 domain packs
│   ├── engine          # Scanner, confidence, validator, triage, differ, fixer
│   ├── baseline        # Statistical profiling, constraints, correlation, patterns
│   ├── drift           # 13 drift checks against saved baseline
│   ├── llm             # Anthropic + OpenAI via fetch(), merger, budget
│   ├── agent           # Strategy, handoff, review queue
│   └── reporters       # JSON, CI
└── goldencheck/node    # Node.js >= 20
    ├── reader          # CSV, Parquet (via nodejs-polars)
    ├── mcp             # MCP server (7 tools)
    ├── a2a             # Agent-to-Agent HTTP server
    ├── tui             # ANSI terminal output
    ├── db-scanner      # Postgres, MySQL, SQLite
    └── watcher         # Directory polling

Benchmarks

Speed

Dataset	Time	Throughput
1K rows	0.05s	19K rows/sec
10K rows	0.23s	43K rows/sec
100K rows	2.29s	44K rows/sec
1M rows	2.07s	482K rows/sec

DQBench v1.0 — Head-to-Head

Tool	Mode	DQBench Score
GoldenCheck	zero-config	88.40
Pandera	best-effort rules	32.51
Soda Core	best-effort rules	22.36
Great Expectations	best-effort rules	21.68

GoldenCheck's zero-config discovery outperforms every competitor — even when they have hand-written rules.

Run the benchmark yourself:

pip install dqbench goldencheck
dqbench run goldencheck

Detection Accuracy

Mode	Column Recall	Cost
Profiler-only (v0.1.0)	87%	$0
Profiler-only (v0.2.0 with confidence)	100%	$0
With LLM Boost	100%	~$0.003-0.01

Tested on a custom benchmark with 341 planted data quality issues across 9 categories.

v0.2.0 improvements: minority wrong-type detection, range profiler chaining, broader temporal heuristics, and confidence scoring pushed profiler-only recall from 87% to 100%.

Raha Benchmark Datasets

Dataset	Column Recall
Flights (2,376 rows)	100% (4/4 columns)
Beers (2,410 rows)	80% (4/5 columns)

Tech Stack

Dependency	Purpose
Polars	All data operations
Typer	CLI framework
Textual	Interactive TUI
Rich	CLI output formatting
Pydantic 2	Config validation

TypeScript / Node.js

Dependency	Purpose
Zero runtime deps	Core package has no dependencies (edge-safe)
nodejs-polars	Parquet reading (optional, Node.js only)
csv-parse	CSV reading (Node.js only)
@modelcontextprotocol/sdk	MCP server (Node.js only)

MCP Server (Claude Desktop)

GoldenCheck includes an MCP server for Claude Desktop integration:

pip install goldencheck[mcp]

Add to your Claude Desktop config (claude_desktop_config.json):

{
  "mcpServers": {
    "goldencheck": {
      "command": "goldencheck",
      "args": ["mcp-serve"]
    }
  }
}

Available tools:

Tool	Description
`scan`	Scan a file for data quality issues (with optional LLM boost)
`validate`	Validate against pinned rules in goldencheck.yml
`profile`	Get column-level statistics and health score
`health_score`	Quick A-F grade for a data file
`get_column_detail`	Deep-dive into a specific column
`list_checks`	List all available profiler checks

Remote MCP Server

GoldenCheck is available as a hosted MCP server on Smithery — connect from any MCP client without installing anything.

Claude Desktop / Claude Code:

{
  "mcpServers": {
    "goldencheck": {
      "url": "https://goldencheck-mcp-production.up.railway.app/mcp/"
    }
  }
}

Local server:

pip install goldencheck[mcp]
goldencheck mcp-serve

19 tools available: scan files, validate rules, profile columns, health-score datasets, auto-configure validation, explain findings, compare domains, suggest fixes.

Jupyter / Colab

GoldenCheck renders rich HTML in Jupyter notebooks:

from goldencheck.engine.scanner import scan_file
from goldencheck.engine.confidence import apply_confidence_downgrade
from goldencheck.notebook import ScanResult

findings, profile = scan_file("data.csv")
findings = apply_confidence_downgrade(findings, llm_boost=False)

# Rich HTML display in notebooks
ScanResult(findings=findings, profile=profile)

API Quick Reference

Python

import goldencheck

# Scan a CSV for quality issues
findings = goldencheck.scan_file("data.csv")
for f in findings:
    print(f"[{f.severity}] {f.column}: {f.check} — {f.message}")

# Create baseline and detect drift
from goldencheck import create_baseline, scan_file
baseline = create_baseline("data.csv")
baseline.save("goldencheck_baseline.yaml")
findings, profile = scan_file("data.csv", baseline="goldencheck_baseline.yaml")

# Health score
score = goldencheck.health_score("data.csv")
print(score)  # e.g. "B (78/100)"

TypeScript

import { scanData, TabularData, Severity } from "goldencheck";

// Scan records (edge-safe)
const data = new TabularData(records);
const { findings, profile } = scanData(data);
for (const f of findings) {
  console.log(`[${f.severity === Severity.ERROR ? "ERROR" : "WARNING"}] ${f.column}: ${f.message}`);
}

import { readFile, scanData, applyConfidenceDowngrade, healthScore } from "goldencheck/node";

// Scan a CSV file (Node.js)
const data = readFile("data.csv");
const result = scanData(data, { domain: "healthcare" });
const findings = applyConfidenceDowngrade(result.findings, false);

// Health score
const byCol = {};
for (const f of findings) {
  if (f.severity >= 2) {
    byCol[f.column] ??= { errors: 0, warnings: 0 };
    byCol[f.column][f.severity === 3 ? "errors" : "warnings"]++;
  }
}
const { grade, points } = healthScore(byCol);
console.log(`Health: ${grade} (${points}/100)`);

import { readFile, createBaseline, serializeBaseline } from "goldencheck/node";
import { runDriftChecks, deserializeBaseline } from "goldencheck";
import { writeFileSync, readFileSync } from "node:fs";

// Create baseline and detect drift
const data = readFile("reference.csv");
const baseline = createBaseline(data);
writeFileSync("baseline.json", serializeBaseline(baseline));

const newData = readFile("production.csv");
const saved = deserializeBaseline(readFileSync("baseline.json", "utf-8"));
const driftFindings = runDriftChecks(newData, saved);

Contributing

See CONTRIBUTING.md for development setup and guidelines.

Author

Ben Severn

License

MIT — see LICENSE

Part of the Golden Suite

Tool	Purpose	Install
GoldenCheck	Validate & profile data quality	`pip install goldencheck` / `npm install goldencheck`
GoldenFlow	Transform & standardize data	`pip install goldenflow`
GoldenMatch	Deduplicate & match records	`pip install goldenmatch`
GoldenPipe	Orchestrate the full pipeline	`pip install goldenpipe`

Companion projects:

dbt-goldencheck — data validation as a dbt test.
goldencheck-types — community-contributed domain type packs.
goldencheck-action — GitHub Action for CI with PR comments.