A policy enforcement layer that sits between Claude and any MCP server, logging tool calls and applying allow/deny rules before execution. You give it a policy YAML with path constraints, tool allowlists, and optional evidence requirements, then wrap your existing MCP server with `assay mcp wrap`. Every tool invocation gets audited into a tamper-evident bundle with cryptographic verification. The trust-basis compiler turns those bundles into claim artifacts (verified, self-reported, inferred, absent) for CI gates or SARIF output. Useful when you need runtime guardrails on filesystem access, exec boundaries, or sensitive tool usage without rewriting the upstream server. Ships as a Rust CLI with stdio transport, no hosted backend required.
Policy-as-code for MCP agents: enforce what a tool call can do, prove what it did, stay honest about what you can't.
A deterministic, fail-closed gate for MCP tool calls — with real kernel-level (eBPF/LSM) enforcement on Linux and offline-verifiable evidence. CI-native, no backend, bounded by design.
Quickstart · How it works · See it work · MCP example · OWASP MCP Top 10 · Discussions
Agents got real tool access through MCP — and tool poisoning, rug pulls, and confused-deputy OAuth came with it. Most tools scan a server or filter a prompt. Assay sits at the tool-call boundary and does three things, in order.
tools/call before it runs, with the precise reason for each allow or deny. On Linux it adds real kernel enforcement — an eBPF/LSM IPv4/TCP connect-egress block and a Landlock TCP-connect port allowlist, both opt-in and fail-closed. A policy it cannot express exactly is refused, never half-applied.verified, self_reported, inferred, absent), and a gate refuses to let a claim exceed what was observed. A tool returning "success" is the provider's assertion, never proof. Assay ships no single safety score and never claims more than it can prove.cargo install assay-cli
mkdir -p /tmp/assay-demo && echo "safe content" > /tmp/assay-demo/safe.txt
assay mcp wrap --policy examples/mcp-quickstart/policy.yaml \
-- npx @modelcontextprotocol/server-filesystem /tmp/assay-demo
✅ ALLOW read_file path=/tmp/assay-demo/safe.txt reason=policy_allow
❌ DENY read_file path=/tmp/outside-demo.txt reason=path_constraint_violation
❌ DENY exec cmd=ls reason=tool_denied
Wire it into Cursor, Claude Code, or Codex in one line with assay mcp config-path <editor>. Python SDK: pip install assay-it. CI: GitHub Action. No hosted backend, no API keys for core flows, deterministic by design. New to the threat model? The OWASP MCP Top 10 mapping lays out, per risk, what Assay covers and what it deliberately does not.
| Output | What it is |
|---|---|
| Policy gate | assay mcp wrap — deterministic allow/deny before tools run, with the reason. |
| Evidence bundle | Offline-verifiable, tamper-evident archive for audit and replay. |
| Trust Basis / Trust Card | Canonical trust-basis.json (bounded claim classification) plus review-friendly trustcard.{json,md,html}. |
| External receipts | Eval outcomes, runtime decisions, and model inventory as bounded receipts with JSON Schema contracts. |
| Tool-decision surface | Each privileged tools/call recorded as assay.tool_decision_surface.v0 — sensitive ids hashed, raw arguments never stored. |
| SARIF / CI | GitHub Action, Security-tab integration, policy gates on PRs. |
| Attestation | Export a bundle as an in-toto / DSSE statement (v0), anchor-pluggable. |
Agent ──► Assay ──► MCP Server
├─ ✅ ALLOW / ❌ DENY (policy, with reason)
├─► 📋 Evidence bundle (offline-verifiable)
└─► 📊 Trust Basis → Trust Card → SARIF / CI
New in 3.30.0: an evidence event can carry an optional soft semantic_digest (with its digest_profile) beside the hard content_hash — a correlation/equivalence overlay for grouping records by canonical content across producers or points in time, computed via the assay-canonical crate (RFC 8785 / JCS). It is never part of content_hash, never on the verify or admission path, and never substitutes integrity. CHANGELOG.md and release notes remain the authority for what is public; crates.io publication is separate from merge state.
Yes if you already have eval output, runtime decisions, inventory artifacts, or MCP tool-call tests, and you want a small reviewable CI artifact instead of a dashboard — bounded auditability, not a scalar trust badge.
Not yet if you need Assay to judge model correctness for you, want a hosted dashboard as the product, or want a compliance claim rather than a bounded evidence boundary. Assay is not a trust-score engine, a generic eval dashboard, or a hosted observability product — see what it is and is not.
An agent tries a privileged action — github.add_deploy_key — through the enforcing proxy, decided per call before it forwards, offline against a local mock (no real credentials):
cd examples/privileged-action-gate && ./run.sh

A deny is fail-closed caution, not a verdict on intent; an allow is the decision to forward, never proof the action happened. Declared-vs-observed conformance is recorded beside the verdict, never as a gate. Full walkthrough: privileged-action-gate.
| You have | What you get | Start here |
|---|---|---|
| Promptfoo JSONL from CI evals | Eval outcome receipts + verified bundle + Trust Basis diff | Promptfoo JSONL |
OpenFeature EvaluationDetails | Decision receipt + verified bundle | OpenFeature |
| CycloneDX ML-BOM model component | Inventory receipt + verified bundle | CycloneDX ML-BOM |
| MCP tool calls | Allow/deny audit trail + observed-behavior evidence | MCP Quick Start |
| A GitHub PR gate | Trust Basis diff, gate status, SARIF/JUnit-ready output | CI Guide |
| A Runner archive / coverage annotation | Coverage descriptors + claim-class cells + a claimed-vs-observed check | Coverage-honesty walkthrough |
The workflow stays small: import or record a bounded outcome, bundle and verify it, compile trust-basis.json, gate the Trust Basis diff. Assay doesn't make the upstream tool the source of truth; it makes the evidence boundary inspectable. For privileged tool actions, the MCP proxy records each tools/call as a structured tool-decision surface — keeping the asserted-versus-verified line honest.
version: "2.0"
name: "my-policy"
tools:
allow: ["read_file", "list_dir"]
deny: ["exec", "shell", "write_file"]
schemas:
read_file:
type: object
properties:
path: { type: string, pattern: "^/app/.*" }
required: ["path"]
Generate one from observed behaviour with assay init --from-trace trace.jsonl, or migrate a legacy constraints: policy with assay policy migrate. See Policy Files.
| Canonical evidence | Assay's evidence model is the stable contract; OpenTelemetry and protocol adapters (ACP / A2A / UCP) map into it. |
| Deterministic | Same input, same decision — not probabilistic. |
| Bounded claims | Explicit about verified vs visible vs absent — no score-first UX. |
| Offline-first | No backend required for core enforcement and bundle verification. |
Trust claims use explicit epistemology, not a single safety score: verified (direct evidence or offline verification), self_reported (emitted without independent corroboration), inferred (bounded, documented rules), absent (no trustworthy evidence). Assay ships no aggregate trust score or safe/unsafe badge as the main output — see ADR-033.
Tool-decision path latency on an M1 Pro fragmented-IPI harness: main protection 0.771ms p50 / 1.913ms p95; fast-path 0.345ms p50 / 1.145ms p95. These are tool-decision timings, not end-to-end model latency.
Assay-Runner is an internal measured-run subsystem behind the delegated Linux/eBPF acceptance path — publish = false, not a standalone product, no release commitment.
cargo test --workspace
cargo clippy --workspace --all-targets -- -D warnings
See CONTRIBUTING.md and GitHub Discussions.