CAT
/MCP
SkillsMCPMarketplacesDigestToolsAdvertise

This week in Claude

Every Monday: Claude Code, Agent SDK, MCP, and the Anthropic platform moves worth your time.

Skills by Category
Frontend DevelopmentBackend & APIsTesting & QASecurityDevOps & CI/CDGit & Pull RequestsDocumentationCode Review & QualityAI & Agent BuildingSkill Development
MCP Servers by Category
Sales & MarketingWeb & Browser AutomationDatabasesAI & LLM ToolsCloud & InfrastructureCommunication & MessagingDeveloper ToolsDesign & CreativeDocuments & KnowledgeSearch & Web Crawling
Marketplaces by Category
AI Agents & OrchestrationLLM IntegrationDevelopment ToolsFrontend & UIBackend & APIsDatabasesTesting & Code QualityDevOps & CloudSecurity & ComplianceGit & Version Control

Cross AI Tools

Discover Claude Code plugins, extensions, and tools. Automatically updated directory of Anthropic Claude AI marketplaces with development tools, productivity plugins, and integrations.

Resources

  • Browse Skills
  • Browse MCP Servers
  • Browse Marketplaces
  • Plugins Reference

Community

  • About
  • Tools
  • Feedback
  • Privacy Policy
  • Advertise

Built for the Claude Code community with Claude Code by @mertduzgun

Independent project, not affiliated with Anthropic

Mcp Datahub

txn2/mcp-datahub
2STDIOregistry active
Summary

Connects Claude to DataHub's GraphQL API for metadata discovery and lineage exploration. Exposes 12 tools covering search, schema inspection, glossary terms, domains, and data lineage tracing. Ships as both a standalone MCP server and a composable Go library, so you can embed it in custom servers with your own auth and tenant logic. Built to work alongside mcp-trino and mcp-s3 for a full data platform stack. Supports multi-instance connections if you run staging and production DataHub clusters. The library includes a QueryProvider interface that lets query engines inject execution context back into metadata responses, turning URNs into queryable table identifiers with sample SQL.

CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →
CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →

txn2/mcp-datahub

GitHub license Go Reference Go Report Card codecov OpenSSF Scorecard SLSA 3

An MCP server and composable Go library that connects AI assistants to DataHub metadata catalogs. Search datasets, explore schemas, trace lineage, and access glossary terms and domains.

mcp-datahub.txn2.com | Installation | Library Docs

MCP Data Platform Ecosystem

mcp-datahub is part of a broader suite of open-source MCP servers designed to work together as a composable data platform. Each component can run standalone or be combined to give AI assistants unified access to storage, query engines, and metadata catalogs.

  • txn2/mcp-data-platform
  • txn2/mcp-s3
  • txn2/mcp-trino

Two Ways to Use

1. Standalone MCP Server

Install and connect to Claude Desktop, Cursor, or any MCP client:

Claude Desktop (Easiest) - Download the .mcpb bundle from releases and double-click to install:

  • macOS Apple Silicon: mcp-datahub_X.X.X_darwin_arm64.mcpb
  • macOS Intel: mcp-datahub_X.X.X_darwin_amd64.mcpb
  • Windows: mcp-datahub_X.X.X_windows_amd64.mcpb

Other Installation Methods:

# Homebrew (macOS)
brew install txn2/tap/mcp-datahub

# Go install
go install github.com/txn2/mcp-datahub/cmd/mcp-datahub@latest

Manual Claude Desktop Configuration (if not using MCPB):

{
  "mcpServers": {
    "datahub": {
      "command": "/opt/homebrew/bin/mcp-datahub",
      "env": {
        "DATAHUB_URL": "https://datahub.example.com",
        "DATAHUB_TOKEN": "your_token"
      }
    }
  }
}

Multi-Server Configuration

Connect to multiple DataHub instances simultaneously:

# Primary server
export DATAHUB_URL=https://prod.datahub.example.com/api/graphql
export DATAHUB_TOKEN=prod-token
export DATAHUB_CONNECTION_NAME=prod

# Additional servers (JSON)
export DATAHUB_ADDITIONAL_SERVERS='{"staging":{"url":"https://staging.datahub.example.com/api/graphql","token":"staging-token"}}'

Use datahub_list_connections to discover available connections, then pass the connection parameter to any tool.

2. Composable Go Library

Import into your own MCP server for custom authentication, tenant isolation, and audit logging:

import (
    "github.com/txn2/mcp-datahub/pkg/client"
    "github.com/txn2/mcp-datahub/pkg/tools"
)

// Create client and register tools with your MCP server
datahubClient, _ := client.NewFromEnv()
defer datahubClient.Close()

toolkit := tools.NewToolkit(datahubClient, tools.Config{})
toolkit.RegisterAll(yourMCPServer)

Customizing Tool Descriptions

Override tool descriptions to match your deployment:

toolkit := tools.NewToolkit(datahubClient, tools.Config{},
    tools.WithDescriptions(map[tools.ToolName]string{
        tools.ToolSearch: "Search our internal data catalog for datasets and dashboards",
    }),
)

Customizing Tool Annotations

Override MCP tool annotations (behavior hints for AI clients):

toolkit := tools.NewToolkit(datahubClient, tools.Config{},
    tools.WithAnnotations(map[tools.ToolName]*mcp.ToolAnnotations{
        tools.ToolSearch: {ReadOnlyHint: true, OpenWorldHint: boolPtr(true)},
    }),
)

All 12 tools ship with default annotations: read tools are marked ReadOnlyHint: true; datahub_create is non-destructive and non-idempotent; datahub_update is non-destructive and idempotent; datahub_delete is destructive and idempotent.

Extensions (Logging, Metrics, Error Hints)

Enable optional middleware via the extensions package:

import "github.com/txn2/mcp-datahub/pkg/extensions"

// Load from environment variables (MCP_DATAHUB_EXT_*)
cfg := extensions.FromEnv()
opts := extensions.BuildToolkitOptions(cfg)
toolkit := tools.NewToolkit(datahubClient, toolsCfg, opts...)

// Or load from a YAML/JSON config file
serverCfg, _ := extensions.LoadConfig("config.yaml")

See the library documentation for middleware, selective tool registration, and enterprise patterns.

Combining with mcp-trino

Build a unified data platform MCP server by combining DataHub metadata with Trino query execution:

import (
    datahubClient "github.com/txn2/mcp-datahub/pkg/client"
    datahubTools "github.com/txn2/mcp-datahub/pkg/tools"
    trinoClient "github.com/txn2/mcp-trino/pkg/client"
    trinoTools "github.com/txn2/mcp-trino/pkg/tools"
)

// Add DataHub tools (search, lineage, schema, glossary)
dh, _ := datahubClient.NewFromEnv()
datahubTools.NewToolkit(dh, datahubTools.Config{}).RegisterAll(server)

// Add Trino tools (query execution, catalog browsing)
tr, _ := trinoClient.NewFromEnv()
trinoTools.NewToolkit(tr, trinoTools.Config{}).RegisterAll(server)

// AI assistants can now:
// - Search DataHub for tables -> Get schema -> Query via Trino
// - Explore lineage -> Understand data flow -> Run validation queries

See txn2/mcp-trino for the companion library.

Bidirectional Integration with QueryProvider

The library supports bidirectional context injection. While mcp-trino can pull semantic context from DataHub, mcp-datahub can receive query execution context back from a query engine:

import (
    datahubTools "github.com/txn2/mcp-datahub/pkg/tools"
    "github.com/txn2/mcp-datahub/pkg/integration"
)

// QueryProvider enables query engines to inject context into DataHub tools
type myQueryProvider struct {
    trinoClient *trino.Client
}

func (p *myQueryProvider) Name() string { return "trino" }

func (p *myQueryProvider) ResolveTable(ctx context.Context, urn string) (*integration.TableIdentifier, error) {
    // Map DataHub URN to Trino table (catalog.schema.table)
    return &integration.TableIdentifier{
        Catalog: "hive", Schema: "production", Table: "users",
    }, nil
}

func (p *myQueryProvider) GetTableAvailability(ctx context.Context, urn string) (*integration.TableAvailability, error) {
    // Check if table is queryable
    return &integration.TableAvailability{Available: true}, nil
}

func (p *myQueryProvider) GetQueryExamples(ctx context.Context, urn string) ([]integration.QueryExample, error) {
    // Return sample queries for this entity
    return []integration.QueryExample{
        {Name: "sample", SQL: "SELECT * FROM hive.production.users LIMIT 10"},
    }, nil
}

// Wire it up
toolkit := datahubTools.NewToolkit(datahubClient, config,
    datahubTools.WithQueryProvider(&myQueryProvider{trinoClient: trino}),
)

When a QueryProvider is configured, tool responses are enriched:

  • Search results: Include query_context with table availability
  • Entity details: Include query_table, query_examples, query_availability
  • Schema: Include query_table for immediate SQL usage
  • Lineage: Include execution_context mapping URNs to tables

Integration Middleware

Enterprise features like access control and audit logging are enabled through middleware adapters:

import (
    datahubTools "github.com/txn2/mcp-datahub/pkg/tools"
    "github.com/txn2/mcp-datahub/pkg/integration"
)

// Access control - filter entities by user permissions
type myAccessFilter struct{}
func (f *myAccessFilter) CanAccess(ctx context.Context, urn string) (bool, error) { /* ... */ }
func (f *myAccessFilter) FilterURNs(ctx context.Context, urns []string) ([]string, error) { /* ... */ }

// Audit logging - track all tool invocations
type myAuditLogger struct{}
func (l *myAuditLogger) LogToolCall(ctx context.Context, tool string, params map[string]any, userID string) error { /* ... */ }

// Wire up with multiple integration options
toolkit := datahubTools.NewToolkit(datahubClient, config,
    datahubTools.WithAccessFilter(&myAccessFilter{}),
    datahubTools.WithAuditLogger(&myAuditLogger{}, func(ctx context.Context) string {
        return ctx.Value("user_id").(string)
    }),
    datahubTools.WithURNResolver(&myURNResolver{}),      // Map external IDs to URNs
    datahubTools.WithMetadataEnricher(&myEnricher{}),    // Add custom metadata
)

See the library documentation for complete integration patterns.

Available Tools

Read Tools (always available)

ToolDescription
datahub_searchSearch for datasets, dashboards, pipelines by query and entity type
datahub_get_entityGet entity metadata by URN (description, owners, tags, domain)
datahub_get_schemaGet dataset schema with field types and descriptions
datahub_get_lineageGet upstream/downstream lineage (supports level=column for column-level)
datahub_get_queriesGet SQL queries associated with a dataset
datahub_browseBrowse catalog: list tags, domains, or data products
datahub_get_glossary_termGet glossary term definition and properties
datahub_get_data_productGet data product details (owners, domain, properties)
datahub_list_connectionsList configured DataHub server connections (multi-server mode)

Write Tools (require DATAHUB_WRITE_ENABLED=true)

3 CRUD tools using the what discriminator pattern — 35 operations total:

ToolOperationsDescription
datahub_create10Create tags, domains, glossary terms, data products, documents, applications, queries, incidents, structured properties, data contracts
datahub_update17Update descriptions, tags, glossary terms, links, owners, domains, structured properties, incidents, queries, documents, data contracts
datahub_delete8Delete queries, tags, domains, glossary entities, data products, applications, documents, structured properties

Write tools are disabled by default for safety.

DataHub Version Compatibility

Minimum: DataHub 1.3.x. Full feature set: DataHub 1.4.x.

DataHub VersionFeatures
1.3.x+ (minimum)All read tools, all write operations except documents (tags, domains, glossary, data products, queries, owners, links, descriptions, incidents, applications, structured properties incl. delete, data contracts)
1.4.x+ (full)+ Documents (create/update/delete)

The client gracefully handles version differences — read queries return empty results (not errors) when a feature is unavailable on older versions.

See the tools reference for detailed documentation.

Configuration

VariableDescriptionDefault
DATAHUB_URLDataHub GraphQL API URL(required)
DATAHUB_TOKENAPI token(required)
DATAHUB_TIMEOUTRequest timeout (seconds)30
DATAHUB_DEFAULT_LIMITDefault search limit10
DATAHUB_MAX_LIMITMaximum limit100
DATAHUB_CONNECTION_NAMEDisplay name for primary connectiondatahub
DATAHUB_ADDITIONAL_SERVERSJSON map of additional servers(optional)
DATAHUB_WRITE_ENABLEDEnable write operations (true or 1)false
DATAHUB_DEBUGEnable debug logging (1 or true)false

Extensions

VariableDescriptionDefault
MCP_DATAHUB_EXT_LOGGINGEnable structured logging of tool callsfalse
MCP_DATAHUB_EXT_METRICSEnable metrics collectionfalse
MCP_DATAHUB_EXT_METADATAEnable metadata enrichment on resultsfalse
MCP_DATAHUB_EXT_ERRORSEnable error hint enrichmenttrue

Config File

As an alternative to environment variables, configure via YAML or JSON:

datahub:
  url: https://datahub.example.com
  token: "${DATAHUB_TOKEN}"
  timeout: "30s"
  write_enabled: true

toolkit:
  default_limit: 20
  descriptions:
    datahub_search: "Custom search description for your deployment"

extensions:
  logging: true
  errors: true

Load with extensions.LoadConfig("config.yaml"). Environment variables override file values for sensitive fields. Token values support $VAR / ${VAR} expansion.

See configuration reference for all options.

Development

make build     # Build binary
make test      # Run tests with race detection
make lint      # Run golangci-lint
make security  # Run gosec and govulncheck
make coverage  # Generate coverage report
make verify    # Run tidy, lint, and test
make help      # Show all targets

Related Projects

  • txn2/mcp-trino (docs) - Composable MCP toolkit for Trino query execution
  • DataHub - The open-source metadata platform

Contributing

See CONTRIBUTING.md for guidelines.

License

Apache License 2.0


Open source by Craig Johnston, sponsored by Deasil Works, Inc.

Featured
CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →
Categories
Data & Analytics
Registryactive
Packagehttps://github.com/txn2/mcp-datahub/releases/download/v1.9.0/mcp-datahub_1.9.0_darwin_arm64.mcpb
TransportSTDIO
UpdatedJun 9, 2026
View on GitHub

Related Data & Analytics MCP Servers

View all →
Google Sheets

com.mcparmory/google-sheets

Create, read, and modify spreadsheet data, formatting, and sheets
25
Google Sheets

domdomegg/google-sheets-mcp

Allow AI systems to read, write, and query spreadsheet data via Google Sheets.
2
Google Sheets Mcp

henilcalagiya/google-sheets-mcp

Powerful tools for automating Google Sheets using Model Context Protocol (MCP)
14
Futuristic Risk Intelligence

cct15/war-dashboard-data

Geopolitical conflict risk, political events, and maritime traffic data for AI agents
1
Mcp Google Sheets Full

moooonad/mcp-google-sheets-full

Full Google Sheets MCP: 26 tools + run_sheets_script escape hatch. User OAuth, no service account.
CSV to JSON API

io.github.br0ski777/csv-to-json

Parse CSV to JSON array. Auto-detect delimiter, headers. x402 micropayment.