CAT
/Skills
SkillsMCPMarketplacesDigestToolsAdvertise

This week in Claude

Every Monday: Claude Code, Agent SDK, MCP, and the Anthropic platform moves worth your time.

Skills by Category
Frontend DevelopmentBackend & APIsTesting & QASecurityDevOps & CI/CDGit & Pull RequestsDocumentationCode Review & QualityAI & Agent BuildingSkill Development
MCP Servers by Category
Sales & MarketingWeb & Browser AutomationDatabasesAI & LLM ToolsCloud & InfrastructureCommunication & MessagingDeveloper ToolsDesign & CreativeDocuments & KnowledgeSearch & Web Crawling
Marketplaces by Category
AI Agents & OrchestrationLLM IntegrationDevelopment ToolsFrontend & UIBackend & APIsDatabasesTesting & Code QualityDevOps & CloudSecurity & ComplianceGit & Version Control

Cross AI Tools

Discover Claude Code plugins, extensions, and tools. Automatically updated directory of Anthropic Claude AI marketplaces with development tools, productivity plugins, and integrations.

Resources

  • Browse Skills
  • Browse MCP Servers
  • Browse Marketplaces
  • Plugins Reference

Community

  • About
  • Tools
  • Feedback
  • Privacy Policy
  • Advertise

Built for the Claude Code community with Claude Code by @mertduzgun

Independent project, not affiliated with Anthropic

Data Analysis

casper-studios/casper-marketplace
245 installs11 stars
Summary

This is a structured workflow for turning messy revenue data into defensible insights. It walks you through a seven-phase process from ingestion to output, with mandatory decision logging at every step (why this filter, why this metric, what assumptions you made). The bias checklist and data quality validator are the real value here: they force you to document survivorship bias, test weighting sensitivity, and add confidence intervals before you ship anything. Built for financial and RevOps contexts where stakeholders will actually question your methodology. Outputs range from exec slide decks to full Marimo notebooks with audit trails. If you've ever had an analysis questioned three months later and couldn't remember why you excluded certain records, this prevents that.

Install to Claude Code

npx -y skills add casper-studios/casper-marketplace --skill data-analysis --agent claude-code

Installs into .claude/skills of the current project.

CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →
CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →
Files
SKILL.mdView on GitHub

Data Analysis

Overview

A comprehensive data analysis and storytelling skill optimized for financial, SaaS, and RevOps contexts. This skill provides structured workflows for turning raw data into actionable insights with full transparency on analytical decisions, bias awareness, and progressive disclosure reporting.

Workflow Overview

Every analysis follows a 7-phase process:

1. SETUP    → Initialize Marimo notebook (run init_marimo_notebook.py)
2. INGEST   → Load data, document sources and assumptions
3. EXPLORE  → EDA with logged decisions (why this viz, why this filter)
4. MODEL    → If needed, with interpretable-first approach
5. INTERPRET → Apply bias checklist, hedge appropriately
6. WISHLIST → Document data gaps and proxies used
7. OUTPUT   → Generate appropriate tier (slides/report/notebook)

Decision Logging Protocol

Every analytical choice must be logged. This creates an audit trail and enables reproducibility.

What to Log

Decision TypeExampleLog Format
Data filteringRemoved 47 records with null revenueFILTER: [reason] - [count] records affected
Metric choiceUsed logo churn vs revenue churnMETRIC: [chosen] over [alternative] because [reason]
VisualizationLine chart for time seriesVIZ: [type] because [reason]
AssumptionAssumed linear growth for projectionASSUMPTION: [statement] - confidence: [H/M/L]
Proxy usedUsed support tickets as NPS proxyPROXY: [proxy] for [missing data] - quality: [S/M/W]

Log Format in Notebook

# === DECISION LOG ===
# FILTER: Excluded trial accounts - 1,247 records removed
# METRIC: NRR over GRR because expansion is significant factor
# ASSUMPTION: Q4 seasonality similar to prior year - confidence: M
# PROXY: Support ticket sentiment for NPS - quality: Weak

Analysis Workflow Details

Phase 1: Setup

Run the initialization script to create a new Marimo notebook with pre-built scaffolding:

python scripts/init_marimo_notebook.py <notebook_name>

This creates a .py file with:

  • Decision log cell (markdown)
  • Data loading template
  • EDA template cells
  • Bias checklist cell

Phase 2: Data Ingestion

When loading data:

  1. Document the source (file path, API, database query)
  2. Record row/column counts
  3. Note any immediate data quality issues
  4. Log assumptions about data freshness
# === DATA SOURCE ===
# Source: sales_data_2024.csv
# Loaded: 2024-01-15
# Records: 15,847 rows x 23 columns
# Note: Data through 2024-01-10, 5-day lag from source system

Phase 3: Exploratory Data Analysis

Follow this EDA checklist:

  • Distribution of key numeric variables
  • Missing value patterns
  • Outlier detection
  • Time series patterns (if applicable)
  • Segment breakdowns
  • Correlation exploration

Log every visualization choice and filtering decision.

Phase 4: Modeling (If Needed)

Prioritize interpretability:

  1. First choice: Descriptive statistics, cohort analysis
  2. Second choice: Linear regression, decision trees
  3. Last resort: Complex ML (document why simpler won't work)

Always provide:

  • Model assumptions
  • Feature importance / coefficients
  • Confidence intervals
  • What the model cannot tell us

Phase 5: Interpretation

Before finalizing insights, run the bias checklist. See references/biases.md for full checklist.

Quick check:

  • Survivorship bias: Am I only looking at "survivors"?
  • Simpson's paradox: Do segment trends differ from aggregate?
  • Selection bias: Is my sample representative?
  • Small-n warning: Is sample size sufficient for claims?

Hedge appropriately:

  • Use "suggests" not "proves"
  • State confidence levels
  • Note what additional data would strengthen conclusions

Phase 5.5: Validation Gate (MANDATORY)

⚠️ GATE: Before proceeding to output, you MUST run the data quality validation checklist.

This is not optional. Run through references/data-quality-validator.md before finalizing:

Critical Patterns Checklist:

  • Market Context: Are YoY changes compared to market/baseline?
  • Weighting Sensitivity: If using composite scores, tested 5-6 weight scenarios?
  • Bootstrap CIs: For small samples, generated P10/P50/P90 ranges?
  • Survivorship Quantified: Data_Availability column added? Coverage % reported?

Statistical Checks:

  • Sample sizes disclosed with confidence intervals?
  • Multiple comparisons accounted for?
  • Cherry-picked baselines avoided?

Logic Checks:

  • Correlation not claimed as causation?
  • Alternative explanations considered?
  • Back-of-envelope validation passed?

Methodology Note on Time Horizons: When assessing skill vs luck (e.g., sales rep performance, investment returns):

  • Longer time horizons (3+ years) reveal inconsistency that short windows hide
  • More periods = higher bar for "likely skill"
  • A rep who is "top 10%" for 2 quarters could easily be luck
  • A rep who is "top 10%" for 12 quarters is more likely skill
  • Always state the number of periods analyzed and what that implies for confidence

Do not proceed to Phase 6/7 until this checklist is complete.

Phase 6: Data Wishlisting

Document gaps and proxies. See references/data-wishlisting.md for patterns.

Format:

## Data Wishlist

| Missing Data | Proxy Used | Quality | Impact on Analysis |
|--------------|------------|---------|-------------------|
| Customer NPS | Support sentiment | Weak | Core finding, needs validation |
| True LTV | 12-month value | Moderate | Acceptable for segmentation |

Phase 7: Output Generation

Choose output tier based on audience and purpose:

TierWhen to UseTool
SlidesExecutive summary, board deckgenerate_pptx_summary.py
ReportDetailed findings, stakeholder reviewMarkdown/PDF
NotebookFull analysis, data team handoffMarimo .py file

Data Cleaning Workflow

For messy data that needs cleaning before analysis:

1. Profile the Data

python scripts/profile_data.py <csv_file> --output data_quality_report.md

This generates:

  • Column-level statistics (nulls, uniques, types)
  • Data quality score (A-F grading)
  • Suspicious pattern detection
  • Suggested cleaning steps

2. Apply Cleaning Patterns

Reference references/data-cleaning.md for:

  • Missing value strategies (drop, impute, flag)
  • Outlier detection methods (IQR, z-score, domain rules)
  • Common transforms (pivot, melt, merge patterns)
  • Type coercion recipes
  • Deduplication patterns

3. Handle Datetime Issues

Reference references/datetime-handling.md for:

  • Timezone conversion patterns
  • Date parsing for mixed formats
  • Fiscal calendar handling (FY vs CY)
  • Period aggregation (daily → weekly → monthly)
  • Business day calculations

Dashboard Building Workflow

For interactive monitoring dashboards:

1. Initialize Dashboard

python scripts/init_dashboard.py <dashboard_name>

This creates a Marimo dashboard with:

  • KPI cards row
  • Filter sidebar (segment, date range, period)
  • Time series trend chart
  • Summary data table
  • Responsive layout

2. Apply Dashboard Patterns

Reference references/dashboard-patterns.md for:

  • Marimo layout patterns (sidebar, tabs, grid)
  • KPI card templates with sparklines
  • Filter/slider patterns for interactivity
  • Data table styling and formatting
  • Time series with range selection
  • Refresh patterns for live data

Data Quality Validation Workflow

Before presenting or accepting analytical claims:

Run the Data Quality Validation Checklist

Reference references/data-quality-validator.md for comprehensive checklists:

Statistical Sins:

  • P-hacking / multiple comparisons
  • Small sample extrapolation
  • Missing confidence intervals
  • Cherry-picked baselines

Chart Crimes:

  • Truncated y-axis
  • Dual y-axis manipulation
  • 3D charts
  • Misleading scales

Logic Fallacies:

  • Correlation ≠ causation
  • Ecological fallacy
  • Base rate neglect
  • Survivorship bias

Sanity Checks:

  • Does this pass the smell test?
  • Back-of-envelope validation
  • Historical comparison
  • Cross-source validation

Excel Output Workflow

For exporting analysis results to Excel with proper formulas and formatting:

Financial Model Standards

Reference references/xlsx-patterns.md for:

  • Color coding convention (blue=inputs, black=formulas, green=cross-sheet links)
  • Number formatting standards (currency, percentages, multiples)
  • Formula construction rules (use formulas, not hardcoded values)
  • Common formula patterns for analysis

Verification

After creating Excel files with formulas, always recalculate:

python scripts/recalc.py output.xlsx

This ensures:

  • All formulas are calculated (openpyxl doesn't evaluate formulas)
  • Zero formula errors (#REF!, #DIV/0!, etc.)
  • JSON output shows any errors to fix

PDF Handling Workflow

For extracting data from PDFs or creating PDF reports:

Extracting Data

Reference references/pdf-patterns.md for:

  • Text extraction (pypdf, pdfplumber)
  • Table extraction to DataFrame
  • OCR for scanned documents
  • Command-line tools (pdftotext, qpdf)

Creating Reports

Reference references/pdf-patterns.md for:

  • Basic report creation with reportlab
  • Professional reports with sections and tables
  • Embedding matplotlib charts in PDFs
  • Merge/split operations

Reference Files

Load these as needed during analysis:

ReferenceWhen to Use
references/metrics.mdCalculating SaaS/RevOps metrics
references/biases.mdInterpretation phase, before finalizing insights
references/report-templates.mdStructuring output (pyramid vs consulting style)
references/visualization-guide.mdChoosing chart types, avoiding anti-patterns
references/data-wishlisting.mdDocumenting gaps, rating proxy quality
references/data-cleaning.mdData quality checks, cleaning patterns
references/datetime-handling.mdTimezone, parsing, fiscal calendars
references/dashboard-patterns.mdMarimo layouts, KPIs, interactivity
references/data-quality-validator.mdData quality validation, detecting issues
references/xlsx-patterns.mdExcel output, financial model standards, formulas
references/pdf-patterns.mdPDF extraction, report creation, manipulation

Scripts

ScriptPurposeUsage
scripts/init_marimo_notebook.pyInitialize analysis workspacepython scripts/init_marimo_notebook.py <name>
scripts/generate_pptx_summary.pyCreate slide deck from findingspython scripts/generate_pptx_summary.py <config.json>
scripts/profile_data.pyGenerate data quality reportpython scripts/profile_data.py <csv_file>
scripts/init_dashboard.pyScaffold interactive dashboardpython scripts/init_dashboard.py <name>
scripts/recalc.pyRecalculate Excel formulaspython scripts/recalc.py <xlsx_file>

Technology Stack

ToolPurposeWhy
MarimoNotebook environmentPure Python files, reactive, git-friendly
pandasData manipulationReliable LLM code generation, mature ecosystem
Matplotlib/SeabornVisualizationPublication-quality, static, well-supported
python-pptxSlide generationProgrammatic PowerPoint creation
openpyxlExcel filesFormulas, formatting, financial models
pypdf/pdfplumberPDF handlingExtract text, tables; create reports
reportlabPDF creationProfessional PDF reports

Example Invocations

Revenue analysis:

"Analyze our ARR trends by segment and identify drivers of growth/churn"

Pipeline analytics:

"Build a win rate analysis by deal size and sales rep"

Cohort analysis:

"Create a retention cohort analysis for customers acquired in 2023"

Forecasting:

"Project next quarter revenue based on current pipeline"

Board deck:

"Create an executive summary deck of our key SaaS metrics"

Data cleaning:

"Clean this messy CSV and profile the data quality"

Dashboard:

"Build a dashboard to monitor our key SaaS metrics"

Data validation:

"Validate these findings before I present them"

Excel output:

"Export this analysis to Excel with proper formulas and formatting"

PDF extraction:

"Extract the tables from this quarterly report PDF"

Financial model:

"Create a revenue projection model in Excel with scenario inputs"

Featured
CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →
First SeenJun 3, 2026
View on GitHub

Recommended

caveman

juliusbrussee/caveman

Ultra-compressed communication mode cutting token usage ~75% while preserving technical accuracy.
203.4k
67.8k
grill-me

mattpocock/skills

Relentless interviewing skill that stress-tests plans and designs through systematic questioning.
250.9k
114.5k
improve

shadcn/improve

Survey any codebase as a senior advisor and produce prioritized, self-contained implementation plans for other models/agents to execute.
10
205
systematic-debugging

obra/superpowers

Structured debugging methodology that mandates root cause investigation before attempting any fixes.
124.6k
215.9k
karpathy-guidelines

forrestchang/andrej-karpathy-skills

Behavioral guidelines to reduce common LLM coding mistakes through explicit assumptions, simplicity, and verifiable success criteria.
13.9k
165.4k
find-skills

vercel-labs/skills

Discover and install specialized agent skills from the open ecosystem when users need extended capabilities.
1.8M
21.1k