CAT
/Skills
SkillsMCPMarketplacesDigestToolsAdvertise

This week in Claude

Every Monday: Claude Code, Agent SDK, MCP, and the Anthropic platform moves worth your time.

Skills by Category
Frontend DevelopmentBackend & APIsTesting & QASecurityDevOps & CI/CDGit & Pull RequestsDocumentationCode Review & QualityAI & Agent BuildingSkill Development
MCP Servers by Category
Sales & MarketingWeb & Browser AutomationDatabasesAI & LLM ToolsCloud & InfrastructureCommunication & MessagingDeveloper ToolsDesign & CreativeDocuments & KnowledgeSearch & Web Crawling
Marketplaces by Category
AI Agents & OrchestrationLLM IntegrationDevelopment ToolsFrontend & UIBackend & APIsDatabasesTesting & Code QualityDevOps & CloudSecurity & ComplianceGit & Version Control

Cross AI Tools

Discover Claude Code plugins, extensions, and tools. Automatically updated directory of Anthropic Claude AI marketplaces with development tools, productivity plugins, and integrations.

Resources

  • Browse Skills
  • Browse MCP Servers
  • Browse Marketplaces
  • Plugins Reference

Community

  • About
  • Tools
  • Feedback
  • Privacy Policy
  • Advertise

Built for the Claude Code community with Claude Code by @mertduzgun

Independent project, not affiliated with Anthropic

Terraform Skill

daymade/claude-code-skills
254 installs1.1k stars
Summary

Battle-tested fixes for Terraform provisioning failures that waste hours in multi-environment setups. Covers the specific races and config traps that break fresh deploys: cloud-init timing issues, SSH connection conflicts during file transfers, Cloudflare API token format errors that only surface in staging, hardcoded domains in Caddyfiles causing cert failures, and the init-data-only-once problem with Casdoor OAuth. Every trap includes the exact error message you'll see and a copy-paste fix. Honest take: this reads like someone's incident post-mortems turned into a runbook, which makes it way more useful than generic Terraform advice. Activate it when you're debugging containers stuck in Restarting state after apply or setting up a second environment that mysteriously breaks differently than production.

Install to Claude Code

npx -y skills add daymade/claude-code-skills --skill terraform-skill --agent claude-code

Installs into .claude/skills of the current project.

CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →
CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →
Files
SKILL.mdView on GitHub

Terraform Operational Traps

Failure patterns from real deployments. Every item caused an incident. Organized as: exact error → root cause → copy-paste fix.

Provisioner traps (symptom → fix)

docker: not found in remote-exec

cloud-init still installing Docker when provisioner SSHs in.

provisioner "remote-exec" {
  inline = [
    "cloud-init status --wait || true",
    "which docker || { echo 'FATAL: Docker not ready'; exit 1; }",
  ]
}

rsync: connection unexpectedly closed in local-exec

Terraform holds its SSH connection open; local-exec rsync opens a second one that gets rejected. Never use local-exec for file transfer to remote. Use tarball + file provisioner:

provisioner "local-exec" {
  command = "tar czf /tmp/src.tar.gz --exclude=node_modules --exclude=.git -C ${path.module}/../../.. myproject"
}
provisioner "file" {
  source      = "/tmp/src.tar.gz"
  destination = "/tmp/src.tar.gz"
}
provisioner "remote-exec" {
  inline = ["tar xzf /tmp/src.tar.gz -C /data/ && rm -f /tmp/src.tar.gz"]
}

macOS BSD tar: --exclude must come BEFORE the source argument.

cloud-init status shows "running" forever

apt-get -y does not suppress debconf dialogs. Packages like iptables-persistent block on TTY prompts.

- |
    echo iptables-persistent iptables-persistent/autosave_v4 boolean true | debconf-set-selections
    echo iptables-persistent iptables-persistent/autosave_v6 boolean true | debconf-set-selections
    DEBIAN_FRONTEND=noninteractive apt-get install -y iptables-persistent

Known offenders: iptables-persistent, postfix, mysql-server, wireshark-common.

EACCES: permission denied in container logs, container Restarting

Host volume dirs are root-owned; container runs as non-root (uid 1001). Fix before docker compose up:

mkdir -p /data/myapp/data /data/myapp/logs
chown -R 1001:1001 /data/myapp/data /data/myapp/logs

Find UID: grep adduser.*-u or USER in Dockerfile.

Provisioner fails but no diagnostic output

set -e exits on first error, hiding subsequent docker logs output. Use set -u without -e, put one verification gate at the end:

provisioner "remote-exec" {
  inline = [
    "set -u",
    "docker compose up -d",
    "sleep 15",
    "docker logs myapp --tail 20 2>&1 || true",
    "docker ps --format 'table {{.Names}}\\t{{.Status}}' || true",
    "docker ps --filter name=myapp --format '{{.Status}}' | grep -q healthy || exit 1",
  ]
}

Container Restarting — database tables missing

DB migrations not in provisioner. PostgreSQL docker-entrypoint-initdb.d only runs on empty data dir. Explicitly create DB + run migrations:

# After postgres healthy:
docker exec pg psql -U postgres -tc "SELECT 1 FROM pg_database WHERE datname='mydb'" | grep -q 1 \
  || docker exec pg psql -U postgres -c "CREATE DATABASE mydb;"

# Idempotent migrations:
for f in migrations/*.sql; do
  VER=$(basename $f)
  APPLIED=$($PSQL -tAc "SELECT 1 FROM schema_migrations WHERE version='$VER'" | tr -d ' ')
  [ "$APPLIED" = "1" ] && continue
  { echo 'BEGIN;'; cat $f; echo 'COMMIT;'; } | $PSQL
  $PSQL -tAc "INSERT INTO schema_migrations(version) VALUES ('$VER') ON CONFLICT DO NOTHING"
done

docker compose build ignores env var override

Compose reads build args from .env file, not shell env. VAR=x docker compose build does NOT work.

# WRONG
DOCKER_WITH_PROXY_MODE=disabled docker compose build

# RIGHT
grep -q DOCKER_WITH_PROXY_MODE .env || echo 'DOCKER_WITH_PROXY_MODE=disabled' >> .env
docker compose build

TLS handshake fails: Invalid format for Authorization header

Caddy DNS-01 ACME needs a Cloudflare API Token (cfut_ prefix, 40+ chars, Bearer auth). A Global API Key (37 hex chars, X-Auth-Key auth) causes HTTP 400 Code:6003. Production may appear to work because it has cached certificates; fresh environments fail on first cert request.

# Verify token format before deploy:
TOKEN=$(grep CLOUDFLARE_API_TOKEN .env | cut -d= -f2)
echo "$TOKEN" | grep -q "^cfut_" || echo "FATAL: needs API Token, not Global Key"

Create scoped token via API:

curl -s "https://api.cloudflare.com/client/v4/user/tokens" -X POST \
  -H "X-Auth-Email: $CF_EMAIL" -H "X-Auth-Key: $CF_GLOBAL_KEY" \
  -d '{"name":"caddy-dns-acme","policies":[{"effect":"allow",
    "resources":{"com.cloudflare.api.account.zone.<ZONE_ID>":"*"},
    "permission_groups":[
      {"id":"4755a26eedb94da69e1066d98aa820be","name":"DNS Write"},
      {"id":"c8fed203ed3043cba015a93ad1616f1f","name":"Zone Read"}]}]}'

TLS fails on staging but works on production — hardcoded domains

Caddyfile or compose has literal domain names. Staging Caddy loads production config, tries to get certs for domains it doesn't own → ACME fails.

Caddyfile: Use {$VAR} — Caddy evaluates env vars at startup.

# WRONG
example.com { tls { dns cloudflare {env.CLOUDFLARE_API_TOKEN} } }

# RIGHT
{$LOBEHUB_DOMAIN} { tls { dns cloudflare {env.CLOUDFLARE_API_TOKEN} } }

Compose: Use ${VAR:?required} — fail-fast if unset.

# WRONG
- APP_URL=https://example.com

# RIGHT
- APP_URL=${APP_URL:?APP_URL is required}

Pass the env var to the gateway container so Caddy can read it:

environment:
  - LOBEHUB_DOMAIN=${LOBEHUB_DOMAIN:?LOBEHUB_DOMAIN is required}
  - CLOUDFLARE_API_TOKEN=${CLOUDFLARE_API_TOKEN:?required for DNS-01 TLS}

OAuth login fails: Social sign in failed

Casdoor init_data.json contains hardcoded redirect URIs. --createDatabase=true only applies init_data on first-ever DB creation — not on restarts. Fix via SQL in provisioner:

# Replace production domain with staging in existing Casdoor DB
$PSQL -c "UPDATE application SET redirect_uris = REPLACE(redirect_uris,
  'example.com', 'staging.example.com')
  WHERE name='lobechat'
  AND redirect_uris LIKE '%example.com%'
  AND redirect_uris NOT LIKE '%staging.example.com%';"

Also check AUTH_CASDOOR_ISSUER — it must match the Casdoor subdomain (auth.staging.example.com), not the app root domain.

Multi-environment isolation

Before creating a second environment, grep .tf files for hardcoded names. See references/multi-env-isolation.md for the complete matrix.

Will fail on apply (globally unique):

ResourceScopeFix
SSH key pairRegion"${env}-deploy"
SLS log projectAccount"${env}-logs"
CloudMonitor contactAccount"${env}-ops"

DNS duplication trap: Two environments creating A records for the same name in the same Cloudflare zone → two independent record IDs → DNS round-robin → ~50% traffic to wrong instance. Fix: use subdomain isolation (staging.example.com) or separate zones. Remember to create DNS records for ALL subdomains Caddy serves (e.g., auth.staging, minio.staging).

Snapshot cross-contamination: Unfiltered data "alicloud_ecs_snapshots" returns ALL account snapshots. New env inherits old 100GB snapshot, fails creating 40GB disk. Gate with variable:

locals {
  latest_snapshot_id = var.enable_snapshot_recovery && length(local.available_snapshots) > 0
    ? local.available_snapshots[0].snapshot_id : null
}

Do NOT add count to the data source — changes its state address, causes drift.

Pre-deploy validation

Run a validation script before terraform apply to catch configuration errors locally. This eliminates the deploy→discover→fix→redeploy cycle.

Key checks (see references/pre-deploy-validation.md):

  1. terraform validate — syntax
  2. No hardcoded domains in Caddyfiles or compose files
  3. Required env vars present (LOBEHUB_DOMAIN, CLAUDE4DEV_DOMAIN, CLOUDFLARE_API_TOKEN, APP_URL, etc.)
  4. Cloudflare API Token format (not Global API Key)
  5. DNS records exist for all Caddy-served domains
  6. Casdoor issuer URL matches auth.* subdomain
  7. SSH private key exists

Integrate into Makefile: make pre-deploy ENV=staging before make apply.

Zero-to-deployment

Fresh disks expose every implicit dependency. See references/zero-to-deploy-checklist.md.

Key items that break provisioners on fresh instances:

  1. Directories: mkdir -p /data/{svc1,svc2} in cloud-init — file provisioner fails if target dir missing
  2. Databases: Explicit CREATE DATABASE — PG init scripts only run on empty data dir
  3. Migrations: Tracked in schema_migrations table, applied idempotently
  4. Provisioner ordering: depends_on between resources sharing Docker networks
  5. Memory: Stop non-critical containers during Docker build on small instances (≤8GB)
  6. Domain parameterization: Every domain in Caddyfile/compose must be {$VAR} / ${VAR:?required}
  7. Credential format: Caddy needs API Token (cfut_), not Global API Key
Featured
CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →
Categories
DevOps & CI/CDCloud & Infrastructure
First SeenJun 3, 2026
View on GitHub

Recommended

More DevOps & CI/CD →
observability-monitoring-monitor-setup

sickn33/antigravity-awesome-skills

observability monitoring monitor setup
262
39.4k
kubesphere-devops-pipeline

kubesphere/kubesphere

This handles CI/CD pipeline operations in KubeSphere's DevOps platform, which wraps Jenkins with Kubernetes custom resources.
17k
monitoring-observability

ahmedasmar/devops-claude-skills

monitoring observability
391
165
gitlab-ci-validator

akin-ozer/cc-devops-skills

gitlab ci validator
236
224
gitlab-ci-generator

akin-ozer/cc-devops-skills

gitlab ci generator
234
224
monitoring-observability

supercent-io/skills-template

Comprehensive monitoring setup with metrics collection, log aggregation, alerting, and health checks.
11k
88