Token Optimizer Mcp

4057 toolsSTDIOregistry active

Summary

This is a token reduction layer that sits between Claude and your file system, API calls, and development tools. It replaces standard MCP tools like read_file and grep with cached, compressed alternatives that store content in SQLite and return only diffs on subsequent calls. The 80+ tools cover file operations, API fetching with TTL-based caching, database queries, test runners, and multi-tier cache management with LRU/LFU eviction. Real production data shows 60-90% token reduction across operations. You'd reach for this when you're hitting context limits on large codebases or repeatedly fetching the same API responses, and you want automatic optimization without changing your workflow.

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

Tools

Public tool metadata for what this MCP can expose to an agent.

7 tools

proxy_chatSend a chat completion request through PromptThin's cost-saving proxy. Five optimization routes are applied automatically: (A) semantic cache, (B) prompt compression via LLMLingua 2, (C) model routing to cheaper models, (D) context pruning at 8K tokens, (E) thinking budget cap...4 params

Send a chat completion request through PromptThin's cost-saving proxy. Five optimization routes are applied automatically: (A) semantic cache, (B) prompt compression via LLMLingua 2, (C) model routing to cheaper models, (D) context pruning at 8K tokens, (E) thinking budget cap...

Parameters* required

modelstring

The LLM model to use. Examples: 'gpt-4o', 'gpt-4o-mini' (OpenAI), 'claude-sonnet-4-6' (Anthropic), 'gemini-2.5-flash' (Gemini), 'llama-3.1-8b-instant' (Groq).

messagesarray

Conversation messages in OpenAI chat format. Each message needs a 'role' and 'content'.

max_tokensinteger

Maximum tokens to generate. Higher values allow longer responses but cost more.default: 1000

temperaturenumber

Controls randomness (0.0 = deterministic, 2.0 = very random). Lower for factual tasks, higher for creative.default: 0.7

proxy_predictEstimate cost savings BEFORE making a real LLM call — completely free, no tokens consumed. Returns original token count, estimated tokens after savings, cost comparison, saving percentage, applicable methods, and a recommendation.3 params

Estimate cost savings BEFORE making a real LLM call — completely free, no tokens consumed. Returns original token count, estimated tokens after savings, cost comparison, saving percentage, applicable methods, and a recommendation.

Parameters* required

modelstring

The target LLM model to estimate costs for. Examples: 'gpt-4o', 'claude-sonnet-4-6', 'gemini-2.5-flash'.

messagesarray

Conversation messages to analyze for savings estimation. Same format as chat_completion.

providerstring

The LLM provider. Must match the model — e.g. 'openai' for gpt-4o, 'anthropic' for claude models.one of openai · anthropic · gemini · groq

billing_start_trialStart a 7-day free Pro trial. Returns a Stripe checkout URL. No charge for 7 days. Pro plan: $4.99 first month, then $11.99/month. 10,000 requests/month. Cancel anytime.1 params

Start a 7-day free Pro trial. Returns a Stripe checkout URL. No charge for 7 days. Pro plan: $4.99 first month, then $11.99/month. 10,000 requests/month. Cancel anytime.

Parameters* required

formatstring

Response format: 'text' for a human-readable message with the checkout URL, 'json' for machine-parseable output.one of text · jsondefault: text

usage_summaryRetrieve complete usage summary: total requests, cache hit rate, total tokens saved, estimated cost saved in USD, and requests routed to cheaper models.1 params

Retrieve complete usage summary: total requests, cache hit rate, total tokens saved, estimated cost saved in USD, and requests routed to cheaper models.

Parameters* required

formatstring

Response format: 'text' for a human-readable summary, 'json' for structured data.one of text · jsondefault: text

billing_statusRetrieve current plan, monthly request limit, requests used this month, remaining requests, and usage percentage. Warns when usage exceeds 80%.1 params

Retrieve current plan, monthly request limit, requests used this month, remaining requests, and usage percentage. Warns when usage exceeds 80%.

Parameters* required

formatstring

Response format: 'text' for a human-readable status, 'json' for structured data.one of text · jsondefault: text

cache_flushMark all cached responses as stale, forcing fresh LLM calls. Use after updating a knowledge base or changing system prompts.1 params

Mark all cached responses as stale, forcing fresh LLM calls. Use after updating a knowledge base or changing system prompts.

Parameters* required

confirmboolean

Safety flag to confirm the flush operation. Must be true to proceed. Prevents accidental cache invalidation.default: true

usage_recentRetrieve recent proxied API requests with timestamp, provider, model, token counts, cache hit status, routing info, and tokens saved.1 params

Retrieve recent proxied API requests with timestamp, provider, model, token counts, cache hit status, routing info, and tokens saved.

Parameters* required

limitinteger

Number of recent requests to return. Defaults to 10. Maximum is 50.default: 10

Token Optimizer MCP

Intelligent token optimization through caching, compression, and smart tooling for Claude Code and Claude Desktop

Overview

Token Optimizer MCP is a Model Context Protocol (MCP) server that reduces context window usage by 60-90% through intelligent caching, compression, and smart tool replacements. By storing compressed content externally in SQLite and providing optimized alternatives to standard tools, the server helps you maximize your available context window.

Production Results: 60-90% token reduction across 38,000+ operations in real-world usage.

Key Features

Smart Tool Replacements: Automatic optimization for Read, Grep, Glob, and more
Context Window Optimization: Store content externally to free up context space
High Compression: Brotli compression (2-4x typical, up to 82x for repetitive content)
Persistent Caching: SQLite-based cache that persists across sessions
Accurate Token Counting: Uses tiktoken for precise token measurements
61 Specialized Tools: File operations, API caching, database optimization, monitoring, and more
Zero External Dependencies: Completely offline operation
Production Ready: Built with TypeScript for reliability

Installation

Quick Install (Recommended)

Windows

# Run PowerShell as Administrator, then:
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser

# Install globally (hooks install automatically!)
npm install -g @ooples/token-optimizer-mcp

macOS / Linux

# Install globally (hooks install automatically!)
npm install -g @ooples/token-optimizer-mcp

That's it! The postinstall script will automatically:

✅ Install token-optimizer-mcp globally via npm
✅ Auto-detect and configure all installed AI tools (Claude Desktop, Cursor, Cline, etc.)
✅ Set up automatic token optimization on every tool call
✅ Configure workspace trust and execution permissions

Result: 60-90% token reduction across all operations!

Note: If automatic installation is skipped (e.g., in CI environments), you can manually run the installer:

Windows: powershell -ExecutionPolicy Bypass -File install-hooks.ps1
macOS/Linux: bash install-hooks.sh

Manual Configuration

For detailed platform-specific installation instructions, see docs/HOOKS-INSTALLATION.md.

Available Tools (65 Total)

Core Caching & Optimization (8 tools)

Click to expand

optimize_text - Compress and cache text (primary tool for token reduction)
get_cached - Retrieve previously cached text
compress_text - Compress text using Brotli
decompress_text - Decompress Brotli-compressed text
count_tokens - Count tokens using tiktoken (GPT-4 tokenizer)
analyze_optimization - Analyze text and get optimization recommendations
get_cache_stats - View cache hit rates and compression ratios
clear_cache - Clear all cached data

Usage Example:

// Cache large content to remove it from context window
optimize_text({
  text: "Large API response or file content...",
  key: "api-response-key",
  quality: 11
})
// Result: 60-90% token reduction

Smart File Operations (10 tools)

Click to expand

Optimized replacements for standard file tools with intelligent caching and diff-based updates:

smart_read - Read files with 80% token reduction through caching and diffs
smart_write - Write files with verification and change tracking
smart_edit - Line-based file editing with diff-only output (90% reduction)
smart_grep - Search file contents with match-only output (80% reduction)
smart_glob - File pattern matching with path-only results (75% reduction)
smart_diff - Git diffs with diff-only output (85% reduction)
smart_branch - Git branch listing with structured JSON (60% reduction)
smart_log - Git commit history with smart filtering (75% reduction)
smart_merge - Git merge management with conflict analysis (80% reduction)
smart_status - Git status with status-only output (70% reduction)

Usage Example:

// Read a file with automatic caching
smart_read({ path: "/path/to/file.ts" })
// First read: full content
// Subsequent reads: only diff (80% reduction)

API & Database Operations (10 tools)

Click to expand

Intelligent caching and optimization for external data sources:

smart_api_fetch - HTTP requests with caching and retry logic (83% reduction on cache hits)
smart_cache_api - API response caching with TTL/ETag/event-based strategies
smart_database - Database queries with connection pooling and caching (83% reduction)
smart_sql - SQL query analysis with optimization suggestions (83% reduction)
smart_schema - Database schema analysis with intelligent caching
smart_graphql - GraphQL query optimization with complexity analysis (83% reduction)
smart_rest - REST API analysis with endpoint discovery (83% reduction)
smart_orm - ORM query optimization with N+1 detection (83% reduction)
smart_migration - Database migration tracking (83% reduction)
smart_websocket - WebSocket connection management with message tracking

Usage Example:

// Fetch API with automatic caching
smart_api_fetch({
  method: "GET",
  url: "https://api.example.com/data",
  ttl: 300
})
// Cached responses: 95% token reduction

Build & Test Operations (10 tools)

Click to expand

Development workflow optimization with intelligent caching:

smart_build - TypeScript builds with diff-based change detection
smart_test - Test execution with incremental test selection
smart_lint - ESLint with incremental analysis and auto-fix
smart_typecheck - TypeScript type checking with caching
smart_install - Package installation with dependency analysis
smart_docker - Docker operations with layer analysis
smart_logs - Log aggregation with pattern filtering
smart_network - Network diagnostics with anomaly detection
smart_processes - Process monitoring with resource tracking
smart_system_metrics - System resource monitoring with performance recommendations

Usage Example:

// Run tests with caching
smart_test({
  onlyChanged: true,  // Only test changed files
  coverage: true
})

Advanced Caching (10 tools)

Click to expand

Enterprise-grade caching strategies with 87-92% token reduction:

smart_cache - Multi-tier cache (L1/L2/L3) with 6 eviction strategies (90% reduction)
cache_warmup - Intelligent cache pre-warming with schedule support (87% reduction)
cache_analytics - Real-time dashboards and trend analysis (88% reduction)
cache_benchmark - Performance testing and strategy comparison (89% reduction)
cache_compression - 6 compression algorithms with adaptive selection (89% reduction)
cache_invalidation - Dependency tracking and pattern-based invalidation (88% reduction)
cache_optimizer - ML-based recommendations and bottleneck detection (89% reduction)
cache_partition - Sharding and consistent hashing (87% reduction)
cache_replication - Distributed replication with conflict resolution (88% reduction)
predictive_cache - ML-based predictive caching with ARIMA/LSTM (91% reduction)

Usage Example:

// Configure multi-tier cache
smart_cache({
  operation: "configure",
  evictionStrategy: "LRU",
  l1MaxSize: 1000,
  l2MaxSize: 10000
})

Monitoring & Dashboards (7 tools)

Click to expand

Comprehensive monitoring with 88-92% token reduction through intelligent caching:

alert_manager - Multi-channel alerting (email, Slack, webhook) with routing (89% reduction)
metric_collector - Time-series metrics with multi-source support (88% reduction)
monitoring_integration - External platform integration (Prometheus, Grafana, Datadog) (87% reduction)
custom_widget - Dashboard widgets with template caching (88% reduction)
data_visualizer - Interactive visualizations with SVG optimization (92% reduction)
health_monitor - System health checks with state compression (91% reduction)
log_dashboard - Log analysis with pattern detection (90% reduction)

Usage Example:

// Create an alert
alert_manager({
  operation: "create-alert",
  alertName: "high-cpu-usage",
  channels: ["slack", "email"],
  threshold: { type: "above", value: 80 }
})

System Operations (6 tools)

Click to expand

System-level operations with smart caching:

smart_cron - Scheduled task management (cron/Windows Task Scheduler) (85% reduction)
smart_user - User and permission management across platforms (86% reduction)
smart_ast_grep - Structural code search with AST indexing (83% reduction)
get_session_stats - Session-level token usage statistics
analyze_project_tokens - Project-wide token analysis and cost estimation
optimize_session - Compress large file operations from current session

Usage Example:

// View session token usage
get_session_stats({})
// Result: Detailed breakdown of token usage by tool

How It Works

Token Analytics (4 tools)

Click to expand

Granular token usage analytics for pinpointing optimization opportunities:

get_hook_analytics - Token usage breakdown by hook phase (PreToolUse, PostToolUse, etc.)
get_action_analytics - Token usage breakdown by tool/action (Read, Write, Grep, etc.)
get_mcp_server_analytics - Token usage breakdown by MCP server (token-optimizer, filesystem, etc.)
export_analytics - Export analytics data in JSON or CSV format with filtering

Usage Example:

// Get per-hook analytics
get_hook_analytics({
  startDate: "2025-01-01T00:00:00Z",
  endDate: "2025-12-31T23:59:59Z"
})
// Result: Shows which hooks consume the most tokens

// Get per-action analytics
get_action_analytics({})
// Result: Shows which tools use the most tokens

// Export analytics as CSV
export_analytics({
  format: "csv",
  hookPhase: "PreToolUse"
})
// Result: CSV export filtered by PreToolUse hook

Key Features:

Per-hook phase tracking (PreToolUse, PostToolUse, SessionStart, etc.)
Per-action tracking (Read, Write, count_tokens, etc.)
Per-MCP-server tracking (token-optimizer, filesystem, GitHub, etc.)
Date range filtering
JSON and CSV export
Persistent storage with SQLite
Zero performance impact (async batched writes)

Global Hooks System (7-Phase Optimization)

When global hooks are installed, token-optimizer-mcp runs automatically on every tool call:

┌─────────────────────────────────────────────────────────────┐
│ Phase 1: PreToolUse - Tool Replacement                      │
│ ├─ Read   → smart_read   (80% token reduction)             │
│ ├─ Grep   → smart_grep   (80% token reduction)             │
│ └─ Glob   → smart_glob   (75% token reduction)             │
└─────────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────────┐
│ Phase 2: Input Validation - Cache Lookups                   │
│ └─ get_cached checks if operation was already done          │
└─────────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────────┐
│ Phase 3: PostToolUse - Output Optimization                  │
│ ├─ optimize_text for large outputs                          │
│ └─ compress_text for repeated content                       │
└─────────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────────┐
│ Phase 4: Session Tracking                                   │
│ └─ Log all operations to operations-{sessionId}.csv         │
└─────────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────────┐
│ Phase 5: UserPromptSubmit - Prompt Optimization             │
│ └─ Optimize user prompts before sending to API              │
└─────────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────────┐
│ Phase 6: PreCompact - Pre-Compaction Optimization           │
│ └─ Optimize before Claude Code compacts the conversation    │
└─────────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────────┐
│ Phase 7: Metrics & Reporting                                │
│ └─ Track token reduction metrics and generate reports       │
└─────────────────────────────────────────────────────────────┘

Production Performance

Based on 38,000+ operations in real-world usage:

Tool Category	Avg Token Reduction	Cache Hit Rate
File Operations	60-90%	>80%
API Responses	83-95%	>75%
Database Queries	83-90%	>70%
Build/Test Output	70-85%	>65%

Per-Session Savings: 300K-700K tokens (worth $0.90-$2.10 at $3/M tokens)

Usage Examples

Basic Caching

// Cache large content to remove from context window
const result = await optimize_text({
  text: "Large API response or file content...",
  key: "cache-key",
  quality: 11
});
// Result: Original tokens removed, only cache key remains (~50 tokens)

// Retrieve later
const cached = await get_cached({ key: "cache-key" });
// Result: Full original content restored

Smart File Reading

// First read: full content
await smart_read({ path: "/src/app.ts" });

// Subsequent reads: only changes (80% reduction)
await smart_read({ path: "/src/app.ts" });

API Caching

// First request: fetch and cache
await smart_api_fetch({
  method: "GET",
  url: "https://api.example.com/data",
  ttl: 300
});

// Subsequent requests: cached (95% reduction)
await smart_api_fetch({
  method: "GET",
  url: "https://api.example.com/data"
});

Session Analysis

// View token usage for current session
await get_session_stats({});
// Result: Breakdown by tool, operation, and savings

// Analyze entire project
await analyze_project_tokens({
  projectPath: "/path/to/project"
});
// Result: Cost estimation and optimization opportunities

Technology Stack

Runtime: Node.js 20+
Language: TypeScript
Database: SQLite (better-sqlite3)
Token Counting: tiktoken (GPT-4 tokenizer)
Compression: Brotli (built-in Node.js)
Caching: Multi-tier LRU/LFU/FIFO caching
Protocol: MCP SDK (@modelcontextprotocol/sdk)

Supported AI Tools

The automated installer detects and configures token-optimizer-mcp for:

✅ Claude Code - CLI with global hooks integration
✅ Claude Desktop - Native desktop application
✅ Cursor IDE - AI-first code editor
✅ Cline - VS Code extension (formerly Claude Dev)
✅ GitHub Copilot - VS Code with MCP support
✅ Windsurf IDE - AI-powered development environment

No manual configuration needed - the installer automatically detects and configures all installed tools!

Documentation

Detailed Tool Reference - Complete documentation for all 61 tools
Installation Guide - Platform-specific installation instructions
Contributing Guide - Development setup and contribution guidelines

Performance Characteristics

Compression Ratio: 2-4x typical (up to 82x for repetitive content)
Context Window Savings: 60-90% average across all operations
Cache Hit Rate: >80% in typical usage
Operation Overhead: <10ms for cache operations (optimized from 50-70ms)
Compression Speed: ~1ms per KB of text
Hook Overhead: <10ms per operation (7x improvement from in-memory optimizations)

Performance Optimizations

The PowerShell hooks have been optimized to reduce overhead from 50-70ms to <10ms through:

In-Memory Session State: Session data kept in memory instead of disk I/O on every operation
Batched Log Writes: Operation logs buffered and flushed every 5 seconds or 100 operations
Lazy Persistence: Disk writes only occur when necessary (session end, optimization, reports)

Environment Variables

Control hook behavior with these environment variables:

Performance Controls

TOKEN_OPTIMIZER_USE_FILE_SESSION (default: false)
- Set to true to revert to file-based session tracking (legacy mode)
- Use if you encounter issues with in-memory session state
- Example: $env:TOKEN_OPTIMIZER_USE_FILE_SESSION = "true"
TOKEN_OPTIMIZER_SYNC_LOG_WRITES (default: false)
- Set to true to disable batched log writes
- Forces immediate writes to disk (slower but more resilient)
- Use for debugging or if logs are being lost
- Example: $env:TOKEN_OPTIMIZER_SYNC_LOG_WRITES = "true"
TOKEN_OPTIMIZER_DEBUG_LOGGING (default: true)
- Set to false to disable DEBUG-level logging
- Reduces log file size and improves performance
- INFO/WARN/ERROR logs still written
- Example: $env:TOKEN_OPTIMIZER_DEBUG_LOGGING = "false"

Development Path

TOKEN_OPTIMIZER_DEV_PATH
- Path to local development installation
- Automatically set to ~/source/repos/token-optimizer-mcp if not specified
- Override for custom development paths
- Example: $env:TOKEN_OPTIMIZER_DEV_PATH = "C:\dev\token-optimizer-mcp"

Performance Impact: Using in-memory mode (default) provides a 7x improvement in hook overhead:

Before: 50-70ms per hook operation
After: <10ms per hook operation
85% reduction in hook latency

Monitoring Token Savings

Real-Time Session Monitoring

To view your actual token SAVINGS, use the get_session_stats tool:

// View current session statistics with token savings breakdown
await get_session_stats({});

Output includes:

Total tokens saved (this is the actual savings amount!)
Token reduction percentage (e.g., "60% reduction")
Cache hit rate and compression ratios
Breakdown by tool (Read, Grep, Glob, etc.)
Top 10 most optimized operations with before/after comparison

Example Output:

{
  "sessionId": "abc-123",
  "totalTokensSaved": 125430,  // ← THIS is your savings!
  "tokenReductionPercent": 68.2,
  "originalTokens": 184000,
  "optimizedTokens": 58570,
  "cacheHitRate": 72.0,
  "byTool": {
    "smart_read": { "saved": 45000, "percent": 80 },
    "smart_grep": { "saved": 32000, "percent": 75 }
  }
}

Session Tracking Files

All operations are automatically tracked in session data files:

Location: ~/.claude-global/hooks/data/current-session.txt

Format:

{
  "sessionId": "abc-123",
  "sessionStart": "20251031-082211",
  "totalOperations": 1250,      // ← Number of operations
  "totalTokens": 184000,         // ← Cumulative token COUNT
  "lastOptimized": 1698765432,
  "savings": {                   // ← Auto-updated every 10 operations (Issue #113)
    "totalTokensSaved": 125430,  // Tokens saved by compression
    "tokenReductionPercent": 68.2,  // Percentage of tokens saved
    "originalTokens": 184000,    // Original token count before optimization
    "optimizedTokens": 58570,    // Token count after optimization
    "cacheHitRate": 42.5,        // Cache hit rate percentage
    "compressionRatio": 0.32,    // Compression efficiency (lower is better)
    "lastUpdated": "20251031-092500"  // Last savings update timestamp
  }
}

New in v1.x: The savings object is now automatically updated every 10 operations, eliminating the need to manually call get_session_stats() for real-time monitoring. This provides instant visibility into token optimization performance.

How it works:

Every 10 operations, the PowerShell hooks automatically call get_cache_stats() MCP tool
Savings metrics are calculated from cache performance data (compression ratio, original vs compressed sizes)
The session file is atomically updated with the latest savings data
If the MCP call fails, the update is skipped gracefully without blocking operations

Note: For detailed per-operation analysis, use get_session_stats(). The session file provides high-level aggregate metrics.

Project-Wide Analysis

Analyze token usage across your entire project:

// Analyze project token costs
await analyze_project_tokens({
  projectPath: "/path/to/project"
});

Provides:

Total token cost estimation
Largest files by token count
Optimization opportunities
Cost projections at current API rates

Cache Performance

Monitor cache hit rates and storage efficiency:

// View cache statistics
await get_cache_stats({});

Metrics:

Total entries
Cache hit rate (%)
Average compression ratio
Total storage saved
Most frequently accessed keys

Troubleshooting

Common Issues and Solutions

Issue: "Invalid or malformed JSON" in Claude Code Settings

Symptom: Claude Code shows "Invalid Settings" error after running install-hooks

Cause: UTF-8 BOM (Byte Order Mark) was added to settings.json files

Solution: Upgrade to v3.0.2+ which fixes the BOM issue:

npm install -g @ooples/token-optimizer-mcp@latest

If you're already on v3.0.2+, manually remove the BOM:

# Windows: Remove BOM from settings.json
$content = Get-Content "~/.claude/settings.json" -Raw
$content = $content -replace '^\xEF\xBB\xBF', ''
$content | Set-Content "~/.claude/settings.json" -Encoding utf8NoBOM

# Linux: Remove BOM from settings.json
sed -i '1s/^\xEF\xBB\xBF//' ~/.claude/settings.json

# macOS: Remove BOM from settings.json (BSD sed requires empty string after -i)
sed -i '' '1s/^\xef\xbb\xbf//' ~/.claude/settings.json

Issue: Hooks Not Working After Installation

Symptom: Token optimization not occurring automatically

Diagnosis:

Check if hooks are installed:

# Windows
Get-Content ~/.claude/settings.json | ConvertFrom-Json | Select-Object -ExpandProperty hooks

# macOS/Linux
cat ~/.claude/settings.json | jq .hooks

Verify dispatcher.ps1 exists:

# Windows
Test-Path ~/.claude-global/hooks/dispatcher.ps1

# macOS/Linux
[ -f ~/.claude-global/hooks/dispatcher.sh ] && echo "Exists" || echo "Missing"

Solution: Re-run the installer:

# Windows
powershell -ExecutionPolicy Bypass -File install-hooks.ps1

# macOS/Linux
bash install-hooks.sh

Issue: Low Cache Hit Rate (<50%)

Symptom: Session stats show cache hit rate below 50%

Causes:

Working with many new files (expected)
Cache was recently cleared
TTL (time-to-live) is too short

Solutions:

Warm up the cache before starting work:

await cache_warmup({
  paths: ["/path/to/frequently/used/files"],
  recursive: true
});

Increase TTL for stable APIs:

await smart_api_fetch({
  url: "https://api.example.com/data",
  ttl: 3600  // 1 hour instead of default 5 minutes
});

Check cache size limits:

await smart_cache({
  operation: "configure",
  l1MaxSize: 2000,  // Increase from default 1000
  l2MaxSize: 20000  // Increase from default 10000
});

Issue: High Memory Usage

Symptom: Node.js process using excessive memory

Cause: Large cache in memory (L1/L2 tiers)

Solution: Configure cache limits:

await smart_cache({
  operation: "configure",
  evictionStrategy: "LRU",  // Least Recently Used
  l1MaxSize: 500,  // Reduce L1 cache
  l2MaxSize: 5000  // Reduce L2 cache
});

Or clear the cache:

await clear_cache({});

Issue: Slow First-Time Operations

Symptom: Initial Read/Grep/Glob operations are slow

Cause: Cache is empty, building indexes

Solution: This is expected behavior. Subsequent operations will be 80-90% faster.

To pre-warm the cache:

await cache_warmup({
  paths: ["/src", "/tests", "/docs"],
  recursive: true,
  schedule: "startup"  // Auto-warm on every session start
});

Issue: "Permission denied" Errors on Windows

Symptom: Cannot write to cache or log files

Cause: PowerShell execution policy or file permissions

Solution:

Set execution policy:

Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser

Check file permissions:

icacls "$env:USERPROFILE\.token-optimizer"

Re-run installer as Administrator if needed

Issue: Cache Files Growing Too Large

Symptom: ~/.token-optimizer/cache.db is >1GB

Cause: Caching very large files or many API responses

Solution:

Clear old entries:

await clear_cache({ olderThan: 7 });  // Clear entries older than 7 days

Reduce cache retention:

await smart_cache({
  operation: "configure",
  defaultTTL: 3600  // 1 hour instead of 7 days
});

Manually delete cache (nuclear option):
```
rm -rf ~/.token-optimizer/cache.db
```

Getting Help

If you encounter issues not covered here:

Check the hook logs: ~/.claude-global/hooks/logs/dispatcher.log
Check session data: ~/.claude-global/hooks/data/current-session.txt
File an issue: GitHub Issues
- Include debug logs
- Include your OS and Node.js version
- Include the output of get_session_stats

Limitations

Small Text: Best for content >500 characters (cache overhead on small snippets)
One-Time Content: No benefit for content that won't be referenced again
Cache Storage: Automatic cleanup after 7 days to prevent disk usage issues
Token Counting: Uses GPT-4 tokenizer (approximation for Claude, but close enough)

License

MIT License - see LICENSE for details

Author

Built for optimal Claude Code token efficiency by the ooples team.

Featured

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

Configuration

TOKEN_OPTIMIZER_CACHE_DIR

Directory path for cache storage (optional, defaults to ~/.token-optimizer-cache)

Registryactive

Packagetoken-optimizer-mcp

TransportSTDIO

UpdatedOct 20, 2025

View on GitHub

Token Optimizer MCP

Intelligent token optimization through caching, compression, and smart tooling for Claude Code and Claude Desktop

Overview

Production Results: 60-90% token reduction across 38,000+ operations in real-world usage.

Key Features

Smart Tool Replacements: Automatic optimization for Read, Grep, Glob, and more
Context Window Optimization: Store content externally to free up context space
High Compression: Brotli compression (2-4x typical, up to 82x for repetitive content)
Persistent Caching: SQLite-based cache that persists across sessions
Accurate Token Counting: Uses tiktoken for precise token measurements
61 Specialized Tools: File operations, API caching, database optimization, monitoring, and more
Zero External Dependencies: Completely offline operation
Production Ready: Built with TypeScript for reliability

Installation

Quick Install (Recommended)

Windows

# Run PowerShell as Administrator, then:
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser

# Install globally (hooks install automatically!)
npm install -g @ooples/token-optimizer-mcp

macOS / Linux

# Install globally (hooks install automatically!)
npm install -g @ooples/token-optimizer-mcp

That's it! The postinstall script will automatically:

✅ Install token-optimizer-mcp globally via npm
✅ Auto-detect and configure all installed AI tools (Claude Desktop, Cursor, Cline, etc.)
✅ Set up automatic token optimization on every tool call
✅ Configure workspace trust and execution permissions

Result: 60-90% token reduction across all operations!

Note: If automatic installation is skipped (e.g., in CI environments), you can manually run the installer:

Windows: powershell -ExecutionPolicy Bypass -File install-hooks.ps1
macOS/Linux: bash install-hooks.sh

Manual Configuration

For detailed platform-specific installation instructions, see docs/HOOKS-INSTALLATION.md.

Available Tools (65 Total)

Core Caching & Optimization (8 tools)

Click to expand

optimize_text - Compress and cache text (primary tool for token reduction)
get_cached - Retrieve previously cached text
compress_text - Compress text using Brotli
decompress_text - Decompress Brotli-compressed text
count_tokens - Count tokens using tiktoken (GPT-4 tokenizer)
analyze_optimization - Analyze text and get optimization recommendations
get_cache_stats - View cache hit rates and compression ratios
clear_cache - Clear all cached data

Usage Example:

// Cache large content to remove it from context window
optimize_text({
  text: "Large API response or file content...",
  key: "api-response-key",
  quality: 11
})
// Result: 60-90% token reduction

Smart File Operations (10 tools)

Click to expand

Optimized replacements for standard file tools with intelligent caching and diff-based updates:

smart_read - Read files with 80% token reduction through caching and diffs
smart_write - Write files with verification and change tracking
smart_edit - Line-based file editing with diff-only output (90% reduction)
smart_grep - Search file contents with match-only output (80% reduction)
smart_glob - File pattern matching with path-only results (75% reduction)
smart_diff - Git diffs with diff-only output (85% reduction)
smart_branch - Git branch listing with structured JSON (60% reduction)
smart_log - Git commit history with smart filtering (75% reduction)
smart_merge - Git merge management with conflict analysis (80% reduction)
smart_status - Git status with status-only output (70% reduction)

Usage Example:

// Read a file with automatic caching
smart_read({ path: "/path/to/file.ts" })
// First read: full content
// Subsequent reads: only diff (80% reduction)

API & Database Operations (10 tools)

Click to expand

Intelligent caching and optimization for external data sources:

smart_api_fetch - HTTP requests with caching and retry logic (83% reduction on cache hits)
smart_cache_api - API response caching with TTL/ETag/event-based strategies
smart_database - Database queries with connection pooling and caching (83% reduction)
smart_sql - SQL query analysis with optimization suggestions (83% reduction)
smart_schema - Database schema analysis with intelligent caching
smart_graphql - GraphQL query optimization with complexity analysis (83% reduction)
smart_rest - REST API analysis with endpoint discovery (83% reduction)
smart_orm - ORM query optimization with N+1 detection (83% reduction)
smart_migration - Database migration tracking (83% reduction)
smart_websocket - WebSocket connection management with message tracking

Usage Example:

// Fetch API with automatic caching
smart_api_fetch({
  method: "GET",
  url: "https://api.example.com/data",
  ttl: 300
})
// Cached responses: 95% token reduction

Build & Test Operations (10 tools)

Click to expand

Development workflow optimization with intelligent caching:

smart_build - TypeScript builds with diff-based change detection
smart_test - Test execution with incremental test selection
smart_lint - ESLint with incremental analysis and auto-fix
smart_typecheck - TypeScript type checking with caching
smart_install - Package installation with dependency analysis
smart_docker - Docker operations with layer analysis
smart_logs - Log aggregation with pattern filtering
smart_network - Network diagnostics with anomaly detection
smart_processes - Process monitoring with resource tracking
smart_system_metrics - System resource monitoring with performance recommendations

Usage Example:

// Run tests with caching
smart_test({
  onlyChanged: true,  // Only test changed files
  coverage: true
})

Advanced Caching (10 tools)

Click to expand

Enterprise-grade caching strategies with 87-92% token reduction:

smart_cache - Multi-tier cache (L1/L2/L3) with 6 eviction strategies (90% reduction)
cache_warmup - Intelligent cache pre-warming with schedule support (87% reduction)
cache_analytics - Real-time dashboards and trend analysis (88% reduction)
cache_benchmark - Performance testing and strategy comparison (89% reduction)
cache_compression - 6 compression algorithms with adaptive selection (89% reduction)
cache_invalidation - Dependency tracking and pattern-based invalidation (88% reduction)
cache_optimizer - ML-based recommendations and bottleneck detection (89% reduction)
cache_partition - Sharding and consistent hashing (87% reduction)
cache_replication - Distributed replication with conflict resolution (88% reduction)
predictive_cache - ML-based predictive caching with ARIMA/LSTM (91% reduction)

Usage Example:

// Configure multi-tier cache
smart_cache({
  operation: "configure",
  evictionStrategy: "LRU",
  l1MaxSize: 1000,
  l2MaxSize: 10000
})

Monitoring & Dashboards (7 tools)

Click to expand

Comprehensive monitoring with 88-92% token reduction through intelligent caching:

alert_manager - Multi-channel alerting (email, Slack, webhook) with routing (89% reduction)
metric_collector - Time-series metrics with multi-source support (88% reduction)
monitoring_integration - External platform integration (Prometheus, Grafana, Datadog) (87% reduction)
custom_widget - Dashboard widgets with template caching (88% reduction)
data_visualizer - Interactive visualizations with SVG optimization (92% reduction)
health_monitor - System health checks with state compression (91% reduction)
log_dashboard - Log analysis with pattern detection (90% reduction)

Usage Example:

// Create an alert
alert_manager({
  operation: "create-alert",
  alertName: "high-cpu-usage",
  channels: ["slack", "email"],
  threshold: { type: "above", value: 80 }
})

System Operations (6 tools)

Click to expand

System-level operations with smart caching:

smart_cron - Scheduled task management (cron/Windows Task Scheduler) (85% reduction)
smart_user - User and permission management across platforms (86% reduction)
smart_ast_grep - Structural code search with AST indexing (83% reduction)
get_session_stats - Session-level token usage statistics
analyze_project_tokens - Project-wide token analysis and cost estimation
optimize_session - Compress large file operations from current session

Usage Example:

// View session token usage
get_session_stats({})
// Result: Detailed breakdown of token usage by tool

How It Works

Token Analytics (4 tools)

Click to expand

Granular token usage analytics for pinpointing optimization opportunities:

get_hook_analytics - Token usage breakdown by hook phase (PreToolUse, PostToolUse, etc.)
get_action_analytics - Token usage breakdown by tool/action (Read, Write, Grep, etc.)
get_mcp_server_analytics - Token usage breakdown by MCP server (token-optimizer, filesystem, etc.)
export_analytics - Export analytics data in JSON or CSV format with filtering

Usage Example:

// Get per-hook analytics
get_hook_analytics({
  startDate: "2025-01-01T00:00:00Z",
  endDate: "2025-12-31T23:59:59Z"
})
// Result: Shows which hooks consume the most tokens

// Get per-action analytics
get_action_analytics({})
// Result: Shows which tools use the most tokens

// Export analytics as CSV
export_analytics({
  format: "csv",
  hookPhase: "PreToolUse"
})
// Result: CSV export filtered by PreToolUse hook

Key Features:

Per-hook phase tracking (PreToolUse, PostToolUse, SessionStart, etc.)
Per-action tracking (Read, Write, count_tokens, etc.)
Per-MCP-server tracking (token-optimizer, filesystem, GitHub, etc.)
Date range filtering
JSON and CSV export
Persistent storage with SQLite
Zero performance impact (async batched writes)

Global Hooks System (7-Phase Optimization)

When global hooks are installed, token-optimizer-mcp runs automatically on every tool call:

┌─────────────────────────────────────────────────────────────┐
│ Phase 1: PreToolUse - Tool Replacement                      │
│ ├─ Read   → smart_read   (80% token reduction)             │
│ ├─ Grep   → smart_grep   (80% token reduction)             │
│ └─ Glob   → smart_glob   (75% token reduction)             │
└─────────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────────┐
│ Phase 2: Input Validation - Cache Lookups                   │
│ └─ get_cached checks if operation was already done          │
└─────────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────────┐
│ Phase 3: PostToolUse - Output Optimization                  │
│ ├─ optimize_text for large outputs                          │
│ └─ compress_text for repeated content                       │
└─────────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────────┐
│ Phase 4: Session Tracking                                   │
│ └─ Log all operations to operations-{sessionId}.csv         │
└─────────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────────┐
│ Phase 5: UserPromptSubmit - Prompt Optimization             │
│ └─ Optimize user prompts before sending to API              │
└─────────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────────┐
│ Phase 6: PreCompact - Pre-Compaction Optimization           │
│ └─ Optimize before Claude Code compacts the conversation    │
└─────────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────────┐
│ Phase 7: Metrics & Reporting                                │
│ └─ Track token reduction metrics and generate reports       │
└─────────────────────────────────────────────────────────────┘

Production Performance

Based on 38,000+ operations in real-world usage:

Tool Category	Avg Token Reduction	Cache Hit Rate
File Operations	60-90%	>80%
API Responses	83-95%	>75%
Database Queries	83-90%	>70%
Build/Test Output	70-85%	>65%

Per-Session Savings: 300K-700K tokens (worth $0.90-$2.10 at $3/M tokens)

Usage Examples

Basic Caching

// Cache large content to remove from context window
const result = await optimize_text({
  text: "Large API response or file content...",
  key: "cache-key",
  quality: 11
});
// Result: Original tokens removed, only cache key remains (~50 tokens)

// Retrieve later
const cached = await get_cached({ key: "cache-key" });
// Result: Full original content restored

Smart File Reading

// First read: full content
await smart_read({ path: "/src/app.ts" });

// Subsequent reads: only changes (80% reduction)
await smart_read({ path: "/src/app.ts" });

API Caching

// First request: fetch and cache
await smart_api_fetch({
  method: "GET",
  url: "https://api.example.com/data",
  ttl: 300
});

// Subsequent requests: cached (95% reduction)
await smart_api_fetch({
  method: "GET",
  url: "https://api.example.com/data"
});

Session Analysis

// View token usage for current session
await get_session_stats({});
// Result: Breakdown by tool, operation, and savings

// Analyze entire project
await analyze_project_tokens({
  projectPath: "/path/to/project"
});
// Result: Cost estimation and optimization opportunities

Technology Stack

Runtime: Node.js 20+
Language: TypeScript
Database: SQLite (better-sqlite3)
Token Counting: tiktoken (GPT-4 tokenizer)
Compression: Brotli (built-in Node.js)
Caching: Multi-tier LRU/LFU/FIFO caching
Protocol: MCP SDK (@modelcontextprotocol/sdk)

Supported AI Tools

The automated installer detects and configures token-optimizer-mcp for:

✅ Claude Code - CLI with global hooks integration
✅ Claude Desktop - Native desktop application
✅ Cursor IDE - AI-first code editor
✅ Cline - VS Code extension (formerly Claude Dev)
✅ GitHub Copilot - VS Code with MCP support
✅ Windsurf IDE - AI-powered development environment

No manual configuration needed - the installer automatically detects and configures all installed tools!

Documentation

Detailed Tool Reference - Complete documentation for all 61 tools
Installation Guide - Platform-specific installation instructions
Contributing Guide - Development setup and contribution guidelines

Performance Characteristics

Compression Ratio: 2-4x typical (up to 82x for repetitive content)
Context Window Savings: 60-90% average across all operations
Cache Hit Rate: >80% in typical usage
Operation Overhead: <10ms for cache operations (optimized from 50-70ms)
Compression Speed: ~1ms per KB of text
Hook Overhead: <10ms per operation (7x improvement from in-memory optimizations)

Performance Optimizations

The PowerShell hooks have been optimized to reduce overhead from 50-70ms to <10ms through:

In-Memory Session State: Session data kept in memory instead of disk I/O on every operation
Batched Log Writes: Operation logs buffered and flushed every 5 seconds or 100 operations
Lazy Persistence: Disk writes only occur when necessary (session end, optimization, reports)

Environment Variables

Control hook behavior with these environment variables:

Performance Controls

TOKEN_OPTIMIZER_USE_FILE_SESSION (default: false)
- Set to true to revert to file-based session tracking (legacy mode)
- Use if you encounter issues with in-memory session state
- Example: $env:TOKEN_OPTIMIZER_USE_FILE_SESSION = "true"
TOKEN_OPTIMIZER_SYNC_LOG_WRITES (default: false)
- Set to true to disable batched log writes
- Forces immediate writes to disk (slower but more resilient)
- Use for debugging or if logs are being lost
- Example: $env:TOKEN_OPTIMIZER_SYNC_LOG_WRITES = "true"
TOKEN_OPTIMIZER_DEBUG_LOGGING (default: true)
- Set to false to disable DEBUG-level logging
- Reduces log file size and improves performance
- INFO/WARN/ERROR logs still written
- Example: $env:TOKEN_OPTIMIZER_DEBUG_LOGGING = "false"

Development Path

TOKEN_OPTIMIZER_DEV_PATH
- Path to local development installation
- Automatically set to ~/source/repos/token-optimizer-mcp if not specified
- Override for custom development paths
- Example: $env:TOKEN_OPTIMIZER_DEV_PATH = "C:\dev\token-optimizer-mcp"

Performance Impact: Using in-memory mode (default) provides a 7x improvement in hook overhead:

Before: 50-70ms per hook operation
After: <10ms per hook operation
85% reduction in hook latency

Monitoring Token Savings

Real-Time Session Monitoring

To view your actual token SAVINGS, use the get_session_stats tool:

// View current session statistics with token savings breakdown
await get_session_stats({});

Output includes:

Total tokens saved (this is the actual savings amount!)
Token reduction percentage (e.g., "60% reduction")
Cache hit rate and compression ratios
Breakdown by tool (Read, Grep, Glob, etc.)
Top 10 most optimized operations with before/after comparison

Example Output:

{
  "sessionId": "abc-123",
  "totalTokensSaved": 125430,  // ← THIS is your savings!
  "tokenReductionPercent": 68.2,
  "originalTokens": 184000,
  "optimizedTokens": 58570,
  "cacheHitRate": 72.0,
  "byTool": {
    "smart_read": { "saved": 45000, "percent": 80 },
    "smart_grep": { "saved": 32000, "percent": 75 }
  }
}

Session Tracking Files

All operations are automatically tracked in session data files:

Location: ~/.claude-global/hooks/data/current-session.txt

Format:

{
  "sessionId": "abc-123",
  "sessionStart": "20251031-082211",
  "totalOperations": 1250,      // ← Number of operations
  "totalTokens": 184000,         // ← Cumulative token COUNT
  "lastOptimized": 1698765432,
  "savings": {                   // ← Auto-updated every 10 operations (Issue #113)
    "totalTokensSaved": 125430,  // Tokens saved by compression
    "tokenReductionPercent": 68.2,  // Percentage of tokens saved
    "originalTokens": 184000,    // Original token count before optimization
    "optimizedTokens": 58570,    // Token count after optimization
    "cacheHitRate": 42.5,        // Cache hit rate percentage
    "compressionRatio": 0.32,    // Compression efficiency (lower is better)
    "lastUpdated": "20251031-092500"  // Last savings update timestamp
  }
}

How it works:

Every 10 operations, the PowerShell hooks automatically call get_cache_stats() MCP tool
Savings metrics are calculated from cache performance data (compression ratio, original vs compressed sizes)
The session file is atomically updated with the latest savings data
If the MCP call fails, the update is skipped gracefully without blocking operations

Note: For detailed per-operation analysis, use get_session_stats(). The session file provides high-level aggregate metrics.

Project-Wide Analysis

Analyze token usage across your entire project:

// Analyze project token costs
await analyze_project_tokens({
  projectPath: "/path/to/project"
});

Provides:

Total token cost estimation
Largest files by token count
Optimization opportunities
Cost projections at current API rates

Cache Performance

Monitor cache hit rates and storage efficiency:

// View cache statistics
await get_cache_stats({});

Metrics:

Total entries
Cache hit rate (%)
Average compression ratio
Total storage saved
Most frequently accessed keys

Troubleshooting

Common Issues and Solutions

Issue: "Invalid or malformed JSON" in Claude Code Settings

Symptom: Claude Code shows "Invalid Settings" error after running install-hooks

Cause: UTF-8 BOM (Byte Order Mark) was added to settings.json files

Solution: Upgrade to v3.0.2+ which fixes the BOM issue:

npm install -g @ooples/token-optimizer-mcp@latest

If you're already on v3.0.2+, manually remove the BOM:

# Windows: Remove BOM from settings.json
$content = Get-Content "~/.claude/settings.json" -Raw
$content = $content -replace '^\xEF\xBB\xBF', ''
$content | Set-Content "~/.claude/settings.json" -Encoding utf8NoBOM

# Linux: Remove BOM from settings.json
sed -i '1s/^\xEF\xBB\xBF//' ~/.claude/settings.json

# macOS: Remove BOM from settings.json (BSD sed requires empty string after -i)
sed -i '' '1s/^\xef\xbb\xbf//' ~/.claude/settings.json

Issue: Hooks Not Working After Installation

Symptom: Token optimization not occurring automatically

Diagnosis:

Check if hooks are installed:

# Windows
Get-Content ~/.claude/settings.json | ConvertFrom-Json | Select-Object -ExpandProperty hooks

# macOS/Linux
cat ~/.claude/settings.json | jq .hooks

Verify dispatcher.ps1 exists:

# Windows
Test-Path ~/.claude-global/hooks/dispatcher.ps1

# macOS/Linux
[ -f ~/.claude-global/hooks/dispatcher.sh ] && echo "Exists" || echo "Missing"

Solution: Re-run the installer:

# Windows
powershell -ExecutionPolicy Bypass -File install-hooks.ps1

# macOS/Linux
bash install-hooks.sh

Issue: Low Cache Hit Rate (<50%)

Symptom: Session stats show cache hit rate below 50%

Causes:

Working with many new files (expected)
Cache was recently cleared
TTL (time-to-live) is too short

Solutions:

Warm up the cache before starting work:

await cache_warmup({
  paths: ["/path/to/frequently/used/files"],
  recursive: true
});

Increase TTL for stable APIs:

await smart_api_fetch({
  url: "https://api.example.com/data",
  ttl: 3600  // 1 hour instead of default 5 minutes
});

Check cache size limits:

await smart_cache({
  operation: "configure",
  l1MaxSize: 2000,  // Increase from default 1000
  l2MaxSize: 20000  // Increase from default 10000
});

Issue: High Memory Usage

Symptom: Node.js process using excessive memory

Cause: Large cache in memory (L1/L2 tiers)

Solution: Configure cache limits:

await smart_cache({
  operation: "configure",
  evictionStrategy: "LRU",  // Least Recently Used
  l1MaxSize: 500,  // Reduce L1 cache
  l2MaxSize: 5000  // Reduce L2 cache
});

Or clear the cache:

await clear_cache({});

Issue: Slow First-Time Operations

Symptom: Initial Read/Grep/Glob operations are slow

Cause: Cache is empty, building indexes

Solution: This is expected behavior. Subsequent operations will be 80-90% faster.

To pre-warm the cache:

await cache_warmup({
  paths: ["/src", "/tests", "/docs"],
  recursive: true,
  schedule: "startup"  // Auto-warm on every session start
});

Issue: "Permission denied" Errors on Windows

Symptom: Cannot write to cache or log files

Cause: PowerShell execution policy or file permissions

Solution:

Set execution policy:

Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser

Check file permissions:

icacls "$env:USERPROFILE\.token-optimizer"

Re-run installer as Administrator if needed

Issue: Cache Files Growing Too Large

Symptom: ~/.token-optimizer/cache.db is >1GB

Cause: Caching very large files or many API responses

Solution:

Clear old entries:

await clear_cache({ olderThan: 7 });  // Clear entries older than 7 days

Reduce cache retention:

await smart_cache({
  operation: "configure",
  defaultTTL: 3600  // 1 hour instead of 7 days
});

Manually delete cache (nuclear option):
```
rm -rf ~/.token-optimizer/cache.db
```

Getting Help

If you encounter issues not covered here:

Check the hook logs: ~/.claude-global/hooks/logs/dispatcher.log
Check session data: ~/.claude-global/hooks/data/current-session.txt
File an issue: GitHub Issues
- Include debug logs
- Include your OS and Node.js version
- Include the output of get_session_stats

Limitations

Small Text: Best for content >500 characters (cache overhead on small snippets)
One-Time Content: No benefit for content that won't be referenced again
Cache Storage: Automatic cleanup after 7 days to prevent disk usage issues
Token Counting: Uses GPT-4 tokenizer (approximation for Claude, but close enough)

License

MIT License - see LICENSE for details

Author

Built for optimal Claude Code token efficiency by the ooples team.