Lets you run Python code on Google Colab's free T4 or premium L4 GPUs from any MCP client without local hardware. Three tools: colab_execute for inline code snippets, colab_execute_file for local .py scripts, and colab_execute_notebook to run code and download all generated artifacts like model weights or CSVs. First run triggers OAuth2 in your browser, then tokens cache locally. Handles timeouts, captures stdout/stderr per cell, and validates zip extractions to prevent path traversal. Useful when you need to prototype CUDA kernels, fine-tune models, or run torch/TensorFlow workloads from Claude without spinning up cloud instances.
MCP server that allocates Google Colab GPU runtimes (T4/L4) and executes Python code on them. Lets any MCP-compatible AI assistant — Claude Code, Claude Desktop, Gemini CLI, Cline, and others — run GPU-accelerated code (CUDA, PyTorch, TensorFlow) without local GPU hardware.
~/.config/colab-exec/token.json for subsequent runs.pip install mcp-server-colab-exec
Or run directly with uvx:
uvx mcp-server-colab-exec
Add to your project's .mcp.json or ~/.claude/.mcp.json:
{
"mcpServers": {
"colab-exec": {
"command": "mcp-server-colab-exec"
}
}
}
Or via the CLI:
claude mcp add colab-exec mcp-server-colab-exec
Add to claude_desktop_config.json:
{
"mcpServers": {
"colab-exec": {
"command": "mcp-server-colab-exec"
}
}
}
gemini mcp add colab-exec -- mcp-server-colab-exec
colab_executeExecute inline Python code on a Colab GPU runtime.
| Parameter | Type | Default | Description |
|---|---|---|---|
code | string | — | Python code to execute (required) |
accelerator | string | "T4" | GPU type: "T4" (free) or "L4" (premium) |
timeout | int | 300 | Max execution time in seconds |
Returns JSON with per-cell output, errors, and stderr.
colab_execute_fileExecute a local .py file on a Colab GPU runtime.
| Parameter | Type | Default | Description |
|---|---|---|---|
file_path | string | — | Path to a local .py file (required) |
accelerator | string | "T4" | GPU type: "T4" (free) or "L4" (premium) |
timeout | int | 300 | Max execution time in seconds |
Security policy: file_path must be a .py file inside the current workspace (cwd).
colab_execute_notebookExecute code and collect all generated artifacts (images, CSVs, models, etc.).
| Parameter | Type | Default | Description |
|---|---|---|---|
code | string | — | Python code to execute (required) |
output_dir | string | — | Local directory for downloaded artifacts (required) |
accelerator | string | "T4" | GPU type: "T4" (free) or "L4" (premium) |
timeout | int | 300 | Max execution time in seconds |
Artifacts are downloaded as a zip and extracted into output_dir.
Zip members are validated before extraction to prevent path traversal and special-file writes.
Check GPU availability:
colab_execute(code="import torch; print(torch.cuda.is_available()); print(torch.cuda.get_device_name(0))")
Run nvidia-smi:
colab_execute(code="import subprocess; print(subprocess.run(['nvidia-smi'], capture_output=True, text=True).stdout)")
Train a model and download weights:
colab_execute_notebook(
code="import torch; model = torch.nn.Linear(10, 1); torch.save(model.state_dict(), '/tmp/model.pt')",
output_dir="./outputs"
)
On first use, the server opens a browser window for Google OAuth2 consent. The access token and refresh token are cached at ~/.config/colab-exec/token.json. Subsequent runs use the cached token and refresh it automatically.
The OAuth2 client credentials are the same ones used by the official Google Colab VS Code extension (google.colab@0.3.0). They are intentionally public.
"GPU quota exceeded" — Colab has usage limits. Wait and retry, or use a different Google account.
"Timed out creating kernel session" — The runtime took too long to start. Retry — Colab sometimes has delays during peak usage.
"Authentication failed" — Delete ~/.config/colab-exec/token.json and re-authenticate.
OAuth browser window doesn't open — Ensure you're running in an environment with a browser. For headless servers, authenticate on a machine with a browser first and copy the token file.
MIT