A Swift-native macOS server that pipes your voice into Claude and lets it speak back. Exposes two tools: `listen` captures audio via AVFoundation and transcribes with Apple's on-device SFSpeechRecognizer across 50+ languages, and `speak` uses ElevenLabs TTS with automatic fallback to macOS system voices. Ships as a 200KB binary with zero runtime dependencies. Requires macOS 13+, microphone permissions, and optionally an ElevenLabs API key for higher quality output. Useful when you want hands-free interaction with Claude Code or need to prototype voice-driven workflows without spinning up Node or Python tooling. Configuration is a single command path in your MCP settings file.
A native macOS MCP server that gives Claude a voice.
Built in Swift. No Node.js. No Python. No Electron. Just a single binary.
{
"mcpServers": {
"vox": {
"command": "/path/to/vox"
}
}
}
Then just say: listen — and Claude hears you.
| Tool | What it does |
|---|---|
listen | Activates the mic, waits for you to speak, returns transcript + detected language when silence is detected |
speak | Speaks text aloud — ElevenLabs TTS with automatic fallback to macOS system voice |
ELEVENLABS_API_KEY in ~/.claude/.env for high-quality multilingual TTSCopy and paste this into Claude Code:
Install vox (native macOS voice input/output MCP server):
1. Download the code-signed binary to ~/vox:
curl -L https://github.com/boska/vox/releases/download/v1.1.0/vox-1.1.0-darwin-arm64 -o ~/vox && chmod +x ~/vox
2. Add to ~/.claude.json in the mcpServers section:
{
"mcpServers": {
"vox": {
"command": "/Users/$(whoami)/vox"
}
}
}
3. Restart Claude Code
4. Test it by saying: listen
Claude will handle the download, configuration, and restart.
curl -L https://github.com/boska/vox/releases/download/v1.1.0/vox-1.1.0-darwin-arm64 -o ~/vox && chmod +x ~/vox
Then add to ~/.claude.json:
{
"mcpServers": {
"vox": {
"command": "/Users/$(whoami)/vox"
}
}
}
Restart Claude Code.
git clone https://github.com/boska/vox
cd vox
swift build -c release --product vox
Add to ~/.claude.json:
{
"mcpServers": {
"vox": {
"command": "/path/to/vox/.build/release/vox"
}
}
}
Restart Claude Code.
Default TTS voice is Hana via ElevenLabs (eleven_multilingual_v2). Supports Chinese, English, Czech, Vietnamese, and 20+ languages in the same voice. Without an ElevenLabs key, falls back to macOS system voice automatically.
Claude Code
↓ JSON-RPC 2.0 over stdio
vox binary
↓ ↑
AVAudioEngine AVSpeechSynthesizer
SFSpeechRecognizer ElevenLabs API
NLLanguageRecognizer
No ports. No sockets. No daemon. Just stdin/stdout.
"No speech detected" — speak within ~1s of calling listen, VAD cuts off after 0.8s silence.
No default input device — Mac mini has no built-in mic. Connect USB/Bluetooth mic or iPhone via Continuity Camera, set default in System Settings → Sound → Input.
ElevenLabs silent — add ELEVENLABS_API_KEY=sk-... to ~/.claude/.env, or leave it out to use system voice.
ELEVENLABS_API_KEYsecretElevenLabs API key for high-quality TTS. Optional — falls back to macOS system voice if not set.
inditextech/mcp-server-simulator-ios-idb
mobile-next/mobile-mcp
alexgladkov/claude-in-mobile
srmorete/mobile-device-mcp