General
clawvox
ClawVox - ElevenLabs voice studio for OpenClaw.
---
name: clawvox
description: ClawVox - ElevenLabs voice studio for OpenClaw. Generate speech, transcribe audio, clone voices, create sound effects, and more.
homepage: https://elevenlabs.io/developers
metadata:
{
"openclaw": {
"emoji": "🎙️",
"skillKey": "clawvox",
"requires": {
"bins": ["curl", "jq"],
"env": ["ELEVENLABS_API_KEY"]
},
"primaryEnv": "ELEVENLABS_API_KEY"
}
}
---
# ClawVox
Transform your OpenClaw assistant into a professional voice production studio with ClawVox - powered by ElevenLabs.
## Quick Reference
| Action | Command | Description |
|--------|---------|-------------|
| Speak | `{baseDir}/scripts/speak.sh 'text'` | Convert text to speech |
| Transcribe | `{baseDir}/scripts/transcribe.sh audio.mp3` | Speech to text |
| Clone | `{baseDir}/scripts/clone.sh --name "Voice" sample.mp3` | Clone a voice |
| SFX | `{baseDir}/scripts/sfx.sh "thunder storm"` | Generate sound effects |
| Voices | `{baseDir}/scripts/voices.sh list` | List available voices |
| Dub | `{baseDir}/scripts/dub.sh --target es audio.mp3` | Translate audio |
| Isolate | `{baseDir}/scripts/isolate.sh audio.mp3` | Remove background noise |
## Setup
1. Get your API key from [elevenlabs.io/app/settings/api-keys](https://elevenlabs.io/app/settings/api-keys)
2. Configure in `~/.openclaw/openclaw.json`:
```json5
{
skills: {
entries: {
"clawvox": {
apiKey: "YOUR_ELEVENLABS_API_KEY",
config: {
defaultVoice: "Rachel",
defaultModel: "eleven_turbo_v2_5",
outputDir: "~/.openclaw/audio"
}
}
}
}
}
```
Or set the environment variable:
```bash
export ELEVENLABS_API_KEY="your_api_key_here"
```
## Voice Generation (TTS)
### Basic Text-to-Speech
```bash
# Quick speak with default voice (Rachel)
{baseDir}/scripts/speak.sh 'Hello, I am your personal AI assistant.'
# Specify voice by name
{baseDir}/scripts/speak.sh --voice Adam 'Hello from Adam'
# Save to file
{baseDir}/scripts/speak.sh --out ~/audio/greeting.mp3 'Welcome to the show'
# Use specific model
{baseDir}/scripts/speak.sh --model eleven_multilingual_v2 'Bonjour'
# Adjust voice settings
{baseDir}/scripts/speak.sh --stability 0.5 --similarity 0.8 'Expressive speech'
# Adjust speed
{baseDir}/scripts/speak.sh --speed 1.2 'Faster speech'
# Use multilingual model for other languages
{baseDir}/scripts/speak.sh --model eleven_multilingual_v2 --voice Rachel 'Hola, que tal'
{baseDir}/scripts/speak.sh --model eleven_multilingual_v2 --voice Adam 'Guten Tag'
```
### Voice Models
| Model | Latency | Languages | Best For |
|-------|---------|-----------|----------|
| `eleven_flash_v2_5` | ~75ms | 32 | Real-time, streaming |
| `eleven_turbo_v2_5` | ~250ms | 32 | Balanced quality/speed |
| `eleven_multilingual_v2` | ~500ms | 29 | Long-form, highest quality |
### Available Voices
Premade voices: Rachel, Adam, Antoni, Bella, Domi, Elli, Josh, Sam, Callum, Charlie, George, Liam, Matilda, Alice, Bill, Brian, Chris, Daniel, Eric, Jessica, Laura, Lily, River, Roger, Sarah, Will
### Long-Form Content
```bash
# Generate audio from text file
{baseDir}/scripts/speak.sh --input chapter.txt --voice "George" --out audiobook.mp3
```
## Speech-to-Text (Transcription)
### Basic Transcription
```bash
# Transcribe audio file
{baseDir}/scripts/transcribe.sh recording.mp3
# Save to file
{baseDir}/scripts/transcribe.sh --out transcript.txt audio.mp3
# Transcribe with language hint
{baseDir}/scripts/transcribe.sh --language es spanish_audio.mp3
# Include timestamps
{baseDir}/scripts/transcribe.sh --timestamps podcast.mp3
```
### Supported Formats
- MP3, MP4, MPEG, MPGA, M4A, WAV, WebM
- Maximum file size: 100MB
## Voice Cloning
### Instant Voice Clone
```bash
# Clone from single sample (minimum 30 seconds recommended)
{baseDir}/scripts/clone.sh --name MyVoice recording.mp3
# Clone with description
{baseDir}/scripts/clone.sh --name BusinessVoice \
--description 'Professional male voice' \
sample.mp3
# Clone with labels
{baseDir}/scripts/clone.sh --name MyVoice \
--labels '{"gender":"male","age":"adult"}' \
sample.mp3
# Remove background noise during cloning
{baseDir}/scripts/clone.sh --name CleanVoice \
--remove-bg-noise \
sample.mp3
# Test cloned voice
{baseDir}/scripts/speak.sh --voice MyVoice 'Testing my cloned voice'
```
## Voice Library Management
```bash
# List all available voices
{baseDir}/scripts/voices.sh list
# Get voice details
{baseDir}/scripts/voices.sh info --name Rachel
{baseDir}/scripts/voices.sh info --id 21m00Tcm4TlvDq8ikWAM
# Search voices (filter output with grep)
{baseDir}/scripts/voices.sh list | grep -i "female"
# Filter by category
{baseDir}/scripts/voices.sh list --category premade
{baseDir}/scripts/voices.sh list --category cloned
# Download voice preview
{baseDir}/scripts/voices.sh preview --name Rachel -o preview.mp3
# Delete custom voice
{baseDir}/scripts/voices.sh delete --id "voice_id"
```
## Sound Effects
```bash
# Generate sound effect
{baseDir}/scripts/sfx.sh 'Heavy rain on a tin roof'
# With duration
{baseDir}/scripts/sfx.sh --duration 5 'Forest ambiance with birds'
# With prompt influence (higher = more accurate)
{baseDir}/scripts/sfx.sh --influence 0.8 'Sci-fi laser gun firing'
# Save to file
{baseDir}/scripts/sfx.sh --out effects/thunder.mp3 'Rolling thunder'
```
**Note:** Duration range is 0.5 to 22 seconds (rounded to nearest 0.5)
## Voice Isolation
```bash
# Remove background noise and isolate voice
{baseDir}/scripts/isolate.sh noisy_recording.mp3
# Save to specific file
{baseDir}/scripts/isolate.sh --out clean_voice.mp3 meeting_recording.mp3
# Don't tag audio events
{baseDir}/scripts/isolate.sh --no-audio-events recording.mp3
```
**Requirements:**
- Minimum duration: 4.6 seconds
- Supported formats: MP3, WAV, M4A, OGG, FLAC
## Dubbing (Multi-Language Translation)
```bash
# Dub audio to Spanish
{baseDir}/scripts/dub.sh --target es audio.mp3
# Dub with source language specified
{baseDir}/scripts/dub.sh --source en --target ja video.mp4
# Check dubbing status
{baseDir}/scripts/dub.sh --status --id "dubbing_id"
# Download dubbed audio
{baseDir}/scripts/dub.sh --download --id "dubbing_id" --out dubbed.mp3
```
**Supported languages:** en, es, fr, de, it, pt, pl, hi, ar, zh, ja, ko, nl, ru, tr, vi, sv, da, fi, cs, el, he, id, ms, no, ro, uk, hu, th
## API Usage Examples
For direct API access, all scripts use curl under the hood:
```bash
# Direct TTS API call
curl -X POST "https://api.elevenlabs.io/v1/text-to-speech/VOICE_ID" \
-H "xi-api-key: $ELEVENLABS_API_KEY" \
-H "Content-Type: application/json" \
-d '{"text": "Hello world", "model_id": "eleven_turbo_v2_5"}' \
--output speech.mp3
```
## Error Handling
All scripts provide helpful error messages:
- **401**: Authentication failed - Check your API key
- **403**: Permission denied - Your API key may not have access
- **429**: Rate limit exceeded - Wait before trying again
- **500/502/503**: ElevenLabs API issues - Try again later
## Testing
Run the test suite to verify everything works:
```bash
{baseDir}/test.sh YOUR_API_KEY
```
Or with environment variable:
```bash
export ELEVENLABS_API_KEY="your_key"
{baseDir}/test.sh
```
## Troubleshooting
### Common Issues
1. **"exec host not allowed (requested gateway)"**
- The skill needs to run commands in a sandbox environment
- Configure OpenClaw to use sandbox: `tools.exec.host: "sandbox"`
- Or enable sandboxing in your OpenClaw config
- Alternative: Configure exec approvals for gateway host (see OpenClaw docs)
2. **Parse errors with quotes or exclamation marks**
- Use single quotes instead of double quotes: `'Hello world'` not `"Hello world!"`
- Avoid exclamation marks (`!`) in text when using double quotes
- For complex text, use the `--input` option with a file
3. **"ELEVENLABS_API_KEY not set"**
- Ensure `ELEVENLABS_API_KEY` is set or configured in openclaw.json
- Check that the API key is at least 20 characters long
2. **"jq is required but not installed"**
- Install jq: `apt-get install jq` (Linux) or `brew install jq` (macOS)
3. **"Rate limited"**
- Check your ElevenLabs plan quota at elevenlabs.io/app/usage
- Free tier: ~10,000 characters/month
4. **"Voice not found"**
- Use `{baseDir}/scripts/voices.sh list` to see available voices
- Check if the voice ID is correct
5. **"Dubbing failed"**
- Ensure source audio is clear and audible
- Check supported language codes
6. **"File too large"**
- Transcription: 100MB max
- Dubbing: 500MB max
- Voice cloning: 50MB per file
### Debug Mode
```bash
# Enable verbose output
DEBUG=1 {baseDir}/scripts/speak.sh 'test'
# Show API request details
DEBUG=1 {baseDir}/scripts/transcribe.sh audio.mp3
```
## Pricing Notes
ElevenLabs API pricing (approximate):
- **Flash v2.5**: ~$0.06/min
- **Turbo v2.5**: ~$0.06/min
- **Multilingual v2**: ~$0.12/min
- **Voice cloning**: Included in plan
- **Sound effects**: ~$0.02/generation
- **Transcription**: ~$0.02/min (Scribe v1)
Free tier: ~10,000 characters/month
## Links
- [ElevenLabs Dashboard](https://elevenlabs.io/app)
- [API Documentation](https://elevenlabs.io/docs)
- [Voice Library](https://elevenlabs.io/voice-library)
- [Pricing](https://elevenlabs.io/pricing)
general
By
Comments
Sign in to leave a comment