← Back to Plugins
Tools

Chatterbox

scottgl9 By scottgl9 👁 2 views ▲ 0 votes

Chatterbox plugin for Openclaw

GitHub

Install

npm install
npx

Configuration Example

{
  "messages": {
    "tts": {
      "chatterbox": {
        "model": "turbo",
        "device": "auto",
        "port": 8099,
        "referenceAudio": "/path/to/reference.wav",
        "temperature": 1.0,
        "exaggeration": 1.0,
        "cfgWeight": 0.5
      }
    }
  }
}

README

# OpenClaw Chatterbox TTS Plugin

A native local text-to-speech plugin for [OpenClaw](https://github.com/nicepkg/openclaw) using [Resemble AI's Chatterbox](https://github.com/resemble-ai/chatterbox) โ€” an open-source (MIT) TTS system with zero-shot voice cloning support.

## Features

- **Zero-shot voice cloning** โ€” provide a reference WAV file to clone any voice
- **Multiple model variants** โ€” turbo (fast), standard, and multilingual
- **Fully local** โ€” no cloud API keys required; runs entirely on your hardware
- **GPU accelerated** โ€” automatic detection of CUDA and Apple MPS devices
- **Optional ffmpeg integration** โ€” converts to mp3/opus when available, falls back to WAV

## Prerequisites

- **Python 3.10+** โ€” `python3` or `python` must be on your PATH
- **PyTorch** โ€” installed with appropriate CUDA/MPS support for your hardware
- **ffmpeg** (optional) โ€” for mp3/opus output; WAV output works without it

## Installation

### 1. Install the Python server dependencies

```bash
pip install -r server/requirements.txt
```

Or install manually:

```bash
pip install chatterbox-tts fastapi 'uvicorn[standard]'
```

### 2. Install the plugin in OpenClaw

Copy or symlink this directory into your OpenClaw extensions folder, or add it as a dependency in your OpenClaw configuration.

## Configuration

Configuration is resolved in order of precedence: **OpenClaw config** โ†’ **environment variables** โ†’ **defaults**.

### OpenClaw Config

Add to your OpenClaw config under `messages.tts.chatterbox`:

```json
{
  "messages": {
    "tts": {
      "chatterbox": {
        "model": "turbo",
        "device": "auto",
        "port": 8099,
        "referenceAudio": "/path/to/reference.wav",
        "temperature": 1.0,
        "exaggeration": 1.0,
        "cfgWeight": 0.5
      }
    }
  }
}
```

### Environment Variables

| Variable | Default | Description |
|---|---|---|
| `CHATTERBOX_MODEL` | `turbo` | Model variant: `turbo`, `standard`, `multilingual` |
| `CHATTERBOX_DEVICE` | `auto` | PyTorch device: `auto`, `cuda`, `mps`, `cpu` |
| `CHATTERBOX_PORT` | `8099` | Port for the managed Python server |
| `CHATTERBOX_BASE_URL` | โ€” | URL of an external Chatterbox server (skips local spawn) |
| `CHATTERBOX_REFERENCE_AUDIO` | โ€” | Path to voice cloning reference WAV |
| `CHATTERBOX_TEMPERATURE` | โ€” | Sampling temperature |
| `CHATTERBOX_EXAGGERATION` | โ€” | Expressiveness exaggeration parameter |
| `CHATTERBOX_CFG_WEIGHT` | โ€” | Classifier-free guidance weight |
| `CHATTERBOX_DISABLED` | โ€” | Set to any value to disable the provider |

### Plugin Config Schema

The `openclaw.plugin.json` file defines the configuration schema for the plugin. See the file for the full list of configurable properties.

## Architecture

```
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  OpenClaw                                           โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚
โ”‚  โ”‚ Speech System โ”‚โ”€โ”€โ”€โ–ถโ”‚ Chatterbox Provider      โ”‚  โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ”‚  (chatterbox-provider.ts)โ”‚  โ”‚
โ”‚                       โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚
โ”‚                                  โ”‚                  โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚
โ”‚  โ”‚ Audio Convert โ”‚โ—€โ”€โ”€โ”€โ”‚ Server Manager           โ”‚  โ”‚
โ”‚  โ”‚ (ffmpeg)      โ”‚    โ”‚  (server-manager.ts)     โ”‚  โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                   โ”‚ HTTP (localhost)
                        โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                        โ”‚ Python FastAPI Server     โ”‚
                        โ”‚  (chatterbox_server.py)   โ”‚
                        โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚
                        โ”‚  โ”‚ Chatterbox TTS Model โ”‚  โ”‚
                        โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚
                        โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
```

- **Plugin entry** (`index.ts`) registers the Chatterbox speech provider with OpenClaw
- **Speech provider** (`chatterbox-provider.ts`) handles synthesis requests, config resolution, and audio format conversion
- **Server manager** (`server-manager.ts`) lazily starts and manages the Python FastAPI server process
- **Audio convert** (`audio-convert.ts`) converts WAV output to mp3/opus via ffmpeg
- **Python server** (`chatterbox_server.py`) loads the Chatterbox model and exposes HTTP endpoints

## Voices

Chatterbox uses voice cloning rather than predefined voices:

- **Default** โ€” uses the model's default voice
- **Clone (reference audio)** โ€” clones the voice from a reference WAV file specified via `referenceAudio` config or `CHATTERBOX_REFERENCE_AUDIO` env var

## Running the Python Server Standalone

For development or debugging, you can run the server directly:

```bash
cd server
pip install -r requirements.txt
python chatterbox_server.py
```

Test with:

```bash
# Health check
curl http://localhost:8099/health

# Synthesize
curl -X POST http://localhost:8099/synthesize \
  -H 'Content-Type: application/json' \
  -d '{"text": "Hello world"}'
```

## Running Tests

### TypeScript tests (vitest)

```bash
npm install
npx vitest
```

### Python tests (pytest)

```bash
pip install pytest httpx anyio
pytest test/test_chatterbox_server.py
```

## License

MIT
tools

Comments

Sign in to leave a comment

Loading comments...