Voice
Local Voice
OpenClaw plugin + Skill for local voice channel implementation
README
# local-voice
Local-first voice I/O stack for OpenClaw agents.
`local-voice` combines:
- **STT** with `faster-whisper`
- **TTS** with **Kokoro** (and optional **Qwen3-TTS**)
- **Discord voice bubbles** + **Telegram voice notes**
- Persistent **per-user voice preferences**
- Consent-gated **voice cloning** flow (Qwen3)
---
## Status
This project is actively evolving from skill prototype toward plugin architecture.
Current focus:
- reliable local voice pipelines
- channel adapters (Discord/Telegram)
- backend switching + fallback
- portability and operator UX
---
## Features
- Local STT (`faster-whisper`)
- Local TTS (`kokoro-onnx`)
- Optional Qwen3 backend (`ttscli`) with fallback to Kokoro
- Discord native voice-message send path
- Telegram native voice-note send path
- Language handling (EN/ES + locale-aware fallback)
- Voice profile mapping by language
- Persistent voice/backend preferences (default + per user)
- Voice clone tooling with explicit consent requirement
- Smoke tests and preflight checks
---
## Requirements
- Node.js 18+
- Python 3.10+
- ffmpeg + ffprobe
- Python packages:
- `faster-whisper`
- `kokoro-onnx`
- `soundfile`
Optional (Qwen backend):
- `ttscli`
- SoX (commonly needed on Windows)
See `references/requirements.md` and `references/config-reference.md` for full details.
---
## Quick Start
### 1) List available voices
```bash
node ./scripts/list_voices.js
```
### 2) Set user preference (optional)
```bash
node ./scripts/voice_preferences.js set --user <userId> --voice <voiceName> --backend qwen3
```
### 3) Send TTS to Discord
```bash
node ./scripts/speak.js \
--text "Hello from local-voice" \
--channel-kind discord \
--channel <discordChannelId> \
--backend kokoro
```
### 4) Send TTS to Telegram
```bash
node ./scripts/speak.js \
--text "Hola desde local-voice" \
--channel-kind telegram \
--channel <telegramChatId> \
--backend kokoro
```
### 5) Run smoke test
```bash
node ./scripts/voice_smoke_test.js
```
---
## Voice Cloning (Qwen3)
Voice cloning is supported via `clone_voice.js` and requires explicit consent.
```bash
node ./scripts/clone_voice.js \
--input ./sample.ogg \
--voice my-voice \
--language en \
--consent yes \
--channel <discordChannelId>
```
Please only clone voices when you have clear permission.
---
## Documentation Index
- `SKILL.md` β workflow and behavior
- `EVOLUTION_PLAN.md` β roadmap/phases
- `references/config-reference.md` β env vars, args, resolution order
- `references/operator-quick-commands.md` β practical command cheatsheet
- `references/voice-cloning.md` β consent + cloning flow
- `references/bug-reporting.md` β issue pipeline design
---
## License
MIT
voice
Comments
Sign in to leave a comment