Voice
Google Tts
OpenClaw plugin for Gemini and Google Cloud TTS with Telegram auto voice replies
Install
openclaw plugins install -l
Configuration Example
{
"autoVoiceAccounts": ["your-bot-account-id"],
"autoVoiceModel": "gemini-2.5-flash-preview-tts",
"accounts": {
"your-bot-account-id": {
"defaultVoice": "Leda",
"defaultStyle": "请用自然、温和、专业的口吻表达。语气不评判,带一点自然的回应感,节奏自然,更像真实对话,不像朗读稿件。"
}
}
}
README
# google-tts
English. 中文说明见 [README.zh-CN.md](./README.zh-CN.md).
`google-tts` is an OpenClaw plugin that adds:
- Gemini TTS
- Google Cloud Text-to-Speech
- Telegram auto voice-bubble replies
It supports manual `/gtts` synthesis and a chat-level `/autovoice` mode. When `/autovoice` is enabled, the bot sends its normal text reply first, then sends a Telegram voice bubble for the same reply.
The plugin is designed to be scoped to specific bot accounts. You decide which bots can use auto voice in `voice-config.json`.
This public version uses `/autovoice` instead of `/voice`. Some OpenClaw installs already reserve `/voice` for the built-in `talk-voice` plugin, so `/autovoice` is the conflict-free command that works across installs.
License: MIT. See [LICENSE](./LICENSE).
## Requirements
- OpenClaw `2026.3.x` or later
- `ffmpeg` on `PATH`
- A Telegram bot configured in OpenClaw if you want auto voice bubbles
- One of:
- `GOOGLE_API_KEY` for Gemini TTS
- OAuth credentials for Google Cloud TTS
Why `ffmpeg` is required:
- Gemini TTS returns raw PCM audio, not a ready-to-send voice file
- The plugin uses `ffmpeg` locally to convert that PCM into usable formats
- Manual `/gtts` output is converted to `mp3`
- Telegram auto voice replies are converted to `ogg/opus`, because that is the most reliable format for Telegram voice bubbles
What `ogg/opus` is doing here:
- `ogg` is the container file format
- `opus` is the speech codec inside it
- Telegram voice notes work best with `ogg/opus`, so the plugin uses it to send a real voice bubble instead of a generic audio file
- This conversion happens only on local audio files generated by the plugin; it is not used to run arbitrary shell tasks
## Install
From the plugin directory:
```bash
openclaw plugins install -l .
openclaw plugins doctor
openclaw gateway --force
```
If you prefer a copied install instead of a linked one:
```bash
openclaw plugins install .
openclaw plugins doctor
openclaw gateway --force
```
## Configure
Copy the example config and edit it for your bot:
```bash
cp voice-config.example.json voice-config.json
```
Example:
```json
{
"autoVoiceAccounts": ["your-bot-account-id"],
"autoVoiceModel": "gemini-2.5-flash-preview-tts",
"accounts": {
"your-bot-account-id": {
"defaultVoice": "Leda",
"defaultStyle": "请用自然、温和、专业的口吻表达。语气不评判,带一点自然的回应感,节奏自然,更像真实对话,不像朗读稿件。"
}
}
}
```
What the config does:
- `autoVoiceAccounts` limits `/autovoice` to specific bot accounts
- `autoVoiceModel` can be `gemini-2.5-flash-preview-tts` or `gemini-2.5-pro-preview-tts`
- `accounts.<accountId>.defaultVoice` and `defaultStyle` set per-bot defaults
## Auth
### Gemini TTS
Set `GOOGLE_API_KEY` in the environment used by OpenClaw.
The plugin can also read `gemini_api_key` from a local `google-tts-tokens.json`, but that file should stay local and should not be committed.
### Google Cloud TTS
Run the OAuth setup once:
```bash
node ./src/oauth-setup.mjs /path/to/client_secret_*.json
```
You can also set `GOOGLE_OAUTH_CLIENT_SECRET_PATH` instead of passing the path on the command line.
This creates `google-tts-tokens.json` in the plugin directory. Keep that file private.
## Commands
- `/autovoice` toggles auto voice for the current Telegram chat
- `/autovoice on|off|status`
- `/voice_style <text>` overrides style for the current chat
- `/voice_style 默认` resets style to the account default
- `/voice_voice <voice>` overrides voice for the current chat
- `/voice_voice 默认` resets voice to the account default
- `/gtts status`
- `/gtts defaults`
- `/gtts voices [langCode]`
- `/gtts say <text>`
- `/gtts say --pro <text>`
- `/gtts say --flash <text>`
- `/gtts say --style '自然一点,像对话' <text>`
- `/gtts say -e cloud <text>`
## Notes
- Auto voice only runs on Telegram and only for accounts listed in `autoVoiceAccounts`
- If a reply is JSON, the plugin only speaks the `response` field
- Replies to slash commands such as `/new` and `/autovoice` are not spoken
- Telegram voice bubbles are sent directly by the plugin after text delivery
- Gemini audio is converted from PCM to `ogg/opus` for Telegram voice bubbles and to `mp3` for manual file output
- Gemini requests use the `x-goog-api-key` header instead of putting the API key in the URL
- Do not commit `google-tts-tokens.json`, `voice-state.json`, `voice-config.json`, or `out/`
voice
Comments
Sign in to leave a comment