← Back to Plugins
Voice

Google Tts

mealcard By mealcard 👁 183 views ▲ 0 votes

OpenClaw plugin for Gemini and Google Cloud TTS with Telegram auto voice replies

GitHub

Install

openclaw plugins install -l

Configuration Example

{
  "autoVoiceAccounts": ["your-bot-account-id"],
  "autoVoiceModel": "gemini-2.5-flash-preview-tts",
  "accounts": {
    "your-bot-account-id": {
      "defaultVoice": "Leda",
      "defaultStyle": "请用自然、温和、专业的口吻表达。语气不评判,带一点自然的回应感,节奏自然,更像真实对话,不像朗读稿件。"
    }
  }
}

README

# google-tts

English. 中文说明见 [README.zh-CN.md](./README.zh-CN.md).

`google-tts` is an OpenClaw plugin that adds:

- Gemini TTS
- Google Cloud Text-to-Speech
- Telegram auto voice-bubble replies

It supports manual `/gtts` synthesis and a chat-level `/autovoice` mode. When `/autovoice` is enabled, the bot sends its normal text reply first, then sends a Telegram voice bubble for the same reply.

The plugin is designed to be scoped to specific bot accounts. You decide which bots can use auto voice in `voice-config.json`.

This public version uses `/autovoice` instead of `/voice`. Some OpenClaw installs already reserve `/voice` for the built-in `talk-voice` plugin, so `/autovoice` is the conflict-free command that works across installs.

License: MIT. See [LICENSE](./LICENSE).

## Requirements

- OpenClaw `2026.3.x` or later
- `ffmpeg` on `PATH`
- A Telegram bot configured in OpenClaw if you want auto voice bubbles
- One of:
  - `GOOGLE_API_KEY` for Gemini TTS
  - OAuth credentials for Google Cloud TTS

Why `ffmpeg` is required:

- Gemini TTS returns raw PCM audio, not a ready-to-send voice file
- The plugin uses `ffmpeg` locally to convert that PCM into usable formats
- Manual `/gtts` output is converted to `mp3`
- Telegram auto voice replies are converted to `ogg/opus`, because that is the most reliable format for Telegram voice bubbles

What `ogg/opus` is doing here:

- `ogg` is the container file format
- `opus` is the speech codec inside it
- Telegram voice notes work best with `ogg/opus`, so the plugin uses it to send a real voice bubble instead of a generic audio file
- This conversion happens only on local audio files generated by the plugin; it is not used to run arbitrary shell tasks

## Install

From the plugin directory:

```bash
openclaw plugins install -l .
openclaw plugins doctor
openclaw gateway --force
```

If you prefer a copied install instead of a linked one:

```bash
openclaw plugins install .
openclaw plugins doctor
openclaw gateway --force
```

## Configure

Copy the example config and edit it for your bot:

```bash
cp voice-config.example.json voice-config.json
```

Example:

```json
{
  "autoVoiceAccounts": ["your-bot-account-id"],
  "autoVoiceModel": "gemini-2.5-flash-preview-tts",
  "accounts": {
    "your-bot-account-id": {
      "defaultVoice": "Leda",
      "defaultStyle": "请用自然、温和、专业的口吻表达。语气不评判,带一点自然的回应感,节奏自然,更像真实对话,不像朗读稿件。"
    }
  }
}
```

What the config does:

- `autoVoiceAccounts` limits `/autovoice` to specific bot accounts
- `autoVoiceModel` can be `gemini-2.5-flash-preview-tts` or `gemini-2.5-pro-preview-tts`
- `accounts.<accountId>.defaultVoice` and `defaultStyle` set per-bot defaults

## Auth

### Gemini TTS

Set `GOOGLE_API_KEY` in the environment used by OpenClaw.

The plugin can also read `gemini_api_key` from a local `google-tts-tokens.json`, but that file should stay local and should not be committed.

### Google Cloud TTS

Run the OAuth setup once:

```bash
node ./src/oauth-setup.mjs /path/to/client_secret_*.json
```

You can also set `GOOGLE_OAUTH_CLIENT_SECRET_PATH` instead of passing the path on the command line.

This creates `google-tts-tokens.json` in the plugin directory. Keep that file private.

## Commands

- `/autovoice` toggles auto voice for the current Telegram chat
- `/autovoice on|off|status`
- `/voice_style <text>` overrides style for the current chat
- `/voice_style 默认` resets style to the account default
- `/voice_voice <voice>` overrides voice for the current chat
- `/voice_voice 默认` resets voice to the account default
- `/gtts status`
- `/gtts defaults`
- `/gtts voices [langCode]`
- `/gtts say <text>`
- `/gtts say --pro <text>`
- `/gtts say --flash <text>`
- `/gtts say --style '自然一点,像对话' <text>`
- `/gtts say -e cloud <text>`

## Notes

- Auto voice only runs on Telegram and only for accounts listed in `autoVoiceAccounts`
- If a reply is JSON, the plugin only speaks the `response` field
- Replies to slash commands such as `/new` and `/autovoice` are not spoken
- Telegram voice bubbles are sent directly by the plugin after text delivery
- Gemini audio is converted from PCM to `ogg/opus` for Telegram voice bubbles and to `mp3` for manual file output
- Gemini requests use the `x-goog-api-key` header instead of putting the API key in the URL
- Do not commit `google-tts-tokens.json`, `voice-state.json`, `voice-config.json`, or `out/`
voice

Comments

Sign in to leave a comment

Loading comments...