Media
chichi-speech
A RESTful service for high-quality text-to-speech using Qwen3
---
name: chichi-speech
description: A RESTful service for high-quality text-to-speech using Qwen3 and specialized voice cloning. Optimized for reusing a specific voice prompt to avoid re-computation.
---
# Chichi Speech Service
This skill provides a FastAPI-based REST service for Qwen3 TTS, specifically configured for reusing a high-quality reference audio prompt for efficient and consistent voice cloning. This service is packaged as an installable CLI.
## Installation
Prerequisites: `python >= 3.10`.
```bash
pip install -e .
```
## Usage
### 1. Start the Service
The service runs on port **9090** by default.
```bash
# Start the server (runs in foreground, use & for background or a separate terminal)
# Optional: Uudate to your own reference audio and text for voice cloning
chichi-speech --port 9090 --host 127.0.0.1 --ref-audio "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen3-TTS-Repo/clone_2.wav" --ref-text "Okay. Yeah. I resent you. I love you. I respect you. But you know what? You blew it! And thanks to you."
```
### 2. Verify Service is Running
Check the health/docs:
```bash
curl http://localhost:9090/docs
```
### 3. Generate Speech
Use cURL:
```bash
curl -X POST "http://localhost:9090/synthesize" \
-H "Content-Type: application/json" \
-d '{
"text": "Nice to meet you",
"language": "English"
}' \
--output output/nice_to_meet.wav
```
## Functionality
- **Endpoint**: `POST /synthesize`
- **Default Port**: 9090
- **Voice Cloning**: Uses a pre-computed voice prompt from reference files to ensure the cloned voice is consistent and generation is fast.
## Requirements
- Python 3.10+
- `qwen-tts` (Qwen3 model library)
- Access to a reference audio file for voice cloning.
- By default, it uses public sample audio from Qwen3.
- **CRITICAL**: You can provide your own reference audio using the `--ref-audio` and `--ref-text` flags.
media
By
Comments
Sign in to leave a comment