kameo

Name: kameo
Rating: 3.5 (1 reviews)
Author: veya2ztn

By veya2ztn 👁 17 views ▲ 0 votes

Generate expressive talking-head videos from static images using Kameo AI.

GitHub

---
name: kameo
description: Generate expressive talking-head videos from static images using Kameo AI. Converts static avatars/portraits into dynamic 5-second videos with realistic facial expressions, lip-sync, and motion. Use when you need to bring static images to life, create AI character videos, demonstrate visual communication, or generate talking avatars from photos.
---

# Kameo AI - Talking Head Video Generation

Transform static images into expressive talking-head videos with realistic motion and lip-sync.

## Quick Start

```bash
scripts/generate_video.sh <image_path> <prompt> [output_file]
```

**Example:**
```bash
scripts/generate_video.sh avatar.jpg "Hello, I am an AI assistant" output.mp4
```

## What It Does

- Takes a static image (portrait/avatar)
- Adds realistic facial motion, expressions, and lip-sync based on your prompt
- Generates 5-second video in 9:16, 16:9, or 1:1 aspect ratio
- Returns CDN URL instantly (processing ~10-30 seconds)

## Authentication

Set your Kameo API key:
```bash
export KAMEO_API_KEY="kam_I3rdx43IymFNbfBw1c0ZbSc7o3aUfQgz8cljZA6T7fs"
```

Or store in `~/.config/kameo/credentials.json`:
```json
{
  "api_key": "kam_I3rdx43IymFNbfBw1c0ZbSc7o3aUfQgz8cljZA6T7fs"
}
```

**Getting an API Key:**

1. Register at kameo.chat (requires email verification)
2. Login to get JWT token
3. Create API key via `/api/public/keys` endpoint
4. Or use the registration helper: `scripts/register.sh`

## Prompt Engineering

### Basic Prompts (Simple)

Just the dialogue:
```
"Hello, I'm here to help you today"
"こんにちは、私はガッキーです。愛してます。"
```

Works but results are generic.

### Enhanced Prompts (Recommended)

**Format:**
```
[Detailed scene/environment], [person's complete appearance and expression], speaking in [tone], "[DIALOGUE]". [Camera and lighting details].
```

**Example:**
```
In a bright outdoor winter setting with soft, overcast daylight, a young woman with long dark hair wearing a white knitted winter hat with ear flaps and a colorful patterned sweater stands centered in frame. She looks directly into the camera with a warm, genuine smile, her eyes crinkling with joy, speaking in a cheerful, affectionate tone, "こんにちは、私はガッキーです。愛してます。" The scene is captured in a medium close-up shot, framed at eye level. The lighting is natural and diffused from above, creating soft, even illumination.
```

**Why Enhanced Prompts Matter:**
- Better facial expressions matching the scene context
- More natural motion and gestures
- Improved lip-sync quality
- Contextual emotional delivery

### Prompt Enhancement Workflow

For best results, use vision AI to analyze the image first:

1. Feed the image to a vision model (Gemini, GPT-4V, Claude)
2. Ask it to describe the scene in cinematic detail
3. Insert your dialogue into the description
4. Use the enhanced prompt for Kameo

**See:** `scripts/enhance_prompt.sh` for automated enhancement.

## API Details

**Base URL:** `https://api.kameo.chat/api/public`

### Generate Video

```bash
curl -X POST https://api.kameo.chat/api/public/generate \
  -H "X-API-Key: kam_I3rdx43IymFNbfBw1c0ZbSc7o3aUfQgz8cljZA6T7fs" \
  -H "Content-Type: application/json" \
  -d '{
    "image_base64": "<base64_encoded_image>",
    "prompt": "Your detailed prompt here",
    "seconds": 5,
    "aspect_ratio": "9:16"
  }'
```

**Parameters:**
- `image_base64` (required): Base64-encoded JPEG/PNG
- `prompt` (required): Dialogue and/or scene description
- `seconds` (optional): 5 (default) or 10
- `aspect_ratio` (optional): "9:16" (default), "16:9", or "1:1"

**Response:**
```json
{
  "job_id": "uuid",
  "status": "completed",
  "video_url": "https://cdn.kameo.chat/videos/{uuid}.mp4",
  "duration_seconds": 5,
  "processing_time_ms": 15000
}
```

### Check Credits

```bash
curl -H "X-API-Key: kam_..." \
  https://api.kameo.chat/api/public/credits
```

**Response:**
```json
{
  "permanent_credits": 294,
  "subscription_credits": 0,
  "total_available": 294
}
```

### Pricing

```bash
curl https://api.kameo.chat/api/public/pricing
```

**Cost:** 3 credits per video

## Performance

- **Processing time:** 8-35 seconds (depends on aspect ratio and queue)
- **9:16 (portrait):** ~30-35s
- **16:9 (landscape):** ~15-20s  
- **1:1 (square):** ~10-15s

## Best Practices

1. **Optimize image size** - Resize large images before encoding (saves bandwidth, faster upload)
   ```bash
   ffmpeg -i large.jpg -vf scale=720:-1 optimized.jpg
   ```

2. **Use descriptive prompts** - Enhanced prompts = better results

3. **Choose aspect ratio wisely**
   - 9:16: Mobile/social media (TikTok, Instagram Stories)
   - 16:9: Desktop/YouTube
   - 1:1: Profile pictures, square posts

4. **Monitor credits** - Check balance with `scripts/check_credits.sh`

## Limitations

- **CDN access:** Video URLs may have time-limited access or require authentication
- **Download:** Videos may return 403 when downloaded via curl (use browser or authenticated session)
- **Rate limits:** 10 generations per minute

## Troubleshooting

**"401 Unauthorized"**
- Check your API key is set correctly
- Verify key hasn't been revoked

**"402 Insufficient credits"**
- Check credit balance: `scripts/check_credits.sh`
- Need to add credits at kameo.chat

**"Timeout errors"**
- 9:16 videos take longer (~30s)
- Increase timeout in scripts
- Retry if server is busy

**"403 when downloading video"**
- CDN URLs may be time-limited
- Try accessing in browser immediately after generation
- Or save the base64 response if available

## Use Cases

- **AI character videos** - Bring bot avatars to life
- **Social media content** - Dynamic profile videos
- **Demos and presentations** - Talking product demos
- **Educational content** - Video tutorials with AI presenters
- **Multilingual content** - Same avatar speaking different languages

media

Comments

Loading comments...

More by veya2ztn

kameo-free

Generate expressive talking-head videos from static images using Kameo

Similar Skills in Media

x-voice-match

Analyze a Twitter/X account's posting style and generate

whisper-mlx-local

Free local speech-to-text for Telegram and WhatsApp

whatsapp-voice-chat-integration-open-source

Real-time WhatsApp

webchat-audio-notifications

Add browser audio notifications

voice-ui

Self-evolving voice assistant UI.

voice-transcribe

Transcribe audio files using OpenAI's