Alfred Talk

Name: Alfred Talk
Rating: 3.5 (1 reviews)
Author: ssdavidai

By ssdavidai 👁 164 views ▲ 0 votes

OpenClaw plugin for ElevenLabs voice agent integration — make and receive AI phone calls

GitHub

Install

pip install -r

README

# alfred-talk

OpenClaw plugin for ElevenLabs voice agent integration — make and receive AI phone calls, automatically process transcripts, and route actions to your agent.

## What It Does

alfred-talk gives your OpenClaw/Alfred instance a complete voice pipeline:

- **Outbound calls** — Initiate phone calls via ElevenLabs Conversational AI + Twilio
- **Inbound calls** — Receive calls on your ElevenLabs phone number
- **Transcript processing** — Automatically capture, summarize, and store call transcripts
- **Notifications** — Post call summaries to Slack, Telegram, or any configured channel
- **Vault integration** — Save transcripts as structured markdown to your agent's inbox

## Architecture

```
┌──────────────────────────────────────────────────────────────────┐
│                        OpenClaw Gateway                         │
│                                                                  │
│  ┌──────────────┐  ┌──────────────┐  ┌────────────────────────┐ │
│  │  alfred-talk  │  │    Skills     │  │   Transcript Watcher   │ │
│  │   plugin.ts   │  │  SKILL.md x2 │  │      (hook)            │ │
│  │              │  │              │  │                        │ │
│  │ • agent tool  │  │ • alfred-talk│  │ Watches transcript dir │ │
│  │ • CLI cmd     │  │ • call skill │  │ for new .json files    │ │
│  │ • webhook rx  │  │              │  │                        │ │
│  │ • service mgr │  └──────────────┘  └────────────────────────┘ │
│  └──────┬───────┘                                                │
│         │ manages                                                │
└─────────┼────────────────────────────────────────────────────────┘
          │
          ▼
┌──────────────────┐     ┌──────────────────┐
│  Voice Server    │     │   ElevenLabs     │
│  (FastAPI/Python)│◄───►│   Conversational │
│                  │     │   AI Platform    │
│ • /v1/chat/comp  │     │                  │
│ • /elevenlabs-   │     │ • Voice agent    │
│   webhook        │     │ • Phone numbers  │
│ • /health        │     │ • Post-call      │
│                  │     │   webhooks       │
└──────────────────┘     └────────┬─────────┘
                                  │
                                  ▼
                         ┌──────────────────┐
                         │     Twilio        │
                         │  (Phone Numbers)  │
                         └──────────────────┘
```

### Call Flow

```
1. User says "Call Mom"
2. Agent uses alfred_talk tool → ElevenLabs API
3. ElevenLabs places call via Twilio
4. During call: voice server proxies LLM responses
5. Call ends → ElevenLabs sends webhook
6. Plugin saves transcript, summarizes, notifies user
7. Transcript saved to vault inbox
```

## Installation

### From npm (when published)

```bash
openclaw plugins install @ssdavidai/alfred-talk
```

### From local directory

```bash
# Clone the repo
git clone https://github.com/ssdavidai/alfred-talk.git
cd alfred-talk

# Install Python dependencies for the voice server
pip install -r voice-server/requirements.txt

# Install as OpenClaw plugin (link mode for development)
openclaw plugins install -l .
```

Restart the Gateway after installation.

## Configuration

Add to your `~/.openclaw/openclaw.json`:

```json5
{
  plugins: {
    entries: {
      "alfred-talk": {
        enabled: true,
        config: {
          // ElevenLabs (required)
          elevenlabs: {
            apiKey: "your-elevenlabs-api-key",
            agentId: "agent_xxxxxxxxxxxx",
            phoneNumberId: "phnum_xxxxxxxxxxxx",  // for outbound calls
            webhookSecret: "your-webhook-secret",  // optional, for signature verification
          },

          // Twilio (if using Twilio for phone numbers)
          twilio: {
            accountSid: "ACxxxxxxxx",
            authToken: "your-auth-token",
            fromNumber: "+15551234567",
          },

          // Gemini API key (for voice server LLM backend)
          geminiApiKey: "your-gemini-api-key",

          // Voice server settings
          voiceServer: {
            enabled: true,       // set false to disable the managed server
            port: 8770,          // default: 8770
            pythonBin: "python3", // path to Python binary
          },

          // Phone contacts (phone → name mapping)
          contacts: {
            "+15551234567": "Alice",
            "+15559876543": "Bob",
          },

          // Notification settings
          notifications: {
            channel: "slack",          // slack, telegram, discord, etc.
            target: "C0ACVH414JC",     // channel/user ID
          },

          // Transcript processing
          transcripts: {
            autoProcess: true,                          // auto-process on webhook receipt
            summaryModel: "anthropic/claude-haiku-4-5", // model for summaries
            inboxDir: "~/vault/inbox",                  // vault inbox for transcript files
          },
        },
      },
    },
  },
}
```

### Environment Variables

These can be set as environment variables instead of (or in addition to) config:

| Variable | Description | Required |
|---|---|---|
| `ELEVENLABS_API_KEY` | ElevenLabs API key | Yes |
| `ELEVENLABS_AGENT_ID` | ElevenLabs Conversational AI agent ID | Yes |
| `ELEVENLABS_PHONE_NUMBER_ID` | ElevenLabs phone number ID (outbound) | For outbound calls |
| `ELEVENLABS_WEBHOOK_SECRET` | Webhook signature verification secret | Recommended |
| `TWILIO_ACCOUNT_SID` | Twilio account SID | If using Twilio |
| `TWILIO_AUTH_TOKEN` | Twilio auth token | If using Twilio |
| `GEMINI_API_KEY` | Google Gemini API key (voice server LLM) | For voice server |

### ElevenLabs Setup

1. Create a **Conversational AI agent** at [elevenlabs.io](https://elevenlabs.io)
2. Configure the agent's voice, personality, and tools on the ElevenLabs dashboard
3. Set the agent's **LLM webhook URL** to your voice server: `https://your-domain.com/v1/chat/completions`
4. Set the **post-call webhook URL** to: `https://your-domain.com/elevenlabs-webhook`
5. Copy the Agent ID and API key into your config

### Webhook Exposure

The voice server needs to be publicly accessible for ElevenLabs webhooks. Options:

```bash
# ngrok (development)
ngrok http 8770

# Tailscale Funnel (production)
tailscale funnel 8770

# Reverse proxy (production)
# Point your domain at localhost:8770
```

## Usage

### Agent Tool

The `alfred_talk` tool is available to your agent when the plugin is enabled:

```
# Make an outbound call
alfred_talk({ action: "call", to: "+15551234567", firstMessage: "Good evening." })

# List recent transcripts
alfred_talk({ action: "list_transcripts", limit: 5 })

# Get a specific transcript
alfred_talk({ action: "get_transcript", conversationId: "conv_abc123" })

# Check voice server status
alfred_talk({ action: "server_status" })
```

### CLI

```bash
# Check status
openclaw alfred-talk status

# List recent transcripts
openclaw alfred-talk transcripts
openclaw alfred-talk transcripts -n 20
```

### Skills

Two skills are included:

- **alfred-talk** — Transcript management (list, view, status)
- **alfred-talk-call** — Outbound call initiation

These teach the agent when and how to use the `alfred_talk` tool.

### Receiving Calls

1. Configure a phone number on ElevenLabs (linked to your Twilio account)
2. Set the inbound call handler to your ElevenLabs agent
3. Calls are handled automatically by ElevenLabs Conversational AI
4. Post-call transcripts are sent to your webhook and processed

### Transcript Flow

```
Call ends
  → ElevenLabs sends POST /elevenlabs-webhook
  → Plugin saves JSON to ~/.openclaw/alfred-talk/transcripts/YYYY-MM-DD/
  → Plugin processes transcript:
    1. Extracts caller info (matched against contacts config)
    2. Spawns a summarization subagent
    3. Posts full transcript to notification channel
    4. Posts summary to notification channel
    5. Saves structured markdown to vault inbox
  → Marks transcript as processed
```

## Customization

### System Prompt

Edit `voice-server/system-prompt.md` to customize the voice agent's personality.
The default prompt is an "Alfred Pennyworth" butler persona — calm, witty, concise.

### Contacts

Map phone numbers to names in the plugin config:

```json5
{
  contacts: {
    "+15551234567": "Alice",
    "+15559876543": "Bob",
  },
}
```

Unknown callers show their phone number in transcripts and notifications.

## Troubleshooting

### Voice server won't start

```bash
# Check Python is available
python3 --version

# Check dependencies
pip install -r voice-server/requirements.txt

# Run manually to see errors
python3 voice-server/server.py
```

### Webhooks not arriving

1. Verify your webhook URL is publicly accessible
2. Check the ElevenLabs dashboard for webhook delivery status
3. Check gateway logs: `tail -f ~/.openclaw/gateway.log | grep alfred-talk`
4. If using signature verification, ensure `webhookSecret` matches ElevenLabs config

### Transcripts not processing

```bash
# Check transcript directory
ls ~/.openclaw/alfred-talk/transcripts/

# Check processed state
cat ~/.openclaw/alfred-talk/transcripts/.processed

# Check plugin status
openclaw alfred-talk status
```

### Call quality issues

- The voice server uses Gemini Flash by default for fast responses
- Ensure your Gemini API key is valid and has quota
- For lower latency, run the voice server geographically close to your users
- ElevenLabs voice quality is configured on their dashboard, not in this plugin

## Development

```bash
# Clone and install
git clone https://github.com/ssdavidai/alfred-talk.git
cd alfred-talk
pip install -r voice-server/requirements.txt

# Link as plugin (no copy, changes take effect on restart)
openclaw plugins install -l .

# Run voice server standalone
python3 voice-server/server.py

# Test webhook
curl -X POST http://localhost:8770/elevenlabs-webhook \
  -H 'Content-Type: application/json' \
  -d '{"type":"post_call_transcription","data":{"conversation

... (truncated)

voice