Waha

Name: Waha
Rating: 3.5 (1 reviews)
Author: omernesh

By omernesh 👁 180 views ▲ 0 votes

OpenClaw channel plugin for WhatsApp via WAHA — includes DM keyword filter and web admin panel

GitHub

Configuration Example

"dmFilter": {
  "enabled": true,
  "mentionPatterns": ["sammie", "help", "hello", "bot", "ai"],
  "godModeBypass": true,
  "godModeSuperUsers": [
    { "identifier": "972544329000", "platform": "whatsapp", "passwordRequired": false }
  ],
  "tokenEstimate": 2500
}

README

# OpenClaw WAHA Plugin — Developer Reference

**Plugin ID:** `waha`
**Platform:** WhatsApp (via WAHA HTTP API)
**Last updated:** 2026-03-07

---

## 1. Overview

This plugin bridges OpenClaw AI agents to WhatsApp through the WAHA (WhatsApp HTTP API) server. It enables the "Sammie" bot to receive WhatsApp messages via webhook, route them through OpenClaw's AI agent pipeline, and deliver replies back through WAHA — including text responses and TTS-generated voice notes.

The plugin operates as a channel adapter within the OpenClaw plugin-sdk framework. It:

- Runs an HTTP webhook server to receive inbound WAHA events
- Applies access control (DM policy, group allowlists with both `@c.us` and `@lid` JID formats)
- Simulates human-like presence (read receipts, typing indicators with random pauses) before replying
- Delivers AI-generated text and voice replies through WAHA's REST API
- Enforces session guardrails (only the "logan" session can send outbound messages)

---

## 2. File Listing

| File | Lines | Description |
|------|-------|-------------|
| `channel.ts` | ~340 | Channel plugin registration and lifecycle. Exports the `ChannelPlugin` definition with metadata, capabilities (reactions, media, markdown), account resolution, and outbound delivery adapter. Wires up the webhook monitor, inbound handler, and send functions. |
| `inbound.ts` | ~380 | Inbound message handler. Receives parsed `WahaInboundMessage` from the monitor, applies DM/group access control via `resolveDmGroupAccessWithCommandGate`, runs the DM keyword filter, starts the human presence simulation, dispatches the message to the AI agent, and delivers the reply. |
| `dm-filter.ts` | ~145 | DM keyword filter. `DmFilter` class with regex caching, god mode bypass for super-users, and stats tracking (dropped/allowed/tokensEstimatedSaved). Fail-open: any error allows messages through. |
| `send.ts` | ~250 | WAHA REST API wrappers. Provides `sendWahaText()`, `sendWahaMediaBatch()`, `sendWahaReaction()`, `sendWahaPresence()`, `sendWahaSeen()`, and the internal `callWahaApi()` HTTP client. Includes `assertAllowedSession()` guardrail, `buildFilePayload()` for base64 encoding of local TTS files, and `resolveMime()` for MIME type detection with file-extension fallback. |
| `presence.ts` | ~170 | Human mimicry presence system. Implements the 4-phase presence simulation: seen, read delay, typing with random pauses (flicker), and reply-length padding. Exports `startHumanPresence()` which returns a `PresenceController` with `finishTyping()` and `cancelTyping()` methods. |
| `types.ts` | ~130 | TypeScript type definitions. Defines `CoreConfig`, `WahaChannelConfig`, `WahaAccountConfig`, `PresenceConfig`, `DmFilterConfig`, `WahaWebhookEnvelope`, `WahaInboundMessage`, `WahaReactionEvent`, and `WahaWebhookConfig`. |
| `config-schema.ts` | ~86 | Zod validation schema for the `channels.waha` config section. Validates all account-level and channel-level settings including secret inputs, policies, presence parameters, DM filter config, and markdown options. |
| `accounts.ts` | ~140 | Multi-account resolution. Resolves which WAHA account (baseUrl, apiKey, session) to use for a given operation. Supports a default account plus named sub-accounts under `channels.waha.accounts`. Handles API key resolution from env vars, files, or direct strings. |
| `normalize.ts` | ~30 | JID normalization utilities. `normalizeWahaMessagingTarget()` strips `waha:`, `whatsapp:`, `chat:` prefixes. `normalizeWahaAllowEntry()` lowercases for allowlist comparison. `resolveWahaAllowlistMatch()` checks if a sender JID is in the allowlist (supports `*` wildcard). |
| `monitor.ts` | ~506 | Webhook HTTP server, health monitoring, and admin panel. Starts an HTTP server on the configured port (default 8050). Handles `/healthz`, `/admin` (HTML dashboard), `/api/admin/stats` (JSON stats), and the main webhook path. Validates HMAC signatures and dispatches inbound events. |
| `runtime.ts` | ~15 | Runtime singleton access. `setWahaRuntime()` / `getWahaRuntime()` store and retrieve the OpenClaw `PluginRuntime` instance for use across modules. |
| `signature.ts` | ~30 | HMAC webhook verification. `verifyWahaWebhookHmac()` validates the `X-Webhook-Hmac` header using SHA-512, accepting hex or base64 signature formats. Uses `crypto.timingSafeEqual()` for constant-time comparison. |
| `secret-input.ts` | ~15 | Secret field schema. Re-exports OpenClaw SDK secret input utilities and provides `buildSecretInputSchema()` which accepts either a plain string or a `{ source, provider, id }` object for env/file/exec-based secret resolution. |

---

## 3. DM Keyword Filter

The DM keyword filter (`dm-filter.ts`) gates inbound DMs by keyword BEFORE they reach the AI agent. Only messages matching at least one pattern are processed; others are silently dropped. This prevents the AI from consuming tokens on irrelevant or unsolicited messages.

### Config (under `channels.waha`)

```json
"dmFilter": {
  "enabled": true,
  "mentionPatterns": ["sammie", "help", "hello", "bot", "ai"],
  "godModeBypass": true,
  "godModeSuperUsers": [
    { "identifier": "972544329000", "platform": "whatsapp", "passwordRequired": false }
  ],
  "tokenEstimate": 2500
}
```

| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `enabled` | `boolean` | `false` | Enable/disable the filter |
| `mentionPatterns` | `string[]` | `[]` | Regex patterns (case-insensitive). Message must match at least one. Empty list means no restriction. |
| `godModeBypass` | `boolean` | `true` | Super-users bypass the filter entirely |
| `godModeSuperUsers` | `array` | `[]` | List of users who bypass the filter (phone in E.164 or JID format) |
| `tokenEstimate` | `number` | `2500` | Estimated tokens saved per dropped message (used for stats display) |

### Behavior

- **Filter disabled**: All messages pass through (stats count as allowed)
- **No patterns**: All messages pass through (no restriction configured)
- **God mode**: Super-users bypass pattern matching entirely. Israeli phone normalization handles 05X/972X/+972X and JID suffixes (`@c.us`, `@lid`, `@s.whatsapp.net`)
- **Pattern match**: Message is allowed if ANY pattern matches (case-insensitive regex)
- **No match**: Message is silently dropped — no reply, no error, no pairing message
- **Fail-open**: Any error in the filter allows the message through (avoids outages from filter bugs)

### Regex caching

Patterns are compiled to `RegExp` objects once and cached. The cache key is the joined pattern array. If config updates (e.g. via `updateConfig()`), the cache is invalidated and rebuilt on next check.

### Stats tracking

The filter maintains runtime counters per account:
- `dropped`: messages silently dropped
- `allowed`: messages passed through
- `tokensEstimatedSaved`: `dropped * tokenEstimate` — rough estimate of AI tokens saved

Recent events (last 50) are stored in memory with timestamp, pass/fail, reason, and text preview.

---

## 4. Admin Panel

A browser-based admin panel is served at `http://<host>:<webhookPort>/admin` (default port 8050).

### Access

```
http://100.114.126.43:8050/admin
```

### Features

- **DM Filter card**: Shows enabled status, keyword patterns, stats (dropped/allowed/tokens saved), and a live event log (last 20 events with timestamp, reason, and message preview)
- **Presence System card**: Displays current presence config (wpm, read delays, typing durations, jitter)
- **Access Control card**: Shows dmPolicy, groupPolicy, allowFrom, groupAllowFrom, and allowedGroups
- **Session Info card**: Shows session name, baseUrl, webhookPort, and server time
- **Auto-refresh**: Reloads stats every 30 seconds. Manual refresh via button.

### Stats API

```bash
curl http://100.114.126.43:8050/api/admin/stats
```

Returns JSON:
```json
{
  "dmFilter": {
    "enabled": true,
    "patterns": ["sammie", "help"],
    "stats": { "dropped": 5, "allowed": 12, "tokensEstimatedSaved": 12500 },
    "recentEvents": [
      { "ts": 1772902231754, "pass": false, "reason": "no_keyword_match", "preview": "hello world" }
    ]
  },
  "presence": { "enabled": true, "wpm": 42, ... },
  "access": { "dmPolicy": "pairing", "allowFrom": [...], ... },
  "session": "3cf11776_logan",
  "webhookPort": 8050,
  "serverTime": "2026-03-07T18:50:00.000Z"
}
```

### Implementation notes

- Zero build tooling: the entire admin dashboard is an embedded HTML/CSS/JS template string in `monitor.ts`
- Admin routes are added BEFORE the POST-only webhook guard in the HTTP server handler
- No authentication on admin routes (only accessible from localhost by default since `webhookHost: 0.0.0.0` binds to all interfaces — restrict via firewall if needed)

---

## 5. Human Mimicry Presence System

### Problem

A bot that instantly shows "typing..." and replies in 200ms is obviously non-human. WhatsApp users notice deterministic timing patterns, which degrades the conversational experience.

### Solution

The presence system simulates a 4-phase human interaction pattern with randomized timing at every step:

```
Phase 1: SEEN         Phase 2: READ         Phase 3: TYPING         Phase 4: REPLY
                                             (with pauses)
  [msg arrives]  -->  [blue ticks]  -->  [typing... ···]  -->  [send message]
       |                   |                    |
       v                   v                    v
   sendSeen()        sleep(readDelay)    typing ON/OFF flicker
                                         (random pauses)
                                         + padding if AI was fast
```

### Flow Detail

1. **Seen** (`sendSeen`): If enabled, immediately marks the message as read (blue ticks).
2. **Read Delay** (`readDelayMs`): Pauses to simulate the time a human takes to read the incoming message. Duration scales with message length (`msPerReadChar * charCount`), clamped to `readDelayMs` bounds, then jittered.
3. **Typing with Flicker**: Sets typing indicator ON, then enters a loop 

... (truncated)

channels