Smart Router

Name: Smart Router
Rating: 3.5 (1 reviews)
Author: joshuaswarren

By joshuaswarren 👁 27 views ▲ 0 votes

ARCHIVED: See openclaw-tactician for the active version

GitHub

Install

npm install &&

Configuration Example

{
  "plugins": {
    "openclaw-smart-router": {
      "mode": "dry-run",
      "providers": {
        "anthropic": {
          "quotaSource": "self-tracked",
          "quotaType": "tokens",
          "tier": "premium",
          "resetSchedule": { "type": "weekly", "dayOfWeek": 3, "hour": 7 }
        },
        "openai-codex": {
          "quotaSource": "self-tracked",
          "quotaType": "messages",
          "tier": "premium",
          "resetSchedule": { "type": "fixed", "fixedDate": "2026-02-09T14:36:00Z" }
        },
        "openrouter": {
          "quotaSource": "api",
          "quotaType": "budget",
          "tier": "budget"
        }
      }
    }
  }
}

README

# openclaw-smart-router

[![npm version](https://badge.fury.io/js/openclaw-smart-router.svg)](https://badge.fury.io/js/openclaw-smart-router)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

Intelligent model routing for OpenClaw with quota prediction, task classification, and automatic optimization.

## What It Does

**Smart Router** helps you get the most out of your LLM quotas by:

- **Predicting exhaustion** - Know when you'll run out of tokens before it happens
- **Analyzing workloads** - Identify which cron jobs and agents can use cheaper models
- **Automatic optimization** - Shift workloads to appropriate models based on task complexity
- **Local model support** - Route simple tasks to MLX, Ollama, or other local servers
- **Budget tracking** - Monitor spend on pay-per-token providers like OpenRouter

## Quick Start

### 1. Install

```bash
cd ~/.openclaw/extensions
git clone https://github.com/joshuaswarren/openclaw-smart-router.git
cd openclaw-smart-router
npm install && npm run build
```

### 2. Enable in openclaw.json

```json
{
  "plugins": {
    "openclaw-smart-router": {
      "mode": "dry-run",
      "providers": {
        "anthropic": {
          "quotaSource": "self-tracked",
          "quotaType": "tokens",
          "tier": "premium",
          "resetSchedule": { "type": "weekly", "dayOfWeek": 3, "hour": 7 }
        },
        "openai-codex": {
          "quotaSource": "self-tracked",
          "quotaType": "messages",
          "tier": "premium",
          "resetSchedule": { "type": "fixed", "fixedDate": "2026-02-09T14:36:00Z" }
        },
        "openrouter": {
          "quotaSource": "api",
          "quotaType": "budget",
          "tier": "budget"
        }
      }
    }
  }
}
```

### 3. Restart Gateway

```bash
kill -USR1 $(pgrep openclaw-gateway)
```

### 4. Check Status

```bash
openclaw router status
```

## Usage

### CLI Commands

```bash
# Show provider status and usage
openclaw router status [provider]

# Predict quota exhaustion
openclaw router predict [--hours=24]

# List configured providers
openclaw router providers

# Manually set usage (e.g., after checking your account)
openclaw router set-usage <provider> <percent|tokens>
# Examples:
openclaw router set-usage anthropic 79%
openclaw router set-usage openai-codex 91%

# Reset quota counter after provider reset
openclaw router reset <provider>

# Analyze crons/agents for optimization opportunities
openclaw router analyze [--type=all|crons|agents]

# Generate and optionally apply optimizations
openclaw router optimize [--apply] [--safe-only]

# Detect local model servers
openclaw router detect-local

# Get or set operation mode
openclaw router mode [manual|dry-run|auto]
```

### Conversational Interface

Chat with OpenClaw using these capabilities:

```
"What's my token usage looking like?"
→ Calls router_status tool

"When will I run out of Codex tokens?"
→ Calls router_predict tool

"Which of my cron jobs could use cheaper models?"
→ Calls router_analyze tool

"Optimize my model usage"
→ Calls router_optimize tool (with confirmation)

"Move everything off Anthropic"
→ Calls router_shift tool
```

## Operation Modes

| Mode | Behavior |
|------|----------|
| `manual` | CLI only. No automatic changes. |
| `dry-run` | Preview optimizations. Ask before applying. (Default) |
| `auto` | Automatically apply safe (reversible) optimizations. |

## Quota Tracking

### How Usage is Tracked

Most LLM providers don't expose usage APIs, so this plugin uses multiple tracking strategies:

| Provider | Method | Notes |
|----------|--------|-------|
| **OpenRouter** | API | ✅ Real-time usage via `/api/v1/auth/key` |
| **Anthropic** | Self-tracked | No usage API. We count tokens from responses. |
| **OpenAI Codex** | Self-tracked | Uses message-based quotas (Pro: 300-1500/5hrs). We count messages. |
| **Google** | Self-tracked | Free tier: 1000 RPD. We count requests. |
| **Kimi, Z.ai, etc.** | Self-tracked | No usage APIs. We count tokens from responses. |

### Quota Types

| Type | Unit | Example Providers |
|------|------|-------------------|
| `tokens` | Input + output tokens | Anthropic, OpenAI API |
| `messages` | Conversations/completions | OpenAI Codex (tier-based) |
| `requests` | API calls per day | Google free tier |
| `budget` | USD spend | OpenRouter |

### Self-Tracked Providers

For providers without usage APIs, we track usage ourselves via the `llm_end` hook:

1. **We don't know your actual limits** - Set them manually with `router set-usage`
2. **Tracking starts from zero** - Historical usage before plugin install is unknown
3. **Reset timing may drift** - We reset when configured, not when your provider does

**Recommended workflow:**

```bash
# Check your provider's dashboard for current usage
# Then sync the plugin:
openclaw router set-usage anthropic 79%
openclaw router set-usage openai-codex 91%
```

## Configuration Reference

```json
{
  "plugins": {
    "openclaw-smart-router": {
      // Operation mode: manual, dry-run, auto
      "mode": "dry-run",

      // Enable debug logging
      "debug": false,

      // Provider-specific configuration
      "providers": {
        "anthropic": {
          // How to track: api, manual, unlimited, self-tracked
          "quotaSource": "self-tracked",
          // What the limit measures: tokens, requests, messages, budget
          "quotaType": "tokens",
          // Optional: set a limit for warnings (omit if unknown)
          // "limit": 10000000,
          // When quota resets
          "resetSchedule": {
            "type": "weekly",    // daily, weekly, monthly, fixed
            "dayOfWeek": 3,      // 0=Sunday (for weekly)
            "hour": 7,           // Hour of reset (0-23)
            "timezone": "America/Chicago"
          },
          // Cost tier: premium, standard, budget, free, local
          "tier": "premium",
          // Priority within tier (higher = preferred)
          "priority": 100
        },
        "openrouter": {
          "quotaSource": "api",
          "budget": {
            "monthlyLimit": 10.00,
            "alertThreshold": 0.8
          },
          "tier": "budget"
        },
        "local-mlx": {
          "quotaSource": "unlimited",
          "tier": "local",
          "local": {
            "type": "mlx",
            "endpoint": "http://localhost:8080",
            "models": ["mlx-community/Llama-3.2-3B-Instruct-4bit"]
          }
        }
      },

      // Minimum quality scores by task type
      "qualityThresholds": {
        "coding": 0.8,
        "reasoning": 0.75,
        "creative": 0.6,
        "simple": 0.4
      },

      // How far ahead to predict (hours)
      "predictionHorizonHours": 24,

      // Alert thresholds (0-1)
      "warningThreshold": 0.8,
      "criticalThreshold": 0.95,

      // Auto-optimization interval (minutes)
      "optimizationIntervalMinutes": 60,

      // When to use local models: never, simple-only, when-available, prefer
      "localModelPreference": "simple-only"
    }
  }
}
```

## How It Works

### Architecture

```
┌──────────────────────────────────────────────────────────────────┐
│                     openclaw-smart-router                         │
├──────────────────────────────────────────────────────────────────┤
│                                                                   │
│  ┌─────────────┐   ┌─────────────┐   ┌─────────────────────────┐ │
│  │   Quota     │   │ Capability  │   │      Optimization       │ │
│  │  Tracker    │   │   Scorer    │   │        Engine           │ │
│  └─────┬───────┘   └──────┬──────┘   └───────────┬─────────────┘ │
│        │                  │                       │               │
│        ▼                  ▼                       ▼               │
│  ┌─────────────────────────────────────────────────────────────┐ │
│  │                   Provider Registry                          │ │
│  │  ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌────────┐ │ │
│  │  │Anthropic│ │ OpenAI  │ │ Google  │ │OpenRouter│ │ Local  │ │ │
│  │  └─────────┘ └─────────┘ └─────────┘ └─────────┘ └────────┘ │ │
│  └─────────────────────────────────────────────────────────────┘ │
│                                                                   │
│  ┌─────────────────────────────────────────────────────────────┐ │
│  │                      Interface Layer                         │ │
│  │  ┌──────────────────────┐  ┌───────────────────────────────┐ │ │
│  │  │     CLI Commands     │  │     Agent Tools (Chat)        │ │ │
│  │  └──────────────────────┘  └───────────────────────────────┘ │ │
│  └─────────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────┘
```

### Task Classification

The plugin analyzes prompts to determine task complexity:

| Signal | Classification | Quality Threshold |
|--------|----------------|-------------------|
| Code keywords, ``` blocks | Coding | 0.8 |
| "analyze", "design", "strategy" | Reasoning | 0.75 |
| "write", "story", "creative" | Creative | 0.6 |
| "summarize", "list", "check" | Simple | 0.4 |

### Model Capability Scoring

Each model is scored on capability dimensions (0-1):

- **coding** - Code generation and debugging
- **reasoning** - Logic, math, analysis
- **creative** - Writing, brainstorming
- **instruction** - Following complex instructions
- **context** - Long context handling
- **speed** - Response latency

Default scores are provided for common models. Override with manual scores in config.

### Optimization Flow

1. **Analyze** - Scan cron jobs and agents for optimization opportunities
2. **Score** - Match task requirements to model capabilities
3. **Plan** - Generate actions (change model, add fallback, split job)
4. **Apply** - Execute changes (dry-run or live based on mode)

## Local Model Support

The plugin auto-detects these local servers:

| Server | Default Port | Detection

... (truncated)

tools