← Back to Plugins
Tools

ReasoniXlaw

Dendim0n By Dendim0n 👁 9 views ▲ 0 votes

inspired by Deepseek-Reasonix, deepseek context management plugin for OpenClaw

GitHub

Install

npm install
npm

README

# ReasoniXlaw

<p align="center">
  <img src="./docs/reasonixlaw-logo.png" alt="ReasoniXlaw Logo" width="600">
</p>

**Prefix-cache stable context engine for DeepSeek models โ€” an OpenClaw plugin.**

> Inspired by [Reasonix](https://github.com/esengine/DeepSeek-Reasonix) โ€” thanks to the project author for the prefix-stable context management concept.

English | [ไธญๆ–‡](./README_CN.md)

## Problem

DeepSeek's API offers **prefix caching**: if the beginning of your request is identical to a previous one, cached tokens cost ~10% of normal input price โ€” up to 90% savings.

But OpenClaw's default context management compresses/modifies messages wherever it sees fit when context fills up. This **destroys the prefix**, breaking DeepSeek's cache.

## Solution

A **ContextEngine plugin** that manages context in three layers:

```
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Layer 1: LOCKED PREFIX             โ”‚  โ† Never modified. Cache hits here.
โ”‚  System prompt + early history      โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Layer 2: ACTIVE TAIL               โ”‚  โ† Append-only between compactions.
โ”‚  Recent messages                    โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Layer 3: COMPRESSED MIDDLE         โ”‚  โ† Only layer that ever changes.
โ”‚  Summarised older messages          โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
```

Compaction only touches the middle layer. The prefix is **never modified**.
DeepSeek sees the same prefix every turn โ†’ cache hit โ†’ 90% cost reduction.

## Architecture

This is a **ContextEngine plugin**, not an AgentHarness. That means:

| Component | Who owns it |
|-----------|-------------|
| Context assembly & compaction | **This plugin** (prefix-stable) |
| Tool execution | OpenClaw PI |
| Auth & API keys | OpenClaw PI |
| Streaming | OpenClaw PI |
| Retries & fallback | OpenClaw PI |
| Transcript persistence | OpenClaw PI |
| Reasoning effort | PI passes through to DeepSeek (your config has `supportsReasoningEffort: true`) |
| Cache stats | Available via `ContextEngineRuntimeContext.promptCache` |

We plug into PI's execution loop at the right points (`assemble`, `compact`) and
let PI handle everything else. No custom HTTP client, no tool bridge, no auth handling.

## Installation

**Via ClawHub (recommended):**

```bash
openclaw plugins install clawhub:@dendim0n/reasonixlaw
```

**From source:**

```bash
cd ~/.openclaw/plugins
git clone https://github.com/Dendim0n/ReasoniXlaw.git
cd ReasoniXlaw
npm install
npm run build
```

## Configuration

```json5
// ~/.openclaw/openclaw.json
{
  plugins: {
    entries: {
      "deepseek-harness": { enabled: true }
    },
    slots: {
      contextEngine: "deepseek-prefix-stable"
    }
  }
}
```

That's it. When a DeepSeek model is in use, PI will automatically use our prefix-stable context engine.

### Custom model targets

By default, `deepseek-v4-flash` and `deepseek-v4-pro` trigger prefix-stable mode. To add your own models:

```json5
// ~/.openclaw/openclaw.json
{
  plugins: {
    entries: {
      "deepseek-harness": {
        enabled: true,
        config: {
          targetModels: ["deepseek-v4-flash", "deepseek-v4-pro", "deepseek-v3", "my-custom-deepseek"]
        }
      }
    },
    slots: {
      contextEngine: "deepseek-prefix-stable"
    }
  }
}
```

`targetModels` **replaces** the default โ€” include the defaults if you want to keep them.

### Model Detection

The engine activates **only** for DeepSeek models. It checks the model string:

| Model | Prefix-stable mode |
|-------|-------------------|
| `deepseek-v4-flash` | โœ… Active |
| `deepseek-v4-pro` | โœ… Active |
| `deepseek/deepseek-v4-pro` | โœ… Active (provider/model format) |
| `deepseek-v3`, `deepseek-r1` | โœ… Active |
| `gemini-2.5-flash` | โŒ Passthrough (PI default) |
| `gpt-4o` | โŒ Passthrough (PI default) |
| unknown/undefined | โœ… Active (safe default) |

Non-DeepSeek models get PI's default context management โ€” no overhead, no interference.

### Optional tuning

The context engine reads config from the plugin entry:

| Option | Default | Description |
|--------|---------|-------------|
| `prefixLockCount` | 2 | Messages to lock into prefix layer |
| `recentKeepCount` | 8 | Recent messages to keep verbatim in tail |
| `compactRatio` | 0.8 | Context ratio that triggers compaction |
| `archiveDropped` | true | Archive dropped messages to disk |
| `targetModels` | `["deepseek-v4-flash", "deepseek-v4-pro"]` | Model name patterns that activate prefix-stable mode |

#### `prefixLockCount` (default: 2)

Number of messages locked into the prefix layer. These 2 messages (typically the system prompt + first user message) **never change** โ€” they're the core of DeepSeek's cache hit. Too large wastes context space; too small doesn't lock enough prefix. 2 is usually sufficient.

#### `recentKeepCount` (default: 8)

Number of recent messages kept verbatim in the tail layer. These messages are **not summarized** during compaction. 8 means your last 4 conversation turns retain full detail. Too large โ†’ context bloats quickly; too small โ†’ recent memory gets fuzzy.

#### `compactRatio` (default: 0.8)

Triggers compaction when token usage reaches 80% of the context window. 0.8 is a sweet spot โ€” too early (0.5) wastes space; too late (0.95) risks PI force-truncating before compaction finishes.

#### `archiveDropped` (default: true)

Whether messages dropped during compaction are archived to `~/.openclaw/deepseek-harness/archive/` (JSONL format). Useful for post-hoc auditing; disable to save disk I/O.

#### `targetModels`

Which model names trigger prefix-stable mode. This **replaces** (not appends to) the default list โ€” so if you customize it, include the defaults too, otherwise `deepseek-v4-flash` etc. won't activate.

> **Generally the defaults work fine.** Only tune if: context is tight (lower `prefixLockCount`), you need more precise recent memory (raise `recentKeepCount`), or context overflows frequently (lower `compactRatio`).

## How It Works

1. **First `assemble()` call**: System prompt + first N messages โ†’ locked into Layer 1 (prefix)
2. **Subsequent calls**: New messages โ†’ appended to Layer 2 (tail) only
3. **Compaction**: When context fills โ†’ only Layer 3 changes โ†’ prefix stays stable
4. **DeepSeek sees identical prefix** โ†’ cached tokens at 10% price

## Key Design Decisions

### Why ContextEngine, not AgentHarness?

The AgentHarness interface replaces the **entire** execution loop. That's what Codex needs (it has its own app-server). But we don't want to replace tool execution, auth, streaming, or retries โ€” we only want to manage context differently.

The ContextEngine interface is **exactly** the right abstraction:
- `assemble()` โ†’ we control what goes into the prompt
- `compact()` โ†’ we control how context is reduced
- PI handles everything else

### Why not a custom API client?

PI's model transport (`params.model`) already:
- Handles auth (API key resolution)
- Supports streaming
- Passes through `reasoning_effort` (DeepSeek's config has `supportsReasoningEffort: true`)
- Reports `promptCache` info (cache hit/miss stats)

Building our own HTTP client would duplicate all of this and lose auth integration.

## Token Economics

| Scenario | Default PI | This Engine |
|----------|-----------|-------------|
| 100K conversation, cache hit rate | ~10% | ~80%+ |
| Cost per turn (V4 Pro, 1M ctx) | ~$0.15 | ~$0.03 |
| Long session (50 turns) | ~$7.50 | ~$1.50 |

*Based on DeepSeek pricing: $0.14/M input, $0.028/M cached.*

## Tests

```bash
npm test
```

## Files

| File | Purpose |
|------|---------|
| `src/types.ts` | Configuration, token estimation |
| `src/context-engine.ts` | ContextEngine implementation (core logic) |
| `src/index.ts` | Plugin entry โ€” registers context engine |
| `tests/context-engine.test.ts` | Unit tests |
| `openclaw.plugin.json` | Plugin manifest |

## License

MIT
tools

Comments

Sign in to leave a comment

Loading comments...