Tools
ReasoniXlaw
inspired by Deepseek-Reasonix, deepseek context management plugin for OpenClaw
Install
npm install
npm
README
# ReasoniXlaw
<p align="center">
<img src="./docs/reasonixlaw-logo.png" alt="ReasoniXlaw Logo" width="600">
</p>
**Prefix-cache stable context engine for DeepSeek models โ an OpenClaw plugin.**
> Inspired by [Reasonix](https://github.com/esengine/DeepSeek-Reasonix) โ thanks to the project author for the prefix-stable context management concept.
English | [ไธญๆ](./README_CN.md)
## Problem
DeepSeek's API offers **prefix caching**: if the beginning of your request is identical to a previous one, cached tokens cost ~10% of normal input price โ up to 90% savings.
But OpenClaw's default context management compresses/modifies messages wherever it sees fit when context fills up. This **destroys the prefix**, breaking DeepSeek's cache.
## Solution
A **ContextEngine plugin** that manages context in three layers:
```
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Layer 1: LOCKED PREFIX โ โ Never modified. Cache hits here.
โ System prompt + early history โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Layer 2: ACTIVE TAIL โ โ Append-only between compactions.
โ Recent messages โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Layer 3: COMPRESSED MIDDLE โ โ Only layer that ever changes.
โ Summarised older messages โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
```
Compaction only touches the middle layer. The prefix is **never modified**.
DeepSeek sees the same prefix every turn โ cache hit โ 90% cost reduction.
## Architecture
This is a **ContextEngine plugin**, not an AgentHarness. That means:
| Component | Who owns it |
|-----------|-------------|
| Context assembly & compaction | **This plugin** (prefix-stable) |
| Tool execution | OpenClaw PI |
| Auth & API keys | OpenClaw PI |
| Streaming | OpenClaw PI |
| Retries & fallback | OpenClaw PI |
| Transcript persistence | OpenClaw PI |
| Reasoning effort | PI passes through to DeepSeek (your config has `supportsReasoningEffort: true`) |
| Cache stats | Available via `ContextEngineRuntimeContext.promptCache` |
We plug into PI's execution loop at the right points (`assemble`, `compact`) and
let PI handle everything else. No custom HTTP client, no tool bridge, no auth handling.
## Installation
**Via ClawHub (recommended):**
```bash
openclaw plugins install clawhub:@dendim0n/reasonixlaw
```
**From source:**
```bash
cd ~/.openclaw/plugins
git clone https://github.com/Dendim0n/ReasoniXlaw.git
cd ReasoniXlaw
npm install
npm run build
```
## Configuration
```json5
// ~/.openclaw/openclaw.json
{
plugins: {
entries: {
"deepseek-harness": { enabled: true }
},
slots: {
contextEngine: "deepseek-prefix-stable"
}
}
}
```
That's it. When a DeepSeek model is in use, PI will automatically use our prefix-stable context engine.
### Custom model targets
By default, `deepseek-v4-flash` and `deepseek-v4-pro` trigger prefix-stable mode. To add your own models:
```json5
// ~/.openclaw/openclaw.json
{
plugins: {
entries: {
"deepseek-harness": {
enabled: true,
config: {
targetModels: ["deepseek-v4-flash", "deepseek-v4-pro", "deepseek-v3", "my-custom-deepseek"]
}
}
},
slots: {
contextEngine: "deepseek-prefix-stable"
}
}
}
```
`targetModels` **replaces** the default โ include the defaults if you want to keep them.
### Model Detection
The engine activates **only** for DeepSeek models. It checks the model string:
| Model | Prefix-stable mode |
|-------|-------------------|
| `deepseek-v4-flash` | โ
Active |
| `deepseek-v4-pro` | โ
Active |
| `deepseek/deepseek-v4-pro` | โ
Active (provider/model format) |
| `deepseek-v3`, `deepseek-r1` | โ
Active |
| `gemini-2.5-flash` | โ Passthrough (PI default) |
| `gpt-4o` | โ Passthrough (PI default) |
| unknown/undefined | โ
Active (safe default) |
Non-DeepSeek models get PI's default context management โ no overhead, no interference.
### Optional tuning
The context engine reads config from the plugin entry:
| Option | Default | Description |
|--------|---------|-------------|
| `prefixLockCount` | 2 | Messages to lock into prefix layer |
| `recentKeepCount` | 8 | Recent messages to keep verbatim in tail |
| `compactRatio` | 0.8 | Context ratio that triggers compaction |
| `archiveDropped` | true | Archive dropped messages to disk |
| `targetModels` | `["deepseek-v4-flash", "deepseek-v4-pro"]` | Model name patterns that activate prefix-stable mode |
#### `prefixLockCount` (default: 2)
Number of messages locked into the prefix layer. These 2 messages (typically the system prompt + first user message) **never change** โ they're the core of DeepSeek's cache hit. Too large wastes context space; too small doesn't lock enough prefix. 2 is usually sufficient.
#### `recentKeepCount` (default: 8)
Number of recent messages kept verbatim in the tail layer. These messages are **not summarized** during compaction. 8 means your last 4 conversation turns retain full detail. Too large โ context bloats quickly; too small โ recent memory gets fuzzy.
#### `compactRatio` (default: 0.8)
Triggers compaction when token usage reaches 80% of the context window. 0.8 is a sweet spot โ too early (0.5) wastes space; too late (0.95) risks PI force-truncating before compaction finishes.
#### `archiveDropped` (default: true)
Whether messages dropped during compaction are archived to `~/.openclaw/deepseek-harness/archive/` (JSONL format). Useful for post-hoc auditing; disable to save disk I/O.
#### `targetModels`
Which model names trigger prefix-stable mode. This **replaces** (not appends to) the default list โ so if you customize it, include the defaults too, otherwise `deepseek-v4-flash` etc. won't activate.
> **Generally the defaults work fine.** Only tune if: context is tight (lower `prefixLockCount`), you need more precise recent memory (raise `recentKeepCount`), or context overflows frequently (lower `compactRatio`).
## How It Works
1. **First `assemble()` call**: System prompt + first N messages โ locked into Layer 1 (prefix)
2. **Subsequent calls**: New messages โ appended to Layer 2 (tail) only
3. **Compaction**: When context fills โ only Layer 3 changes โ prefix stays stable
4. **DeepSeek sees identical prefix** โ cached tokens at 10% price
## Key Design Decisions
### Why ContextEngine, not AgentHarness?
The AgentHarness interface replaces the **entire** execution loop. That's what Codex needs (it has its own app-server). But we don't want to replace tool execution, auth, streaming, or retries โ we only want to manage context differently.
The ContextEngine interface is **exactly** the right abstraction:
- `assemble()` โ we control what goes into the prompt
- `compact()` โ we control how context is reduced
- PI handles everything else
### Why not a custom API client?
PI's model transport (`params.model`) already:
- Handles auth (API key resolution)
- Supports streaming
- Passes through `reasoning_effort` (DeepSeek's config has `supportsReasoningEffort: true`)
- Reports `promptCache` info (cache hit/miss stats)
Building our own HTTP client would duplicate all of this and lose auth integration.
## Token Economics
| Scenario | Default PI | This Engine |
|----------|-----------|-------------|
| 100K conversation, cache hit rate | ~10% | ~80%+ |
| Cost per turn (V4 Pro, 1M ctx) | ~$0.15 | ~$0.03 |
| Long session (50 turns) | ~$7.50 | ~$1.50 |
*Based on DeepSeek pricing: $0.14/M input, $0.028/M cached.*
## Tests
```bash
npm test
```
## Files
| File | Purpose |
|------|---------|
| `src/types.ts` | Configuration, token estimation |
| `src/context-engine.ts` | ContextEngine implementation (core logic) |
| `src/index.ts` | Plugin entry โ registers context engine |
| `tests/context-engine.test.ts` | Unit tests |
| `openclaw.plugin.json` | Plugin manifest |
## License
MIT
tools
Comments
Sign in to leave a comment