Tools
Hookclaw
OpenClaw Memory RAG Plugin โ auto-hydrates prompts with relevant memories via before_agent_start hook
Install
openclaw plugins install --link
Configuration Example
{
"plugins": {
"entries": {
"hookclaw": {
"enabled": true,
"config": {
"maxResults": 3,
"minScore": 0.5,
"maxContextChars": 2000
}
}
}
}
}
README
# HookClaw โ OpenClaw Memory RAG Plugin
Automatically injects relevant memories into every prompt via the `before_agent_start` hook. No more context loss after compaction.
## Why?
OpenClaw agents lose context after compaction. The built-in `memory-core` plugin indexes your conversations, but the agent only searches memory when it *decides* to โ which means important context silently disappears when the context window fills up.
HookClaw fixes this by intercepting every prompt *before* the model starts reasoning. It embeds the prompt, searches the memory vector index, and prepends the top-k relevant chunks as context. The model sees relevant memories alongside the user's message without any agent-side tool calls.
**Result:** Your agent remembers what you discussed yesterday, last week, or last month โ automatically, on every message.
## How It Works
1. User sends a message (Telegram, WhatsApp, etc.)
2. OpenClaw fires `before_agent_start` before the model processes the prompt
3. HookClaw embeds the prompt via the configured embedding provider (e.g. Gemini)
4. The embedding is searched against the memory vector index (SQLite + sqlite-vec)
5. Top-k results above the similarity threshold are formatted as XML context
6. Context is returned via `{ prependContext }` โ OpenClaw prepends it to the prompt
7. The model sees: `[relevant memories] + [user's message]`
Typical latency: **150-350ms** (embedding API is the bottleneck; SQLite vector search is <20ms).
## Prerequisites
- **OpenClaw** v2026.2.9 or later (requires plugin SDK with `before_agent_start` hook support)
- **Node.js** 20+ (uses ES modules, `node:test` runner)
- **memory-core** plugin enabled and configured โ HookClaw searches the index that memory-core builds. Without memory-core, there's nothing to search.
Verify memory-core is active and has indexed content:
```bash
openclaw memory status --json
# Look for: "files": N, "chunks": N (both should be > 0)
```
If `chunks: 0`, you need to build the index first. Memory-core indexes files in your workspace's `memory/` directory. See [OpenClaw memory docs](https://docs.openclaw.ai/cli/memory) for setup.
## Installation
```bash
# Clone the plugin
git clone https://github.com/jduar005/hookclaw.git ~/hookclaw
# Install as a linked plugin (symlink โ enables live editing)
openclaw plugins install --link ~/hookclaw
# Restart the gateway to load the plugin
systemctl --user restart openclaw-gateway # Linux (systemd)
# โ or โ
launchctl kickstart -k gui/$(id -u)/openclaw-gateway # macOS (launchd)
# โ or โ
openclaw gateway restart # If running manually / Docker
```
### Verify installation
After restart, check the gateway logs for the registration message:
```bash
journalctl --user -u openclaw-gateway --since "1 min ago" | grep hookclaw
```
Expected output:
```
hookclaw: registered before_agent_start hook (maxResults=3, minScore=0.5, timeout=2000ms, format=xml)
```
You can also verify via the CLI:
```bash
openclaw plugins list
# Should show: HookClaw Memory RAG | hookclaw | loaded | ~/hookclaw/index.js | 1.1.0
```
## Configuration
All settings are optional. HookClaw works out of the box with sensible defaults. To override, add a `config` block to the plugin entry in `~/.openclaw/openclaw.json`:
```json
{
"plugins": {
"entries": {
"hookclaw": {
"enabled": true,
"config": {
"maxResults": 3,
"minScore": 0.5,
"maxContextChars": 2000
}
}
}
}
}
```
This merges with your existing `openclaw.json` โ you only need to add the `hookclaw` key inside `plugins.entries`. Any keys you omit use the defaults below.
| Option | Default | Description |
|--------|---------|-------------|
| `maxResults` | 3 | Max memory chunks to inject per prompt |
| `minScore` | 0.5 | Minimum similarity score threshold (0-1) |
| `maxContextChars` | 2000 | Max total characters of injected context |
| `timeoutMs` | 2000 | Memory search timeout (ms) |
| `logInjections` | true | Log injection/skip events to gateway logs |
| `formatTemplate` | `"xml"` | Context format: `"xml"` or `"markdown"` |
| `skipShortPrompts` | 20 | Skip prompts shorter than N chars (saves embedding calls) |
| `cacheSize` | 20 | Max entries in the prompt dedup LRU cache |
| `cacheTtlMs` | 300000 | Cache TTL in ms (default 5 min) |
| `adaptiveResults` | true | Vary result count based on score quality |
### Glossary
- **Memory index** โ The SQLite database where memory-core stores embedded chunks of your conversation history and memory files. Located at `~/.openclaw/memory/main.sqlite`.
- **Chunk** โ A section of a memory file (typically 15-40 lines) that has been embedded as a vector. Each chunk is independently searchable.
- **Similarity score** โ A 0-1 value indicating how semantically similar a chunk is to the current prompt. Higher = more relevant. Produced by comparing embedding vectors.
- **Embedding provider** โ The API used to convert text into vectors (e.g. Gemini `embedding-001`, OpenAI `text-embedding-3-small`). Configured in memory-core, not HookClaw.
## Tuning Guide
The defaults are tuned for precision over recall. Here's how to adjust for your setup.
### minScore โ the most important knob
This controls what counts as "relevant." Gemini embedding similarity scores typically range from 0.35 (noise) to 0.75+ (strong match). Setting this too low floods the model with irrelevant context; too high and useful memories get filtered out.
| minScore | Behavior | Use when... |
|----------|----------|-------------|
| 0.30 | Firehose โ almost everything matches | Never recommended; even "hello" scores 0.40+ |
| 0.45 | Loose โ some noise gets through | Large diverse memory index, want broad recall |
| **0.50** | **Balanced โ default** | **Most setups; good precision/recall tradeoff** |
| 0.55 | Tight โ only strong matches | Small focused memory index, want surgical precision |
| 0.65+ | Very strict โ few injections | Only want near-exact topic matches |
**How to calibrate:** Enable `logInjections`, use your agent normally for a day, then check logs. If you see frequent `no relevant memories found` on prompts that *should* have matched, lower the threshold. If you see injections on generic prompts, raise it.
### maxResults โ less is more
Each injected chunk consumes model context. More chunks = more distraction potential. In practice, 2-3 highly relevant chunks outperform 5 mediocre ones.
| maxResults | Context cost | Best for... |
|------------|-------------|-------------|
| 1-2 | ~500-1000 chars | Agents with tight context budgets or small memory indexes |
| **3** | **~1500-2000 chars** | **Default; good balance of breadth and focus** |
| 5 | ~3000-4000 chars | Large memory indexes with diverse topics |
### skipShortPrompts โ save embedding API calls
Short prompts ("hi", "ok", "thanks") produce meaningless embeddings. Skip them to save latency and API costs.
| skipShortPrompts | Filters out... |
|------------------|---------------|
| 10 | Single words only ("hi", "ok") |
| **20** | **Short phrases ("hello how are you", "sounds good thanks")** |
| 40 | Most conversational messages |
### maxContextChars โ budget your tokens
Controls total character limit across all injected chunks. Chunks are included in order of relevance score until this limit is reached.
| maxContextChars | Roughly... | Good for... |
|-----------------|-----------|-------------|
| 1000 | ~250 tokens | Very constrained contexts |
| **2000** | **~500 tokens** | **Default; enough for 2-3 meaningful chunks** |
| 4000 | ~1000 tokens | When you need full paragraphs of context |
### Recommended starting configs
**Surgical (small memory, focused agent):**
```json
{ "maxResults": 2, "minScore": 0.55, "maxContextChars": 1500, "skipShortPrompts": 20 }
```
**Balanced (default):**
```json
{ "maxResults": 3, "minScore": 0.50, "maxContextChars": 2000, "skipShortPrompts": 20 }
```
**Broad recall (large memory, general assistant):**
```json
{ "maxResults": 5, "minScore": 0.45, "maxContextChars": 4000, "skipShortPrompts": 15 }
```
### Watching the logs
With `logInjections: true`, every prompt produces a log line:
```
hookclaw: #1 injecting 3 memories (189ms, top score: 0.529) โ context injected
hookclaw: #2 no relevant memories found (193ms) โ searched but nothing passed minScore
hookclaw: #3 skip โ prompt too short (5 chars) โ skipped entirely, no API call
hookclaw: #4 cache hit (0ms) โ same prompt seen recently, reused result
```
OpenClaw's own `agent/embedded` subsystem independently confirms each injection:
```
hooks: prepended context to prompt (1847 chars)
```
If you see the first line but not the second, the hook returned context but OpenClaw didn't apply it โ check your OpenClaw version supports `prependContext` in hook results.
## Context Format
### XML (default)
```xml
<relevant_memories>
<memory source="memory" path="memory/2026-02-12.md" lines="236-258" score="0.749">
Chunk text here...
</memory>
</relevant_memories>
```
### Markdown
```markdown
---
**Relevant Memories:**
> *memory* | `memory/2026-02-12.md` | lines 236-258 | (score: 0.749)
Chunk text here...
---
```
## Testing
```bash
node --test test/*.test.js
```
49 tests covering handler logic, context formatting, prompt cache, and adaptive filtering.
## Error Handling
Every failure mode is non-fatal โ the prompt passes through unmodified:
- **Memory search tool unavailable:** Logged once, all future searches skipped for the session
- **Embedding API timeout:** Caught by `Promise.race` with configurable `timeoutMs`
- **SQLite errors:** Graceful fallback, returns empty results
- **Handler throws:** Caught by OpenClaw hook runner (`catchErrors: true`)
If HookClaw fails, the user's prompt still reaches the model โ just without memory context.
## Troubleshooting
**Plugin doesn't appear in `openclaw plugins list`:**
- Verify `package.json`
... (truncated)
tools
Comments
Sign in to leave a comment