Tools
Engram
Local-first memory plugin for OpenClaw AI agents. LLM-powered extraction, plain markdown storage, hybrid search via QMD. Gives agents persistent long-term memory across conversations.
Install
npm install
npm
Configuration Example
---
id: fact-1738789200000-a1b2
category: fact
created: 2026-02-05T12:00:00.000Z
updated: 2026-02-05T12:00:00.000Z
source: extraction
confidence: 0.85
confidenceTier: implied
tags: ["tools", "preferences"]
entityRef: tool-qmd
---
QMD supports hybrid search combining BM25 and vector embeddings with reranking.
README
# openclaw-engram
A local-first memory plugin for [OpenClaw](https://github.com/openclaw/openclaw) that gives AI agents persistent, searchable long-term memory across conversations.
Engram uses **LLM-powered extraction** (OpenAI Responses API) to intelligently identify what's worth remembering from each conversation, stores memories as plain **markdown files** on disk, and retrieves relevant context via **[QMD](https://github.com/tobi/qmd)** hybrid search (BM25 + vector + reranking).
## Why Engram?
Most AI memory systems are either too noisy (store everything) or too lossy (store nothing useful). Engram takes a different approach:
- **Signal detection first** -- A fast local regex scan classifies each turn before any API call happens. High-signal turns (corrections, preferences, identity statements) trigger immediate extraction; low-signal turns are batched.
- **Structured extraction** -- An LLM analyzes buffered turns and extracts typed memories (facts, preferences, corrections, entities, decisions, relationships, principles, commitments, moments, skills) with confidence scores.
- **Automatic consolidation** -- Periodic consolidation passes merge duplicates, update entity profiles, refresh the behavioral profile, and expire stale memories.
- **Local-first storage** -- All memories are plain markdown files with YAML frontmatter. No database, no vendor lock-in. Grep them, version them, back them up however you like.
- **Privacy by default** -- Memories never leave your machine unless you choose to sync them. The LLM extraction call is the only external API call.
## Features
### Core Features
- **10 memory categories**: fact, preference, correction, entity, decision, relationship, principle, commitment, moment, skill
- **Confidence tiers**: explicit (0.95-1.0), implied (0.70-0.94), inferred (0.40-0.69), speculative (0.00-0.39)
- **TTL on speculative memories**: Auto-expire after 30 days if unconfirmed
- **Lineage tracking**: Memories track their parent IDs through consolidation merges and updates
- **Entity profiles**: Accumulates facts about people, projects, tools, and companies into per-entity files, with automatic name normalization and periodic deduplication
- **Behavioral profile**: A living `profile.md` that evolves as the system learns about the user, with automatic cap and pruning to control token usage
- **Identity reflection**: Optional self-reflection that helps the agent improve over sessions
- **Question generation**: Generates 1-3 curiosity questions per extraction to drive deeper engagement
- **Commitment lifecycle**: Tracks promises and deadlines with configurable decay (default 90 days)
- **Auto-consolidation**: IDENTITY.md reflections are automatically summarized when they exceed 8KB
- **Smart buffer**: Configurable trigger logic (signal-based, turn count, or time-based)
- **QMD integration**: Hybrid search with BM25, vector embeddings, and reranking
- **Graceful degradation**: Works without QMD (falls back to direct file reads) and without an API key (retrieval-only mode)
- **Migration tools**: Import memories from Honcho, Supermemory, and context files
- **CLI**: Search, inspect, and manage memories from the command line
- **Agent tools**: `memory_search`, `memory_store`, `memory_profile`, `memory_entities`
### v1.2.0 Advanced Features
All advanced features are **disabled by default** for gradual adoption. Enable them in your config as needed.
#### Importance Scoring (Zero-LLM)
- **Local heuristic scoring** at extraction time โ no API calls
- Five tiers: `critical` (0.9-1.0), `high` (0.7-0.9), `normal` (0.4-0.7), `low` (0.2-0.4), `trivial` (0.0-0.2)
- Scores based on: explicit importance markers, personal info, instructions, emotional content, factual density
- Extracts salient keywords for improved search relevance
- Used for **ranking** (not exclusion) โ all memories are still stored and searchable
#### Access Tracking
- Tracks `accessCount` and `lastAccessed` for each memory
- Batched updates during consolidation (zero retrieval latency impact)
- Enables "working set" prioritization โ frequently accessed memories surface higher
- CLI: `openclaw engram access` to view most accessed memories
#### Recency Boosting
- Recent memories ranked higher in search results
- Configurable weight (0-1, default 0.2)
- Exponential decay with 7-day half-life
#### Automatic Chunking
- Sentence-boundary splitting for long memories (>150 tokens)
- Target ~200 tokens per chunk with 2-sentence overlap
- Each chunk maintains `parentId` and `chunkIndex` for context reconstruction
- Preserves coherent thoughts โ never splits mid-sentence
#### Contradiction Detection
- QMD similarity search finds candidate conflicts (fast, cheap)
- LLM verification confirms actual contradictions (prevents false positives)
- Auto-resolve when confidence > 0.9
- Full audit trail: old memory marked `status: superseded` with `supersededBy` link
- Nothing is deleted โ superseded memories remain searchable explicitly
#### Memory Linking (Knowledge Graph)
- Typed relationships: `follows`, `references`, `contradicts`, `supports`, `related`
- LLM suggests links during extraction based on semantic connections
- Links stored in frontmatter with strength scores (0-1)
- Enables graph traversal between related memories
#### Conversation Threading
- Auto-detect thread boundaries (session change or 30-minute gap)
- Auto-generate thread titles from top TF-IDF keywords
- Group memories into conversation threads for context reconstruction
- CLI: `openclaw engram threads` to view threads
#### Memory Summarization
- Triggered when memory count exceeds threshold (default 1000)
- Compresses old, low-importance, unprotected memories into summaries
- **Archive, not delete** โ source memories marked `status: archived`, still searchable
- Protected: recent memories, high-importance, entities, commitments/preferences/decisions
- Summaries stored in `summaries/` directory
#### Topic Extraction
- TF-IDF analysis of the entire memory corpus
- Extracts top N topics (default 50) during consolidation
- Stored in `state/topics.json`
- CLI: `openclaw engram topics` to view extracted topics
## Architecture
```
Conversation turn arrives
|
v
Signal scan (local regex, <10ms, free)
|
v
Append to smart buffer
|
v
Trigger check:
HIGH signal? --> Extract NOW (single LLM call)
Buffer >= N? --> Extract BATCH
Time > T? --> Extract BATCH
else --> Keep buffering
|
v
If extracted: write markdown files to disk
|
v
Every Nth extraction: Consolidation pass
- Merge/dedup memories
- Merge fragmented entity files
- Update entity profiles
- Update behavioral profile (with cap enforcement)
- Clean expired commitments and TTL memories
- Auto-consolidate identity reflections
|
v
Background: qmd update (re-index new files)
```
### Retrieval Flow
```
Agent session starts
|
v
Read profile.md directly (free, instant)
|
v
QMD search memory collection (relevant memories)
|
v
QMD search global collections (workspace context)
|
v
Optionally inject highest-priority open question
|
v
Combine and inject into system prompt
```
## Storage Layout
All memories are stored as markdown files with YAML frontmatter:
```
~/.openclaw/workspace/memory/local/
โโโ profile.md # Living behavioral profile (auto-updated)
โโโ entities/ # One markdown file per tracked entity
โ โโโ person-jane-doe.md
โ โโโ project-my-app.md
โ โโโ tool-qmd.md
โโโ facts/ # Memory entries organized by date
โ โโโ YYYY-MM-DD/
โ โโโ fact-1738789200000-a1b2.md
โ โโโ preference-1738789200000-c3d4.md
โโโ corrections/ # High-weight correction memories
โ โโโ correction-1738789200000-e5f6.md
โโโ questions/ # Generated curiosity questions
โ โโโ q-m1abc-xy.md
โโโ threads/ # Conversation threads (v1.2.0)
โ โโโ thread-1738789200000-a1b2.json
โโโ summaries/ # Memory summaries (v1.2.0)
โ โโโ summary-1738789200000-a1b2.json
โโโ config/
โ โโโ aliases.json # Entity name aliases
โโโ state/
โโโ buffer.json # Current unbatched turns (survives restarts)
โโโ meta.json # Extraction count, timestamps, totals
โโโ topics.json # Extracted topics (v1.2.0)
```
### Memory File Format
Each memory file uses YAML frontmatter:
```yaml
---
id: fact-1738789200000-a1b2
category: fact
created: 2026-02-05T12:00:00.000Z
updated: 2026-02-05T12:00:00.000Z
source: extraction
confidence: 0.85
confidenceTier: implied
tags: ["tools", "preferences"]
entityRef: tool-qmd
---
QMD supports hybrid search combining BM25 and vector embeddings with reranking.
```
## Installation
### Prerequisites
- [OpenClaw](https://github.com/openclaw/openclaw) gateway
- Node.js 20+
- An OpenAI API key (for extraction; retrieval works without one)
- [QMD](https://github.com/tobi/qmd) (optional, for hybrid search)
### Install
```bash
# Clone into the OpenClaw extensions directory
git clone https://github.com/joshuaswarren/openclaw-engram.git \
~/.openclaw/extensions/openclaw-engram
# Install dependencies and build
cd ~/.openclaw/extensions/openclaw-engram
npm install
npm run build
```
### Enable in OpenClaw
Add to your `openclaw.json`:
```jsonc
{
"plugins": {
"allow": ["openclaw-engram"],
"slots": {
"memory": "openclaw-engram"
},
"entries": {
"openclaw-engram": {
"enabled": true,
"config": {
"openaiApiKey": "${OPENAI_API_KEY}"
}
}
}
}
}
```
### Set Up QMD Collection (Optional)
If you have QMD installed, add a collection pointing at the memory directory. Add to `~/.config/qmd/index.yml`:
```yaml
openclaw-engram:
path: ~/.openclaw/workspace/memory/local
extensions: [.md]
```
Then index:
```bash
qmd update && qmd embed
```
### Restart the Gatewa
... (truncated)
tools
Comments
Sign in to leave a comment