Productivity
agent-memory-ultimate
Production-ready memory system β daily logs, sleep consolidation, SQLite + FTS5, WhatsApp/ChatGPT/VCF importers. Human-inspired architecture.
---
name: agent-memory-ultimate
version: 3.0.1
description: "Cut your agent's token usage by 60-80% β stop dumping entire files into context. Cognitive memory for OpenClaw agents with vector search, knowledge graphs, RAPTOR hierarchy, and automatic consolidation. Store once, recall precisely. All local, zero API costs."
homepage: https://github.com/globalcaos/clawdbot-moltbot-openclaw
repository: https://github.com/globalcaos/clawdbot-moltbot-openclaw
metadata:
openclaw:
emoji: "π§ "
tags:
- memory
- cognitive
- sqlite
- vector-search
- knowledge-graph
- raptor
- embeddings
- token-savings
- consolidation
- recall
- offline
- long-term-memory
- sleep-cycle
---
# Agent Memory Ultimate
**Cut your agent's token usage by 60-80%.** Stop reading entire files into context β store memories once, recall only what's relevant. Vector search, knowledge graphs, automatic decay, and RAPTOR hierarchies give your agent human-like memory that actually scales.
**Zero cloud APIs. Zero costs. Everything runs locally.**
---
## Why Token Savings Matter
Every time your agent reads `MEMORY.md` or a daily log, that's thousands of tokens burned on context. Most of it is irrelevant to the current question.
| Approach | Tokens per recall | Precision |
|----------|-------------------|-----------|
| Read entire MEMORY.md | 3,000-10,000 | ~5% relevant |
| Read daily log + MEMORY.md | 5,000-20,000 | ~10% relevant |
| `mem.py recall "query"` | 200-800 | ~80% relevant |
| `mem.py primed-recall` (context-aware) | 300-1,000 | ~90% relevant |
**On a typical agent doing 50 recalls/day:**
- Old way: ~500K tokens/day on memory reads
- With v3: ~30K tokens/day
- **Savings: ~470K tokens/day** (~$7-14/day on API, or significant Claude Max headroom)
The agent gets *better* answers with *less* context pollution.
---
## What's New in v3
v2 was markdown files + FTS5 keyword search. v3 adds a **full cognitive architecture:**
| Feature | v2 | v3 |
|---------|----|----|
| Storage | Markdown files | SQLite + sqlite-vec embeddings |
| Search | FTS5 keyword only | Hybrid (vector + keyword + BM25) |
| Recall | Read entire files | Precise snippets via 6 strategies |
| Associations | None | Knowledge graph with multi-hop traversal |
| Hierarchy | Flat files | RAPTOR tree (L0βL1βL2βL3 abstraction) |
| Decay | Manual cleanup | Automatic strength decay + half-life |
| Consolidation | Cron text prompt | Clustering + merging + hierarchy rebuild |
| Context priming | None | Spreading activation from conversation |
| Sharing | None | Cross-agent with sensitivity gates |
| Token cost | High (full file reads) | Low (targeted retrieval) |
**v2 still works.** Markdown files, daily logs, MEMORY.md β all unchanged. v3 adds a parallel cognitive layer that dramatically reduces token waste.
---
## Architecture
```
βββββββββββββββββββββββββββββββββββββββ
β Agent Session β
β (reads MEMORY.md, daily logs) β β v2 (unchanged)
βββββββββββββββββββββββββββββββββββββββ€
β mem.py CLI β β v3 (NEW)
β store / recall / consolidate β
ββββββββββββ¬βββββββββββ¬ββββββββββββββββ€
β sqlite-vecβ FTS5 β Association β
β (vectors) β(keyword)β Graph β
ββββββββββββ΄βββββββββββ΄ββββββββββββββββ€
β memory.db β
β memories, associations, hierarchy β
β shares, embeddings (384-dim) β
βββββββββββββββββββββββββββββββββββββββ
```
- **Embedding model:** `all-MiniLM-L6-v2` (384 dimensions, ~80MB, ONNX)
- **Vector search:** `sqlite-vec` β no Pinecone, no Weaviate, no cloud
- **Database:** `db/memory.db` (one file, portable, backupable)
---
## Quick Start
```bash
# 1. Initialize database
python3 scripts/mem.py init
# 2. Start the local embedding server (port 9999)
bash scripts/start-memory.sh
# 3. Store a memory
python3 scripts/mem.py store "Oscar prefers wired home automation" --type semantic --importance 0.8
# 4. Recall it (200 tokens instead of 10,000)
python3 scripts/mem.py recall "home automation preferences"
```
That's it. Your agent now has precise memory retrieval.
---
## 6 Recall Strategies
Not all queries are the same. Use the right strategy:
| Strategy | Best For | Command | When |
|----------|----------|---------|------|
| **Hybrid** (default) | General recall | `mem.py recall "query"` | Most queries |
| **Vector** | Semantic similarity | `recall --strategy vector` | Fuzzy, conceptual |
| **Keyword** | Exact terms, IDs | `recall --strategy keyword` | File names, codes |
| **Adaptive** | Auto-selects detail | `mem.py recall-adaptive "query"` | Exploratory |
| **Graph** | Follow connections | `mem.py recall-assoc "query" --hops 2` | Related concepts |
| **Primed** | Context-aware | `mem.py primed-recall "q" --context "..."` | Mid-conversation |
**Primed recall** is the killer feature β pass the last 2-3 user messages as context and results are biased toward what's conversationally relevant. This is how human memory works.
---
## CLI Reference
### Core Commands
| Command | Description |
|---------|-------------|
| `mem.py init` | Create database schema |
| `mem.py migrate` | Import existing documents from jarvis.db |
| `mem.py store <content> [--type TYPE] [--source SRC] [--importance N]` | Store a memory |
| `mem.py recall <query> [--strategy hybrid\|vector\|keyword] [--limit N]` | Search memories |
| `mem.py forget <id\|--query QUERY>` | Soft-delete (strength β 0) |
| `mem.py hard-delete <id>` | Permanently remove |
| `mem.py stats` | Database statistics |
**Memory types:** `episodic` (events), `semantic` (facts), `procedural` (how-to), `preference` (user preferences)
### Knowledge Graph
| Command | Description |
|---------|-------------|
| `mem.py associate <src_id> <tgt_id> [--type TYPE] [--weight N]` | Link two memories |
| `mem.py links <memory_id>` | Show all associations |
| `mem.py recall-assoc <query> [--hops N] [--limit N]` | Multi-hop traversal |
| `mem.py graph-stats` | Graph statistics |
**Edge types:** `related`, `caused_by`, `part_of`, `contradicts`, `temporal`, `supports`
### RAPTOR Hierarchy
| Command | Description |
|---------|-------------|
| `mem.py build-hierarchy [--levels N]` | Build abstraction tree |
| `mem.py recall-adaptive <query> [--detail auto\|broad\|specific\|0-3]` | Recall at right abstraction level |
| `mem.py hierarchy-stats` | Show hierarchy structure |
Broad queries ("what do I know about AI?") match high-level summaries. Specific queries ("sqlite-vec version") match leaf nodes. **Adaptive auto-selects.**
### Spreading Activation
| Command | Description |
|---------|-------------|
| `mem.py primed-recall <query> [--context 'text1' 'text2'] [--limit N]` | Context-primed recall |
### Cross-Agent Sharing
| Command | Description |
|---------|-------------|
| `mem.py share <memory_id> --with <agent> [--sensitivity N]` | Share a memory |
| `mem.py shared [--from AGENT] [--to AGENT]` | List shares |
| `mem.py revoke <share_id> \| --memory <id>` | Revoke access |
**Sensitivity levels:** 0 (public) β 5 (top secret). Default gate: β€ 3. Both agents must consent.
### Maintenance
| Command | Description |
|---------|-------------|
| `mem.py consolidate [--days N] [--decay-only]` | Full consolidation cycle |
Runs: decay β cluster β merge duplicates β rebuild hierarchy.
---
## Daily Cycle
### Wake Up
```markdown
1. Read SOUL.md, USER.md (identity β small, always load)
2. Read today's daily log (recent context)
3. Use `mem.py primed-recall "session start"` for relevant memories
β Gets 200-800 tokens of precisely relevant context
β Instead of 10,000+ tokens from reading MEMORY.md cover-to-cover
```
### During Day
```bash
# Store important facts as they come up
mem.py store "Client meeting moved to March 1" --type episodic --importance 0.7
# Before answering from memory β recall, don't read files
mem.py recall "client meeting schedule"
# Link related memories when you notice connections
mem.py associate 42 87 --type related
```
### Sleep Cycle (Nightly Consolidation)
```json
{
"schedule": { "kind": "cron", "expr": "0 3 * * *", "tz": "America/Los_Angeles" },
"payload": {
"kind": "agentTurn",
"message": "Run memory consolidation: python3 scripts/mem.py consolidate --days 7"
},
"sessionTarget": "isolated"
}
```
**What consolidation does:**
1. **Decay** β Unaccessed memories lose strength (half-life: 30 days)
2. **Cluster** β Groups similar memories
3. **Merge** β Combines near-duplicates (saves storage + tokens)
4. **Rebuild hierarchy** β Updates RAPTOR tree for better adaptive recall
---
## File Structure
```
workspace/
βββ MEMORY.md # Long-term curated memory (v2, still works)
βββ SOUL.md # Identity & personality
βββ USER.md # Human profile
βββ db/
β βββ agent.db # Contacts/history (v2)
β βββ memory.db # Cognitive memory (v3)
βββ bank/
β βββ entities/ # People profiles
β βββ contacts.md # Quick contact reference
β βββ opinions.md # Preferences, beliefs
βββ memory/
βββ YYYY-MM-DD.md # Daily logs
βββ projects/ # Project notes
βββ knowledge/ # Topic docs
```
---
## Comparison: Why This Over Alternatives
| Feature | Read files (default) | Basic RAG | Agent Memory Ultimate v3 |
|---------|---------------------|-----------|--------------------------|
| Token cost per recall | 3K-20K | 500-2K | 200-800 |
| Precision | ~5-10% | ~50% | ~80-90% |
| Associations | β | β | β
Knowledge graph |
| Abstraction levels | β | β | β
RAPTOR hierarchy |
| Context priming | β | β | β
Spreading activation |
| Automatic decay | β | β | β
Configurable half-life |
| Cross-agent sharing | β | β | β
With sensitivity gates |
| Cloud dependency | None | Usually | **None β fully local** |
| Setup complexity | Zero | High | `mem.py init` + one script |
---
## Data Sources (v2 Importers)
| Source | Command |
|--------|-
... (truncated)
productivity
By
Comments
Sign in to leave a comment