Tools
Agentic Memory
Production-grade personalized memory system for AI agents. PostgreSQL + pgvector + bge-m3. OpenClaw plugin.
Install
pip install -r
README
# Agentic Memory
**Production-grade personalized memory system for AI agents.**
Give your AI agent persistent, searchable, evolving memory โ powered by PostgreSQL + pgvector + bge-m3. Built as an [OpenClaw](https://github.com/openclaw/openclaw) plugin that fully replaces the built-in SQLite memory with a unified PostgreSQL backend.
```
User: "Do you remember what we discussed about the deployment last week?"
Agent: *searches 44 conversation memories + 118 workspace file chunks*
Agent: "Yes โ last Tuesday you decided to move from Vercel to Cloudflare Workers..."
```
## Why This Exists
AI agents forget everything between sessions. Most "memory" solutions are either:
- **Too simple** โ just append to a text file
- **Too complex** โ require a PhD in vector databases
- **Not unified** โ workspace files and conversation history live in separate silos
Agentic Memory solves this with a single PostgreSQL backend that handles:
| Layer | What It Stores | How It's Searched |
|-------|---------------|-------------------|
| **Vector memories** | Conversations, decisions, insights | pgvector HNSW (semantic) |
| **Structured facts** | User preferences, contacts, config | Key-value lookup |
| **Workspace index** | SKILL.md, AGENTS.md, docs/*.md | Hybrid vector + FTS |
| **File attachments** | Images, PDFs, documents | MinIO + metadata search |
| **Access log** | What was recalled and when | Time-decay scoring |
One `memory_search` call queries everything. No need to know which backend holds the answer.
## Architecture
```
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ AI Agent (OpenClaw) โ
โ โ
โ memory_search tool exec curl tool โ
โ โ โ โ
โ โผ โผ โ
โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โ
โ โ memory-api โโโโโโโโโโถโ Memory Service โ โ
โ โ plugin โ curl โ FastAPI :18800 โ โ
โ โ โ โ โ โ
โ โ Registers: โ โ /search โ Hybrid โ
โ โ โข memory_ โ โ /store โ vector+FTS โ
โ โ search โ โ /facts โ โ
โ โ โข memory_ โ โ /recall โ โ
โ โ get โ โ /workspace/* โ โ
โ โ โข runtime โ โ /compact โ โ
โ โโโโโโโโโโโโโโโ โ /extract โ โ
โ โ /v1/embeddings โ โ
โ โโโโโโโโโโฌโโโโโโโโโโ โ
โ โ โ
โ โโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโ โ
โ โผ โผ โผ โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ โ
โ โPostgreSQLโ โ Redis โ โ MinIO โ โ
โ โ+ pgvectorโ โ cache โ โ files โ โ
โ โ+ HNSW โ โ dedup โ โ โ โ
โ โ+ tsvectorโ โ โ โ โ โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ โ
โ โ
โ Embedding: BAAI/bge-m3 (1024-dim, local) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
```
## Quick Start
### 1. Start infrastructure
```bash
docker compose up -d
```
This starts PostgreSQL (with pgvector), Redis, and MinIO.
### 2. Install Memory Service
```bash
cd memory-service
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
```
### 3. Run the service
```bash
uvicorn main:app --host 127.0.0.1 --port 18800
```
On first start, bge-m3 (568MB) downloads automatically. After that, startup takes ~3 seconds.
### 4. Verify
```bash
# Health check
curl http://localhost:18800/health
# โ {"status":"ok","pg":true,"redis":true,"minio":true,"embedding_model_loaded":true}
# Store a memory
curl -X POST http://localhost:18800/store \
-H "Content-Type: application/json" \
-d '{"content": "User prefers dark mode and vim keybindings", "category": "preference"}'
# Search memories
curl -X POST http://localhost:18800/search \
-H "Content-Type: application/json" \
-d '{"query": "editor preferences", "max_results": 5}'
```
### 5. Install OpenClaw plugin
```bash
# Copy plugin to OpenClaw plugins directory
cp -r openclaw-plugin/memory-api ~/.openclaw/plugins/
# Symlink OpenClaw SDK (required for plugin imports)
ln -s $(npm root -g)/openclaw ~/.openclaw/plugins/memory-api/node_modules/openclaw
# Copy example config
cp config/openclaw.json.example ~/.openclaw/openclaw.json
# Edit with your settings
# Restart gateway
openclaw gateway restart
# Verify
openclaw memory status --deep
```
## How It Works โ Step by Step
### Memory Storage (5 tiers)
```
User says something โ Agent decides what's worth remembering
โ
โโโโโโโโดโโโโโโโ
โ Tier Check โ
โโโโโโโโฌโโโโโโโ
โ
โโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโ
โผ โผ โผ
โโโโโโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโโโโโ
โ vector โ โ fact โ โ cache โ
โ โ โ โ โ โ
โ Decisions, โ โ Contacts,โ โ Conversation โ
โ insights, โ โ prefs, โ โ highlights โ
โ project ctx โ โ schedule โ โ (4hr TTL) โ
โ โ โ โ โ โ
โ โ pgvector โ โ โ facts โ โ โ Redis โ
โ HNSW index โ โ table โ โ โ
โโโโโโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโโโโโ
```
**Vector memories** get embedded with bge-m3 (1024-dim) and stored in PostgreSQL with an HNSW index for fast approximate nearest-neighbor search.
**Facts** are structured key-value pairs (e.g., `domain=preference, key=timezone, value="America/Vancouver"`). Exact lookup, no embedding needed.
**Cache** is for short-lived conversation context. Stored in Redis with a 4-hour TTL. Deduplication via content hash prevents storing the same thing twice within 5 minutes.
### Memory Retrieval
When `memory_search("deployment strategy")` is called:
1. **Embed the query** with bge-m3 (query mode)
2. **Vector search** โ pgvector HNSW finds top-N semantically similar memories
3. **FTS search** โ PostgreSQL tsvector finds keyword matches
4. **Workspace search** โ same hybrid search across indexed workspace files
5. **Merge** โ Reciprocal Rank Fusion (RRF) combines results from all sources
6. **Time decay** โ older memories are scored lower (configurable half-life per category)
7. **Return** โ top results with path, line numbers, score, snippet
### Memory Decay
Memories fade over time using exponential decay:
```
score = raw_score ร 2^(-age_days / half_life)
```
| Category | Half-life | Meaning |
|----------|-----------|---------|
| conversation_highlight | 14 days | Recent chat context fades fast |
| general | 30 days | General knowledge |
| project | 45 days | Project-specific context |
| insight | 60 days | Learned patterns |
| decision | 90 days | Important decisions persist |
| skill | 180 days | Skills/abilities last longest |
### Workspace File Indexing
The memory service indexes your OpenClaw workspace markdown files into PostgreSQL:
```
~/.openclaw/workspace/
โโโ AGENTS.md โโ
โโโ IDENTITY.md โ Scanned, chunked,
โโโ SOUL.md โ embedded with bge-m3,
โโโ memory/ โ stored in pgvector
โ โโโ 2026-03-27.md โ
โ โโโ 2026-03-28.md โโ
โโโ skills/
โโโ web-search/
โโโ SKILL.md โโโ Also indexed
```
**Chunking strategy**: Markdown files are split into chunks of ~512 tokens with 50-token overlap. Each chunk gets a bge-m3 embedding and is stored with its file path + line range for precise citation.
**Incremental sync**: Only changed files are re-indexed (hash comparison). Call `POST /workspace/sync` after editing workspace files, or `POST /workspace/sync {"force": true}` for a full re-index.
### Auto-Extraction (Hooks)
The `auto-extract` hook runs at session boundaries (new session, reset, pre-compaction) to automatically extract durable memories from the conversation:
```
Session about to compact
โ
โผ
auto-extract hook fires
โ
โโ Reads last 30 messages from session transcript
โโ Calls POST /extract with {messages, auto_store: true}
โ โ
โ โผ
โ Claude Haiku analyzes messages
โ Extracts: [
โ {tier: "fact", category: "preference", content: "User prefers tabs over spaces"},
โ {tier: "vector", category: "decision", content: "Decided to use PostgreSQL over MongoDB"},
โ ]
โ โ
โ โผ
โ Each extracted memory is stored via /store or /facts
โ
โโ Session compacts normally
```
### Auto-Compaction
When conversations get long, the `/compact` endpoint summarizes them:
```bash
curl -X POST http://localhost:18800/compact \
-H "Content-Type: application/json" \
-d '{"messages": [...], "force": true}'
```
Returns a structured summary with sections:
- **Context** โ what was being discussed
- **Key Decisions** โ what was decided
- **Action Items** โ what needs to be done
- **Important Details** โ technical specifics
- **Current State** โ where things stand
## API Reference
### Core Endpoints
| Method | Path | Description |
|--------|------|-------------|
| GET | `/health` | Service health + component status |
| POST | `/store` | Store a memory (auto-classifies tier) |
| POST | `/search` | Semantic + FTS hybrid search |
| POST | `/recall` | Context recall (returns relevant memories for a topic) |
| POST | `/facts` | Store/query structured facts |
| GET | `/facts` | List all active facts |
| POST | `/compact` | Su
... (truncated)
tools
Comments
Sign in to leave a comment