← Back to Plugins
Tools

Agentic Memory

YaoS-Code By YaoS-Code 👁 8 views ▲ 0 votes

Production-grade personalized memory system for AI agents. PostgreSQL + pgvector + bge-m3. OpenClaw plugin.

GitHub

Install

pip install -r

README

# Agentic Memory

**Production-grade personalized memory system for AI agents.**

Give your AI agent persistent, searchable, evolving memory โ€” powered by PostgreSQL + pgvector + bge-m3. Built as an [OpenClaw](https://github.com/openclaw/openclaw) plugin that fully replaces the built-in SQLite memory with a unified PostgreSQL backend.

```
User: "Do you remember what we discussed about the deployment last week?"

Agent: *searches 44 conversation memories + 118 workspace file chunks*
Agent: "Yes โ€” last Tuesday you decided to move from Vercel to Cloudflare Workers..."
```

## Why This Exists

AI agents forget everything between sessions. Most "memory" solutions are either:
- **Too simple** โ€” just append to a text file
- **Too complex** โ€” require a PhD in vector databases
- **Not unified** โ€” workspace files and conversation history live in separate silos

Agentic Memory solves this with a single PostgreSQL backend that handles:

| Layer | What It Stores | How It's Searched |
|-------|---------------|-------------------|
| **Vector memories** | Conversations, decisions, insights | pgvector HNSW (semantic) |
| **Structured facts** | User preferences, contacts, config | Key-value lookup |
| **Workspace index** | SKILL.md, AGENTS.md, docs/*.md | Hybrid vector + FTS |
| **File attachments** | Images, PDFs, documents | MinIO + metadata search |
| **Access log** | What was recalled and when | Time-decay scoring |

One `memory_search` call queries everything. No need to know which backend holds the answer.

## Architecture

```
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                     AI Agent (OpenClaw)                      โ”‚
โ”‚                                                             โ”‚
โ”‚   memory_search tool          exec curl tool                โ”‚
โ”‚        โ”‚                           โ”‚                        โ”‚
โ”‚        โ–ผ                           โ–ผ                        โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”         โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”               โ”‚
โ”‚  โ”‚ memory-api  โ”‚โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ถโ”‚  Memory Service  โ”‚               โ”‚
โ”‚  โ”‚  plugin     โ”‚  curl   โ”‚  FastAPI :18800   โ”‚               โ”‚
โ”‚  โ”‚             โ”‚         โ”‚                  โ”‚               โ”‚
โ”‚  โ”‚ Registers:  โ”‚         โ”‚  /search         โ”‚  Hybrid       โ”‚
โ”‚  โ”‚ โ€ข memory_   โ”‚         โ”‚  /store          โ”‚  vector+FTS   โ”‚
โ”‚  โ”‚   search    โ”‚         โ”‚  /facts          โ”‚               โ”‚
โ”‚  โ”‚ โ€ข memory_   โ”‚         โ”‚  /recall         โ”‚               โ”‚
โ”‚  โ”‚   get       โ”‚         โ”‚  /workspace/*    โ”‚               โ”‚
โ”‚  โ”‚ โ€ข runtime   โ”‚         โ”‚  /compact        โ”‚               โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜         โ”‚  /extract        โ”‚               โ”‚
โ”‚                          โ”‚  /v1/embeddings  โ”‚               โ”‚
โ”‚                          โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜               โ”‚
โ”‚                                   โ”‚                         โ”‚
โ”‚                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”          โ”‚
โ”‚                    โ–ผ              โ–ผ              โ–ผ          โ”‚
โ”‚              โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”      โ”‚
โ”‚              โ”‚PostgreSQLโ”‚  โ”‚  Redis   โ”‚  โ”‚  MinIO   โ”‚      โ”‚
โ”‚              โ”‚+ pgvectorโ”‚  โ”‚  cache   โ”‚  โ”‚  files   โ”‚      โ”‚
โ”‚              โ”‚+ HNSW    โ”‚  โ”‚  dedup   โ”‚  โ”‚          โ”‚      โ”‚
โ”‚              โ”‚+ tsvectorโ”‚  โ”‚          โ”‚  โ”‚          โ”‚      โ”‚
โ”‚              โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜      โ”‚
โ”‚                                                             โ”‚
โ”‚              Embedding: BAAI/bge-m3 (1024-dim, local)       โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
```

## Quick Start

### 1. Start infrastructure

```bash
docker compose up -d
```

This starts PostgreSQL (with pgvector), Redis, and MinIO.

### 2. Install Memory Service

```bash
cd memory-service
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
```

### 3. Run the service

```bash
uvicorn main:app --host 127.0.0.1 --port 18800
```

On first start, bge-m3 (568MB) downloads automatically. After that, startup takes ~3 seconds.

### 4. Verify

```bash
# Health check
curl http://localhost:18800/health
# โ†’ {"status":"ok","pg":true,"redis":true,"minio":true,"embedding_model_loaded":true}

# Store a memory
curl -X POST http://localhost:18800/store \
  -H "Content-Type: application/json" \
  -d '{"content": "User prefers dark mode and vim keybindings", "category": "preference"}'

# Search memories
curl -X POST http://localhost:18800/search \
  -H "Content-Type: application/json" \
  -d '{"query": "editor preferences", "max_results": 5}'
```

### 5. Install OpenClaw plugin

```bash
# Copy plugin to OpenClaw plugins directory
cp -r openclaw-plugin/memory-api ~/.openclaw/plugins/

# Symlink OpenClaw SDK (required for plugin imports)
ln -s $(npm root -g)/openclaw ~/.openclaw/plugins/memory-api/node_modules/openclaw

# Copy example config
cp config/openclaw.json.example ~/.openclaw/openclaw.json
# Edit with your settings

# Restart gateway
openclaw gateway restart

# Verify
openclaw memory status --deep
```

## How It Works โ€” Step by Step

### Memory Storage (5 tiers)

```
User says something โ†’ Agent decides what's worth remembering
                           โ”‚
                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”
                    โ”‚  Tier Check โ”‚
                    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                           โ”‚
            โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
            โ–ผ              โ–ผ              โ–ผ
    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
    โ”‚    vector    โ”‚ โ”‚   fact   โ”‚ โ”‚    cache     โ”‚
    โ”‚              โ”‚ โ”‚          โ”‚ โ”‚              โ”‚
    โ”‚ Decisions,   โ”‚ โ”‚ Contacts,โ”‚ โ”‚ Conversation โ”‚
    โ”‚ insights,    โ”‚ โ”‚ prefs,   โ”‚ โ”‚ highlights   โ”‚
    โ”‚ project ctx  โ”‚ โ”‚ schedule โ”‚ โ”‚ (4hr TTL)    โ”‚
    โ”‚              โ”‚ โ”‚          โ”‚ โ”‚              โ”‚
    โ”‚ โ†’ pgvector   โ”‚ โ”‚ โ†’ facts  โ”‚ โ”‚ โ†’ Redis      โ”‚
    โ”‚   HNSW index โ”‚ โ”‚   table  โ”‚ โ”‚              โ”‚
    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
```

**Vector memories** get embedded with bge-m3 (1024-dim) and stored in PostgreSQL with an HNSW index for fast approximate nearest-neighbor search.

**Facts** are structured key-value pairs (e.g., `domain=preference, key=timezone, value="America/Vancouver"`). Exact lookup, no embedding needed.

**Cache** is for short-lived conversation context. Stored in Redis with a 4-hour TTL. Deduplication via content hash prevents storing the same thing twice within 5 minutes.

### Memory Retrieval

When `memory_search("deployment strategy")` is called:

1. **Embed the query** with bge-m3 (query mode)
2. **Vector search** โ€” pgvector HNSW finds top-N semantically similar memories
3. **FTS search** โ€” PostgreSQL tsvector finds keyword matches
4. **Workspace search** โ€” same hybrid search across indexed workspace files
5. **Merge** โ€” Reciprocal Rank Fusion (RRF) combines results from all sources
6. **Time decay** โ€” older memories are scored lower (configurable half-life per category)
7. **Return** โ€” top results with path, line numbers, score, snippet

### Memory Decay

Memories fade over time using exponential decay:

```
score = raw_score ร— 2^(-age_days / half_life)
```

| Category | Half-life | Meaning |
|----------|-----------|---------|
| conversation_highlight | 14 days | Recent chat context fades fast |
| general | 30 days | General knowledge |
| project | 45 days | Project-specific context |
| insight | 60 days | Learned patterns |
| decision | 90 days | Important decisions persist |
| skill | 180 days | Skills/abilities last longest |

### Workspace File Indexing

The memory service indexes your OpenClaw workspace markdown files into PostgreSQL:

```
~/.openclaw/workspace/
โ”œโ”€โ”€ AGENTS.md          โ”€โ”
โ”œโ”€โ”€ IDENTITY.md         โ”‚  Scanned, chunked,
โ”œโ”€โ”€ SOUL.md             โ”‚  embedded with bge-m3,
โ”œโ”€โ”€ memory/             โ”‚  stored in pgvector
โ”‚   โ”œโ”€โ”€ 2026-03-27.md   โ”‚
โ”‚   โ””โ”€โ”€ 2026-03-28.md  โ”€โ”˜
โ””โ”€โ”€ skills/
    โ””โ”€โ”€ web-search/
        โ””โ”€โ”€ SKILL.md   โ”€โ”€โ”€ Also indexed
```

**Chunking strategy**: Markdown files are split into chunks of ~512 tokens with 50-token overlap. Each chunk gets a bge-m3 embedding and is stored with its file path + line range for precise citation.

**Incremental sync**: Only changed files are re-indexed (hash comparison). Call `POST /workspace/sync` after editing workspace files, or `POST /workspace/sync {"force": true}` for a full re-index.

### Auto-Extraction (Hooks)

The `auto-extract` hook runs at session boundaries (new session, reset, pre-compaction) to automatically extract durable memories from the conversation:

```
Session about to compact
    โ”‚
    โ–ผ
auto-extract hook fires
    โ”‚
    โ”œโ”€ Reads last 30 messages from session transcript
    โ”œโ”€ Calls POST /extract with {messages, auto_store: true}
    โ”‚       โ”‚
    โ”‚       โ–ผ
    โ”‚   Claude Haiku analyzes messages
    โ”‚   Extracts: [
    โ”‚     {tier: "fact", category: "preference", content: "User prefers tabs over spaces"},
    โ”‚     {tier: "vector", category: "decision", content: "Decided to use PostgreSQL over MongoDB"},
    โ”‚   ]
    โ”‚       โ”‚
    โ”‚       โ–ผ
    โ”‚   Each extracted memory is stored via /store or /facts
    โ”‚
    โ””โ”€ Session compacts normally
```

### Auto-Compaction

When conversations get long, the `/compact` endpoint summarizes them:

```bash
curl -X POST http://localhost:18800/compact \
  -H "Content-Type: application/json" \
  -d '{"messages": [...], "force": true}'
```

Returns a structured summary with sections:
- **Context** โ€” what was being discussed
- **Key Decisions** โ€” what was decided
- **Action Items** โ€” what needs to be done
- **Important Details** โ€” technical specifics
- **Current State** โ€” where things stand

## API Reference

### Core Endpoints

| Method | Path | Description |
|--------|------|-------------|
| GET | `/health` | Service health + component status |
| POST | `/store` | Store a memory (auto-classifies tier) |
| POST | `/search` | Semantic + FTS hybrid search |
| POST | `/recall` | Context recall (returns relevant memories for a topic) |
| POST | `/facts` | Store/query structured facts |
| GET | `/facts` | List all active facts |
| POST | `/compact` | Su

... (truncated)
tools

Comments

Sign in to leave a comment

Loading comments...