Tools

Memolo

Name: Memolo
Rating: 3.5 (1 reviews)
Author: thdeptrai

By thdeptrai 👁 55 views ▲ 0 votes

Memolo Long-term shared memory system for multi-agent AI. Self-hosted with PostgreSQL, Qdrant, Ollama/MiniMax. Includes OpenClaw plugin, real-time dashboard, and Node.js SDK.

GitHub

Configuration Example

const { MemoryClient } = require('memolo');
const memory = new MemoryClient({
  agentId: 'my-agent',
  apiKey: 'agent-api-key-from-register',
  serverUrl: 'http://192.168.1.100:7437',
});

const ctx = await memory.recall('user question', { format: 'context' });
const result = await memory.store({ userMessage: '...', agentResponse: '...' });

README

# 🧠 Memolo — Intelligent Memory System for Multi-Agent AI

> Persistent, self-organizing memory for AI agents — atomic fact extraction, knowledge graph, LLM deduplication, and semantic recall across your LAN

## Quick Start

### Prerequisites

- **[Docker Desktop](https://www.docker.com/products/docker-desktop/)** — runs PostgreSQL, Qdrant, Server, Dashboard
- **[Ollama](https://ollama.com)** — runs locally on host for embedding generation (needs GPU)
- **[MiniMax API Key](https://platform.minimax.io)** — cloud LLM for fact extraction & deduplication

### Installation

```bash
# 1. Clone
git clone https://github.com/thdeptrai/memolo.git
cd memolo

# 2. Pull embedding model
ollama pull qwen3-embedding:8b

# 3. Configure
cp .env.example .env
# Edit .env → set these 2 required values:
#   MEMOLO_MASTER_KEY=<random string for API security>
#   MINIMAX_API_KEY=<your MiniMax API key>

# 4. Start everything
docker compose up -d

# ✅ Done!
# API:       http://localhost:7437
# Dashboard: http://localhost:3001
```

### Update to Latest Version

```bash
bash update.sh
# Or: git pull && docker compose up -d --build memolo
```

---

## How Memolo Works

Memolo is a **local-first AI memory system** that turns raw agent conversations into structured, searchable, and self-maintaining knowledge. Here's what happens when an agent sends a message:

### 📥 Store Flow — What happens when `POST /api/memory/store` is called

```
USER MESSAGE + AGENT RESPONSE
            │
            ▼
    ┌───────────────────┐
    │ 1. Save Raw Data  │  Exchange saved to PostgreSQL with sequence number
    └───────┬───────────┘
            │
            ▼
    ┌───────────────────┐
    │ 2. Batch Queue    │  Exchange enqueued into in-memory buffer
    │ (if batch enabled)│  (reduces LLM API calls for cloud providers)
    └───────┬───────────┘
            │
            ├──── (flush every 10s or when batch full) ────┐
            │                                               │
            ▼                                               ▼
    ┌───────────────────┐                           ┌──────────────────┐
    │ 3. Extract Facts  │                           │ 5. Update Graph  │
    │ + Dedup (combined)│  1 LLM call per batch:    │                  │
    │                   │   • Extract atomic facts   │ Entities → DB    │
    │                   │   • ADD/UPDATE/DELETE/NONE  │ Co-occurring →   │
    │                   │   • User + Agent facts      │ relationships    │
    └───────┬───────────┘                           └──────────────────┘
            │
            ▼
    ┌───────────────────┐
    │ 4. Embed & Store  │  Generate 4096-dim embedding via Ollama,
    │                   │  store in Qdrant for semantic search
    └───────────────────┘
```

### 📤 Recall Flow — What happens when `POST /api/memory/recall` is called

```
QUERY: "What tools does the user prefer?"
            │
            ▼
    ┌───────────────────┐
    │ 1. Query Enrich   │  Enrich query using conversation profile
    └───────┬───────────┘  (active topic, preferences, context)
            │
            ├────────────────────┬──────────────────┐
            ▼                    ▼                   ▼
    ┌──────────────┐   ┌──────────────┐    ┌──────────────┐
    │ 2. Semantic   │   │ 3. Cross-    │    │ 4. Knowledge │
    │    Search     │   │    Agent     │    │    Graph     │
    │ (Qdrant,      │   │    Search   │    │    Query     │
    │  topic-aware)│   │ (shared KB) │    │ (neighbors)  │
    └──────┬───────┘   └──────┬──────┘    └──────┬───────┘
            │                  │                   │
            ├──────────────────┼───────────────────┘
            ▼
    ┌───────────────────┐
    │ 5. LLM Rerank     │  Score each result 0.0-1.0 by
    │ (configurable)    │  semantic relevance to query
    └───────┬───────────┘
            │
            ▼
        RANKED RESULTS
```

---

## Core Mechanisms

### 🔍 Atomic Fact Extraction
Unlike batch summarization that processes groups of messages, Memolo extracts atomic facts from **every individual exchange** in real-time. Each fact is a standalone, searchable unit of knowledge. Facts from both user messages and agent responses (configurable) are extracted.

**Example:**
```
User:  "Tao dang lam project ecommerce bang Next.js va Prisma ORM, database la PostgreSQL"
Agent: "Hay lam! Stack rat manh."

Extracted Facts:
 1. "User đang làm project ecommerce"
 2. "Project sử dụng Next.js framework"
 3. "Database là PostgreSQL"
```

### 🧠 Combined Extract + Dedup (1 LLM call)
When a new exchange arrives, Memolo performs extraction AND deduplication in a **single LLM call** (optimized for cloud providers like MiniMax to minimize API costs):

1. **Extract** atomic facts from user + agent messages
2. **Compare** against existing memories (fetched via vector search)
3. **Decide** action for each fact: ADD / UPDATE / DELETE / NONE

This prevents memory bloat and handles contradictions automatically:

```
Existing:  "Database là PostgreSQL"
New fact:  "Đã chuyển sang MySQL thay vì PostgreSQL"

LLM Decision: UPDATE
Result:    "Đã chuyển sang sử dụng MySQL thay vì PostgreSQL" (old superseded)
```

**Key technique:** UUIDs are mapped to integers `[0], [1], [2]...` before sending to the LLM to prevent hallucination (inspired by mem0).

### 📦 Batch Processing
Exchanges are queued and processed in batches (configurable interval + max size) to **dramatically reduce LLM API calls** for cloud providers. A batch of 10 exchanges → 1 LLM call instead of 10.

### 🕸️ Knowledge Graph
Entities mentioned in conversations are automatically extracted and linked:

```
Entities:  [Next.js, TypeScript, PostgreSQL, Vercel]
Relations: Next.js --related_to--> TypeScript
           PostgreSQL --related_to--> Next.js
```

During recall, the graph provides **contextual connections** — asking about "Next.js" also surfaces related technologies, preferences, and decisions.

### 📊 Search Reranking
After initial vector similarity search, an LLM scores each result's relevance (0.0-1.0) to the actual query. This catches cases where vector similarity alone misses semantic nuance. Can be disabled via config.

### 📜 Memory History (Audit Trail)
Every ADD, UPDATE, and DELETE operation is logged with:
- Before/after content
- Who changed it (agent ID or system)
- Why (dedup_merge, contradiction, manual, etc.)

---

## Architecture

```
┌──────────────┐   ┌──────────────┐   ┌──────────────┐
│   Agent 1    │   │   Agent 2    │   │   Agent N    │
│ (Device A)   │   │ (Device B)   │   │ (Device C)   │
└──────┬───────┘   └──────┬───────┘   └──────┬───────┘
       │ X-API-Key        │ X-API-Key        │ X-API-Key
       └──────────────────┼──────────────────┘
                          │ LAN (WiFi)
                   ┌──────▼──────┐
                   │   Memolo    │ ← 0.0.0.0:7437
                   │   Server    │   API + Dashboard
                   ├─────────────┤
                   │ Batch Queue │ ← queue + flush periodically
                   │ Fact Extract│ ← per-exchange atomic facts
                   │ Deduplicator│ ← LLM ADD/UPDATE/DELETE
                   │ Graph Engine│ ← entities + relationships
                   │ Reranker    │ ← LLM relevance scoring
                   │ Contradict. │ ← detect & supersede conflicts
                   │ Intelligence│ ← decay, dedup
                   │ LLM Service │ ← MiniMax M2.5
                   │ Auth Layer  │ ← master + per-agent keys
                   │ Config API  │ ← live runtime settings
                   └──────┬──────┘
              ┌───────────┼───────────┐
         ┌────▼────┐ ┌────▼────┐ ┌────▼────┐
         │PostgreSQL│ │ Qdrant  │ │ Ollama  │
         │ 9 tables │ │ vectors │ │ (host)  │
         │ + graph  │ │ 4096dim │ │embed only│
         └─────────┘ └─────────┘ └─────────┘
```

## Technology Stack

| Component | Technology |
|-----------|-----------|
| Server | Node.js + Express 4.18 |
| Database | PostgreSQL 16 (9 tables, 11 migrations) |
| Vector Store | Qdrant (Cosine similarity, 4096 dim) |
| Embeddings | Ollama `qwen3-embedding:8b` (local) |
| LLM (Facts/Dedup/Rerank) | MiniMax M2.5 (cloud, Anthropic-compatible API) |
| Auth | API key (master + per-agent) via `X-API-Key` header |
| Dashboard | Next.js 16 + TailwindCSS v4 + shadcn/ui (dark theme) |
| Deployment | Docker Compose (single command) |

## Database Schema

| Table | Purpose |
|-------|---------|
| `agents` | Registered AI agents with API keys |
| `conversations` | Conversation sessions per agent |
| `exchanges` | Raw user/agent message pairs |
| `memories` | Extracted facts, decisions, insights (with content_hash, actor_id, superseded_by) |
| `knowledge_base` | Curated knowledge entries |
| `memory_history` | Audit trail for all memory mutations |
| `entities` | Named entities from the knowledge graph |
| `relationships` | Typed links between entities |
| `memory_conversations` | Many-to-many memory ↔ conversation linking |

---

## Authentication

### 2-Tier API Keys

| Key Type | Set In | Purpose |
|----------|--------|---------|
| **Master key** | `.env` (`MEMOLO_MASTER_KEY`) | Admin: register agents, full access |
| **Per-agent key** | Auto-generated on register | Agent: store, recall, search |

- `GET /api/health` — **no auth required**
- If `MEMOLO_MASTER_KEY` not set → auth disabled (dev mode)

## Agent Integration

### SDK (Node.js)
```javascript
const { MemoryClient } = require('memolo');
const memory = new MemoryClient({
  agentId: 'my-agent',
  apiKey: 'agent-api-key-from-register',
  serverUrl: 'http://192.168.1.100:7437',
});

const ctx = await memory.recall('user question', { format: 'context' });
const result = await memory.store({ userMessage: '...', agentResponse: '...' });
```

### OpenClaw Plugin (Example Integration)
```json
{
  "memolo": {
    "agentId": "my-agent",
    "apiKey": "${MEMOLO_API_KEY}",
    "serverUrl": "http://192.168.1.100:7437"
  }
}
```

---

## API Reference

### Memory
| Method | Path | Descri

... (truncated)

tools