Tools

Memory Lancedb Ultra

Name: Memory Lancedb Ultra
Rating: 3.5 (1 reviews)
Author: zhangbo198324

By zhangbo198324 👁 52 views ▲ 0 votes

Enhanced LanceDB memory plugin for OpenClaw with hybrid retrieval, cross-encoder rerank, multi-scope isolation. Based on win4r/memory-lancedb-pro.

GitHub

Install

npm install
```

Configuration Example

{
  "plugins": {
    "load": { "paths": ["plugins/memory-lancedb-ultra"] },
    "entries": {
      "memory-lancedb-ultra": {
        "enabled": true,
        "config": {
          "embedding": {
            "apiKey": "${OPENAI_API_KEY}",
            "model": "text-embedding-3-small"
          }
        }
      }
    },
    "slots": { "memory": "memory-lancedb-ultra" }
  }
}

README

<div align="center">

# 🧠 memory-lancedb-ultra · OpenClaw Plugin

**Enhanced Long-Term Memory Plugin for [OpenClaw](https://github.com/openclaw/openclaw)**

Hybrid Retrieval (Vector + BM25) · Cross-Encoder Rerank · Multi-Scope Isolation · Management CLI

[![OpenClaw Plugin](https://img.shields.io/badge/OpenClaw-Plugin-blue)](https://github.com/openclaw/openclaw)
[![LanceDB](https://img.shields.io/badge/LanceDB-Vectorstore-orange)](https://lancedb.com)
[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)

</div>

---

> **Based on [win4r/memory-lancedb-pro](https://github.com/win4r/memory-lancedb-pro) (MIT License).**
> This fork includes security hardening, bug fixes, and architectural improvements.

---

## Included asset (kept as requested)

- `铁律-密码在视频里.zip` is intentionally retained in this repository for compatibility with the original upstream materials.

---

## Changes from upstream (memory-lancedb-pro)

### 🔴 Security & Correctness Fixes

| Issue | Fix |
|-------|-----|
| **Scope isolation bypassed** — `agentId` was always `undefined` at tool registration, so `getAccessibleScopes(undefined)` returned all scopes | Tools now receive runtime `toolCtx.agentId` via OpenClaw's tool factory context; `getAccessibleScopes(undefined)` defaults to **least-privilege** (default scope only) instead of all scopes |
| **SQL injection risk** — scope values were only single-quote escaped | Added `SAFE_SCOPE_PATTERN` validation (`/^[a-zA-Z0-9._:-]{1,64}$/`) at the store layer; all scope inputs sanitized before SQL interpolation |
| **`vectorWeight`/`bm25Weight` config had no effect** — fusion was hardcoded as "vector + 15% BM25 bonus" | Replaced with proper weighted fusion: `normV * vectorScore + normB * bm25Score` with configurable weights |
| **Dedup threshold inconsistent** — 0.95 in auto-capture, 0.98 in tools, 0.95 in CLI | Unified to single `DEDUP_SIMILARITY_THRESHOLD = 0.95` constant |
| **Backup capped at 10,000 entries** — silently lost data beyond that | Replaced with paginated full export loop |

### 🟡 Quality Improvements

| Area | Change |
|------|--------|
| **CLI `clampInt` missing** — `reembed` command would crash | Added missing function definition |
| **README accuracy** — documented "RRF fusion" but implementation wasn't RRF | Updated docs to say "weighted fusion" matching the actual algorithm |
| **Missing LICENSE file** — README claimed MIT but no file existed | Added proper MIT LICENSE with original author attribution |

---

## Architecture

```
┌─────────────────────────────────────────────────────────┐
│                   index.ts (Entry Point)                │
│  Plugin Registration · Config Parsing · Lifecycle Hooks │
└────────┬──────────┬──────────┬──────────┬───────────────┘
         │          │          │          │
    ┌────▼───┐ ┌────▼───┐ ┌───▼────┐ ┌──▼──────────┐
    │ store  │ │embedder│ │retriever│ │   scopes    │
    │ .ts    │ │ .ts    │ │ .ts    │ │    .ts      │
    └────────┘ └────────┘ └────────┘ └─────────────┘
         │                     │
    ┌────▼───┐           ┌─────▼──────────┐
    │migrate │           │noise-filter.ts │
    │ .ts    │           │adaptive-       │
    └────────┘           │retrieval.ts    │
                         └────────────────┘
    ┌─────────────┐   ┌──────────┐   ┌────────────┐
    │  tools.ts   │   │  cli.ts  │   │constants.ts│
    │ (Agent API) │   │ (CLI)    │   │ (shared)   │
    └─────────────┘   └──────────┘   └────────────┘
```

## Core Features

- **Hybrid Retrieval**: Vector (ANN) + BM25 (FTS) with configurable weighted fusion
- **Cross-Encoder Reranking**: Jina / SiliconFlow / Pinecone (5s timeout, cosine fallback)
- **Multi-Stage Scoring**: Recency boost → Importance weight → Length normalization → Time decay → Hard min score → MMR diversity
- **Multi-Scope Isolation**: `global`, `agent:<id>`, `custom:<name>`, `project:<id>`, `user:<id>` with per-agent access control
- **Adaptive Retrieval**: Skips greetings, commands, emoji (CJK-aware thresholds)
- **Noise Filtering**: Blocks agent denials, meta-questions, boilerplate at capture + retrieval
- **Session Memory**: Optional session summary storage on `/new`
- **Auto-Capture / Auto-Recall**: Lifecycle hooks for transparent memory management
- **Management CLI**: `list`, `search`, `stats`, `delete`, `export`, `import`, `reembed`, `migrate`
- **Task-Aware Embeddings**: Separate `taskQuery` / `taskPassage` for providers like Jina

## Installation

```bash
cd /path/to/your/openclaw/workspace
git clone https://github.com/zhangbo198324/memory-lancedb-ultra.git plugins/memory-lancedb-ultra
cd plugins/memory-lancedb-ultra && npm install
```

Then in your OpenClaw config (`~/.openclaw/openclaw.json`):

```json
{
  "plugins": {
    "load": { "paths": ["plugins/memory-lancedb-ultra"] },
    "entries": {
      "memory-lancedb-ultra": {
        "enabled": true,
        "config": {
          "embedding": {
            "apiKey": "${OPENAI_API_KEY}",
            "model": "text-embedding-3-small"
          }
        }
      }
    },
    "slots": { "memory": "memory-lancedb-ultra" }
  }
}
```

```bash
openclaw gateway restart
```

## Configuration

See the full config schema in `openclaw.plugin.json`. Key options:

| Option | Default | Description |
|--------|---------|-------------|
| `embedding.apiKey` | *required* | API key for embedding provider |
| `embedding.model` | `text-embedding-3-small` | Embedding model |
| `retrieval.mode` | `hybrid` | `hybrid` or `vector` |
| `retrieval.vectorWeight` | `0.7` | Weight for vector similarity |
| `retrieval.bm25Weight` | `0.3` | Weight for BM25 keyword match |
| `retrieval.rerank` | `cross-encoder` | `cross-encoder`, `lightweight`, or `none` |
| `retrieval.hardMinScore` | `0.35` | Discard results below this score |
| `autoCapture` | `true` | Auto-capture from conversations |
| `autoRecall` | `true` | Auto-inject memories into context |

## License

MIT — see [LICENSE](LICENSE) for details.

Original work: [win4r/memory-lancedb-pro](https://github.com/win4r/memory-lancedb-pro)

tools