Voice

Bee

Name: Bee
Rating: 3.5 (1 reviews)
Author: skysphere-labs

By skysphere-labs 👁 271 views ▲ 0 votes

BEE — Belief Extraction Engine: Cognitive memory plugin for OpenClaw that learns what matters about you and recalls it before every response

GitHub

Install

npm install

#

README

# BEE — Belief Extraction Engine

**Your AI assistant should know who you are without being told twice.**

BEE is an [OpenClaw](https://github.com/skysphere-labs/openclaw) plugin that gives your AI assistant persistent, local-first cognitive memory. It automatically extracts beliefs, preferences, facts, goals, and decisions from your conversations, stores them in a local SQLite database, and recalls relevant context before every response — so your assistant genuinely learns who you are over time.

---

## The Problem

Every AI assistant today suffers from the same fundamental limitation: **amnesia**.

- You tell your assistant you prefer concise answers — it forgets next session.
- You explain your tech stack, your role, your goals — gone after the conversation ends.
- You make a decision ("we're going with PostgreSQL") — your assistant won't remember tomorrow.

Commercial "memory" features are either cloud-dependent, shallow (keyword matching), or both. Your personal context — the things that make interactions feel natural and productive — shouldn't live on someone else's server, and it shouldn't require you to repeat yourself.

**BEE solves this.** It creates a local, structured memory layer that learns what matters about you and automatically brings it back when it's relevant.

## How It Works

BEE hooks into two points of the OpenClaw agent lifecycle:

```
┌─────────────────────────────────────────────────────┐
│                  AGENT LIFECYCLE                     │
│                                                      │
│  ┌──────────────────┐                                │
│  │ before_agent_start│──► RECALL                     │
│  └────────┬─────────┘   Query beliefs DB             │
│           │              Build <bee-recall> context   │
│           ▼              Prepend to agent context     │
│  ┌──────────────────┐                                │
│  │   Agent Response  │                               │
│  └────────┬─────────┘                                │
│           │                                          │
│  ┌────────▼─────────┐                                │
│  │    agent_end      │──► EXTRACT                    │
│  └──────────────────┘   Prefilter messages            │
│                         Classify & score beliefs      │
│                         Store in SQLite               │
└─────────────────────────────────────────────────────┘
```

### 1. Extract — Learning from conversation

After each agent turn, BEE processes the last few user messages through a multi-stage pipeline:

- **Prefilter**: Discards noise — acknowledgments ("ok", "thanks"), emoji-only messages, system messages, URLs without commentary, and short filler. Only messages with genuine belief signals pass through.
- **Classification**: Categorizes each belief as `identity`, `goal`, `preference`, `decision`, `fact`, or `implicit_inference`.
- **Confidence scoring**: Assigns a confidence score (0–1) based on linguistic signals — explicit first-person declarations score higher than ambiguous statements.
- **Decay sensitivity**: Tags each belief with how quickly it should fade (`low` for identity, `high` for transient facts).

### 2. Store — Local SQLite with structure

Beliefs are stored in a local SQLite database with WAL mode for performance:

```sql
beliefs (
  id, content, confidence, category, status,
  activation_score, decay_sensitivity,
  source, reasoning, context_summary,
  created_at, updated_at, last_relevant
)
```

Statuses include `active`, `provisional`, `archived`, and `flagged_for_review`. Beliefs with low confidence start as `provisional`.

### 3. Recall — Tiered retrieval before every response

Before the agent generates a response, BEE queries the belief database and constructs a context block with three tiers:

| Tier | What it contains | Selection criteria |
|---|---|---|
| **Core** | Durable identity, goals, strong preferences | High confidence, identity/goal/preference categories, never archived |
| **Active** | Recently relevant beliefs | Accessed or updated within last 7 days, sorted by activation score |
| **Recalled** | Beliefs relevant to the current prompt | Token-matched against the user's current message |

The tiers are deduplicated and size-capped (max 2200 chars) to avoid overwhelming the agent context. The result is injected as a `<bee-recall>` block prepended to the agent's context.

### 4. Profile Distillation — A synthesized "who you are"

On top of per-turn recall, BEE periodically distills your high-confidence beliefs (identity, goals, strong preferences) into a natural-language **user profile** — a 2–3 sentence summary of who you are and what you care about.

- Profile is regenerated asynchronously every N turns (default 50) or when the underlying belief set changes.
- Uses a lightweight LLM call (`claude-haiku`) to synthesize beliefs into prose.
- The profile is cached in SQLite (`bee_profile` table) and injected as a `<bee-profile>` block alongside the recall context.
- If no runtime is available, profile synthesis is gracefully skipped.

This gives the agent a stable, high-level understanding of the user that doesn't fluctuate turn-to-turn — think of it as long-term memory vs. the working memory of per-turn recall.

## Install

```bash
openclaw plugin install skysphere-labs/openclaw-bee
```

Zero config required. BEE will create its SQLite database at `~/.openclaw/workspace/state/bee.db` automatically.

## Configuration

All configuration is optional. Override via your OpenClaw plugin config:

| Key | Type | Default | Description |
|---|---|---|---|
| `dbPath` | string | `~/.openclaw/workspace/state/bee.db` | Path to the SQLite database file |
| `extractionModel` | string | `anthropic/claude-haiku` | Model used for belief extraction (Opus is blocked to prevent cost overruns) |
| `maxCoreBeliefs` | number | `10` | Maximum core beliefs recalled per turn |
| `maxActiveBeliefs` | number | `5` | Maximum active beliefs recalled per turn |
| `maxRecalledBeliefs` | number | `5` | Maximum prompt-relevant beliefs recalled per turn |
| `autoExtract` | boolean | `true` | Automatically extract beliefs after each agent turn |
| `autoRecall` | boolean | `true` | Automatically recall beliefs before each agent turn |
| `profileEnabled` | boolean | `true` | Enable profile distillation (synthesized user summary) |
| `profileRefreshTurns` | number | `50` | Regenerate profile every N turns (when beliefs change) |
| `debug` | boolean | `false` | Enable verbose logging |

## Architecture

```
src/
├── index.ts        # Plugin entry — registers hooks, parses config
├── types.ts        # TypeScript types (BeeConfig, BeliefRow, etc.)
├── prefilter.ts    # Message prefilter — discards noise, passes signal
├── extract.ts      # Belief extraction — classification, scoring, storage
├── recall.ts       # Tiered recall — core/active/recalled + profile beliefs query
├── profile.ts      # Profile distillation — LLM-synthesized user summary
├── db.ts           # SQLite operations — schema, CRUD, queries
└── openclaw-plugin-sdk.d.ts  # OpenClaw plugin SDK type declarations
```

### Prefilter intelligence

The prefilter is the first line of defense against noisy extractions. It uses pattern matching to classify messages:

- **DISCARD**: "ok", "thanks", "nice", emoji-only, URLs, assistant messages, system messages
- **CONTEXT_ONLY**: Questions without personal info, imperative directives, pure task commands
- **PASS**: First-person declarations, messages with named entities/specifics, decision language, anything 4+ words that doesn't match a discard pattern

## Privacy

- **100% local** — all belief data stays on your machine in a SQLite file
- **No cloud storage** — nothing leaves your device unless you explicitly configure an extraction model that makes API calls
- **No telemetry** — BEE collects zero analytics or usage data
- **You own your data** — the SQLite database is a standard file you can inspect, export, or delete at any time

## Comparison

| Feature | BEE | Cloud-based memory (e.g. Supermemory) |
|---|---|---|
| Storage | Local SQLite | Cloud servers |
| Cost | Free & open source | ~$20/mo subscription |
| Privacy | Data never leaves your machine | Data stored on third-party servers |
| Extraction | LLM-powered semantic classification | Mostly regex/rules |
| Recall | Tiered (core/active/recalled) | Flat retrieval |
| Customization | Full config + open source | Limited |
| Offline | Works completely offline | Requires internet |

## Development

```bash
# Install dependencies
npm install

# Run prefilter tests
npm test

# Type check
npm run typecheck
```

## Requirements

- [OpenClaw](https://github.com/skysphere-labs/openclaw) >= 2026.2.0
- Node.js (for better-sqlite3)

## License

MIT — see [LICENSE](LICENSE)

---

Built by [Skysphere Labs](https://skyslabs.ai)

voice