← Back to Plugins
Tools

LycheeMem

LycheeMem By LycheeMem ⭐ 6 stars 👁 2 views ▲ 0 votes

Cognitive memory system for long-horizon AI agents. Persistent, structured, temporally-aware memory with three-tier architecture.

Homepage GitHub

Install

npm install
npm

Configuration Example

// Request
{
  "query": "what tools do I use for database backups",
  "top_k": 5,
  "include_graph": true,
  "include_skills": true
}

// Response
{
  "query": "...",
  "graph_results": [ { "fact_id": "...", "summary": "...", "relevance": 0.91, ... } ],
  "skill_results": [ { "id": "...", "intent": "pg_dump backup to S3", "score": 0.87, ... } ],
  "total": 6
}

README

<div align="center">
  <img src="assert/logo.png" alt="LycheeMem Logo" width="200">
  <h1>LycheeMem</h1>
  <p>
    <img src="https://img.shields.io/badge/License-Apache_2.0-blue.svg" alt="License">
    <img src="https://img.shields.io/badge/python-3.11+-blue.svg" alt="Python Version">
    <img src="https://img.shields.io/badge/LangGraph-000?style=flat&logo=langchain" alt="LangGraph">
    <img src="https://img.shields.io/badge/litellm-000?style=flat&logo=python" alt="litellm">
  </p>
  <p>
    <a href="README_zh.md">中文</a> | English
  </p>
</div>


LycheeMem is a cognitive memory system for long-horizon AI agents, providing persistent, structured, and temporally-aware memory. It models memory the way humans use it — distinguishing what you remember *happening* from what you have come to *know* — and makes those memories available at inference time through a multi-stage reasoning pipeline.

---

<div align="center" style="margin: 20px 0; font-size: 14px; color: #586069;">
  <a href="#memory-architecture" style="text-decoration: none; color: #0366d6; margin: 0 8px;">Memory Architecture</a>
  •
  <a href="#pipeline" style="text-decoration: none; color: #0366d6; margin: 0 8px;">Pipeline</a>
  •
  <a href="#quick-start" style="text-decoration: none; color: #0366d6; margin: 0 8px;">Quick Start</a>
  •
  <a href="#api-reference" style="text-decoration: none; color: #0366d6; margin: 0 8px;">API Reference</a>
  •
  <a href="#web-demo" style="text-decoration: none; color: #0366d6; margin: 0 8px;">Web Demo</a>
</div>

---

## 🔥 News

• [03/23/2026] 🎉 **LycheeMem is now Open Source!** [GitHub Repository →](https://github.com/LycheeMem/LycheeMem)

---

## 🚀 Coming Soon

📢 **OpenClaw Plugin is Coming!** — Save your tokens and optimize memory efficiency! Stay tuned!

---

## 📚 Memory Architecture

LycheeMem organizes memory into three complementary stores:

<table style="border-collapse: collapse; width: 100%; margin: 20px auto; border: 1px solid #e1e4e8; border-radius: 8px; overflow: hidden;">
  <thead>
    <tr style="background-color: #f6f8fa;">
      <th style="border: 1px solid #e1e4e8; padding: 15px; text-align: center; color: #0366d6; font-weight: 600;">Working Memory</th>
      <th style="border: 1px solid #e1e4e8; padding: 15px; text-align: center; color: #0366d6; font-weight: 600;">Semantic Memory</th>
      <th style="border: 1px solid #e1e4e8; padding: 15px; text-align: center; color: #0366d6; font-weight: 600;">Procedural Memory</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td style="border: 1px solid #e1e4e8; padding: 15px; vertical-align: top; background: white;">
        <p style="margin: 0 0 10px 0; font-size: 13px; color: #586069; font-weight: 600;">(Episodic)</p>
        <ul style="margin: 0; padding-left: 18px; font-size: 13px; line-height: 1.6; color: #24292e;">
          <li>Session turns</li>
          <li>Summaries</li>
          <li>Token budget management</li>
        </ul>
      </td>
      <td style="border: 1px solid #e1e4e8; padding: 15px; vertical-align: top; background: white;">
        <p style="margin: 0 0 10px 0; font-size: 13px; color: #586069; font-weight: 600;">(Knowledge Graph)</p>
        <ul style="margin: 0; padding-left: 18px; font-size: 13px; line-height: 1.6; color: #24292e;">
          <li>Entity nodes</li>
          <li>Bi-temporal facts</li>
          <li>Communities</li>
          <li>Episode anchors</li>
        </ul>
      </td>
      <td style="border: 1px solid #e1e4e8; padding: 15px; vertical-align: top; background: white;">
        <p style="margin: 0 0 10px 0; font-size: 13px; color: #586069; font-weight: 600;">(Skills)</p>
        <ul style="margin: 0; padding-left: 18px; font-size: 13px; line-height: 1.6; color: #24292e;">
          <li>Skill entries</li>
          <li>HyDE retrieval</li>
        </ul>
      </td>
    </tr>
  </tbody>
</table>

### 💾 Working Memory

The working memory window holds the active conversation context for a session. It operates under a **dual-threshold token budget**:

- **Warn threshold (70%)** — triggers asynchronous background pre-compression; the current request is not blocked.
- **Block threshold (90%)** — the pipeline pauses and flushes older turns to a compressed summary before proceeding.

Compression produces *summary anchors* (past context, distilled) + *raw recent turns* (last N turns, verbatim). Both are passed downstream as the conversation history.

### 🗺️ Semantic Memory — Graphiti Knowledge Graph

The knowledge graph is implemented as a **Graphiti-style bi-temporal graph** backed by Neo4j. It stores the world in four node types:

| Node | Purpose |
|------|---------|
| `Episode` | A single conversation turn; all facts are traceable to source episodes |
| `Entity` | A named entity (person, project, place, concept, etc.) |
| `Fact` | A typed relation between two entities with temporal validity and transaction metadata |
| `Community` | A cluster of strongly related entities, carrying a periodically refreshed summary |

#### Bi-temporal Model

Every `Fact` carries four timestamps that separate *when something was true* from *when the system learned it*:

```
t_valid_from / t_valid_to   →  Valid time   (real-world truth interval)
t_tx_created / t_tx_expired →  Transaction time (system-side record interval)
```

This allows the graph to answer queries like *"what was the user's home address last month?"* correctly even if the address has since changed, and to distinguish genuinely contradictory facts from facts that were true at different times.

#### Graph Retrieval Pipeline

Retrieval combines three complementary signals:

1. **BM25 full-text search** — keyword-level recall against `Entity.name` and `Fact.fact_text` via a Neo4j full-text index.
2. **BFS graph traversal** — starts from the most recent episode nodes for the session and expands outward, up to a configurable depth, surfacing semantically linked facts even when they do not match keyword terms.
3. **Vector ANN search** — approximate nearest-neighbour over the `Entity.embedding` vector index (configurable dimensionality and similarity function).

After retrieval, candidates are re-ranked using **Reciprocal Rank Fusion (RRF)** across all three lists. Optionally, a **cross-encoder reranker** (driven by the same LLM adapter already in the pipeline, no extra vendor SDK) refines the top-N results, followed by **Maximal Marginal Relevance (MMR)** diversification to avoid near-duplicate context in the final prompt.

#### Community Detection

A background sweep runs `refresh_all_communities()` every *N* episodes globally (default: 50). Community summaries are included in graph search results to provide broad contextual framing even when no specific fact directly matches a query.

### 🛠️ Procedural Memory — Skill Store

The skill store preserves reusable *how-to* knowledge as structured skill entries, each carrying:

- **Intent** — a short description of what the skill does.
- **`doc_markdown`** — a full Markdown document describing the procedure, commands, parameters, and caveats.
- **Embedding** — a dense vector of the intent text, used for similarity search.
- **Metadata** — usage counters, last-used timestamp, preconditions.

Skill retrieval uses **HyDE (Hypothetical Document Embeddings)**: the query is first expanded into a *hypothetical ideal answer* by the LLM, then that draft text is embedded to produce a query vector that matches well against stored procedure descriptions, even when the user's original phrasing is vague.

---

## ⚙️ Pipeline

Every request passes through a fixed sequence of five agents. Four are synchronous stages in the LangGraph pipeline; one is a background post-processing task.

<div align="center">
  <div style="display: flex; flex-direction: column; align-items: center; font-family: -apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial,sans-serif; gap: 8px;">
    <div style="font-weight: bold; color: #586069; font-size: 14px;">START</div>
    <div style="font-size: 18px; color: #d1d5da; line-height: 1;">▼</div>
    <div style="border: 1px solid #e1e4e8; border-radius: 8px; padding: 15px; background-color: #f6f8fa; width: 100%; max-width: 600px; box-shadow: inset 0 1px 3px rgba(27,31,35,0.02);">
      <div style="display: flex; flex-direction: column; gap: 8px; text-align: left;">
        <div style="padding: 12px; border-left: 5px solid #0366d6; background: white; border-radius: 4px; box-shadow: 0 1px 2px rgba(0,0,0,0.06); color: #24292e;">
          <strong style="color: #0366d6;">1. WMManager</strong> — Token budget check + compress/render
        </div>
        <div style="text-align: center; color: #d1d5da; font-size: 16px; margin: -4px 0;">↓</div>
        <div style="padding: 12px; border-left: 5px solid #0366d6; background: white; border-radius: 4px; box-shadow: 0 1px 2px rgba(0,0,0,0.06); color: #24292e;">
          <strong style="color: #0366d6;">2. SearchCoordinator</strong> — Multi-query → Graph + Skill retrieval
        </div>
        <div style="text-align: center; color: #d1d5da; font-size: 16px; margin: -4px 0;">↓</div>
        <div style="padding: 12px; border-left: 5px solid #0366d6; background: white; border-radius: 4px; box-shadow: 0 1px 2px rgba(0,0,0,0.06); color: #24292e;">
          <strong style="color: #0366d6;">3. SynthesizerAgent</strong> — LLM-as-Judge scoring + context fusion
        </div>
        <div style="text-align: center; color: #d1d5da; font-size: 16px; margin: -4px 0;">↓</div>
        <div style="padding: 12px; border-left: 5px solid #28a745; background: white; border-radius: 4px; box-shadow: 0 1px 2px rgba(0,0,0,0.06); color: #24292e;">
          <strong style="color: #28a745;">4. ReasoningAgent</strong> — Final response generation
        </div>
      </div>
    </div>
    <div style="font-size: 18px; color: #d1d5da; line-height: 1;">▼</div>
    <div style="font-weight: bold; color: #586069; font-size: 14px;">END</div>
    <div style="display: flex; align-items: center; gap: 8px; margin-to

... (truncated)
tools

Comments

Sign in to leave a comment

Loading comments...