← Back to Plugins
Tools

Mem Arch

st007097-coder By st007097-coder 👁 7 views ▲ 0 votes

๐Ÿง  OpenClaw Memory Architect Plugin โ€” 10-Layer Memory Defense System for AI Agents (P1-P10). Inspired by Claude Code's memory architecture.

GitHub

README

# OpenClaw Memory Architect Plugin โ€” From Inspiration to Implementation

![cover](./cover.jpg)

---

## 1. Introduction ๐ŸŒ…

The user is a Hong Kong-based QS (Quantity Surveyor) who built an AI assistant called "Ah Sing" running on the OpenClaw platform. Ah Sing can search for information, manage files, and connect to various services (Notion, Obsidian, GitHub, etc.) โ€” a capable personal assistant.

But in daily use, five recurring pain points kept undermining productivity:

### ๐Ÿ”ด Pain Point #1: Sudden Amnesia

Every session restart, Ah Sing completely forgot what it had done and decided before. Worse, sometimes a mid-session disconnect would wipe the context, forcing the user to re-explain the entire situation. Details confirmed in the previous session became a blank slate the moment a new one started.

### ๐Ÿ”ด Pain Point #2: Token Consumption Runaway

`tool results` routinely ran into thousands of lines โ€” a single `read` on a large file or an `exec` running a script would burn through tokens rapidly. Sessions quickly hit the context window ceiling, unable to sustain even one complete workflow.

### ๐Ÿ”ด Pain Point #3: Context Accumulation Stalling

As sessions grew longer, old `tool results` piled up. Relying on prompt instructions for the LLM to self-compress was fundamentally unreliable. Sessions became painfully slow or crashed outright.

### ๐Ÿ”ด Pain Point #4: Memory Maintenance by Hand

All memory depended on manually maintaining `MEMORY.md`. Miss one update and context was lost. Meanwhile, `MEMORY.md` kept growing โ€” eventually even reading it consumed significant tokens, creating a vicious cycle.

### ๐Ÿ”ด Pain Point #5: Unreliable Behavior

Even with rules written in the system prompt, the LLM sometimes followed them and sometimes didn't โ€” zero guarantees. "Extract memories at the end of every conversation" might work three or four times out of ten.

---

**The goal was clear: build a Plugin that systematically solves all five pain points at the code level.**

---

## 2. Inspiration: AI Agent Memory Discussions on X/Twitter ๐Ÿ’ก

### 2.1 From Social Media to Deep Research

The project's inspiration came from numerous articles and discussions about AI Agent memory systems on X (Twitter). In early 2026, AI Agent memory management became a hot topic in the community โ€” different developers shared various approaches to memory architecture:

- **Claude Code's 7-Layer Defense** โ€” Revealing how Anthropic internally handles Context management
- **Cross-session memory solutions** โ€” Various attempts at memory persistence across sessions
- **Topic files architecture** โ€” Categorizing memory by subject to prevent single-file bloat
- **Memory-as-code concept** โ€” Elevating memory management from the prompt layer to the code layer

These articles surfaced a core insight:

> **The same pattern, implemented at prompt-level vs code-level, produces dramatically different results.**

### 2.2 CCB: The Most Valuable Reference

Among the many references, **CCB (Claude Code Best)** stood out as the most valuable. CCB is a community-maintained reverse-engineered reconstruction of the Claude Code official CLI:

> ๐Ÿ”— GitHub: [https://github.com/claude-code-best/claude-code](https://github.com/claude-code-best/claude-code)

CCB's value lies in its complete reveal of a memory defense system, covering memory extraction, session management, context compression, and more.

### 2.3 Six Core Patterns

| # | Pattern | File | Description |
|---|---------|------|-------------|
| 1 | Auto Memory Extraction | `extractMemories.ts` | Automatically extracts worth-remembering info at conversation end |
| 2 | Session Memory | `sessionMemory.ts` | Cross-session memory summary system |
| 3 | Tool Parallel Execution | `toolOrchestration.ts` | Issue multiple side-effect-free tool calls simultaneously |
| 4 | Three-Layer Compression | โ€” | Auto Compact / Micro Compact / Snip Compact |
| 5 | Granular Permission System | โ€” | Plan / Auto / Manual modes |
| 6 | Tool Use Summary | โ€” | Auto-summarize results after each tool call |

### 2.4 Broader References

Beyond CCB, we also studied other memory system designs:

- **MemGPT / Letta** โ€” Treating the LLM as an OS, managing Context with paging concepts
- **LangGraph's checkpointing** โ€” State graph persistence solutions
- **AutoGen's multi-agent memory sharing** โ€” Memory synchronization between agents
- **Various open-source AI memory libraries** โ€” Vector stores, knowledge graphs, and other experiments

From all of this, we distilled a simple but profound design framework:

```
Naturally occurs in conversation โ†’ Prompt-Level (in AGENTS.md)
Requires automatic triggering โ†’ Code-Level (Plugin / Hook / Cron)
```

This framework would be repeatedly validated throughout development and ultimately become the core principle of the entire system.

---

## 3. The Exploration Phase: v1 to v3 Evolution ๐Ÿ”ฌ

### 3.1 v1: Initial Attempt (All Prompt-Based)

**Approach:** Wrote all 6 core CCB patterns into the `AGENTS.md` system prompt, expecting the LLM to self-execute these behaviors during conversation.

**Results:**

| Feature | Stability | Notes |
|---------|-----------|-------|
| Auto Memory Extraction | โŒ Unstable | Sometimes works, sometimes forgotten |
| Session Memory | โŒ Unstable | Relies on LLM self-discipline |
| Tool Parallel Execution | โœ… Stable | The only reliable feature |
| Three-Layer Compression | โŒ Unstable | Trigger conditions too vague |
| Granular Permission System | โœ… Stable | Naturally occurs in conversation |
| Tool Use Summary | โŒ Unstable | Easily missed |

**Conclusion:** Only 2 out of 6 features worked reliably. The other 4 depended entirely on "self-discipline."

**Lesson:** More instructions in the prompt โ‰  more capabilities. An LLM isn't a program โ€” it won't strictly follow instructions.

---

### 3.2 v2: Feature Expansion (Problems Persist)

**Approach:** Added more detailed compression rules, clearer permission system descriptions, and more specific Tool Summary formats on top of v1.

**Results:** The underlying implementation approach didn't change at all. More prompt = more instability.

**Lesson:** Prompt-level auto-triggering is inherently flawed. No matter how detailed the prompt, the LLM will occasionally "forget" to execute.

---

### 3.3 v3: Architecture Overhaul (Bold Deletion + Code-Level Replacement) ๐Ÿ”„

This was the most critical turning point in the entire development process. We made a bold decision: **delete most prompt-level features and replace them with code-level solutions.**

**Core Decisions:**

| Decision | Content |
|----------|---------|
| ๐Ÿ—‘๏ธ Delete | 4 unreliable prompt-level features (Auto Memory Extraction, Session Memory, Three-Layer Compression, Tool Use Summary) |
| โœ… Keep | 2 reliable prompt-level features (Tool Parallel Execution, Granular Permission System) |
| ๐Ÿ”ง Replace | Implement deleted features via Plugin / Hook / Cron code-level solutions |

**Decision Framework:**

```
Naturally occurs in conversation โ†’ Prompt-Level (in AGENTS.md)
Requires automatic triggering โ†’ Code-Level (Plugin / Hook / Cron)
```

Simple yet profound. It answered a key question: **What belongs in prompts, and what must be in code?**

---

## 4. Architecture Design: OpenClaw's Plugin System ๐Ÿ—๏ธ

### 4.1 Plugin System Overview

OpenClaw's Plugin system is a fine-grained interception system that allows developers to inject custom logic at specific points in the tool call lifecycle. This is exactly the code-level infrastructure we needed.

### 4.2 Supported Hook Points

| Hook Point | Trigger Timing | Typical Use |
|------------|----------------|-------------|
| `tool_result_persist` | After tool result returns | Result compression, summarization |
| `session:compact:before` | Before context compression | Inject summaries, check circuit breaker |
| `session:compact:after` | After context compression | Reset counters |
| `agent_end` | When session ends | Memory writes, lock release |
| `after_tool_call` | After tool call completes | Counting, statistics |
| `before_prompt_build` | Before prompt construction | Inject reminders, micro-compression |
| `session_end` | When session terminates | Cleanup, trigger dreaming |

### 4.3 Plugin Architecture Diagram

```
OpenClaw Plugin Architecture
โ”‚
โ”œโ”€โ”€ tool_result_persist (P2)
โ”‚   โ””โ”€โ”€ Per-Tool result compression
โ”‚
โ”œโ”€โ”€ session:compact:before (P1 + P7)
โ”‚   โ”œโ”€โ”€ P1 โ€” Circuit breaker check
โ”‚   โ””โ”€โ”€ P7 โ€” Session Memory injection
โ”‚
โ”œโ”€โ”€ session:compact:after (P1)
โ”‚   โ””โ”€โ”€ Reset circuit breaker on success
โ”‚
โ”œโ”€โ”€ agent_end (P5 + P10)
โ”‚   โ”œโ”€โ”€ P5 โ€” Memory write mutex check
โ”‚   โ””โ”€โ”€ P10 โ€” Memory Lock release
โ”‚
โ”œโ”€โ”€ after_tool_call (P6 + P9)
โ”‚   โ”œโ”€โ”€ P6 โ€” Tool call counting
โ”‚   โ””โ”€โ”€ P9 โ€” Topic File migration suggestion
โ”‚
โ”œโ”€โ”€ before_prompt_build (P4 + P6 + P8 + P10)
โ”‚   โ”œโ”€โ”€ P4 โ€” Micro-compression
โ”‚   โ”œโ”€โ”€ P6 โ€” Summary injection
โ”‚   โ”œโ”€โ”€ P8 โ€” Skeptical Memory reminder
โ”‚   โ””โ”€โ”€ P10 โ€” Lock cleanup
โ”‚
โ””โ”€โ”€ session_end (P3)
    โ””โ”€โ”€ Dreaming trigger
```

### 4.4 Why Plugin Over Other Approaches?

| Approach | Pros | Cons |
|----------|------|------|
| Prompt (AGENTS.md) | Simple, flexible | Unreliable, can't auto-trigger |
| Cron Job | Precise timing | Can't intercept tool call lifecycle |
| **Plugin** | **Auto-trigger, fine-grained interception** | **Requires understanding Hook Points** |
| External Service | Powerful | Depends on additional infrastructure |

Plugin is the optimal choice โ€” combining auto-triggering and fine-grained interception without additional infrastructure.

---

## 5. Development: The 10-Layer Defense System ๐Ÿ›ก๏ธ

### P1: Circuit Breaker โšก

| Item | Details |
|------|---------|
| **Purpose** | Prevent infinite retries from wasting API budget |
| **Hook Points** | `session:compact:before`, `session:compact:after`, `compaction_failure` |
| **Implementation** | Tracks consecutive failures; pauses compression for 30 minutes after 3 failures |
| **Tests** | 8 test cases, all passing โœ… |

The circuit breaker borrows from a classic distributed systems pattern. When co

... (truncated)
tools

Comments

Sign in to leave a comment

Loading comments...