Gralkor

Name: Gralkor
Rating: 3.5 (1 reviews)
Author: elimydlarz

By elimydlarz ⭐ 1 stars 👁 94 views ▲ 0 votes

OpenClaw plugin that gives your agents great long-term, temporally-aware memory.

Homepage GitHub

Install

openclaw plugins install @susu-eng/gralkor@latest

Configuration Example

{
  "dataDir": "/path/to/gralkor-data",
  "workspaceDir": "~/.openclaw/workspace",
  "googleApiKey": "your-gemini-key",
  "llm": { "provider": "gemini", "model": "gemini-3.1-flash-lite-preview" },
  "embedder": { "provider": "gemini", "model": "gemini-embedding-2-preview" },
  "autoCapture": { "enabled": true },
  "autoRecall": { "enabled": true, "maxResults": 10 },
  "search": { "maxResults": 20, "maxEntityResults": 10 },
  "idleTimeoutMs": 300000,
  "ontology": {
    "entities": {},
    "edges": {},
    "edgeMap": {}
  },
  "test": false
}

README

# Gralkor

**The best memory plugin for OpenClaw agents**

Gralkor is an OpenClaw plugin that gives your agents long-term, temporally-aware memory. It uses [Graphiti](https://github.com/getzep/graphiti) (by Zep) for knowledge graph construction and [FalkorDB](https://www.falkordb.com/) as the graph database backend. Both run automatically as a managed subprocess - no independent server for you to manage, or SaaS company to connect to.

Gralkor automatically remembers and recalls everything your agent says, _thinks_, and _does_ — no prompt engineering required by the operator, no conscious (haha) effort required by the agent.

## Why Gralkor

Let's look in detail about the decisions made for Gralkor and why they make it the best memory plugin for OpenClaw.

**Graphs, not Markdown or pure vector.** Graphs are the right data structure for representing knowledge. Your code is a graph - the _world_ is a deeply interrelated graph and trying to flatten it into Markdown files or pure vector embeddings is fighting reality. Gralkor doesn't use MD files (other than indexing yours), and this is not another chunking strategy or embedding experiment. Graphiti has already solved this layer and Gralkor leverages it optimally for this use case.

[HippoRAG](https://arxiv.org/abs/2405.14831) (NeurIPS 2024) found graph-based retrieval reaches 89.1% recall@5 on 2WikiMultiHopQA versus 68.2% for flat vector retrieval — a 20.9-point gap. [AriGraph](https://arxiv.org/abs/2407.04363) (IJCAI 2025) independently found KG-augmented agents markedly outperform RAG, summarization, and full-conversation-history baselines across interactive environments.

**Remembering behaviour, not just dialog.** Agents make mistakes, weigh options, reject approaches - they _learn_ as they complete tasks. Gralkor distills the agent's behaviour - not just its dialog - into first-person behavioural reports weaved into episode transcripts before ingestion.

For almost all other memory plugins, your agent is inherently dishonest with you, frequently claiming to remember what it has done when it only really remembers what it _already claimed_ to have done, or to have thought _what it is only now imagining_.

With Gralkor your agent actually remembers it's thoughts and actions.

[Reflexion](https://arxiv.org/abs/2303.11366) (NeurIPS 2023) showed agents storing self-reflective reasoning traces outperform GPT-4 output-only baselines by 11 points on HumanEval. [ExpeL](https://arxiv.org/abs/2308.10144) (AAAI 2024) directly ablated reasoning-trace storage versus output-only: +11–19 points across benchmarks from storing the reasoning process alone.

**Maximum context at ingestion.** Gralkor captures all messages in each session of work, distills behaviour, and feeds results to Graphiti *as whole episodes*. Extraction works _way_ better when Graphiti has full context.

Most memory plugins save isolated question-answer pairs or summarized snippets: Some store only the first user message and the last assistant reply, others store to the last turn only.

Gralkor captures the entire series of questions, thoughts, actions, and responses that _solved the problem_ together, with all their interrelationships. Richer semantics, better understanding, better recall.

[SeCom](https://arxiv.org/abs/2502.05589) (ICLR 2025) found coherent multi-turn episode storage scores 5.99 GPT4Score points higher than isolated turn-level storage on LOCOMO. [LongMemEval](https://arxiv.org/abs/2410.10813) (ICLR 2025) confirms: fact-level QA-pair extraction drops accuracy from 0.692 to 0.615 versus full-round episode storage.

**Built for the long term.** Graphiti (and therefore Gralkor) is deeply temporal. On every ingestion, it doesn't just append; it resolves new information against the existing graph, amending, expiring, and invalidating so that your agent knows _what happened over time_.

Graphiti does the heavy temporal lifting on ingestion. It's bad for throughput, and useless for short-lived agents, which means serving a single, long-lived user agent is _the perfect use case_.

[LongMemEval](https://arxiv.org/abs/2410.10813) (ICLR 2025) established that temporal reasoning is the hardest memory sub-task for commercial LLMs; time-aware indexing recovers 7–11% of that loss. [MemoTime](https://arxiv.org/abs/2510.13614) (WWW 2026) found temporal knowledge graphs enable a 4B model to match GPT-4-Turbo on temporal reasoning, with up to 24% improvement over static memory baselines.

**Recursion through reflection.** Point your agent back at its own memory — let it reflect on what it knows, identify contradictions, synthesize higher-order insights, and do with them whatever you believe to be _good cognitive architecture_. Gralkor doesn't limit you to one approach, but the research is quite clear - you should do _something_.

My way is to use cron and [Thinker CLI](https://github.com/elimydlarz/thinker-cli) together, directing the agent to use the search and add memory tools in a sequential reflective process. Share yours, and ask to see mine.

[Reflexion](https://arxiv.org/abs/2303.11366) (NeurIPS 2023) demonstrated that agents storing verbal reflections in an episodic buffer gain 11 points with no weight updates. [Generative Agents](https://arxiv.org/abs/2304.03442) (UIST 2023) showed empirically that a reflection layer synthesizing raw memories into higher-order insights is essential for coherent long-term behavior.

**Custom ontology: model your agent's world _your way_.** Gralkor lets you define your own entity types, attributes, and relationships so that information is parsed into entities and relationships you define. Your graph doesn't have to be a black box - you can keep track of what matters to you.

You can use a domain model codified by experts in your field, or encode _your_ model of the world so that your agent shares it.

[Apple's ODKE+](https://arxiv.org/abs/2509.04696) (2025) showed ontology-guided extraction hits 98.8% precision vs 91% raw LLM; [GoLLIE](https://arxiv.org/abs/2310.03668) (ICLR 2024) directly ablated schema-constrained versus unconstrained generation on the same model, finding +13 F1 points average across NER, relation, and event extraction in zero-shot settings.

**Interpretation** Gralkor interprets information in memory for relevance to the task at hand. This step radically improves output with minimal impact on cost and latency.

**On cost.** Gralkor costs more to run than a Markdown file in the short term. In the longer term, Gralkor provides more efficient context management, reducing token burn. Instead of paying to pollute your context window with junk every read, you pay more on ingestion in exchange for cheap, high-relevance reads forever.

An agent that remembers behaviour, decisions, your preferences, and reasoning across sessions changes the _character_ of your work. You stop spending turns re-establishing context and focus more on what you care about. A single recalled behavioural fact — "we rejected mysql because it lacked jsonb column support needed for X" — prevents re-litigating that decision in a new session - it might save 10 subagents repeating a parallel investigation of database options.

Gralkor is _good_ memory, not cheap memory. You can push the llm choice and perhaps get better extraction, but otherwise I've just made it as good as possible while being reasonable about latency.

## Tools

- **`memory_search`** — searches the knowledge graph and returns relevant facts and entity summaries
- **`memory_add`** — stores information in the knowledge graph; Graphiti extracts entities and relationships
- **`memory_build_indices`** — rebuilds search indices and constraints (maintenance)
- **`memory_build_communities`** — detects and builds entity communities/clusters to improve search quality (maintenance)
- Hooks: auto-capture (stores full multi-turn conversations after each agent run), auto-recall (injects relevant facts before the agent responds)
- Set up: `plugins.slots.memory = "gralkor"` in `openclaw.json`

## Quick Start

### 1. Prerequisites

- OpenClaw 2026.4.2
- Python 3.12+ on the system PATH
- `uv` on PATH ([install](https://docs.astral.sh/uv/getting-started/installation/))
- An API key for a supported LLM provider (see below)

### 2. Configure before installing

Config must be set **before** `plugins.allow`, because OpenClaw validates all listed plugins' config on every write.

```bash
# Required: data directory for persistent state (venv, FalkorDB database).
# Choose a path YOU control — Gralkor has no default.
# This directory survives plugin reinstalls; the plugin dir does not.
openclaw config set plugins.entries.gralkor.config.dataDir /path/to/gralkor-data

# Required: LLM API key for knowledge extraction.
# Gemini is the default provider (LLM + embeddings + reranking, one key).
openclaw config set plugins.entries.gralkor.config.googleApiKey 'your-key-here'

# Optional
openclaw config set plugins.entries.gralkor.config.test true
```

### 3. Install the plugin

```bash
openclaw plugins install @susu-eng/gralkor@latest --dangerously-force-unsafe-install
```

> **Why `--dangerously-force-unsafe-install`?** OpenClaw's install-time security scanner flags Gralkor as critical because of the embeeded Python server. Inspect the source if you'd like to verify there's nothing weird going on.

### 4. Enable and assign the memory slot

OpenClaw has a single `memory` slot that determines which plugin provides memory to your agents. You must explicitly assign Gralkor to the `memory` slot, otherwise installing the plugin does nothing — auto-capture and auto-recall hooks will never fire.

```bash
# If you use an allowlist, add gralkor to it
openclaw config set --json plugins.allow '["gralkor"]'

# Enable the plugin entry
openclaw config set plugins.entries.gralkor.enabled true

# Assign Gralkor to the memory slot (replaces the built-in memory-core)
openclaw config set plugins.slots.memory gralkor

# Expose Gralkor's tools to the agent. Auto-capture an

... (truncated)

tools