TIP
MemPalace was co-developed by actress Milla Jovovich (from The Fifth Element) and her technical partner Ben Sigman. It’s a local AI memory system. In the LongMemEval benchmark, it achieves 96.6% R@5 in raw verbatim mode, with no external API calls throughout—completely free. GitHub: milla-jovovich/mempalace
Intermediate · About 20 minutes · You’ll master the full MemPalace deployment workflow, understand the Palace layer structure, and learn how to connect the memory system to any AI using MCP.
Target Audience
1–3 years of experience developers who have used AI coding tools like Claude Code / Cursor / Copilot. You want your AI to retain contextual memory across long-running projects—so it doesn’t start from scratch every time.
Core Dependencies & Environment
- Python: 3.9+
- Core dependencies:
chromadb>=0.5.0,<0.7,pyyaml>=6.0 - Install:
pip install mempalaceoruv pip install mempalace - Operating System: macOS / Linux / Windows (WSL works too)
- Storage: ChromaDB (vector store) + SQLite (knowledge graph), all stored locally—no network required
WARNING
MemPalace is designed to run purely locally; your data never leaves your machine. However, during the first installation it installs ChromaDB via pip. ChromaDB itself does not require internet access—just make sure you can import it normally.
Full Project Structure
mempalace/
├── mempalace/ # Core Python package
│ ├── cli.py # CLI entry point, routes to mine/search/init, etc.
│ ├── mcp_server.py # MCP server, exposes 19 tools
│ ├── knowledge_graph.py # Temporal knowledge graph (SQLite)
│ ├── palace_graph.py # Palace navigation graph (BFS traversal, tunnel discovery)
│ ├── convo_miner.py # Conversation mining, split by Q+A
│ ├── miner.py # Project file mining, split by paragraphs
│ ├── searcher.py # Semantic search (ChromaDB)
│ ├── normalize.py # Normalize 5 dialogue formats
│ ├── dialect.py # AAAK compressed dialect
│ ├── layers.py # Four-layer memory stack (L0–L3)
│ ├── onboarding.py # Initialization onboarding
│ ├── entity_detector.py # Automatically detects person/project names
│ └── split_mega_files.py # Split/merge large session files
├── hooks/ # Claude Code auto-save hooks
│ ├── mempal_save_hook.sh # Auto-save every 15 messages
│ └── mempal_precompact_hook.sh # Emergency save before context compression
├── benchmarks/ # Reproducible benchmarks (LongMemEval / LoCoMo)
│ ├── longmemeval_bench.py
│ ├── locomo_bench.py
│ └── BENCHMARKS.md
└── examples/
├── basic_mining.py
└── mcp_setup.md
Step-by-Step
Step 1: Install
pip install mempalace
The minimum Python version is 3.9. After installation, confirm it works:
mempalace --version
TIP
If you use uv: uv pip install mempalace, the result is exactly the same.
Step 2: Initialize the Memory Palace
mempalace init ~/projects/myapp
The init command starts the guided flow and asks you, in order:
- The people you often collaborate with (add them to the wing config)
- The project you’re working on (one wing per project)
- Your AI identity (writes to the L0 layer)
After onboarding, it generates two configuration files:
~/.mempalace/config.json— global config (palace path, etc.)~/.mempalace/wing_config.json— wing and keyword mapping
The generated wing_config.json looks roughly like this:
{
"default_wing": "wing_general",
"wings": {
"wing_kai": { "type": "person", "keywords": ["kai", "kai's"] },
"wing_driftwood": { "type": "project", "keywords": ["driftwood", "analytics", "saas"] }
}
}
Every time the AI starts, it only loads L0 + L1 (about 170 tokens), and it already knows what your world looks like.
Step 3: Mine Data
MemPalace supports two mining modes—choose based on your data source.
Mode A: Mine Project Files (code, documents, notes)
mempalace mine ~/projects/myapp
The miner recursively scans the directory, splits content by paragraphs, stores it in ChromaDB, and keeps the original content in the Drawer.
Mode B: Mine Conversation Exports (Claude/ChatGPT/Slack)
# Basic usage
mempalace mine ~/chats/ --mode convos
# Specify a wing to make later project filtering easier
mempalace mine ~/chats/ --mode convos --wing myapp
# Enable auto-classification (extract decision-making, preferences, milestones, questions, and emotional context)
mempalace mine ~/chats/ --mode convos --extract general
The convo_miner splits conversations by Q+A, and automatically detects which room it belongs to (via 70+ matching modes in room_detector_local.py; no API needed).
TIP
If your ChatGPT/Claude export file contains multiple merged sessions, split it into single-session files first using mempalace split ~/chats/; mining results will be better.
Step 4: Semantic Search Validation
After mining, try searching:
mempalace search "why did we switch to GraphQL"
Add wing filtering to search only within a specific project:
mempalace search "auth decision" --wing driftwood
Make it even more precise by adding room filtering:
mempalace search "auth decision" --wing driftwood --room auth-migration
The results are the original text from the Drawer (verbatim)—no summaries, no information loss. ChromaDB performs vector search, while Closet provides structured summaries.
Step 5: Connect the MCP Server
MCP (Model Context Protocol) makes MemPalace available as a tool to any AI. Configure once—effective permanently.
Connect Claude Code:
claude mcp add mempalace -- python -m mempalace.mcp_server
After configuration, Claude Code automatically gets 19 tools. The AI will call mempalace_search itself when needed—you don’t have to search manually.
Connect the Gemini CLI:
# See examples/gemini_cli_setup.md
claude mcp add mempalace -- python -m mempalace.mcp_server
Gemini CLI has more complete support for MCP, and save hooks can also be configured automatically.
MCP Tool List (19 tools):
| Tool | Purpose |
|---|---|
mempalace_status | Return a full view of the Palace + AAAK protocol |
mempalace_list_wings | List all wings and memory counts |
mempalace_list_rooms | List rooms inside a given wing |
mempalace_search | Semantic search with wing/room filtering |
mempalace_kg_query | Query temporal relations of entities |
mempalace_kg_add | Add fact triples |
mempalace_kg_invalidate | Invalidate a fact |
mempalace_kg_timeline | Generate a temporal story for an entity |
mempalace_diary_write | Agent writes an AAAK diary |
mempalace_diary_read | Agent reads an AAAK diary |
mempalace_traverse | BFS traverse a wing |
mempalace_find_tunnels | Discover tunnels across wings |
| ... | ... |
From the mempalace_status output, the AI automatically learns the AAAK syntax and memory protocol—no prompt configuration required.
Step 6: Configure Claude Code Auto-Save Hooks
Claude Code’s Hooks let MemPalace automatically save memories during every conversation.
Edit ~/.claude/settings.json (Claude Code global config) and add:
{
"hooks": {
"Stop": [
{
"matcher": "",
"hooks": [
{
"type": "command",
"command": "/path/to/mempalace/hooks/mempal_save_hook.sh"
}
]
}
],
"PreCompact": [
{
"matcher": "",
"hooks": [
{
"type": "command",
"command": "/path/to/mempalace/hooks/mempal_precompact_hook.sh"
}
]
}
]
}
}
Difference between the two hooks:
- Stop: Triggers once every 15 messages. Performs structured saving—everything about the topic, decisions, references, and code changes is recorded, and it also rebuilds the L1 layer (key facts layer).
- PreCompact: Triggers before context compression. It urgently rescues any memory that hasn’t been saved yet, preventing important context from being lost during compression.
WARNING
The hook scripts include shell call paths. After cloning, place them in a fixed location and write those paths into the config. The scripts themselves don’t involve dangerous operations—they only write structured memories into ChromaDB.
Step 7: Understand the Palace Structure
MemPalace’s core abstraction is the “memory palace”—borrowing memory techniques from ancient Greek orators, using spatial structure instead of a flat search index.
WING: kai (person)
┌──────────┐ ──hall── ┌──────────┐
│ auth-mig │ │ security │
└────┬─────┘ └──────────┘
│
▼
┌──────────┐ ┌──────────┐
│ Closet │ ───▶ │ Drawer │ ← original text exists here
└──────────┘ └──────────┘
TUNNEL (cross-wing connections):
kai/auth-mig ←→ driftwood/auth-mig ←→ priya/auth-mig
Wings: Either a person or a project; this is the main category for memories. Each wing can contain multiple rooms.
Rooms: Specific topics within a wing, such as auth-migration, ci-pipeline, pricing. If a room with the same name appears in different wings, tunnels are generated automatically.
Halls: Corridors for different memory types. Every wing has the same hall set:
hall_facts— locked-in decisionshall_events— sessions, milestones, and debugging processeshall_discoveries— breakthroughs and new insightshall_preferences— habits, preferences, opinionshall_advice— recommendations and solutions
Closets: The summary layer. It points to where the original content lives (the Drawer). Original text is never lost—this just adds a navigable structure.
Drawers: Where the original text is stored. The raw verbatim mode in MemPalace reads original content from here for vector search, achieving 96.6% R@5.
Step 8: Use Knowledge Graph Temporal Relations
ChromaDB stores vectors of the original text. The Knowledge Graph (SQLite) stores structured fact triples. They complement each other.
from mempalace.knowledge_graph import KnowledgeGraph
kg = KnowledgeGraph()
# Add facts with validity windows
kg.add_triple("Kai", "works_on", "Orion", valid_from="2025-06-01")
kg.add_triple("Maya", "assigned_to", "auth-migration", valid_from="2026-01-15")
kg.add_triple("Maya", "completed", "auth-migration", valid_from="2026-02-01")
# Query what Kai is doing now
print(kg.query_entity("Kai"))
# → [Kai → works_on → Orion (current)]
# Query the state on 2026-01-20 (when Maya has not yet completed auth-migration)
print(kg.query_entity("Maya", as_of="2026-01-20"))
# → [Maya → assigned_to → auth-migration]
# View the timeline of the Orion project
print(kg.timeline("Orion"))
# → a chain of facts ordered by time
# Maya switches projects, invalidating the old fact
kg.invalidate("Maya", "assigned_to", "auth-migration", ended="2026-02-01")
# Now query_entity("Maya") no longer returns auth-migration
The validity windows of facts (valid_from / ended) are MemPalace’s key capability. When you query historical states, it tells you “what happened then,” not “what happens now.”
Step 9: The Four-Layer Memory Stack Architecture
MemPalace’s retrieval strategy has four layers: the higher the layer, the lighter the load; the lower the layer, the more precise:
| Layer | Content | Size | When loaded |
|---|---|---|---|
| L0 | AI identity (who you are) | ~50 tokens | Every session |
| L1 | Key facts (team, project, preferences) | ~120 tokens | Every session |
| L2 | Room recall (recent sessions for the current project) | On demand | When a topic hits L2 |
| L3 | Deep search (full semantic retrieval) | On demand | When explicitly asked |
Each time the AI starts, it loads L0 + L1 (mempalace wake-up). With just 170 tokens, it establishes a complete context background. Only when a topic triggers specific rooms does it load L2. Only when you explicitly ask a question does it trigger L3’s full ChromaDB search.
This is also why MemPalace has extremely low cost—$10/year for search vs. $507/year for the summarization approach.
Common Troubleshooting
Q1: Search results are empty, but you’re sure the content exists
Troubleshoot in three steps:
# 1. Confirm wing and room names are correct
mempalace list-wings
mempalace list-rooms --wing myapp
# 2. Broaden the scope—search without specifying wing/room
mempalace search "keyword" # without --wing
# 3. Check whether ChromaDB was actually written to
mempalace status # see whether drawer total count is 0
If mempalace status shows 0 drawers, mining didn’t succeed. The conversation file format may not be supported. Currently supported formats include: Claude Code JSONL, Claude.ai JSON, ChatGPT JSON, Slack JSON, and plain text.
Q2: ChromaDB collection name conflicts
The default collection name is mempalace_drawers. If you run init multiple times or in different directories, conflicts may occur. Explicitly set the path in ~/.mempalace/config.json:
{
"palace_path": "/custom/path/to/palace",
"collection_name": "mempalace_drawers"
}
Then override with --palace <path>:
mempalace search "query" --palace /custom/path/to/palace
Q3: MCP connection fails
First manually verify the MCP service starts correctly:
python -m mempalace.mcp_server
# Normally it outputs nothing and keeps running in the foreground
# Ctrl+C to exit
If you see ModuleNotFoundError, check whether MemPalace is installed correctly:
pip show mempalace
If you’re using a virtual environment, confirm that the Python path in Claude Code’s MCP config is correct:
which python # get the correct python path
claude mcp add mempalace -- /path/to/python -m mempalace.mcp_server
Q4: MCP tool calls work, but results aren’t as expected
When the AI calls mempalace_search, the wing/room parameters must match precisely to get the maximum benefit from the Palace structure. In your prompt, guide the AI to use the correct filters:
When searching for project-specific memories, always pass --wing <project>.
When searching for a specific topic, always pass --room <room-name>.
Q5: Hook scripts aren’t triggering
# Check whether Claude Code hooks are enabled
claude doctor
Make sure the hooks paths in settings.json are absolute paths. Relative paths may fail to resolve in Claude Code due to different working directories.
Q6: Knowledge graph temporal queries return unexpected results
Temporal queries depend on the as_of parameter format, which must be YYYY-MM-DD:
# Wrong format
kg.query_entity("Kai", as_of="2026/03/01")
# Correct format
kg.query_entity("Kai", as_of="2026-03-01")
Also confirm that when adding facts with add_triple, you used the correct format for valid_from; otherwise the temporal window won’t take effect.
Further Reading / Advanced Directions
AAAK Experimental Compression Layer
AAAK is a lossy compressed dialect that uses regex replacements to compress repeated entities into code. In large-scale scenarios (the same project mentioned hundreds of times), it can save token costs. However, the current raw verbatim mode (96.6%) still outperforms AAAK mode (84.2%). It’s best suited for long-running projects across many sessions with frequent repeated entities.
Specialist Agents: Isolated Multi-Agent Memory
Each agent has its own independent wing and AAAK diary:
~/.mempalace/agents/
├── reviewer.json # code quality, patterns, bugs
├── architect.json # architecture decisions, trade-offs
└── ops.json # deployment, incidents, infrastructure
At runtime, the AI dynamically discovers agents from the palace—no need to write any configuration in CLAUDE.md.
Reproduce Benchmarks
The benchmarks/ directory contains full reproduction scripts for LongMemEval and LoCoMo:
python benchmarks/longmemeval_bench.py
No API key is needed throughout. On an M2 Ultra, it completes 500 questions within 5 minutes, verifying the reproducibility of the 96.6% result.
Horizontal comparison with other systems
| System | LongMemEval R@5 | API requirement | Cost |
|---|---|---|---|
| MemPalace (raw) | 96.6% | None | Free |
| MemPalace (hybrid + rerank) | 100% | Optional | Free |
| Mem0 | ~85% | Required | $19–249/month |
| Zep | ~85% | Required | $25/month+ |
| Mastra | 94.87% | Required (GPT) | API costs |
MemPalace is the only方案 that reaches the highest score with zero API calls.