Guide

MemPalace Setup Guide

Get MemPalace running in your AI agent in under 5 minutes. Python, Node.js, and Docker instructions.

Prerequisites

  • Python 3.10+ (for the core library)
  • pip or uv package manager
  • Optional: Docker for containerized deployments

Step 1: Install MemPalace

Using pip

pip install mempalace

Using uv (recommended for speed)

uv pip install mempalace

From source

git clone https://github.com/milla-jovovich/mempalace.git
cd mempalace
pip install -e .

Step 2: Initialize Your Palace

from mempalace import Palace

# Create a new palace with default settings
palace = Palace(
    storage_path="~/.mempalace",  # Where memories are stored
    max_tokens=170,                # Startup token budget
)

# The palace is ready to use immediately
print(f"Palace initialized: {palace.wings} wings")

Step 3: Store Memories

# Simple memory storage
palace.store("User prefers TypeScript over JavaScript")

# With metadata
palace.store(
    "The project uses Next.js 15 with App Router",
    wing="work",
    hall="web-projects",
    tags=["tech-stack", "frontend"]
)

# Batch storage
palace.store_batch([
    "User is a senior engineer",
    "User works at a startup",
    "User prefers dark mode",
])

Step 4: Recall Memories

# Basic recall
memories = palace.recall("What framework does the user use?")
# Returns: ["The project uses Next.js 15 with App Router"]

# Scoped recall (search within a specific wing)
work_memories = palace.recall(
    "tech stack preferences",
    wing="work"
)

# With reranking for maximum precision
precise = palace.recall(
    "deployment preferences",
    rerank=True  # Uses Haiku reranker
)

Step 5: Integrate with Your Agent

from mempalace import Palace

palace = Palace()

def agent_response(user_message: str) -> str:
    # Recall relevant memories
    context = palace.recall(user_message, limit=5)

    # Build prompt with memory context
    system_prompt = f"""You are a helpful assistant.

Relevant memories about this user:
{chr(10).join(f'- {m}' for m in context)}
"""

    # Get LLM response (your existing code)
    response = call_llm(system_prompt, user_message)

    # Store new memories from the conversation
    palace.store(f"User asked about: {user_message}")

    return response

Configuration Options

OptionDefaultDescription
storage_path~/.mempalaceDirectory for memory storage
max_tokens170Token budget for startup context
rerankFalseEnable Haiku reranking
auto_organizeTrueAuto-sort memories into wings/halls
embedding_modeldefaultEmbedding model for similarity search

Best Practices

  • Be specific with memory storage— “User prefers dark mode in VS Code” is better than “user likes dark mode”
  • Use wings for organization — Separate work/personal/project memories to improve recall relevance
  • Enable reranking for critical queries — The Haiku reranker bumps recall from 96.6% to 100% at minimal cost
  • Prune periodically — Use palace.prune(older_than="90d") to remove stale memories

Next Steps

Now that MemPalace is running, explore the advanced features:

  • Multi-agent memory sharing
  • Memory importance scoring
  • Conversation summarization pipelines
  • Custom embedding models