🧠Agent Memory & State Management
Build agents that remember context across sessions
Agent Memory & State Management
AI agents become dramatically more useful when they can remember things. Without memory, every conversation starts from zero — the agent has no idea who you are, what you prefer, or what happened in previous sessions. Memory transforms a stateless chatbot into a true assistant that learns and adapts over time.
This lesson covers the full memory stack: from short-term working memory within a single session, to long-term persistent memory across sessions, to episodic memory that captures specific events and experiences.
Why Memory Matters
Consider a personal assistant agent. Without memory, every session looks like this:
- User: "Book me a table at my favorite restaurant."
- Agent: "I don't know your favorite restaurant. Which one?"
- User: "The Italian place on 5th street, same as always."
- Agent: "What time? How many people? Indoor or outdoor?"
Every single time. With memory, the agent already knows:
- Your favorite restaurant is "Trattoria Milano" on 5th Street
- You prefer outdoor seating
- You usually book for 2 people
- You like 7:30 PM on weekdays, 8:00 PM on weekends
Memory is not optional for production agents — it is the difference between a tool and an assistant.
Types of Agent Memory
1. Short-Term Memory (Working Memory)
Short-term memory lives within a single session. It is the conversation context that the model uses to maintain coherence during a single interaction. This includes:
- The current conversation messages
- Tool call results accumulated during the session
- Internal reasoning and intermediate conclusions
- Temporary variables and scratchpad data
Short-term memory is inherently limited by the model's context window. When the context fills up, older information gets pushed out or truncated.
interface ShortTermMemory {
conversationHistory: Message[];
toolResults: ToolResult[];
scratchpad: Record<string, unknown>;
tokenCount: number;
maxTokens: number;
}
class WorkingMemory {
private memory: ShortTermMemory;
constructor(maxTokens: number = 100000) {
this.memory = {
conversationHistory: [],
toolResults: [],
scratchpad: {},
tokenCount: 0,
maxTokens,
};
}
addMessage(message: Message): void {
this.memory.conversationHistory.push(message);
this.memory.tokenCount += this.estimateTokens(message);
this.pruneIfNeeded();
}
addToolResult(result: ToolResult): void {
this.memory.toolResults.push(result);
this.memory.tokenCount += this.estimateTokens(result);
this.pruneIfNeeded();
}
setScratchpad(key: string, value: unknown): void {
this.memory.scratchpad[key] = value;
}
private pruneIfNeeded(): void {
while (this.memory.tokenCount > this.memory.maxTokens * 0.9) {
const oldest = this.memory.conversationHistory.shift();
if (!oldest) break;
this.memory.tokenCount -= this.estimateTokens(oldest);
}
}
private estimateTokens(data: unknown): number {
const text = JSON.stringify(data);
return Math.ceil(text.length / 4);
}
getContext(): ShortTermMemory {
return { ...this.memory };
}
}2. Long-Term Memory (Persistent Memory)
Long-term memory survives across sessions. It stores facts, preferences, and learned information that the agent can retrieve later. This is the most important type of memory for building useful assistants.
interface MemoryEntry {
id: string;
key: string;
value: string;
category: "preference" | "fact" | "instruction" | "relationship";
confidence: number;
createdAt: Date;
updatedAt: Date;
accessCount: number;
source: string;
}
interface LongTermMemoryStore {
store(entry: Omit<MemoryEntry, "id" | "createdAt" | "updatedAt" | "accessCount">): Promise<string>;
retrieve(query: string, limit?: number): Promise<MemoryEntry[]>;
update(id: string, updates: Partial<MemoryEntry>): Promise<void>;
delete(id: string): Promise<void>;
search(filters: MemoryFilter): Promise<MemoryEntry[]>;
}
interface MemoryFilter {
category?: MemoryEntry["category"];
minConfidence?: number;
afterDate?: Date;
keyword?: string;
}3. Episodic Memory
Episodic memory captures specific events and experiences — what happened, when it happened, and what the outcome was. This allows agents to learn from past interactions.
interface Episode {
id: string;
timestamp: Date;
summary: string;
trigger: string;
actions: ActionRecord[];
outcome: "success" | "failure" | "partial";
lessonsLearned: string[];
tags: string[];
}
interface ActionRecord {
tool: string;
input: Record<string, unknown>;
output: string;
durationMs: number;
success: boolean;
}
class EpisodicMemory {
private episodes: Episode[] = [];
recordEpisode(episode: Omit<Episode, "id">): string {
const id = crypto.randomUUID();
this.episodes.push({ ...episode, id });
return id;
}
findSimilarEpisodes(situation: string, limit: number = 5): Episode[] {
return this.episodes
.filter((ep) => this.calculateSimilarity(situation, ep.summary) > 0.3)
.sort((a, b) =>
this.calculateSimilarity(situation, b.summary) -
this.calculateSimilarity(situation, a.summary)
)
.slice(0, limit);
}
getSuccessfulStrategies(tag: string): Episode[] {
return this.episodes.filter(
(ep) => ep.tags.includes(tag) && ep.outcome === "success"
);
}
getFailurePatterns(tag: string): Episode[] {
return this.episodes.filter(
(ep) => ep.tags.includes(tag) && ep.outcome === "failure"
);
}
private calculateSimilarity(a: string, b: string): number {
const wordsA = new Set(a.toLowerCase().split(/\s+/));
const wordsB = new Set(b.toLowerCase().split(/\s+/));
const intersection = new Set([...wordsA].filter((w) => wordsB.has(w)));
const union = new Set([...wordsA, ...wordsB]);
return intersection.size / union.size;
}
}JSON-Based Memory Storage
The simplest approach to persistent memory is a JSON file. This works well for single-user agents or prototypes:
interface MemoryStore {
version: number;
memories: MemoryEntry[];
metadata: {
lastAccess: string;
totalEntries: number;
};
}
class JsonMemoryBackend implements LongTermMemoryStore {
private filePath: string;
private cache: MemoryStore | null = null;
constructor(filePath: string) {
this.filePath = filePath;
}
private async load(): Promise<MemoryStore> {
if (this.cache) return this.cache;
try {
const raw = await fs.readFile(this.filePath, "utf-8");
this.cache = JSON.parse(raw) as MemoryStore;
} catch {
this.cache = {
version: 1,
memories: [],
metadata: {
lastAccess: new Date().toISOString(),
totalEntries: 0,
},
};
}
return this.cache;
}
private async save(): Promise<void> {
if (!this.cache) return;
this.cache.metadata.lastAccess = new Date().toISOString();
this.cache.metadata.totalEntries = this.cache.memories.length;
await fs.writeFile(this.filePath, JSON.stringify(this.cache, null, 2));
}
async store(entry: Omit<MemoryEntry, "id" | "createdAt" | "updatedAt" | "accessCount">): Promise<string> {
const store = await this.load();
const id = crypto.randomUUID();
const now = new Date();
store.memories.push({
...entry,
id,
createdAt: now,
updatedAt: now,
accessCount: 0,
});
await this.save();
return id;
}
async retrieve(query: string, limit: number = 10): Promise<MemoryEntry[]> {
const store = await this.load();
const queryWords = query.toLowerCase().split(/\s+/);
const scored = store.memories.map((memory) => {
const text = `${memory.key} ${memory.value}`.toLowerCase();
const matchCount = queryWords.filter((w) => text.includes(w)).length;
const score = matchCount / queryWords.length;
return { memory, score };
});
return scored
.filter((s) => s.score > 0.2)
.sort((a, b) => b.score - a.score)
.slice(0, limit)
.map((s) => {
s.memory.accessCount++;
return s.memory;
});
}
async update(id: string, updates: Partial<MemoryEntry>): Promise<void> {
const store = await this.load();
const index = store.memories.findIndex((m) => m.id === id);
if (index === -1) throw new Error(`Memory ${id} not found`);
store.memories[index] = {
...store.memories[index],
...updates,
updatedAt: new Date(),
};
await this.save();
}
async delete(id: string): Promise<void> {
const store = await this.load();
store.memories = store.memories.filter((m) => m.id !== id);
await this.save();
}
async search(filters: MemoryFilter): Promise<MemoryEntry[]> {
const store = await this.load();
return store.memories.filter((m) => {
if (filters.category && m.category !== filters.category) return false;
if (filters.minConfidence && m.confidence < filters.minConfidence) return false;
if (filters.afterDate && m.createdAt < filters.afterDate) return false;
if (filters.keyword) {
const text = `${m.key} ${m.value}`.toLowerCase();
if (!text.includes(filters.keyword.toLowerCase())) return false;
}
return true;
});
}
}Database-Backed Memory
For production agents serving multiple users, you need a proper database. Here is a SQLite-based implementation:
class SqliteMemoryBackend implements LongTermMemoryStore {
private db: Database.Database;
constructor(dbPath: string) {
this.db = new Database(dbPath);
this.initialize();
}
private initialize(): void {
this.db.exec(`
CREATE TABLE IF NOT EXISTS memories (
id TEXT PRIMARY KEY,
user_id TEXT NOT NULL,
key TEXT NOT NULL,
value TEXT NOT NULL,
category TEXT NOT NULL,
confidence REAL DEFAULT 1.0,
created_at TEXT NOT NULL,
updated_at TEXT NOT NULL,
access_count INTEGER DEFAULT 0,
source TEXT NOT NULL,
embedding BLOB
);
CREATE INDEX IF NOT EXISTS idx_memories_user ON memories(user_id);
CREATE INDEX IF NOT EXISTS idx_memories_category ON memories(category);
CREATE INDEX IF NOT EXISTS idx_memories_key ON memories(key);
CREATE VIRTUAL TABLE IF NOT EXISTS memories_fts USING fts5(
key, value, content=memories, content_rowid=rowid
);
`);
}
async store(entry: Omit<MemoryEntry, "id" | "createdAt" | "updatedAt" | "accessCount">): Promise<string> {
const id = crypto.randomUUID();
const now = new Date().toISOString();
this.db.prepare(`
INSERT INTO memories (id, user_id, key, value, category, confidence, created_at, updated_at, access_count, source)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, 0, ?)
`).run(id, "default", entry.key, entry.value, entry.category, entry.confidence, now, now, entry.source);
return id;
}
async retrieve(query: string, limit: number = 10): Promise<MemoryEntry[]> {
const rows = this.db.prepare(`
SELECT m.* FROM memories m
JOIN memories_fts fts ON m.rowid = fts.rowid
WHERE memories_fts MATCH ?
ORDER BY rank
LIMIT ?
`).all(query, limit);
return rows.map(this.rowToEntry);
}
async update(id: string, updates: Partial<MemoryEntry>): Promise<void> {
const setClauses: string[] = [];
const values: unknown[] = [];
if (updates.value !== undefined) {
setClauses.push("value = ?");
values.push(updates.value);
}
if (updates.confidence !== undefined) {
setClauses.push("confidence = ?");
values.push(updates.confidence);
}
setClauses.push("updated_at = ?");
values.push(new Date().toISOString());
values.push(id);
this.db.prepare(
`UPDATE memories SET ${setClauses.join(", ")} WHERE id = ?`
).run(...values);
}
async delete(id: string): Promise<void> {
this.db.prepare("DELETE FROM memories WHERE id = ?").run(id);
}
async search(filters: MemoryFilter): Promise<MemoryEntry[]> {
let sql = "SELECT * FROM memories WHERE 1=1";
const params: unknown[] = [];
if (filters.category) {
sql += " AND category = ?";
params.push(filters.category);
}
if (filters.minConfidence) {
sql += " AND confidence >= ?";
params.push(filters.minConfidence);
}
if (filters.keyword) {
sql += " AND (key LIKE ? OR value LIKE ?)";
params.push(`%${filters.keyword}%`, `%${filters.keyword}%`);
}
return this.db.prepare(sql).all(...params).map(this.rowToEntry);
}
private rowToEntry(row: Record<string, unknown>): MemoryEntry {
return {
id: row.id as string,
key: row.key as string,
value: row.value as string,
category: row.category as MemoryEntry["category"],
confidence: row.confidence as number,
createdAt: new Date(row.created_at as string),
updatedAt: new Date(row.updated_at as string),
accessCount: row.access_count as number,
source: row.source as string,
};
}
}Memory Search & Retrieval Strategies
Effective memory search goes beyond simple keyword matching. Here are proven strategies:
Semantic Search with Embeddings
class SemanticMemorySearch {
private embeddings: Map<string, number[]> = new Map();
async indexMemory(id: string, text: string): Promise<void> {
const embedding = await this.getEmbedding(text);
this.embeddings.set(id, embedding);
}
async search(query: string, memories: MemoryEntry[], topK: number = 5): Promise<MemoryEntry[]> {
const queryEmbedding = await this.getEmbedding(query);
const scored = memories.map((memory) => {
const memEmbedding = this.embeddings.get(memory.id);
if (!memEmbedding) return { memory, score: 0 };
const score = this.cosineSimilarity(queryEmbedding, memEmbedding);
return { memory, score };
});
return scored
.sort((a, b) => b.score - a.score)
.slice(0, topK)
.map((s) => s.memory);
}
private cosineSimilarity(a: number[], b: number[]): number {
let dotProduct = 0;
let normA = 0;
let normB = 0;
for (let i = 0; i < a.length; i++) {
dotProduct += a[i] * b[i];
normA += a[i] * a[i];
normB += b[i] * b[i];
}
return dotProduct / (Math.sqrt(normA) * Math.sqrt(normB));
}
private async getEmbedding(text: string): Promise<number[]> {
// Use an embedding API (e.g., Voyage AI, OpenAI)
// Placeholder for actual implementation
return new Array(1024).fill(0).map(() => Math.random());
}
}Contextual Memory Retrieval
class ContextualRetriever {
async retrieveRelevantMemories(
currentMessage: string,
conversationHistory: Message[],
memoryStore: LongTermMemoryStore
): Promise<MemoryEntry[]> {
// Strategy 1: Direct query from current message
const directResults = await memoryStore.retrieve(currentMessage, 5);
// Strategy 2: Extract entities and search for each
const entities = this.extractEntities(currentMessage);
const entityResults: MemoryEntry[] = [];
for (const entity of entities) {
const results = await memoryStore.retrieve(entity, 3);
entityResults.push(...results);
}
// Strategy 3: Search based on recent conversation context
const recentContext = conversationHistory
.slice(-3)
.map((m) => m.content)
.join(" ");
const contextResults = await memoryStore.retrieve(recentContext, 3);
// Deduplicate and rank
const allResults = [...directResults, ...entityResults, ...contextResults];
const seen = new Set<string>();
return allResults.filter((m) => {
if (seen.has(m.id)) return false;
seen.add(m.id);
return true;
});
}
private extractEntities(text: string): string[] {
const patterns = [
/(?:my |the )([A-Z][a-z]+(?: [A-Z][a-z]+)*)/g,
/(?:called |named )["']?([^"',]+)["']?/g,
/(@\w+)/g,
];
const entities: string[] = [];
for (const pattern of patterns) {
let match;
while ((match = pattern.exec(text)) !== null) {
entities.push(match[1]);
}
}
return entities;
}
}Knowledge Base Integration
A knowledge base is structured long-term memory that serves as the agent's reference library:
interface KnowledgeBase {
documents: KBDocument[];
index: Map<string, string[]>;
}
interface KBDocument {
id: string;
title: string;
content: string;
tags: string[];
lastUpdated: Date;
}
class AgentKnowledgeBase {
private kb: KnowledgeBase = {
documents: [],
index: new Map(),
};
addDocument(doc: Omit<KBDocument, "id">): string {
const id = crypto.randomUUID();
const document = { ...doc, id };
this.kb.documents.push(document);
// Index by tags and keywords
for (const tag of doc.tags) {
const existing = this.kb.index.get(tag) || [];
existing.push(id);
this.kb.index.set(tag, existing);
}
return id;
}
query(question: string): KBDocument[] {
const words = question.toLowerCase().split(/\s+/);
const scores = new Map<string, number>();
for (const doc of this.kb.documents) {
const text = `${doc.title} ${doc.content} ${doc.tags.join(" ")}`.toLowerCase();
let score = 0;
for (const word of words) {
if (text.includes(word)) score++;
}
if (score > 0) scores.set(doc.id, score / words.length);
}
return this.kb.documents
.filter((d) => (scores.get(d.id) || 0) > 0.2)
.sort((a, b) => (scores.get(b.id) || 0) - (scores.get(a.id) || 0));
}
}Complete Example: Personal Assistant with Memory
Here is a full implementation of a personal assistant agent that remembers user preferences across sessions:
const client = new Anthropic();
class PersonalAssistant {
private memory: JsonMemoryBackend;
private episodic: EpisodicMemory;
private workingMemory: WorkingMemory;
constructor(memoryPath: string) {
this.memory = new JsonMemoryBackend(memoryPath);
this.episodic = new EpisodicMemory();
this.workingMemory = new WorkingMemory();
}
async chat(userMessage: string): Promise<string> {
// Step 1: Retrieve relevant memories
const memories = await this.memory.retrieve(userMessage, 10);
const episodes = this.episodic.findSimilarEpisodes(userMessage, 3);
// Step 2: Build memory context
const memoryContext = this.buildMemoryContext(memories, episodes);
// Step 3: Add to working memory
this.workingMemory.addMessage({ role: "user", content: userMessage });
// Step 4: Call Claude with memory context
const response = await client.messages.create({
model: "claude-sonnet-4-20250514",
max_tokens: 1024,
system: `You are a personal assistant with memory.
Here are things you remember about this user:
${memoryContext}
IMPORTANT: When you learn new information about the user (preferences,
facts, names, habits), output a <memory> tag with the information to store.
Format: <memory category="preference">key: value</memory>`,
messages: this.workingMemory.getContext().conversationHistory,
});
const assistantMessage = response.content[0].type === "text"
? response.content[0].text
: "";
// Step 5: Extract and store new memories
await this.extractAndStoreMemories(assistantMessage, userMessage);
// Step 6: Record episode
this.episodic.recordEpisode({
timestamp: new Date(),
summary: `User asked: ${userMessage.slice(0, 100)}`,
trigger: userMessage,
actions: [],
outcome: "success",
lessonsLearned: [],
tags: this.extractTags(userMessage),
});
// Step 7: Update working memory
this.workingMemory.addMessage({ role: "assistant", content: assistantMessage });
return assistantMessage;
}
private buildMemoryContext(
memories: MemoryEntry[],
episodes: Episode[]
): string {
let context = "";
if (memories.length > 0) {
context += "## Known Facts & Preferences\n";
for (const m of memories) {
context += `- [${m.category}] ${m.key}: ${m.value}\n`;
}
}
if (episodes.length > 0) {
context += "\n## Recent Interactions\n";
for (const ep of episodes) {
context += `- ${ep.timestamp.toLocaleDateString()}: ${ep.summary}\n`;
}
}
return context || "No memories stored yet.";
}
private async extractAndStoreMemories(
response: string,
userMessage: string
): Promise<void> {
const memoryPattern = /<memory category="(\w+)">([^<]+)<\/memory>/g;
let match;
while ((match = memoryPattern.exec(response)) !== null) {
const category = match[1] as MemoryEntry["category"];
const content = match[2];
const [key, ...valueParts] = content.split(":");
const value = valueParts.join(":").trim();
await this.memory.store({
key: key.trim(),
value,
category,
confidence: 0.9,
source: `conversation: ${userMessage.slice(0, 50)}`,
});
}
}
private extractTags(text: string): string[] {
const tags: string[] = [];
const tagPatterns: Record<string, RegExp> = {
food: /\b(restaurant|food|eat|dinner|lunch|breakfast|cook)\b/i,
travel: /\b(travel|flight|hotel|trip|vacation|book)\b/i,
work: /\b(meeting|deadline|project|report|email|schedule)\b/i,
health: /\b(doctor|gym|exercise|health|medicine|appointment)\b/i,
};
for (const [tag, pattern] of Object.entries(tagPatterns)) {
if (pattern.test(text)) tags.push(tag);
}
return tags;
}
}
// Usage
async function main() {
const assistant = new PersonalAssistant("./user-memory.json");
// First session
await assistant.chat("I prefer vegetarian food and my name is Sarah.");
await assistant.chat("Book me a table tonight — I like quiet places.");
// Later session — the agent remembers
const response = await assistant.chat("Can you suggest a restaurant for tonight?");
console.log(response);
// The agent knows: name is Sarah, prefers vegetarian, likes quiet places
}
main();Best Practices
| Practice | Why It Matters |
|---|---|
| Set confidence scores on memories | Not all information is equally reliable — hearsay vs. explicit statements |
| Implement memory decay | Old, unused memories should lose relevance over time |
| Deduplicate before storing | Avoid storing "likes Italian food" five times |
| Separate user facts from preferences | Facts rarely change; preferences evolve |
| Version your memory schema | You will need to migrate memory data as your agent evolves |
| Add memory size limits | Unbounded memory leads to slow search and high costs |
| Encrypt sensitive memories | User data must be protected at rest and in transit |
| Let users view and delete memories | Transparency builds trust — users should control their data |
Summary
Agent memory transforms stateless interactions into persistent, learning relationships. Start with JSON-based storage for prototypes, graduate to database-backed storage for production, and layer in semantic search for intelligent retrieval. The key insight: memory is not just storage — it is the foundation of personalization, context, and trust.