How is short-term memory different from long-term memory in an agent?

Short-term memory is the active working set the model sees right now — the system prompt, recent messages, and tool results sitting inside the context window. It is fast but volatile and capped in size. Long-term memory is persistent storage outside the model: documents, summaries, and prior interactions saved as embeddings and pulled back in only when relevant. The agent constantly promotes the important parts of short-term memory into long-term storage and retrieves them again later.

Why do AI agents need long-term memory at all?

Because the context window is a hard ceiling, an agent cannot keep months of history or a whole knowledge base in front of the model at once. Long-term memory solves this by writing facts down externally and fetching just the few pieces that matter for the current step. It is what lets a support agent remember a customer's prior tickets, a coding agent recall earlier design decisions, and a personal assistant keep your preferences without re-reading every past message.

Glossary

Agent memory

Agent memory is how an AI agent retains information to stay coherent over time. It combines short-term memory held in the context window with long-term memory kept in external stores like vector databases.

Glossary
Updated 2026

Start building free Deep dive: agent memory

Agent memory is the set of mechanisms an AI agent uses to carry information forward — from one reasoning step to the next, and from one session to the next — so its behavior stays consistent instead of resetting each time it acts. It is usually described in two tiers. Short-term memory is the live working set the model can see right now: the instructions, the recent conversation, and the latest tool output, all sitting inside the context window. Long-term memory is everything written down outside the model so it outlives that window.

The two tiers work as a loop. As an agent runs, it accumulates a transcript in short-term memory, but that space is finite and re-paid on every turn, so the agent distills what matters — key facts, decisions, summaries — and persists them externally. Long-term memory typically lives in a vector database, where each piece of knowledge is stored as an embedding and later recalled by meaning rather than exact wording. When a new step begins, the agent retrieves only the handful of memories relevant to the task and slots them back into the prompt. This keeps the limited window full of the right context instead of stale clutter.

Picture a customer-support agent handling a returning user. Mid-chat, short-term memory holds the current question and the last few replies. But the agent also queries its long-term store and finds that this customer opened a ticket about a delayed order last month — a fact that was never in the current conversation. By blending the live thread with that recalled history, the agent answers as if it genuinely remembers the relationship, picking up exactly where things left off rather than asking the user to explain everything again.

Related terms

Concepts that build agent memory

Context window: The token budget that holds an agent's short-term memory for the current step. See /glossary/context-window.
Vector database: The store where long-term memories live and are retrieved by similarity. See /glossary/vector-database.
Embeddings: The numeric representations that let memories be saved and recalled by meaning. See /glossary/embeddings.

FAQ

Agent memory FAQ

Agent memory is everything an AI agent retains so it can stay coherent across steps and sessions. It splits into two layers: short-term memory, which lives in the context window and holds the running transcript of the current task, and long-term memory, which lives in external stores like vector databases so facts and past conversations survive after the window is cleared. Together they let an agent build on what it already knows instead of starting cold every turn.

Keep reading

Learn more

AI agent memory, in depthPatterns for short-term and long-term recall Vector databaseWhere long-term memories are stored EmbeddingsHow memories are encoded for retrieval

Get started

Give your agent a memory that lasts

Combine context, retrieval, and persistent storage so your agent remembers what matters. Free to start — no credit card required.

Start building free Read the deep dive