Glossary

Agent memory

Agent memory is how an AI agent retains information to stay coherent over time. It combines short-term memory held in the context window with long-term memory kept in external stores like vector databases.

  • Glossary
  • Updated 2026

Agent memory is the set of mechanisms an AI agent uses to carry information forward — from one reasoning step to the next, and from one session to the next — so its behavior stays consistent instead of resetting each time it acts. It is usually described in two tiers. Short-term memory is the live working set the model can see right now: the instructions, the recent conversation, and the latest tool output, all sitting inside the context window. Long-term memory is everything written down outside the model so it outlives that window.

The two tiers work as a loop. As an agent runs, it accumulates a transcript in short-term memory, but that space is finite and re-paid on every turn, so the agent distills what matters — key facts, decisions, summaries — and persists them externally. Long-term memory typically lives in a vector database, where each piece of knowledge is stored as an embedding and later recalled by meaning rather than exact wording. When a new step begins, the agent retrieves only the handful of memories relevant to the task and slots them back into the prompt. This keeps the limited window full of the right context instead of stale clutter.

Picture a customer-support agent handling a returning user. Mid-chat, short-term memory holds the current question and the last few replies. But the agent also queries its long-term store and finds that this customer opened a ticket about a delayed order last month — a fact that was never in the current conversation. By blending the live thread with that recalled history, the agent answers as if it genuinely remembers the relationship, picking up exactly where things left off rather than asking the user to explain everything again.

FAQ

Agent memory FAQ

Agent memory is everything an AI agent retains so it can stay coherent across steps and sessions. It splits into two layers: short-term memory, which lives in the context window and holds the running transcript of the current task, and long-term memory, which lives in external stores like vector databases so facts and past conversations survive after the window is cleared. Together they let an agent build on what it already knows instead of starting cold every turn.

Get started

Give your agent a memory that lasts

Combine context, retrieval, and persistent storage so your agent remembers what matters. Free to start — no credit card required.