Which agent pattern should I start with?

Start with the tool-use loop or plain ReAct. They are the simplest patterns that still do real work: the model reasons, decides whether to call a tool, reads the result, and repeats until done. Most production agents are a tool-use loop at their core. Only add planning, reflection, routing, or multiple agents once you can point at a concrete failure the simpler version cannot fix — a task too long to hold in one chain of thought, answers that need a second pass for quality, or a workload that splits cleanly into specialist lanes.

What is the difference between routing and orchestrator-worker?

Routing classifies an incoming request and dispatches it to exactly one downstream handler — a triage step that picks a lane and gets out of the way. Orchestrator-worker is heavier: a coordinating agent decomposes a goal into several sub-tasks, fans them out to multiple workers (sometimes in parallel), and then synthesizes their outputs into one result. Routing is one-in-one-out selection; orchestrator-worker is one-in-many-out decomposition and recombination. Many systems use routing first, then hand the chosen lane to an orchestrator.

When is reflection or self-critique worth the extra cost?

Reflection pays off when quality matters more than latency and when mistakes are catchable by reading the output — code that must compile, prose that must follow a rubric, plans that must satisfy constraints. The pattern adds at least one extra model call to critique and revise, so it roughly doubles cost per improved attempt. Skip it for simple lookups or chit-chat where the first answer is almost always fine. Use it where a wrong answer is expensive and a second look reliably catches the error.

Can I combine multiple agent design patterns?

Yes — real systems almost always compose them. A common stack routes a request to a lane, runs an orchestrator that decomposes the work, gives each worker a ReAct tool-use loop, and wraps the final output in an evaluator-optimizer pass before returning it. The patterns are layers, not rivals: routing chooses, planning sequences, orchestration parallelizes, ReAct executes each step, and reflection or evaluation guards quality. The skill is keeping each layer as thin as the task allows so the system stays debuggable.

Blog · Engineering

7 AI Agent Design Patterns Every Builder Should Know

Most agent code is a handful of recurring shapes wearing different framework costumes. Learn the seven that matter — what each one is, when it earns its keep, and how to stack them without turning your agent into a haunted house.

11 min read
Engineering
Updated 2026

Build with these patterns Agentic workflows guide

You do not invent a new control flow every time you build an agent. You reach for one of a small set of patterns — the same way you reach for a queue, a cache, or a state machine in ordinary software.

The word agent hides a lot of variety. A one-shot tool call and a fleet of coordinating specialists are both called agents, but they share almost nothing structurally. What they do share is a handful of reusable shapes for arranging reasoning, tool calls, and collaboration. Learn those shapes and most agent codebases suddenly read like variations on a theme rather than a pile of bespoke spaghetti.

This post walks through seven patterns every builder should keep in their head: ReAct, plan-and-execute, reflection, routing, orchestrator-worker, evaluator-optimizer, and the tool-use loop. For each, you get a plain answer to two questions — what is it and when do you use it — and we close with the part nobody writes down: how to compose them without drowning in model calls. If you want the foundational mechanics first, the LLM agents guide and the agentic workflows guide are the right warm-up.

The map

The seven patterns at a glance

Skim the whole toolbox first. Each card is a pattern, its one-line job, and the situation it was made for — the rest of the post zooms into each.

1 · ReAct

Interleave reasoning and acting: think a step, take a tool action, read the result, think again. The default loop for open-ended tasks that need live information.

2 · Plan-and-execute

Draft a full multi-step plan first, then carry it out step by step. Best when the task has a long horizon and wandering reasoning gets lost.

3 · Reflection

Generate, then critique your own output and revise. Earns its keep when quality matters and mistakes are catchable by re-reading the draft.

4 · Routing

Classify the request, then dispatch it to exactly one specialized handler. A cheap triage step that keeps each lane focused and prompts short.

5 · Orchestrator-worker

A coordinator decomposes a goal, fans sub-tasks out to specialist workers, and synthesizes their results. For work that splits into parallel parts.

6 · Evaluator-optimizer

One model produces, another scores against criteria, and the loop optimizes until the bar is met. A measurable quality gate around any generator.

7 · Tool-use loop

The function-calling primitive under all of the above: the model emits a tool call, the runtime executes it, the result returns, repeat until done.

Pattern 1

ReAct — reason and act, in lockstep

The pattern that made agents feel like agents: don't plan everything up front, just think one step, act, observe, and let the next thought be informed by what actually happened.

ThoughtReason about the next step

ActionCall a tool or search

ObservationRead the tool result

LoopRe-reason or finish

The ReAct cycle: a thought motivates an action, the environment returns an observation, and that observation shapes the next thought — until the agent decides it is finished.

What it is. ReAct (short for reason + act) interleaves chains of thought with tool actions in a single loop. The model writes a short reasoning trace, picks one action, executes it, reads the observation, and folds that evidence into its next thought. It does not commit to a full plan up front — it steers in real time based on what the world returns.

When to use it. Reach for ReAct on open-ended tasks where the next move depends on information you don't have yet: question answering over tools, web research, debugging, anything where a fixed plan would be guesswork. It is the most common default for an LLM agent because it degrades gracefully — even a two-tool agent benefits from thinking between calls. The trade-off is drift: long ReAct loops can wander, repeat themselves, or lose the thread, which is exactly the gap the next two patterns fill.

Pattern 2

Plan-and-execute — decide the route before you drive

When a task is too long to improvise, separate the thinking from the doing: write the whole plan once, then execute each step with a cheaper, more focused worker.

What it is. A planner model breaks the goal into an explicit, ordered list of steps. An executor then runs those steps one by one — often re-planning when reality diverges from the plan. The key move is the separation: a strong model does the expensive reasoning once, and lighter calls handle the mechanical execution.

When to use it. Plan-and-execute shines on long-horizon tasks where pure ReAct loses its way — multi-stage data pipelines, "research this topic and write a report," anything with five-plus dependent steps. An upfront plan is auditable (you can read it before anything runs), parallelizable (independent steps can fan out), and cheaper at execution time. The cost is rigidity: a plan written before the first observation can be wrong, so good implementations let the planner revise mid-flight rather than marching off a cliff.

1 · Plan
A capable model turns the goal into an ordered list of concrete, checkable steps with their dependencies made explicit.
2 · Execute
Each step runs in turn — often as its own small ReAct or tool-use loop — producing an intermediate result the next step consumes.
3 · Re-plan
When a step's result contradicts the plan, the planner revises the remaining steps instead of blindly continuing.

Pattern 3

Reflection — let the agent grade its own work

The cheapest reliable quality boost: have the model read its first draft, critique it against the goal, and produce a better second version.

Generate, critique, revise

A second pass beats a smarter prompt

Reflection (also called self-critique) splits work into a producer turn and a critic turn. The model writes a first answer, then a second prompt asks it to find flaws against the requirements — missing cases, broken logic, rubric violations — and rewrite accordingly. The critique can come from the same model wearing a different hat, or from a dedicated critic with its own instructions.

Use it when quality outranks latency and errors are visible on re-reading: code that must compile and pass tests, structured output that must match a schema, prose that must hit a checklist. Skip it for trivial lookups where the first answer is essentially always right — every reflection pass is at least one extra model call, so you trade cost and latency for correctness.

Catches mistakes the first pass confidently shipped.
Works best with concrete criteria, not vague 'make it better'.
Pairs naturally with tools — let the critic run the tests.
Cap the loop: two or three revisions, then stop.

How agents self-improve

DraftProduce a first answer

CritiqueScore against criteria

ReviseFix the flaws found

AcceptPass or stop at cap

Reflection loop: a draft is critiqued against the goal, then revised, until the critic is satisfied or the iteration cap is hit.

Pattern 4

Routing — classify first, then dispatch

Not every request belongs in the same prompt. A lightweight classifier sends each input to the one handler built for it, keeping every lane sharp and short.

RequestRaw user input

ClassifyWhich lane is this?

DispatchPick one handler

SpecialistFocused prompt + tools

A router classifies the incoming request, then dispatches it to exactly one specialized handler — billing, technical, or refunds — each with its own focused prompt and tools.

What it is. Routing puts a classification step at the front door. A small, fast model (or even a rules layer) reads the input, decides which category it belongs to, and dispatches it to a single downstream handler tuned for that category. Each handler gets a tighter prompt, a smaller toolset, and fewer ways to go wrong than one do-everything mega-agent.

When to use it. Routing fits anytime your traffic is genuinely heterogeneous — a support bot fielding billing, technical, and account questions; a coding agent splitting "explain this" from "edit this"; a model picker sending easy queries to a cheap model and hard ones to a strong one. It cuts cost and raises accuracy because no single prompt has to be good at everything. The risk is misroutes, so keep an "unsure" fallback and log the classifier's decisions. Routing is one-in, one-out — when a request needs many handlers working together, you want the next pattern.

Pattern 5

Orchestrator-worker — decompose, delegate, recombine

When a goal naturally splits into parts, a coordinating agent breaks it down, hands each piece to a specialist worker, and stitches the results back into one answer.

Orchestrator

Decomposes & synthesizes

Researcher

Gathers sources

Coder

Writes the changes

Tester

Verifies output

Writer

Drafts the summary

The orchestrator owns the goal and the final synthesis; each worker is a specialist with a narrow brief, free to run in parallel.

What it is. A lead agent — the orchestrator — receives the goal, decomposes it into sub-tasks, and dispatches each to a worker agent with its own role, prompt, and tools. Workers can run in parallel and don't need to know about one another. When they finish, the orchestrator collects and synthesizes their outputs into a single coherent result.

When to use it. This is the workhorse of multi-agent systems: tasks that decompose into independent, specializable parts — research across many sources at once, a codebase change spanning several files, a report whose sections can be drafted concurrently. The payoff is parallelism and focus; the cost is coordination overhead, more tokens, and the hard problem of recombination. For the deeper mechanics of wiring orchestrators to workers, see AI agent orchestration, and for the call on whether you even need more than one agent, read single-agent vs multi-agent.

Pattern 6

Evaluator-optimizer — a scoreboard in the loop

Reflection's rigorous cousin: pair a generator with a separate evaluator that scores against explicit criteria, and keep optimizing until the output clears the bar.

What it is. Two roles run in a tight loop. A generator produces a candidate; an evaluator scores it against defined criteria and returns specific, actionable feedback; the generator tries again using that feedback. The loop continues until the evaluator passes the output or you hit an iteration limit. The difference from plain reflection is rigor — the evaluator is a distinct role with measurable criteria, not just the producer second-guessing itself.

When to use it. Use evaluator-optimizer when "good" has a clear, checkable definition: translations graded against a rubric, generated code measured by passing tests, search results scored for relevance, copy held to brand and length rules. It works precisely because the evaluator can give pointed feedback — "this case is unhandled," "tone is too formal" — that the optimizer can act on. If you cannot articulate the criteria, you cannot build the evaluator, and you are better off with simple reflection or a human in the loop.

GenerateProduce a candidate

EvaluateScore vs. criteria

FeedbackPinpoint what to fix

PassMeets bar or cap

Evaluator-optimizer: the generator proposes, the evaluator scores against criteria and returns feedback, and the loop repeats until the bar is cleared.

Pattern 7

The tool-use loop — the primitive under everything

Strip every pattern above down and the same engine is humming underneath: the model asks for a tool, the runtime runs it, the result comes back, and the loop turns again.

ModelEmit a tool call

RuntimeExecute the function

ResultReturn structured data

ContinueCall again or answer

The function-calling loop: the model emits a structured tool call, your runtime executes it against the real world, the result returns to the model, and the cycle repeats until a final answer.

What it is. The tool-use (or function-calling) loop is the lowest-level pattern: you describe a set of tools with their inputs, the model decides when to call one and emits a structured request, your runtime executes it and returns the result, and the model continues with that result in context. It repeats until the model produces a final answer instead of another call. Everything else in this post is a policy layered on top of this loop — ReAct adds explicit reasoning between calls, planning sequences them, orchestration distributes them.

When to use it. Always, in some form — it is less a choice than the substrate. You reach for the bare loop when the task is well-defined and bounded: look something up, transform it, write it back. The craft is in the tools, not the loop: clear names, tight schemas, helpful error messages, and guardrails on anything that writes or spends. Get the workflow plumbing right and most agents are a good tool-use loop plus one or two patterns from above. Watch the usual failure modes: infinite loops, runaway cost, and tools that fail silently.

Putting it together

Composing patterns without the haunted house

The patterns are layers, not rivals. The trick is adding only the layers a real failure demands — and keeping each one thin enough to debug.

No serious system uses exactly one pattern. A mature agent might route an incoming request to the right lane, run an orchestrator that plans and decomposes the work, give each worker its own ReAct tool-use loop, and wrap the final output in an evaluator-optimizer pass before returning it. Five patterns, one pipeline — and each is doing a job the others can't.

The order is the insight. Routing chooses, planning sequences, orchestration parallelizes, ReAct and the tool-use loop execute each step, and reflection or evaluation guards the result. Stack them in that grain and the system reads top-to-bottom; fight the grain and you get the haunted house — agents calling agents calling agents with no one able to say why a given answer came out the way it did.

So compose deliberately. Begin with the thinnest thing that could work — usually a tool-use loop or plain ReAct — ship it, and add a layer only when you can name the failure it fixes. "Tasks longer than eight steps lose the thread" earns you planning. "Output quality is inconsistent" earns you reflection or an evaluator. "One prompt can't serve three audiences" earns you routing. Every layer you add is more latency, more tokens, and more surface area for bugs, so each one should pay for itself.

The composition rule of thumb

Add patterns in this order as needs appear: tool-use loop → ReAct → reflection → routing → planning → orchestrator-worker → evaluator-optimizer. Stop at the first level that meets your quality bar. Most agents never need the last two — and the ones that do should reach them because a metric forced the move, not because multi-agent diagrams look impressive.

FAQ

Agent design patterns, answered

AI agent design patterns are reusable shapes for organizing how a language model reasons, calls tools, and coordinates with other agents to finish a task. They are the agent equivalent of software design patterns: named, battle-tested structures like ReAct, plan-and-execute, reflection, routing, and orchestrator-worker that you reach for instead of reinventing control flow each time. Picking the right pattern is mostly about how predictable the task is, how many steps it takes, and how much you can afford to spend on extra model calls.

Keep reading

Go deeper on building real agents

Agentic workflowsHow these patterns wire into real pipelines LLM agentsThe reason-act loop these patterns extend Multi-agent systemsWhere orchestrator-worker lives AI agent orchestrationCoordinating planners and workers Single-agent vs multi-agentWhen one agent beats many

AI agent design patternsagent patternsReAct patternplan and executereflection patternorchestrator workeragentic patterns

Get started

Build agents on patterns that hold up

Start from templates that bake in ReAct, routing, and orchestration — then compose your way up only as far as the task demands. Free to start, no credit card required.

Start building free Browse agent templates