The road ahead for autonomous agents
A transparent view of what we're building. This is the AI Agentics product roadmap— the capabilities shipping now, the work landing next, and the long-horizon bets we're exploring for the future of agentic AI.
- Updated quarterly
- Now / Next / Later
- Community-shaped
This is the AI Agentics product roadmap — a living document of where the platform is headed. We build in the open because the teams shipping production agents deserve to plan around what's coming, not guess at it. Everything below is grouped into three honest horizons: Now (in active development or beta), Next (committed and scoped), and Later (directional bets we're still researching).
Our north star is the same one that guides every team building AI agents: close the gap between an agent that demos well and one you can trust in production. That means investing in evaluation, observability, memory, and orchestration just as heavily as in raw new capabilities like voice and long-running autonomy.
Timelines shift as we learn — a roadmap is a statement of intent, not a contract. For features that have already landed, see the changelog. To weigh in on priorities, tell us what you need.
Now: shipping in the current cycle
In active development or already in beta. These land in the next few weeks and define our current focus on trust, quality, and new agent modalities.
Agent evals
ShippingA built-in evaluation suite to measure agent quality the way you measure code: regression sets, scoring rubrics, LLM-as-judge graders, and CI gates that block a deploy when task success or tool-call accuracy drops.
Voice agents (beta)
ShippingLow-latency speech-to-speech agents with streaming transcription, barge-in handling, and the same tool-calling loop as text agents — so a phone or in-app voice agent shares one codebase.
Trace-level observability
ShippingStep-by-step traces of every reasoning turn, tool call, and token spend, with replay and diff views so you can debug a failed run and reproduce it deterministically.
Guardrail policies
ShippingDeclarative input/output guardrails — PII redaction, allowed-tool scopes, and content filters — enforced at the orchestration layer rather than baked into prompts.
Next: committed and scoped
Work we've designed and prioritized for the coming quarters. The shape is settled; we're heads-down on the build.
On-prem & VPC deployment
PlannedRun the full agent runtime inside your own VPC or air-gapped data center. Bring your own models and vector store, keep every byte of agent data in your boundary, and stay aligned with SOC 2 and HIPAA controls.
Agent marketplace
PlannedA curated catalog of installable agents, tools, and orchestration templates — one-click to fork a support agent or research agent into your workspace, with versioning and trust signals built in.
Shared long-term memory
PlannedA managed memory service with hybrid semantic plus keyword retrieval, per-user namespaces, and recency decay — so fleets of agents share context without you operating a vector store yourself.
MCP-native tool registry
PlannedFirst-class Model Context Protocol support: discover, install, and govern external tool servers from one registry, with scoped credentials and audit logging for every tool an agent can reach.
On-prem deployment and the marketplace are the two requests we hear most from enterprise teams. Both build on the foundations described in our AI agent frameworks guide and the patterns behind multi-agent systems.
Later: bets we're exploring
Directional research where the problem matters but the design is still open. We share these to invite your input — not to promise a date.
Autonomous multi-day agents
ExploringLong-horizon agents that durably checkpoint state, survive restarts, and run a task across hours or days — pausing for human approval at the right moments instead of holding a single session open.
Fine-tune routing
ExploringAutomatically route each sub-task to the cheapest model that can handle it — distilling a fine-tuned small model for hot paths while reserving frontier models for the hard reasoning steps.
Self-organizing agent teams
ExploringOrchestrators that spawn, supervise, and retire specialist worker agents on the fly based on the task graph — with built-in review loops so agents check each other's work.
Agent-authored tools
ExploringAgents that recognize a missing capability, write and sandbox-test a new tool, and register it for reuse — turning one-off scripts into a growing, governed tool library.
Upcoming milestones by quarter
A quarter-by-quarter view of when the headline capabilities are expected to reach general availability.
- Q3 2026Now
Agent evals GA + voice agents beta
The evaluation suite reaches general availability with CI gating, and voice agents open in public beta with streaming speech and shared tool-calling.
- Q4 2026Next
On-prem deployment & shared memory
Self-hosted VPC deployment lands for enterprise plans alongside the managed long-term memory service and MCP-native tool registry.
- Q1 2027Next
Agent marketplace launch
The curated marketplace opens with installable agents, tools, and templates — plus versioning, trust signals, and one-click forking into your workspace.
Active horizons
Now · Next · Later
Planned capabilities
across the roadmap
Quarters
of committed work
Traceable
every run observable
Your feedback shapes what we build
Tell us what to build next
This roadmap moves when we hear from the people building real agents. If a capability would unblock your team — or you'd reorder a horizon — send us the details at our contact page. We read every request, and recurring themes are exactly how items graduate from Later to Next to Now. To see what we've already shipped against this plan, browse the changelog.
Build on a platform that ships
Start today and grow into every capability on this roadmap as it lands. Free to begin — no credit card required.