What Is the ReAct Pattern?
Every time a voice agent checks your calendar, looks up an order, or books an appointment, it's running a loop that most developers never see: Think → Act → Observe → Repeat.
Loading diagram…
This is the ReAct pattern (short for Reasoning and Acting), and it's the foundational design pattern behind modern AI agents. First introduced by Yao et al. (2022; published at ICLR 2023), ReAct describes how a large language model (LLM) alternates between explicit reasoning steps and tool-using actions in an iterative loop until it arrives at a final answer.
Instead of generating a response in one shot, a ReAct agent:
- Thinks about what it knows and what's missing
- Acts by calling an external tool (an API, database, or service)
- Observes the result
- Thinks again with the new information
- Repeats until it has enough context to respond
For voice agents, this matters because it's the pattern that determines how your agent decides to call tools, look up data, and respond, all in real time, while a human is waiting on the other end of the line.
Why Developers Should Care About ReAct
If you've ever built a voice agent that calls a function (checking availability, looking up a customer record, placing an order), you've already implemented ReAct, whether you knew it or not.
Understanding the pattern explicitly helps you:
- Design better tool schemas that minimize unnecessary reasoning loops
- Debug failures by tracing exactly where the agent's logic broke down
- Optimize latency by reducing the number of think→act cycles
- Scale capabilities by adding tools without breaking reliability
ReAct is also the reasoning engine inside every other agent pattern. The Supervisor pattern uses ReAct for routing decisions. The Handoff pattern uses ReAct to determine when to transfer. Understanding ReAct is a prerequisite to understanding the rest.
How the ReAct Loop Works in a Voice Agent
Let's walk through a concrete example. A user calls in and says:
"What time is my dentist appointment tomorrow?"
Here's the ReAct loop in action:
| Step | Phase | What Happens |
|---|---|---|
| 1 | 🧠 Think | "The user wants appointment info. I need to check their calendar. I'll call the calendar lookup tool." |
| 2 | ⚡ Act | Calls lookup_calendar(user_id="123", date="2026-03-10") |
| 3 | 👁️ Observe | Tool returns {appointment: "Dr. Smith, 2:30 PM"} |
| 4 | 🧠 Think | "I have the info. I can respond now." |
| 5 | 💬 Respond | "Your dentist appointment with Dr. Smith is tomorrow at 2:30 PM." |
For simple lookups, the loop runs once. For complex, multi-step requests (like booking an appointment that requires checking availability, confirming a time, and sending a confirmation), the loop may iterate several times before the agent has enough context to give a final answer.
The key insight is that the LLM never executes tools directly. It proposes tool calls, and the surrounding system executes them and feeds results back. This separation is what makes ReAct safe, loggable, and controllable.
ReAct vs. Chain of Thought vs. Function Calling
These three concepts are frequently confused. Here's how they differ:
| Approach | How It Works | Strengths | Limitations |
|---|---|---|---|
| Chain of Thought (CoT) | LLM reasons step-by-step but never takes actions. Purely internal reasoning. | Improves accuracy on reasoning-heavy tasks | Can't access live data. Prone to hallucination when facts are needed. |
| Function Calling | LLM outputs structured tool calls (JSON) without explicit reasoning text | Reliable, fast, structured output | Reasoning is implicit, making it harder to debug and prone to losing track in long sequences |
| ReAct | Combines CoT reasoning with tool actions in an interleaved loop | Visible reasoning, grounded in real data, auditable | More tokens consumed, potential latency from reasoning steps |
In practice, modern voice agents blend these approaches. Most frameworks use function calling mechanics (structured JSON tool calls) with ReAct-style reasoning traces for observability. You get the reliability of function calling with the debuggability of explicit reasoning.
When to Use the ReAct Pattern
ReAct is the right choice when:
- The solution path isn't known in advance. The agent needs to figure out what to do, not just do a fixed sequence.
- Your agent calls tools. Appointment booking, order lookups, CRM queries, knowledge base searches. Any time an agent needs external data to answer.
- You need an audit trail. Every decision the agent makes is visible. You can trace what it thought, what it tried, and what it observed. When things go wrong, you can trace exactly where logic broke down.
- You're prototyping. ReAct is the simplest pattern to get a tool-using agent working.
When to consider alternatives
- Simple, deterministic flows where steps are always the same → use the Sequential Pipeline instead
- Multi-domain routing where the main job is directing to specialists → use the Handoff pattern
- Too many tools (20+) → the LLM may pick the wrong tool or get lost. Consider a Supervisor to route to specialized sub-agents, each with a focused toolset.
- High-volume, simple queries like FAQ lookups or status checks → these don't need iterative reasoning
The Latency Challenge: ReAct in Real-Time Voice
Here's the hard part. Every think→act→observe cycle adds latency. In text chat, a user can wait. In voice, every extra cycle is dead air.
A single ReAct loop iteration can add 500ms or more if the LLM generates verbose reasoning before acting. Multiply that by 2-3 iterations for a complex request, and you're looking at seconds of silence.
How to minimize latency in voice ReAct agents
1. Design tools that return complete, ready-to-speak answers
Don't return raw database rows that require further LLM processing. Return pre-formatted results like "3 slots available at 1 PM, 2:30 PM, and 4 PM" instead of [{time: "13:00"}, {time: "14:30"}, {time: "16:00"}].
2. Pre-classify intent to skip unnecessary reasoning
For common use cases (billing, support, booking), a lightweight classifier can bypass the full reasoning loop. The agent goes straight to the right tool instead of "thinking" about which tool to use.
3. Use filler speech during tool calls
While the agent is waiting for an API response, have it say "Let me check that for you" or "One moment while I look that up." This eliminates dead air and makes the interaction feel natural.
4. Stream partial responses
Start TTS on the first sentence of the response while the LLM is still generating the rest. Don't wait for the full answer.
5. Limit tool set size
ReAct agents struggle with large tool sets. Keep each agent's toolbox focused (max 5-10 tools). If you need more capabilities, use a Supervisor pattern to route to specialized sub-agents.
Building a ReAct Voice Agent with LiveKit
LiveKit's Agents framework implements the ReAct pattern natively through the @function_tool decorator. When you give an agent tools, LiveKit handles the full think→act→observe loop automatically. The LLM decides whether to call a tool, LiveKit executes it, and the return value is fed back to the LLM for continued reasoning.
Here's a practical example, an appointment booking agent that demonstrates multi-step ReAct reasoning:
1from livekit.agents import Agent, function_tool, RunContext23class AppointmentAgent(Agent):4def __init__(self):5super().__init__(6instructions="""You help users book appointments.7Use the available tools to check availability and book slots.8Always confirm details with the user before booking."""9)1011@function_tool()12async def check_availability(self, context: RunContext, date: str):13"""Check available appointment slots for a given date.14Args:15date: The date to check in YYYY-MM-DD format.16"""17raw_slots = await calendar_api.get_slots(date)18# Pre-format as speech-ready strings so the LLM can read them directly19formatted = [s["time"].strftime("%I:%M %p") for s in raw_slots]20return {"available_slots": formatted}2122@function_tool()23async def book_appointment(self, context: RunContext, date: str, time: str):24"""Book an appointment at the specified date and time.25Args:26date: The date in YYYY-MM-DD format.27time: The time in HH:MM format.28"""29confirmation = await calendar_api.book(date, time)30return {"confirmation_id": confirmation.id}
What happens at runtime
- User says: "Can I get an appointment tomorrow afternoon?"
- 🧠 Think: LLM reasons it needs to check availability
- ⚡ Act: Calls
check_availability(date="2026-03-10") - 👁️ Observe: Returns
{"available_slots": ["1:00 PM", "2:30 PM", "4:00 PM"]} - 💬 Speak: "I have three slots available: 1 PM, 2:30 PM, and 4 PM. Which works best?"
- User says: "2:30 works"
- 🧠 Think: LLM reasons it should book the confirmed time
- ⚡ Act: Calls
book_appointment(date="2026-03-10", time="14:30") - 👁️ Observe: Returns confirmation
- 💬 Speak: "You're all set! Appointment confirmed for tomorrow at 2:30 PM."
This is two full ReAct loops (one to gather data, one to take action), orchestrated automatically by LiveKit's agent framework.
LiveKit Features That Supercharge ReAct
LiveKit's framework provides several capabilities specifically designed to make ReAct work well in real-time voice:
Eliminate dead air with session.say()
Tools can generate speech during execution. While waiting for an async API call, the agent can say "Let me check that for you", eliminating the silence that makes ReAct feel slow:
1@function_tool()2async def lookup_order(self, context: RunContext, order_id: str):3"""Look up an order by ID."""4context.session.say("Let me pull up that order for you.")5result = await orders_api.get(order_id)6return result
Prevent interruptions during critical actions
In voice, users sometimes speak while the agent is mid-action. context.disallow_interruptions() prevents barge-in during critical tool execution, such as when writing to a database:
1@function_tool()2async def process_payment(self, context: RunContext, amount: float):3"""Process a payment."""4context.disallow_interruptions()5result = await payment_api.charge(amount)6return result
Graceful error handling with ToolError
When a tool call fails, ToolError returns a structured error to the LLM. The agent then reasons about the failure and tries a different approach. The observe→think loop handles recovery gracefully instead of crashing:
1from livekit.agents.llm import ToolError23@function_tool()4async def check_balance(self, context: RunContext, account_id: str):5"""Check account balance."""6try:7return await accounts_api.balance(account_id)8except AccountNotFound:9raise ToolError("Account not found. Ask the user to verify their account number.")
Dynamic tool management
agent.update_tools() lets you add or remove tools at runtime. Start with a small toolset for fast routing, then expand capabilities as the conversation evolves, keeping the ReAct loop tight when it matters most.
MCP support for extensible toolboxes
LiveKit agents can load tools from external MCP (Model Context Protocol) servers, extending the ReAct toolbox without code changes. Connect to any MCP-compatible service, and the tools appear automatically in the agent's reasoning loop.
Common ReAct Failure Modes (and How to Fix Them)
ReAct agents fail in predictable ways. Knowing the patterns helps you build more reliable voice agents.
1. Wrong tool selection
Symptom. The agent calls the billing API when the user asks about shipping.
Fix. Write descriptive, unambiguous tool descriptions. The LLM chooses tools based on the docstring. Vague descriptions lead to wrong choices. Be specific about when each tool should be used.
2. Infinite reasoning loops
Symptom. The agent keeps calling tools without ever producing a final answer.
Fix. Set a maximum iteration limit (3-5 tool calls are typical for voice). If the agent can't resolve the query in that budget, have it gracefully escalate: "I wasn't able to find that information. Let me connect you with someone who can help."
3. Tool overload
Symptom. Agent accuracy drops as you add more tools (especially beyond 15-20).
Fix. Decompose into specialized sub-agents using the Supervisor pattern. Each sub-agent gets a focused toolset of 5-10 tools, and the supervisor routes to the right one. As one AI engineering practitioner noted: "ReAct shines when the action space is small and the feedback from tools is crisp. Once the tool list grows, the reasoning loop degrades fast."
4. Verbose reasoning causing latency
Symptom. The agent "thinks" for 2+ seconds before acting, even on simple requests.
Fix. Use a fast, lightweight model (like GPT-4o-mini) for initial triage and tool routing, then switch to a larger model only for complex reasoning. LiveKit's Agent class supports per-agent model overrides. All models are accessed through LiveKit Inference, a unified API for STT, LLM, and TTS with no separate account setup required.
5. Hallucinated tool arguments
Symptom. The agent invents parameter values instead of asking the user.
Fix. Make required parameters explicit in tool schemas and add validation. If a required field is missing, return a ToolError prompting the agent to ask the user.
ReAct as the Foundation for Advanced Patterns
ReAct is the reasoning engine inside every other pattern:
- Supervisor / Coordinator, where the supervisor agent uses ReAct to reason about which specialist to delegate to
- Handoff / Routing, where the triage agent uses ReAct to determine when and where to transfer the call
- Human-in-the-Loop, where the agent uses ReAct to assess risk and decide when to escalate to a human
- Sequential Pipeline, where each stage can use ReAct internally for its specialized processing
Understanding ReAct gives you a mental model for debugging any agent behavior, because at the lowest level, every agent decision follows the same think→act→observe loop.
Key Takeaways
- ReAct (Think → Act → Observe → Repeat) is the reasoning engine inside every tool-calling agent. Understanding it helps you design better tools, debug failures, and optimize latency
- The pattern separates agents that talk from agents that solve problems. It's how your agent decides to check a calendar, look up an order, or book an appointment
- Latency is the main tradeoff in voice. Minimize it with complete tool responses, intent pre-classification, filler speech, and focused toolsets (5–10 tools max)
- ReAct is the foundation for every other pattern. The Supervisor uses it for routing, the Handoff uses it to decide when to transfer, HITL uses it to assess risk
- LiveKit's
@function_tooldecorator implements the full loop automatically, with built-in support for filler speech, error recovery, and dynamic tool management
Getting Started
Here's the fastest path to a ReAct-powered voice agent:
- Start with the LiveKit Agents quickstart to get a working voice agent in minutes
- Add your first tool by decorating a method with
@function_tool()and watch the ReAct loop come alive - Test in the Agent Playground to interact with your agent without any frontend setup, or prototype without code using Agent Builder
- Iterate on tool schemas. The #1 lever for ReAct quality is well-written tool descriptions
- Monitor the reasoning loop using Agent Observability to trace every think→act→observe cycle and identify optimization opportunities
The ReAct pattern is the foundation. Once you understand how your agent thinks, you can make it think faster, smarter, and more reliably. That's what separates a demo from a production voice agent.
Give it a try and let us know what you're building.