The ReAct Pattern for Voice Agents and How AI Agents Think, Act, and Respond

What Is the ReAct Pattern?#

Every time a voice agent checks your calendar, looks up an order, or books an appointment, it's running a loop that most developers never see: Think → Act → Observe → Repeat.

Loading diagram…

This is the ReAct pattern (short for Reasoning and Acting), and it's the foundational design pattern behind modern AI agents. First introduced by Yao et al. (2022; published at ICLR 2023), ReAct describes how a large language model (LLM) alternates between explicit reasoning steps and tool-using actions in an iterative loop until it arrives at a final answer.

Instead of generating a response in one shot, a ReAct agent:

Thinks about what it knows and what's missing
Acts by calling an external tool (an API, database, or service)
Observes the result
Thinks again with the new information
Repeats until it has enough context to respond

For voice agents, this matters because it's the pattern that determines how your agent decides to call tools, look up data, and respond, all in real time, while a human is waiting on the other end of the line.

Why Developers Should Care About ReAct#

If you've ever built a voice agent that calls a function (checking availability, looking up a customer record, placing an order), you've already implemented ReAct, whether you knew it or not.

Understanding the pattern explicitly helps you:

Design better tool schemas that minimize unnecessary reasoning loops
Debug failures by tracing exactly where the agent's logic broke down
Optimize latency by reducing the number of think→act cycles
Scale capabilities by adding tools without breaking reliability

ReAct is also the reasoning engine inside every other agent pattern. The Supervisor pattern uses ReAct for routing decisions. The Handoff pattern uses ReAct to determine when to transfer. Understanding ReAct is a prerequisite to understanding the rest.

How the ReAct Loop Works in a Voice Agent#

Let's walk through a concrete example. A user calls in and says:

"What time is my dentist appointment tomorrow?"

Here's the ReAct loop in action:

Step	Phase	What Happens
1	🧠 Think	"The user wants appointment info. I need to check their calendar. I'll call the calendar lookup tool."
2	⚡ Act	Calls `lookup_calendar(user_id="123", date="2026-03-10")`
3	👁️ Observe	Tool returns `{appointment: "Dr. Smith, 2:30 PM"}`
4	🧠 Think	"I have the info. I can respond now."
5	💬 Respond	"Your dentist appointment with Dr. Smith is tomorrow at 2:30 PM."

For simple lookups, the loop runs once. For complex, multi-step requests (like booking an appointment that requires checking availability, confirming a time, and sending a confirmation), the loop may iterate several times before the agent has enough context to give a final answer.

The key insight is that the LLM never executes tools directly. It proposes tool calls, and the surrounding system executes them and feeds results back. This separation is what makes ReAct safe, loggable, and controllable.

ReAct vs. Chain of Thought vs. Function Calling#

These three concepts are frequently confused. Here's how they differ:

Approach	How It Works	Strengths	Limitations
Chain of Thought (CoT)	LLM reasons step-by-step but never takes actions. Purely internal reasoning.	Improves accuracy on reasoning-heavy tasks	Can't access live data. Prone to hallucination when facts are needed.
Function Calling	LLM outputs structured tool calls (JSON) without explicit reasoning text	Reliable, fast, structured output	Reasoning is implicit, making it harder to debug and prone to losing track in long sequences
ReAct	Combines CoT reasoning with tool actions in an interleaved loop	Visible reasoning, grounded in real data, auditable	More tokens consumed, potential latency from reasoning steps

In practice, modern voice agents blend these approaches. Most frameworks use function calling mechanics (structured JSON tool calls) with ReAct-style reasoning traces for observability. You get the reliability of function calling with the debuggability of explicit reasoning.

When to Use the ReAct Pattern#

ReAct is the right choice when:

The solution path isn't known in advance. The agent needs to figure out what to do, not just do a fixed sequence.
Your agent calls tools. Appointment booking, order lookups, CRM queries, knowledge base searches. Any time an agent needs external data to answer.
You need an audit trail. Every decision the agent makes is visible. You can trace what it thought, what it tried, and what it observed. When things go wrong, you can trace exactly where logic broke down.
You're prototyping. ReAct is the simplest pattern to get a tool-using agent working.

When to consider alternatives#

Simple, deterministic flows where steps are always the same → use the Sequential Pipeline instead
Multi-domain routing where the main job is directing to specialists → use the Handoff pattern
Too many tools (20+) → the LLM may pick the wrong tool or get lost. Consider a Supervisor to route to specialized sub-agents, each with a focused toolset.
High-volume, simple queries like FAQ lookups or status checks → these don't need iterative reasoning

The Latency Challenge: ReAct in Real-Time Voice#

Here's the hard part. Every think→act→observe cycle adds latency. In text chat, a user can wait. In voice, every extra cycle is dead air.

A single ReAct loop iteration can add 500ms or more if the LLM generates verbose reasoning before acting. Multiply that by 2-3 iterations for a complex request, and you're looking at seconds of silence.

How to minimize latency in voice ReAct agents#

1. Design tools that return complete, ready-to-speak answers

Don't return raw database rows that require further LLM processing. Return pre-formatted results like "3 slots available at 1 PM, 2:30 PM, and 4 PM" instead of [{time: "13:00"}, {time: "14:30"}, {time: "16:00"}].

2. Pre-classify intent to skip unnecessary reasoning

For common use cases (billing, support, booking), a lightweight classifier can bypass the full reasoning loop. The agent goes straight to the right tool instead of "thinking" about which tool to use.

3. Use filler speech during tool calls

While the agent is waiting for an API response, have it say "Let me check that for you" or "One moment while I look that up." This eliminates dead air and makes the interaction feel natural.

4. Stream partial responses

Start TTS on the first sentence of the response while the LLM is still generating the rest. Don't wait for the full answer.

5. Limit tool set size

ReAct agents struggle with large tool sets. Keep each agent's toolbox focused (max 5-10 tools). If you need more capabilities, use a Supervisor pattern to route to specialized sub-agents.

Building a ReAct Voice Agent with LiveKit#

LiveKit's Agents framework implements the ReAct pattern natively through the @function_tool decorator. When you give an agent tools, LiveKit handles the full think→act→observe loop automatically. The LLM decides whether to call a tool, LiveKit executes it, and the return value is fed back to the LLM for continued reasoning.

Here's a practical example, an appointment booking agent that demonstrates multi-step ReAct reasoning:

1from livekit.agents import Agent, function_tool, RunContext
2
3class AppointmentAgent(Agent):
4    def __init__(self):
5        super().__init__(
6            instructions="""You help users book appointments.
7            Use the available tools to check availability and book slots.
8            Always confirm details with the user before booking."""
9        )
10
11    @function_tool()
12    async def check_availability(self, context: RunContext, date: str):
13        """Check available appointment slots for a given date.
14        Args:
15            date: The date to check in YYYY-MM-DD format.
16        """
17        raw_slots = await calendar_api.get_slots(date)
18        # Pre-format as speech-ready strings so the LLM can read them directly
19        formatted = [s["time"].strftime("%I:%M %p") for s in raw_slots]
20        return {"available_slots": formatted}
21
22    @function_tool()
23    async def book_appointment(self, context: RunContext, date: str, time: str):
24        """Book an appointment at the specified date and time.
25        Args:
26            date: The date in YYYY-MM-DD format.
27            time: The time in HH:MM format.
28        """
29        confirmation = await calendar_api.book(date, time)
30        return {"confirmation_id": confirmation.id}

What happens at runtime#

User says: "Can I get an appointment tomorrow afternoon?"
🧠 Think: LLM reasons it needs to check availability
⚡ Act: Calls check_availability(date="2026-03-10")
👁️ Observe: Returns {"available_slots": ["1:00 PM", "2:30 PM", "4:00 PM"]}
💬 Speak: "I have three slots available: 1 PM, 2:30 PM, and 4 PM. Which works best?"
User says: "2:30 works"
🧠 Think: LLM reasons it should book the confirmed time
⚡ Act: Calls book_appointment(date="2026-03-10", time="14:30")
👁️ Observe: Returns confirmation
💬 Speak: "You're all set! Appointment confirmed for tomorrow at 2:30 PM."

This is two full ReAct loops (one to gather data, one to take action), orchestrated automatically by LiveKit's agent framework.

LiveKit Features That Supercharge ReAct#

LiveKit's framework provides several capabilities specifically designed to make ReAct work well in real-time voice:

Eliminate dead air with `session.say()`#

Tools can generate speech during execution. While waiting for an async API call, the agent can say "Let me check that for you", eliminating the silence that makes ReAct feel slow:

1@function_tool()
2async def lookup_order(self, context: RunContext, order_id: str):
3    """Look up an order by ID."""
4    context.session.say("Let me pull up that order for you.")
5    result = await orders_api.get(order_id)
6    return result

Prevent interruptions during critical actions#

In voice, users sometimes speak while the agent is mid-action. context.disallow_interruptions() prevents barge-in during critical tool execution, such as when writing to a database:

1@function_tool()
2async def process_payment(self, context: RunContext, amount: float):
3    """Process a payment."""
4    context.disallow_interruptions()
5    result = await payment_api.charge(amount)
6    return result

Graceful error handling with `ToolError`#

When a tool call fails, ToolError returns a structured error to the LLM. The agent then reasons about the failure and tries a different approach. The observe→think loop handles recovery gracefully instead of crashing:

1from livekit.agents.llm import ToolError
2
3@function_tool()
4async def check_balance(self, context: RunContext, account_id: str):
5    """Check account balance."""
6    try:
7        return await accounts_api.balance(account_id)
8    except AccountNotFound:
9        raise ToolError("Account not found. Ask the user to verify their account number.")

Dynamic tool management#

agent.update_tools() lets you add or remove tools at runtime. Start with a small toolset for fast routing, then expand capabilities as the conversation evolves, keeping the ReAct loop tight when it matters most.

MCP support for extensible toolboxes#

LiveKit agents can load tools from external MCP (Model Context Protocol) servers, extending the ReAct toolbox without code changes. Connect to any MCP-compatible service, and the tools appear automatically in the agent's reasoning loop.

Common ReAct Failure Modes (and How to Fix Them)#

ReAct agents fail in predictable ways. Knowing the patterns helps you build more reliable voice agents.

1. Wrong tool selection#

Symptom. The agent calls the billing API when the user asks about shipping.

Fix. Write descriptive, unambiguous tool descriptions. The LLM chooses tools based on the docstring. Vague descriptions lead to wrong choices. Be specific about when each tool should be used.

2. Infinite reasoning loops#

Symptom. The agent keeps calling tools without ever producing a final answer.

Fix. Set a maximum iteration limit (3-5 tool calls are typical for voice). If the agent can't resolve the query in that budget, have it gracefully escalate: "I wasn't able to find that information. Let me connect you with someone who can help."

3. Tool overload#

Symptom. Agent accuracy drops as you add more tools (especially beyond 15-20).

Fix. Decompose into specialized sub-agents using the Supervisor pattern. Each sub-agent gets a focused toolset of 5-10 tools, and the supervisor routes to the right one. As one AI engineering practitioner noted: "ReAct shines when the action space is small and the feedback from tools is crisp. Once the tool list grows, the reasoning loop degrades fast."

4. Verbose reasoning causing latency#

Symptom. The agent "thinks" for 2+ seconds before acting, even on simple requests.

Fix. Use a fast, lightweight model (like GPT-4o-mini) for initial triage and tool routing, then switch to a larger model only for complex reasoning. LiveKit's Agent class supports per-agent model overrides. All models are accessed through LiveKit Inference, a unified API for STT, LLM, and TTS with no separate account setup required.

5. Hallucinated tool arguments#

Symptom. The agent invents parameter values instead of asking the user.

Fix. Make required parameters explicit in tool schemas and add validation. If a required field is missing, return a ToolError prompting the agent to ask the user.

ReAct as the Foundation for Advanced Patterns#

ReAct is the reasoning engine inside every other pattern:

Supervisor / Coordinator, where the supervisor agent uses ReAct to reason about which specialist to delegate to
Handoff / Routing, where the triage agent uses ReAct to determine when and where to transfer the call
Human-in-the-Loop, where the agent uses ReAct to assess risk and decide when to escalate to a human
Sequential Pipeline, where each stage can use ReAct internally for its specialized processing

Understanding ReAct gives you a mental model for debugging any agent behavior, because at the lowest level, every agent decision follows the same think→act→observe loop.

Key Takeaways#

ReAct (Think → Act → Observe → Repeat) is the reasoning engine inside every tool-calling agent. Understanding it helps you design better tools, debug failures, and optimize latency
The pattern separates agents that talk from agents that solve problems. It's how your agent decides to check a calendar, look up an order, or book an appointment
Latency is the main tradeoff in voice. Minimize it with complete tool responses, intent pre-classification, filler speech, and focused toolsets (5–10 tools max)
ReAct is the foundation for every other pattern. The Supervisor uses it for routing, the Handoff uses it to decide when to transfer, HITL uses it to assess risk
LiveKit's @function_tool decorator implements the full loop automatically, with built-in support for filler speech, error recovery, and dynamic tool management

Getting Started#

Here's the fastest path to a ReAct-powered voice agent:

Start with the LiveKit Agents quickstart to get a working voice agent in minutes
Add your first tool by decorating a method with @function_tool() and watch the ReAct loop come alive
Test in the Agent Console to interact with and debug your agent without any frontend setup, or prototype without code using Agent Builder
Iterate on tool schemas. The #1 lever for ReAct quality is well-written tool descriptions
Monitor the reasoning loop using Agent Observability to trace every think→act→observe cycle and identify optimization opportunities

The ReAct pattern is the foundation. Once you understand how your agent thinks, you can make it think faster, smarter, and more reliably. That's what separates a demo from a production voice agent.

Give it a try and let us know what you're building.

06.19.2026

Build a voice AI agent with memory using LiveKit and Supabase

Read

06.17.2026

Why Your Agent Leaves on Browser Refresh (and When to Keep It)

Read

06.10.2026

LiveKit noise cancellation: what it is and how it works

Read

What Is the ReAct Pattern?#

Why Developers Should Care About ReAct#

How the ReAct Loop Works in a Voice Agent#

ReAct vs. Chain of Thought vs. Function Calling#

When to Use the ReAct Pattern#

When to consider alternatives#

The Latency Challenge: ReAct in Real-Time Voice#

How to minimize latency in voice ReAct agents#

Building a ReAct Voice Agent with LiveKit#

What happens at runtime#

LiveKit Features That Supercharge ReAct#

Eliminate dead air with session.say()#

Prevent interruptions during critical actions#

Graceful error handling with ToolError#

Dynamic tool management#

MCP support for extensible toolboxes#

Common ReAct Failure Modes (and How to Fix Them)#

1. Wrong tool selection#

2. Infinite reasoning loops#

3. Tool overload#

4. Verbose reasoning causing latency#

5. Hallucinated tool arguments#

ReAct as the Foundation for Advanced Patterns#

Key Takeaways#

Getting Started#

Related

Build a voice AI agent with memory using LiveKit and Supabase

Why Your Agent Leaves on Browser Refresh (and When to Keep It)

LiveKit noise cancellation: what it is and how it works

Eliminate dead air with `session.say()`#

Graceful error handling with `ToolError`#