This tutorial walks you through building a voice AI healthcare intake assistant that uses a talking avatar powered by Anam. A patient opens the app, clicks Start, and speaks with Liv — a lip-synced Anam avatar who guides them through a medical intake form one field at a time. As the user answers each question, the agent calls function tools that push updates to the form on screen using RPC (remote procedure call), a method that lets software trigger actions or exchange data between processes or systems in real time. This keeps both the avatar and the form in sync as the conversation progresses.
For the full source code, see the anam example.
What you'll build
By the end of this tutorial, you'll have a working app that:
- Renders a talking Anam avatar in the browser, lip-synced to the agent's voice.
- Guides a patient through a full intake form using voice, one field at a time.
- Updates each form field in real time as the patient speaks, using LiveKit RPC.
- Handles a final confirmation loop and lets the patient submit the form by voice.
- Works with either a Python or TypeScript backend — both share the same frontend.
Prerequisites
Before you start, make sure you have:
- Python 3.10 or later and uv installed (for the Python backend)
- Node.js and pnpm installed
- A LiveKit Cloud account (the free tier works)
- An Anam API key
The agent uses LiveKit Inference for STT, LLM, and TTS, so you don't need separate API keys from Deepgram, OpenAI, or ElevenLabs. You do need an Anam API key for the avatar.
Step 1: Get the code
Use pnpm dlx degit to download just the anam folder without cloning the entire repository:
1pnpm dlx degit livekit-examples/python-agents-examples/complex-agents/avatars/anam anam2cd anam
The directory has three parts:
agent-py/— the Python LiveKit agentagent-ts/— the TypeScript LiveKit agent (same functionality, different language)frontend/— a Next.js frontend that works with either backend
You only need to run one backend. Pick the one you're comfortable with.
Step 2: Set up the agent
Python backend
Install dependencies from the existing pyproject.toml:
1cd agent-py2uv sync
Copy .env.example to .env.local and fill in your values:
1cp .env.example .env.local
1LIVEKIT_URL=wss://<project-subdomain>.livekit.cloud2LIVEKIT_API_KEY=<your_api_key>3LIVEKIT_API_SECRET=<your_api_secret>4ANAM_API_KEY=<your_anam_api_key>
Before the first run, download the VAD and turn detector models:
1uv run python src/agent.py download-files
TypeScript backend
Install dependencies:
1cd agent-ts2pnpm install
Copy .env.example to .env.local and fill in the same values:
1cp .env.example .env.local
Before the first run, download the VAD and turn detector models:
1pnpm run download-files
Get your LiveKit credentials from the LiveKit Cloud dashboard under Settings > API Keys. Get your Anam API key from the Anam Lab.
Step 3: Understand the agent
This section walks through the key parts of agent-py/src/agent.py. The TypeScript version in agent-ts/src/agent.ts follows the same design.
Voice pipeline
The agent uses LiveKit Inference to wire up Deepgram for STT, OpenAI for the LLM, and ElevenLabs for TTS:
1session = AgentSession(2stt=inference.STT(model="deepgram/nova-3", language="multi"),3llm=inference.LLM(model="openai/gpt-4o-mini"),4tts=inference.TTS(5model="elevenlabs/eleven_turbo_v2_5",6voice="cgSgspJ2msm6clMCkdW9", # Jessica (ElevenLabs)7sample_rate=16000, # Required for Anam avatar compatibility8),9turn_detection=MultilingualModel(),10vad=ctx.proc.userdata["vad"],11preemptive_generation=True,12)
The sample_rate=16000 on the TTS is a hard requirement for Anam. Anam's lip-sync engine expects audio at 16kHz. If you change the TTS model or provider, make sure it outputs at that sample rate.
Function tools as an RPC bridge
The agent has three function tools that act as a bridge to the frontend form. When the patient answers a question, the LLM calls the appropriate tool, which performs an RPC call to the frontend to update the form in real time.
Python (agent-py/src/agent.py):
1@function_tool2async def update_field(3self,4context: RunContext,5field_name: Annotated[str, Field(description="The field ID to update: fullName, dob, address, phone, emergencyName, emergencyRelationship, emergencyPhone, medications, allergies, reasonForVisit")],6value: Annotated[str, Field(description="The value to set for the field.")],7):8"""Update a form field on the patient intake form."""9if field_name not in VALID_FIELD_NAMES:10raise llm.LLMToolException(f"Invalid field name: {field_name}")11payload = json.dumps({"fieldName": field_name, "value": value})12response = await perform_rpc_to_frontend(self._ctx, "updateField", payload)13return response1415@function_tool16async def get_form_state(self, context: RunContext):17"""Get the current state of all form fields."""18response = await perform_rpc_to_frontend(self._ctx, "getFormState", "{}")19return response2021@function_tool22async def submit_form(self, context: RunContext):23"""Submit the completed intake form."""24response = await perform_rpc_to_frontend(self._ctx, "submitForm", "{}")25context.session.say("Your form has been submitted. You will be contacted soon. Thank you.")26return response
Each tool calls perform_rpc_to_frontend, which uses local_participant.perform_rpc() to call a named method on the frontend participant:
1async def perform_rpc_to_frontend(ctx: JobContext, method: str, payload: str) -> str:2destination_identity = get_remote_participant_identity(ctx)3response = await ctx.room.local_participant.perform_rpc(4destination_identity=destination_identity,5method=method,6payload=payload,7response_timeout=5.0,8)9return response
The get_remote_participant_identity function skips any participant whose identity starts with anam-, because Anam joins the room as a separate participant and you only want to target the actual user:
1def get_remote_participant_identity(ctx: JobContext) -> str:2for participant in ctx.room.remote_participants.values():3if not participant.identity.startswith("anam-"):4return participant.identity5raise llm.LLMToolException("No remote participant found")
TypeScript (agent-ts/src/agent.ts) follows the same pattern using llm.tool():
1const updateField = llm.tool({2description: 'Update a form field on the patient intake form.',3parameters: z.object({4fieldName: z.string().describe('The field ID to update: fullName, dob, ...'),5value: z.string().describe('The value to set for the field'),6}),7execute: async ({ fieldName, value }) => {8const response = await performRpcToFrontend(9'updateField',10JSON.stringify({ fieldName, value })11);12return response;13},14});
Both backends use the exact same RPC method names (updateField, getFormState, submitForm) and the same JSON payload format, so the single Next.js frontend works with either.
Attaching the Anam avatar
After the session starts, create an AvatarSession and start it:
1# Start the session first2await session.start(3agent=IntakeAssistant(ctx),4room=ctx.room,5room_options=room_io.RoomOptions(...),6)78# Create Anam avatar — must happen AFTER session.start()9avatar = anam.AvatarSession(10persona_config=anam.PersonaConfig(11name="Liv",12avatarId="071b0286-4cce-4808-bee2-e642f1062de3",13),14)1516await avatar.start(session, room=ctx.room)
Note that session.start() comes first here. Anam needs to connect to an active audio stream for lip-sync, so the session must be running before you attach the avatar. This is different from some other avatar plugins where the avatar starts first. Once avatar.start() resolves, Anam has joined the room as a separate participant and is publishing a video track.
You can use any stock avatar from the Anam Avatar Gallery or create a custom one in Anam Lab. The avatarId in PersonaConfig is the ID you copy from the gallery or lab.
Agent name
Both backends register the same agent name with LiveKit so the frontend can dispatch to either one.
Python (agent-py/src/agent.py) — set on the entrypoint decorator:
1@server.rtc_session(agent_name="Anam-Demo")2async def intake_agent(ctx: JobContext):3...
TypeScript (agent-ts/src/main.ts) — set in ServerOptions:
1cli.runApp(2new ServerOptions({3agent: fileURLToPath(import.meta.url),4agentName: 'Anam-Demo',5}),6);
If you rename the agent, update both the backend and the AGENT_NAME variable in the frontend .env.local.
Instruction design
The system prompt is structured to enforce strict one-field-at-a-time behavior. Each turn asks for exactly one piece of information, confirms it, then moves on. This keeps the conversation predictable and ensures update_field is called with clean, confirmed values rather than unverified user input. The confirmation step also handles corrections cleanly — if the patient says "actually, it's spelled differently," the agent updates the same field before moving forward.
Step 4: Set up the frontend
Install dependencies and configure the environment:
1cd frontend2cp .env.example .env.local3pnpm install
Fill in .env.local:
1LIVEKIT_API_KEY=<your_api_key>2LIVEKIT_API_SECRET=<your_api_secret>3LIVEKIT_URL=wss://<project-subdomain>.livekit.cloud4AGENT_NAME=Anam-Demo
AGENT_NAME must match the agent name registered in the backend — both backends use Anam-Demo as shown in Step 3.
How RPC handlers work
The frontend registers three RPC methods using the useRpcHandlers hook (frontend/hooks/useRpcHandlers.ts). These are the methods the agent calls:
1room.registerRpcMethod('updateField', async (data: RpcInvocationData) => {2const { fieldName, value } = JSON.parse(data.payload) as {3fieldName: string;4value: string;5};6if (!isValidFieldName(fieldName)) {7throw new RpcError(1500, 'Invalid field name', JSON.stringify({ fieldName }));8}9setFormData((prev) => ({ ...prev, [fieldName]: value }));10return JSON.stringify({ success: true, fieldName, value });11});1213room.registerRpcMethod('getFormState', async () => {14return JSON.stringify(formDataRef.current);15});1617room.registerRpcMethod('submitForm', async () => {18setIsSubmitted(true);19return JSON.stringify({ success: true });20});
When updateField is called, it updates the local React state, which immediately re-renders the corresponding input on screen. getFormState reads from a ref rather than state directly to avoid stale closure issues. submitForm flips the isSubmitted flag, which replaces the form with a confirmation message.
The hook registers on connect and cleans up on disconnect:
1return () => {2room.unregisterRpcMethod('updateField');3room.unregisterRpcMethod('getFormState');4room.unregisterRpcMethod('submitForm');5};
Rendering the avatar
The AvatarPanel component (frontend/components/app/avatar-panel.tsx) locates the Anam video track and renders it with <VideoTrack>:
1const remoteParticipants = useRemoteParticipants();2const worker = remoteParticipants.find(3(p) =>4p.kind === ParticipantKind.AGENT &&5p.attributes['lk.publish_on_behalf'] === agent?.identity6);78const workerTracks = useParticipantTracks(9[Track.Source.Camera, Track.Source.ScreenShare],10worker?.identity11);1213const trackRef =14workerTracks.find((t) => t.source === Track.Source.Camera) ?? ...;
Anam joins the room as a worker participant publishing video on behalf of the agent. The component finds this worker by checking the lk.publish_on_behalf attribute, then gets its camera track. Once trackRef resolves, the avatar video renders inside a <VideoTrack>.
Step 5: Run the app
Start the agent in one terminal (choose one backend):
1# Python2cd agent-py3uv run python src/agent.py dev45# TypeScript6cd agent-ts7pnpm run dev
Start the frontend in another terminal:
1cd frontend2pnpm dev
Open http://localhost:3000 and click Start intake. Liv will introduce herself and begin walking you through the form.
How it works
Here's the full session flow:
- The user opens the app and clicks Start intake. The frontend calls
/api/connection-details, which mints a LiveKit JWT and dispatches theAnam-Demoagent. - The agent starts an
AgentSessionwith Deepgram STT, GPT-4o-mini LLM, and ElevenLabs TTS at 16kHz. session.start()runs first, activating the voice pipeline. Thenavatar.start()connects Anam to the room as a separate participant and begins publishing a video track. The frontend'sAvatarPaneldetects the new track and renders the avatar video.session.generate_reply()triggers the agent's opening message. Liv greets the user and asks for their full name.- The user speaks. LiveKit routes the audio through Deepgram Nova 3 (STT), GPT-4o-mini (LLM), and ElevenLabs Jessica (TTS). Anam receives the TTS audio and lip-syncs the avatar.
- When the user answers a question, the LLM calls
update_field. The agent performs an RPC call toupdateFieldon the frontend, which updates the form in React state and re-renders the input. - This continues for all ten fields. Before submission, the agent calls
get_form_stateto verify what's been captured, then reads it back to the user for confirmation. - When the user confirms, the LLM calls
submit_form. The agent performs an RPC call tosubmitForm, which marks the form as submitted in the frontend. The avatar delivers a closing message, and the session ends.
Summary
The key techniques in this app:
- Starting
session.start()beforeavatar.start()so Anam can connect to the live audio stream for lip-sync. - Setting
sample_rate=16000on the ElevenLabs TTS, which is required for Anam avatar compatibility. - Using function tools that call
local_participant.perform_rpc()to push form updates from the agent to the frontend in real time. - Registering RPC handlers on the frontend with
room.registerRpcMethod()to receive those updates and update React state. - Filtering out
anam-prefixed participants when targeting the user with RPC calls, since Anam joins as a separate room participant.
For more information, see:
- Anam plugin docs for the full parameter reference.
- LiveKit RPC docs for more on remote procedure calls between room participants.
- Full source code including both agent backends and the complete Next.js frontend.