LiveKit voice agent with AssemblyAI Universal-3 Pro Streaming
Build a production-ready LiveKit voice agent using AssemblyAI Universal-3 Pro Streaming. 307ms P50 latency, neural turn detection, and anti-hallucination — in Python.



Build a production-ready real-time voice agent using LiveKit Agents and the AssemblyAI Universal-3 Pro Streaming model (`u3-rt-pro`). This is the fastest path from zero to a deployed Voice AI agent — and the combination that gives you the best speech-to-text latency available today.
Why Universal-3 Pro Streaming?
307ms P50 latency. That's what separates a voice agent that feels natural from one that feels broken.
*Benchmarks from Hamming.ai across 4M+ production calls.*
The turn detection difference is significant. Instead of silence-based VAD, Universal-3 Pro uses acoustic and linguistic signals together — so it knows the difference between a pause mid-sentence and an actual end-of-turn. Fewer false triggers, snappier response.
Architecture
│ WebRTC (LiveKit room)
▼
LiveKit Cloud ──► AssemblyAI Universal-3 Pro Streaming (speech-to-text)
│ transcript + neural turn signal
▼
OpenAI GPT-4o (LLM)
│ text response
▼
Cartesia Sonic (TTS)
│ audio
▼
Back to LiveKit roo
Prerequisites
- Python 3.11+
- AssemblyAI API key — free tier available
- LiveKit Cloud account — free tier available
- OpenAI API key
- Cartesia API key
Quick start
1. Clone and install
git clone https://github.com/kelseyefoster/voice-agent-livekit-universal-3-pro
cd voice-agent-livekit-universal-3-pro
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt2. Configure environment
cp .env.example .env
# Edit .env with your API keys
3. Download plugin models
python agent.py download-files4. Run locally
# Console mode — speak directly from your terminal
python agent.py console
# Dev mode — connects to LiveKit Cloud, open agents-playground.livekit.io
python agent.py devOpen agents-playground.livekit.io, enter your LiveKit URL and API key, and start talking.
Tuning Universal-3 Pro Streaming
The three turn detection parameters give you a lot of control over how responsive vs. patient the agent feels:
stt=assemblyai.STT(
model="u3-rt-pro",
# How confident the model needs to be before declaring turn end (0.0–1.0)
# Lower = faster response; higher = fewer false triggers on noisy lines
end_of_turn_confidence_threshold=0.4,
# Silence (ms) before the speculative end-of-turn check fires
min_turn_silence=300,
# Hard ceiling — force turn end after this much silence regardless
max_turn_silence=1200,
)**For noisy environments** (call centers, mobile): raise `end_of_turn_confidence_threshold` to `0.6`
**For fast-paced conversation**: lower `min_turn_silence` to `200`
**For healthcare or deliberate speech**: raise `max_turn_silence` to `2000`
Enabling keyterm prompting
Boost recognition accuracy for domain-specific vocabulary mid-session — no restart required:
# After session.start():
await session.stt.update_options(
keyterms_prompt=["YourBrandName", "SpecialProduct", "TechnicalTerm"]Up to 1,000 terms, each up to 50 characters. This is especially useful for medical terminology, product names, and financial jargon.
Enabling real-time speaker diarization
stt=assemblyai.STT(
model="u3-rt-pro",
speaker_labels=True,
max_speakers=2, # e.g., interviewer + candidate, agent + customer
)Swapping components
The LiveKit Agents plugin system makes it straightforward to swap any component:
# Different LLM
from livekit.plugins import anthropic
llm=anthropic.LLM(model="claude-opus-4-6")
# Different TTS
from livekit.plugins import elevenlabs
tts=elevenlabs.TTS(voice_id="your_voice_id")
# Groq for ultra-low-latency LLM inference
llm=openai.LLM.with_groq(model="llama-3.3-70b-versatile")Deploy to Fly.io
fly launch --no-deploy
fly secrets set \
ASSEMBLYAI_API_KEY=your_key \
OPENAI_API_KEY=your_key \
CARTESIA_API_KEY=your_key \
LIVEKIT_URL=wss://... \
LIVEKIT_API_KEY=your_key \
LIVEKIT_API_SECRET=your_secret
fly deployResources
- AssemblyAI Universal Streaming docs
- LiveKit Agents docs
- AssemblyAI LiveKit integration guide
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
.png)
.png)



