Pipecat voice agent with AssemblyAI Universal-3 Pro Streaming
Build a real-time voice agent using Pipecat — Daily.co's open-source Voice AI framework — and the AssemblyAI Universal-3 Pro Streaming model as the speech-to-text engine.



Build a real-time voice agent using Pipecat — Daily.co's open-source Voice AI framework — and the AssemblyAI Universal-3 Pro Streaming model as the speech-to-text engine.
Pipecat's modular pipeline design means you can swap any component without touching the rest. AssemblyAI has a first-party Pipecat plugin with full Universal-3 Pro Streaming support — no manual WebSocket wiring required.
Why AssemblyAI in Pipecat?
The 41% latency advantage is noticeable in live conversation — and the neural turn detection means fewer awkward double-responses when users pause mid-thought.
Prerequisites
- Python 3.11+
- AssemblyAI API key
- Daily.co API key
- OpenAI API key
- Cartesia API key
Quick start
git clone https://github.com/kelseyefoster/voice-agent-pipecat-universal-3-pro
cd voice-agent-pipecat-universal-3-pro
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env
# Edit .env with your API keys
# Create a Daily.co room
python create_room.py
# Start the bot (paste the room URL from above)
python bot.py --url https://your-name.daily.co/your-roomOpen the room URL in your browser and start talking.
Universal-3 Pro Streaming features
Keyterm prompting
Boost accuracy on domain-specific vocabulary without restarting the session:
stt = AssemblyAISTTService(
connection_params=AssemblyAIConnectionParams(
api_key=os.environ["ASSEMBLYAI_API_KEY"],
speech_model="u3-rt-pro",
keyterms_prompt=["AssemblyAI", "Universal-3", "Pipecat", "YourBrandName"],
)
)Up to 1,000 terms per session. Essential for medical, legal, and financial applications.
Real-time speaker diarization
connection_params=AssemblyAIConnectionParams(
api_key=os.environ["ASSEMBLYAI_API_KEY"],
speech_model="u3-rt-pro",
speaker_labels=True,
max_speakers=2,
)Multilingual support
connection_params=AssemblyAIConnectionParams(
api_key=os.environ["ASSEMBLYAI_API_KEY"],
speech_model="u3-rt-pro",
language_detection=True,
)Supported languages: English, Spanish, French, German, Italian, Portuguese.
Tuning turn detection
connection_params=AssemblyAIConnectionParams(
api_key=os.environ["ASSEMBLYAI_API_KEY"],
speech_model="u3-rt-pro",
end_of_turn_confidence_threshold=0.7,
min_end_of_turn_silence_when_confident=300,
max_turn_silence=1000,
)Deploy to PipecatCloud
pip install pipecatcloud
pcc auth login
pcc init
pcc secrets set my-agent-secrets --file .env
pcc deployResources
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.


