Twilio phone agent with AssemblyAI Universal-3 Pro Streaming
Build an AI phone agent that handles real calls using Twilio Voice + Media Streams and the AssemblyAI Universal-3 Pro Streaming model for real-time speech-to-text.



Build an AI phone agent that handles real calls using Twilio Voice + Media Streams and the AssemblyAI Universal-3 Pro Streaming model for real-time speech-to-text.
The key detail: Twilio streams 8kHz μ-law (mulaw) audio. AssemblyAI Universal-3 Pro accepts pcm_mulaw at sample_rate=8000 natively — no resampling, no format conversion.
Architecture
Incoming call
│
Twilio Voice
│ TwiML → open WebSocket
▼
Your server (/media-stream WebSocket)
│ │
│ mulaw 8kHz audio │ synthesized mulaw audio
▼ ▲
AssemblyAI Universal-3 Pro ElevenLabs TTS
│ transcript + turn signal
▼
OpenAI GPT-4oPrerequisites
- Python 3.11+
- AssemblyAI API key
- Twilio account with a phone number
- OpenAI API key
- ElevenLabs API key
- ngrok (for local development)
Quick start
git clone https://github.com/kelseyefoster/voice-agent-twilio-universal-3-pro
cd voice-agent-twilio-universal-3-pro
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env
# Edit .env with your API keys
uvicorn server:app --host 0.0.0.0 --port 8000
ngrok http 8000Configure Twilio
- Go to Twilio Console > Phone Numbers
- Select your number > Voice & Fax
- Set A Call Comes In to Webhook:
https://your-ngrok-url.ngrok.io/incoming-call - Call your Twilio number
AssemblyAI WebSocket parameters for Twilio
ASSEMBLYAI_WS_URL = (
"wss://streaming.assemblyai.com/v3/ws"
"?speech_model=u3-rt-pro"
"&encoding=pcm_mulaw" # must match Twilio's audio format
"&sample_rate=8000" # must match Twilio's 8kHz stream
"&end_of_turn_confidence_threshold=0.5"
"&min_turn_silence=400"
)Phone calls have more background noise than browser audio — the slightly higher confidence threshold and longer min_turn_silence reduce false triggers.
Extending the agent
Add post-call transcription
import assemblyai as aai
transcriber = aai.Transcriber()
transcript = transcriber.transcribe(recording_url)
print(transcript.text)Add keyterm prompting
ASSEMBLYAI_WS_URL +=
"&keyterms_prompt=YourBrand&keyterms_prompt=SpecialTerm"Deploy to Railway or Render
# Railway
railway login && railway init && railway up
# Render — create a Web Service pointing to this repo
# Start: uvicorn server:app --host 0.0.0.0 --port $PORTResources
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.



