April 2, 2026

Twilio phone agent with AssemblyAI Universal-3 Pro Streaming

Build an AI phone agent that handles real calls using Twilio Voice + Media Streams and the AssemblyAI Universal-3 Pro Streaming model for real-time speech-to-text.

Kelsey Foster

Growth

AI voice agents

Universal-3 Pro Streaming

Reviewed by

Table of contents

[Visible on live site]

Build an AI phone agent that handles real calls using Twilio Voice + Media Streams and the AssemblyAI Universal-3 Pro Streaming model for real-time speech-to-text.

The key detail: Twilio streams 8kHz μ-law (mulaw) audio. AssemblyAI Universal-3 Pro accepts pcm_mulaw at sample_rate=8000 natively — no resampling, no format conversion.

Architecture

Incoming call
     │
  Twilio Voice
     │ TwiML → open WebSocket
     ▼
Your server (/media-stream WebSocket)
     │                        │
     │ mulaw 8kHz audio       │ synthesized mulaw audio
     ▼                        ▲
AssemblyAI Universal-3 Pro    ElevenLabs TTS
     │ transcript + turn signal
     ▼
  OpenAI GPT-4o

Prerequisites

Python 3.11+
AssemblyAI API key
Twilio account with a phone number
OpenAI API key
ElevenLabs API key
ngrok (for local development)

Quick start

git clone https://github.com/kelsey-aai/voice-agent-twilio-universal-3-pro
cd voice-agent-twilio-universal-3-pro

python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

cp .env.example .env
# Edit .env with your API keys

uvicorn server:app --host 0.0.0.0 --port 8000
ngrok http 8000

Configure Twilio

Go to Twilio Console > Phone Numbers
Select your number > Voice & Fax
Set A Call Comes In to Webhook: https://your-ngrok-url.ngrok.io/incoming-call
Call your Twilio number

AssemblyAI WebSocket parameters for Twilio`‍`

ASSEMBLYAI_WS_URL = (
    "wss://streaming.assemblyai.com/v3/ws"
    "?speech_model=u3-rt-pro"
    "&encoding=pcm_mulaw"      # must match Twilio's audio format
    "&sample_rate=8000"        # must match Twilio's 8kHz stream
    "&end_of_turn_confidence_threshold=0.5"
    "&min_turn_silence=400"
)

Phone calls have more background noise than browser audio — the slightly higher confidence threshold and longer min_turn_silence reduce false triggers.

Extending the agent

Add post-call transcription

import assemblyai as aai
transcriber = aai.Transcriber()
transcript = transcriber.transcribe(recording_url)
print(transcript.text)

Add keyterm prompting

ASSEMBLYAI_WS_URL +=
"&keyterms_prompt=YourBrand&keyterms_prompt=SpecialTerm"

Deploy to Railway or Render

# Railway
railway login && railway init && railway up

# Render — create a Web Service pointing to this repo
# Start: uvicorn server:app --host 0.0.0.0 --port $PORT

Resources

Build your Twilio phone agent today

Sign up for a free AssemblyAI account and start transcribing Twilio calls with Universal-3 Pro Streaming in under 30 minutes.

Start building

Twilio phone agent with AssemblyAI Universal-3 Pro Streaming

Architecture

Prerequisites

Quick start

Configure Twilio

AssemblyAI WebSocket parameters for Twilio`‍`

Extending the agent

Add post-call transcription

Add keyterm prompting

Deploy to Railway or Render

Resources

How to build an AI scribe for therapy sessions

Building a voice-powered e-commerce shopping assistant

Build an AI voice agent for customer support that can look up orders

How to vibe code a voice agent with AssemblyAI's Voice Agent API

LiveKit voice agent with AssemblyAI Universal-3 Pro Streaming

Agora voice agent with AssemblyAI Universal-3 Pro Streaming

Prompt engineering for Universal-3 Pro: A practical guide

Speaker identification and diarization with AssemblyAI

Twilio phone agent with AssemblyAI Universal-3 Pro Streaming

Architecture

Prerequisites

Quick start

Configure Twilio

AssemblyAI WebSocket parameters for Twilio‍

Extending the agent

Add post-call transcription

Add keyterm prompting

Deploy to Railway or Render

Resources

Related posts

How to build an AI scribe for therapy sessions

Building a voice-powered e-commerce shopping assistant

Build an AI voice agent for customer support that can look up orders

How to vibe code a voice agent with AssemblyAI's Voice Agent API

LiveKit voice agent with AssemblyAI Universal-3 Pro Streaming

Agora voice agent with AssemblyAI Universal-3 Pro Streaming

Prompt engineering for Universal-3 Pro: A practical guide

Speaker identification and diarization with AssemblyAI

AssemblyAI WebSocket parameters for Twilio`‍`