Vapi voice agent with AssemblyAI Universal-3 Pro Streaming
Use AssemblyAI Universal-3 Pro Streaming as the speech-to-text engine inside your Vapi voice agent — and get punctuation-based turn detection, keyterm prompting, and 307ms P50 latency inside Vapi's managed voice platform.



Use AssemblyAI Universal-3 Pro Streaming as the speech-to-text engine inside your Vapi voice agent — and get punctuation-based turn detection, keyterm prompting, and 307ms P50 latency inside Vapi's managed voice platform.
What is Vapi?
Vapi handles telephony, turn-taking, and orchestration so you don't have to. It supports 14+ speech-to-text providers. You bring your AssemblyAI key, Vapi handles the rest.
Setup: add AssemblyAI to Vapi
Step 1 — Add your API key
- Go to dashboard.vapi.ai
- Navigate to Settings > Transcriber Providers
- Add your AssemblyAI API key
Step 2 — Create an assistant (dashboard)
- Click Create Assistant
- Under Transcriber, select Assembly AI
- Under Model, select u3-rt-pro (Universal-3 Pro Streaming)
- Save and test via the web call button
Step 3 — Create an assistant (API)
python create_assistant.py create
This creates a fully configured assistant:
{
"transcriber": {
"provider": "assembly-ai",
"model": "u3-rt-pro",
"language": "en",
"keytermsPrompt": ["YourBrand", "SpecialTerm"],
"confidenceThreshold": 0.4
}
}Quick start
git clone https://github.com/kelseyefoster/voice-agent-vapi-assemblyai
cd voice-agent-vapi-assemblyai
pip install -r requirements.txt
cp .env.example .env
# Edit .env with your keys
# Create an assistant
python create_assistant.py create
# Make an outbound test call (requires Twilio number in .env)
python create_assistant.py call --assistant-id <id> --phone +1XXXXXXXXXX
# Start the webhook server
uvicorn webhook_server:app --port 8000Keyterm prompting
This is one of the biggest accuracy levers available in Vapi. Boost recognition for domain-specific vocabulary that a general speech model would otherwise miss:
"keytermsPrompt": [
"hemoglobin A1c", # medical
"HIPAA", # compliance
"Jardiance", # drug name
"deductible", # insurance
]
Up to 100 keyterms, each up to 50 characters. Takes effect immediately on the next call — no assistant restart needed.
Supported languages
English, Spanish, French, German, Italian, and Portuguese — with native multilingual code switching. Set the language in the transcriber config:
{ "transcriber": { "provider": "assembly-ai", "model": "u3-rt-pro",
"language": "es" } }
When to choose AssemblyAI over Deepgram in Vapi
Resources
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.

