Insights & Use Cases
April 2, 2026

Vapi voice agent with AssemblyAI Universal-3 Pro Streaming

Use AssemblyAI Universal-3 Pro Streaming as the speech-to-text engine inside your Vapi voice agent — and get punctuation-based turn detection, keyterm prompting, and 307ms P50 latency inside Vapi's managed voice platform.

Reviewed by
No items found.
Table of contents

Use AssemblyAI Universal-3 Pro Streaming as the speech-to-text engine inside your Vapi voice agent — and get punctuation-based turn detection, keyterm prompting, and 307ms P50 latency inside Vapi's managed voice platform.

What is Vapi?

Vapi handles telephony, turn-taking, and orchestration so you don't have to. It supports 14+ speech-to-text providers. You bring your AssemblyAI key, Vapi handles the rest.

Setup: add AssemblyAI to Vapi

Step 1 — Add your API key

  1. Go to dashboard.vapi.ai
  2. Navigate to Settings > Transcriber Providers
  3. Add your AssemblyAI API key

Step 2 — Create an assistant (dashboard)

  1. Click Create Assistant
  2. Under Transcriber, select Assembly AI
  3. Under Model, select u3-rt-pro (Universal-3 Pro Streaming)
  4. Save and test via the web call button

Step 3 — Create an assistant (API)

python create_assistant.py create

This creates a fully configured assistant:

{
  "transcriber": {
    "provider": "assembly-ai",
    "model": "u3-rt-pro",
    "language": "en",
    "keytermsPrompt": ["YourBrand", "SpecialTerm"],
    "confidenceThreshold": 0.4
  }
}

Quick start

git clone https://github.com/kelseyefoster/voice-agent-vapi-assemblyai
cd voice-agent-vapi-assemblyai

pip install -r requirements.txt
cp .env.example .env
# Edit .env with your keys

# Create an assistant
python create_assistant.py create

# Make an outbound test call (requires Twilio number in .env)
python create_assistant.py call --assistant-id <id> --phone +1XXXXXXXXXX

# Start the webhook server
uvicorn webhook_server:app --port 8000

Keyterm prompting

This is one of the biggest accuracy levers available in Vapi. Boost recognition for domain-specific vocabulary that a general speech model would otherwise miss:

"keytermsPrompt": [
    "hemoglobin A1c",     # medical
    "HIPAA",              # compliance
    "Jardiance",          # drug name
    "deductible",         # insurance
]

Up to 100 keyterms, each up to 50 characters. Takes effect immediately on the next call — no assistant restart needed.

Supported languages

English, Spanish, French, German, Italian, and Portuguese — with native multilingual code switching. Set the language in the transcriber config:

{ "transcriber": { "provider": "assembly-ai", "model": "u3-rt-pro",
"language": "es" } }

When to choose AssemblyAI over Deepgram in Vapi

Use case

Recommended

Fastest possible streaming latency

AssemblyAI Universal-3 Pro (307ms P50)

Account numbers, serial codes

AssemblyAI (+21% fewer alphanumeric errors)

Medical or clinical terminology

AssemblyAI (keyterm prompting)

Interruption handling

AssemblyAI (punctuation-based turn detection)

Multilingual callers

AssemblyAI (native code switching)

Resources

Switch your Vapi agent to AssemblyAI

Sign up for a free AssemblyAI account, add your key to Vapi's dashboard, and enable Universal-3 Pro Streaming in minutes.

Start building
Title goes here

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.

Button Text
AI voice agents
Universal-3 Pro Streaming