Streaming Migration Guide: Universal Streaming to Universal-3 Pro Streaming
This guide walks through the process of upgrading from Universal Streaming to Universal-3 Pro Streaming for real-time audio transcription.
Get Started
Before we begin, make sure you have an AssemblyAI account and an API key. You can sign up for a free account and get your API key from your dashboard.
Quick upgrade
If you’re already using Universal Streaming, you can quickly test Universal-3 Pro Streaming by switching the speech_model parameter to "u3-rt-pro" and removing format_turns (formatting is always on in U3 Pro). Just update the connection params and start streaming.
That’s it for a quick test. But there are important behavioral differences in turn detection, partials, and formatting that may require updates to your message handling logic. Read on for the full migration details.
Why upgrade
Universal-3 Pro Streaming delivers:
- Exceptional entity accuracy — credit card numbers, phone numbers, email addresses, physical addresses, and names captured correctly at streaming speed
- Promptable model — custom transcription instructions via
prompt, plus domain-term boosting viakeyterms_prompt(up to 100 terms) - Better turn detection — punctuation-based system that waits when speakers pause mid-thought and responds when they’re done
- Native multilingual code-switching — English, Spanish, German, French, Portuguese, Italian in a single model
- Sub-300ms latency — fast time to complete transcript
- Mid-stream configuration — update keyterms, prompts, and silence parameters without dropping the connection
For full details, see Universal-3 Pro Streaming.
What changes
This table covers the key parameter, behavior, and response field differences. Use it as a migration checklist.
Sources: U3 Pro docs, Universal docs, Turn detection docs, API Reference
Side-by-side code
Full working Python examples side by side using raw websocket-client.
Universal Streaming
Universal-3 Pro Streaming
Turn detection
This is the most significant behavioral difference between the two models.
Universal Streaming uses a confidence-based system combining semantic and acoustic detection (source):
The model evaluates end_of_turn_confidence during silence. If the score exceeds end_of_turn_confidence_threshold after min_turn_silence, the turn ends. Otherwise, the turn is forced to end after max_turn_silence.
Universal-3 Pro uses a punctuation-based system (source):
When silence reaches min_turn_silence, the model transcribes the audio and checks for terminal punctuation (. ? !):
- Terminal punctuation found — the turn ends (
end_of_turn: true) - No terminal punctuation — a partial is emitted (
end_of_turn: false) and the turn continues - Silence reaches
max_turn_silence— the turn is forced to end regardless of punctuation
end_of_turn_confidence_threshold does not exist on Universal-3 Pro (it
was never part of the U3 Pro API — not deprecated, just absent). It is
officially deprecated on Universal Streaming. Remove this parameter and
configure min_turn_silence and max_turn_silence instead. For configuration
guidance, see Configuring Turn
Detection.
New capabilities
These features are new or enhanced in Universal-3 Pro. For full details, see Universal-3 Pro Streaming.
Prompting
Universal-3 Pro supports a prompt parameter for custom transcription instructions. When omitted, a default prompt optimized for turn detection (88% accuracy) is applied automatically. See the Prompting Guide for details.
Start with no prompt. The default prompt delivers 88% turn detection accuracy. Only customize if you have specific requirements, and build off the default prompt rather than starting from scratch.
Keyterms prompting
Boost recognition of specific names, brands, or domain terms. Maximum 100 keyterms, each 50 characters or less. See Keyterms Prompting for details.
prompt and keyterms_prompt can be used together. When you use
keyterms_prompt, your boosted words are appended to the default prompt (or
your custom prompt if provided) automatically.
Mid-stream configuration updates
Update prompt, keyterms_prompt, min_turn_silence, and max_turn_silence during an active session without reconnecting. See Updating configuration mid-stream for details.
Force turn end
ForceEndpoint is supported on both Universal Streaming and Universal-3 Pro — no migration changes needed. Force the current turn to end immediately based on external signals. See Forcing a turn endpoint for details.
Language support
Universal Streaming transcribes English by default. For multilingual support, use speech_model: "universal-streaming-multilingual". (Source)
Universal-3 Pro natively code-switches between 6 languages in a single model — no separate multilingual model needed: English, Spanish, German, French, Portuguese, Italian. It also supports automatic language detection, returning language_code and language_confidence fields in Turn messages. To guide toward a specific language, prepend Transcribe <language>. to the default prompt. See Supported languages for the full list.
Language Detection: Universal Streaming supports the language_detection connection parameter (true/false, default false) with the multilingual model. When enabled, Turn messages include language_code and language_confidence fields. Universal-3 Pro also supports language detection with code-switching — see Supported languages for details.
Need more than 6 languages? Use the Whisper Streaming model
(speech_model: "whisper-rt") for 99+ languages with automatic language
detection. See Whisper Streaming
for details.