agent_context parameter — either as a connection-time query parameter (to seed the model with your agent’s opening greeting) or mid-stream via UpdateConfiguration after each agent reply. Context helps the model disambiguate words that sound similar and improves entity recognition and consistency.
For example, after your agent asks "What's your email address?" the model might transcribe the reply as "user at assemblyai dot com". With agent_context, the model knows an email is coming and produces "user@assemblyai.com".
How it works
During a streaming session, Universal-3 Pro Streaming keeps a short memory of recent finalized turns and uses them as additional context when transcribing the next turn. This means:- Context is per-session. Closing the WebSocket clears the context — a new session starts fresh.
- Only
agent_contextvalues and finalized turns (end_of_turn: true) are carried forward, not partials.
Defaults
| Behavior | Default |
|---|---|
| Context carryover | Enabled |
| Number of prior entries carried | 3 |
| Maximum context size | ~1500 characters |
Passing your agent’s reply as context
Universal-3 Pro Streaming automatically carries prior STT-finalized turns (what the user said) back into the model — no configuration required. You can also pass your voice agent’s spoken reply (what your TTS just said) via theagent_context parameter. There are two ways to set it:
- At connection time — pass
agent_contextas a query parameter on the WebSocket URL. Use this to seed the model with your agent’s opening greeting before the user has said anything. - Mid-stream — send an
UpdateConfigurationmessage with theagent_contextfield after each subsequent agent reply.
"yes", "7pm", "that's all").
Setting an opening greeting at connection time
When you open the WebSocket, passagent_context alongside your other connection parameters. The first user turn will be transcribed with the greeting already in the model’s context.
- Python
- Python SDK
- JavaScript
- JavaScript SDK
Updating agent context mid-stream
A typical voice agent loop looks like this:- User speaks → Universal-3 Pro Streaming emits a final turn.
- Your agent runs an LLM step and generates a reply.
- Your TTS speaks the reply to the user.
- User responds → next turn.
- Python
- Python SDK
- JavaScript
- JavaScript SDK
Limits
- Universal-3 Pro only.
agent_contextis supported onspeech_model: "u3-rt-pro". If you set it at connection time on any other model, the session is rejected; if you send it mid-stream on another model, it’s stripped with a warning. - Per-value cap: ~1500 characters. Trim long agent replies down to the substantive question before sending.
When context carryover helps most
Context carryover has the largest impact on:- Voice agents — short user responses to agent questions (
"yes","no","that's all", dates, times, single names). - Spelled-out entities — emails, account IDs, addresses, and similar inputs read aloud after the agent has just asked for them. Setting
agent_contextto the agent’s prompt (e.g."What's your email address?") primes the model for what’s coming. - Disambiguation — words that sound similar but only one fits the conversation (
"fleas"vs"please","to"vs"two"vs"too"). - Entity recall — names, products, or terms that were established earlier in the conversation.
Interactions with other parameters
prompt— Context carryover is layered on top of the default prompt and any custompromptyou provide. You don’t need to manage it yourself.keyterms_prompt— You can usekeyterms_promptalongside context carryover. If you provide aprompt, we recommend droppingkeyterms_promptfor that turn and folding domain terms into your prompt instead.- Multilingual sessions — Carrying prior turns biases the model toward the languages already seen in the conversation. For sessions that mix three or more languages, this can occasionally push the model toward translating rather than transcribing. If you see drift, set a single transcription language in your
prompt(see Specifying the transcription language).
FAQ
Do I need to enable context carryover?
Do I need to enable context carryover?
No. It’s on by default for every Universal-3 Pro Streaming session. Just keep using the same WebSocket connection across the conversation.
Is context carryover billed separately?
Is context carryover billed separately?
No. Streaming is billed on WebSocket session duration, not on the size of the prompt or the carried context.
Does context carry over between WebSocket connections?
Does context carry over between WebSocket connections?
No. Context is scoped to a single WebSocket session. If you reconnect, the new session starts with no prior context.
Can I pass my voice agent's spoken reply as context?
Can I pass my voice agent's spoken reply as context?
Yes. Set
agent_context as a connection-time query parameter to seed the agent’s opening greeting, and/or send it via UpdateConfiguration mid-stream after each subsequent agent reply. The model uses it as context for the next user turn. See Passing your agent’s reply as context.