Terminate Streaming Session After Inactivity
An often-overlooked aspect of implementing AssemblyAI’s Streaming Speech-to-Text (STT) service is efficiently terminating transcription sessions. In this cookbook, you will learn how to terminate a Streaming session after any fixed duration of silence.
For the full code, refer to this GitHub gist.
Quickstart
Get Started
Before we begin, make sure you have an AssemblyAI account and an API key. You can sign up for an AssemblyAI account and get your API key from your dashboard.
Step-by-step instructions
First, install AssemblyAI’s Python SDK.
Implementing Speech Activity Checks
Our Streaming API emits a Turn Event each time speech is processed. During periods of silence, no TurnEvent will be sent. You can use this behavior to detect inactivity and automatically terminate the session.
We can track the timestamp of the most recent non-empty transcript using a datetime. On every Turn Event, we:
-
Update the timestamp if meaningful speech is received
-
Check how many seconds have passed since the last valid transcript
-
If that exceeds your timeout (e.g. 5 seconds), terminate the session
Key Variables
These are updated on every turn event.
Turn event logic
This pattern ensures sessions are cleanly terminated after inactivity.
What You’ll Observe
-
Live transcription continues as long as there’s speech
-
After 5 seconds of silence, the session ends automatically
You can change the timeout value to suit your needs by modifying the silence_duration
> 5 check.