Migrating from Streaming v2 to Streaming v3
This cookbook guides you through migrating from AssemblyAI’s legacy Streaming STT model (v2) to our latest Universal Streaming STT model (v3), which provides ultra-low latency for faster transcription, intelligent endpointing for more natural speech detection, and improved accuracy across various audio conditions.
Check out this blog post to learn more about this new model!
Overview of changes
The migration involves several key improvements:
- API Version: Upgrade from v2 (
/v2/realtime/ws
) to v3 (/v3/ws
) - Enhanced Error Handling: Robust cleanup and resource management
- Improved Threading: Better control over audio streaming threads
- Modern Message Format: Updated message types and structure
- Configuration Options: More flexible connection parameters
- Graceful Shutdown: Proper termination handling
You can follow the step-by-step guide below to make changes to your existing code but here is what your code should look like in the end:
For more information on our Universal Streaming feature, see this section of our official documentation.
Step-by-step migration guide
1. Update API endpoint and configuration
Before (v2):
After (v3):
Key Changes:
- New base URL:
streaming.assemblyai.com
instead ofapi.assemblyai.com
- Version upgrade:
/v3/ws
instead of/v2/realtime/ws
- Configuration via URL parameters using
urlencode()
- Added
format_turns
option for better transcript formatting
2. Improve audio configuration
Before (v2):
After (v3):
Key Changes:
- Reduced buffer size from 200ms to 50ms for lower latency
- Sample rate now references the configuration parameter
- Added detailed comments explaining the calculations
3. Enhance thread management
Before (v2):
After (v3):
Key Changes:
- Added
threading.Event()
for controlled thread termination - Global
audio_thread
variable for better lifecycle management - Condition-based loop (
while not stop_event.is_set()
) instead of infinite loop - Improved error handling and logging
4. Update message handling
Before (v2):
After (v3):
Key Changes:
- Message types renamed:
SessionBegins
→Begin
,PartialTranscript
/FinalTranscript
→Turn
- Field names updated:
message_type
→type
,session_id
→id
,text
→transcript
- Added session expiration timestamp handling
- Improved transcript formatting with
turn_is_formatted
flag - Added
Termination
message handling with session statistics - Enhanced error handling with specific
JSONDecodeError
catch
5. Implement robust resource management
Before (v2):
After (v3):
Key Changes:
- Added thread stop signaling via
stop_event.set()
- Conditional resource cleanup with null checks
- Proper thread joining with timeout
- Resource nullification to prevent reuse
- Enhanced error handling in
on_error
6. Add graceful shutdown handling
Before (v2):
After (v3):
Key Changes:
- WebSocket runs in separate thread for better control
- Proper KeyboardInterrupt handling
- Graceful termination message sending
- Thread joining with timeouts
- Comprehensive cleanup in
finally
block
7. Improve error handling and logging
Before (v2):
- Basic error printing
- Limited context in error messages
- No resource cleanup on errors
After (v3):
- Detailed error context and timestamps
- Proper exception type handling
- Resource cleanup on all error paths
- Connection status checking before operations
Migration checklist
- Update API endpoint from v2 to v3
- Change base URL to
streaming.assemblyai.com
- Update message type handling (
Begin
,Turn
,Termination
) - Implement
threading.Event()
for thread control - Add proper resource cleanup in all code paths
- Update field names in message parsing
- Add graceful shutdown with termination messages
- Implement timeout-based thread joining
- Add detailed error logging with context
- Test KeyboardInterrupt handling
- Verify audio resource cleanup
- Test connection failure scenarios
Testing your migration
- Basic Functionality: Verify transcription works with simple speech
- Error Handling: Test with invalid API keys or network issues
- Graceful Shutdown: Test Ctrl+C interruption
- Resource Cleanup: Monitor for memory leaks during extended use
- Thread Management: Verify proper thread termination
- Message Formatting: Test with
format_turns
enabled/disabled
Common migration issues
Issue: “WebSocket connection failed”
Solution: Verify you’re using the new v3 endpoint URL and proper authentication header format.
Issue: “Message type not recognized”
Solution: Update message type handling from old names (SessionBegins
, PartialTranscript
) to new ones (Begin
, Turn
).
Issue: “Audio thread won’t stop”
Solution: Ensure you’re using threading.Event()
and calling stop_event.set()
in error handlers.
Issue: “Resource leak warnings”
Solution: Verify all audio resources are properly cleaned up in on_close
and finally
blocks.
Benefits of migration
- Improved Reliability: Better error handling and recovery
- Lower Latency: Reduced buffer sizes for faster response
- Enhanced Features: Formatted transcripts and session statistics
- Better Resource Management: Proper cleanup prevents memory leaks
- Graceful Shutdown: Clean termination with proper cleanup
- Modern Architecture: Improved threading and event handling
Conclusion
This migration provides a more robust, maintainable, and feature-rich streaming transcription implementation. The enhanced error handling, resource management, and modern API features make it suitable for production use cases where reliability and performance are critical.