Migrating from Streaming v2 to Streaming v3 (JavaScript)
This cookbook guides you through migrating from AssemblyAI’s legacy Streaming STT model (v2) to our latest Universal Streaming STT model (v3), which provides ultra-low latency for faster transcription, intelligent endpointing for more natural speech detection, and improved accuracy across various audio conditions.
Check out this blog post to learn more about this new model!
Overview of changes
The migration involves several key improvements:
- API Version: Upgrade from v2 (
/v2/realtime/ws
) to v3 (/v3/ws
) - Enhanced Error Handling: Robust cleanup and resource management
- Modern Message Format: Updated message types and structure
- Configuration Options: More flexible connection parameters
- Graceful Shutdown: Proper termination handling
You can follow the step-by-step guide below to make changes to your existing code but here is what your code should look like in the end:
For more information on our Universal Streaming feature, see this section of our official documentation.
Step-by-step migration guide
1. Update API endpoint and configuration
Before (v2):
After (v3):
Key Changes:
- New base URL:
streaming.assemblyai.com
instead ofapi.assemblyai.com
- Version upgrade:
/v3/ws
instead of/v2/realtime/ws
- Configuration via URL parameters using
querystring
- Added
format_turns
option for better transcript formatting
2. Audio configuration
Before (v2):
After (v3):
Key Changes:
- Sample rate now references the configuration parameter
3. Update message handling schema
Before (v2):
After (v3):
Key Changes:
- Message types renamed:
SessionBegins
→Begin
,PartialTranscript
/FinalTranscript
→Turn
- Field names updated:
message_type
→type
,session_id
→id
,text
→transcript
- Added session expiration timestamp handling (
expires_at
) - New transcript formatting with
turn_is_formatted
flag - Added turn tracking with
turn_order
andend_of_turn
fields - New confidence scoring with
end_of_turn_confidence
- Added
Termination
message with session statistics - Error handling moved from message-based to WebSocket events
4. Add graceful shutdown handling and improve error handling and logging
Before (v2):
After (v3):
Key Changes:
- Proper KeyboardInterrupt handling
- Graceful termination message sending
- Detailed error context and timestamps
- Proper exception type handling
- Resource cleanup on all error paths
- Connection status checking before operations
Migration checklist
- Update API endpoint from v2 to v3
- Update message type handling (
Begin
,Turn
,Termination
) - Add proper resource cleanup in all code paths
- Update field names in message parsing
- Add graceful shutdown with termination messages
- Add detailed error logging with context
- Test KeyboardInterrupt handling
- Verify audio resource cleanup
- Test connection failure scenarios
Testing your migration
- Basic Functionality: Verify transcription works with simple speech
- Error Handling: Test with invalid API keys or network issues
- Graceful Shutdown: Test Ctrl+C interruption
- Resource Cleanup: Monitor for memory leaks during extended use
- Message Formatting: Test with
format_turns
enabled/disabled
Common migration issues
Issue: “WebSocket connection failed”
Solution: Verify you’re using the new v3 endpoint URL and proper authentication header format.
Issue: “Message type not recognized”
Solution: Update message type handling from old names (SessionBegins
, PartialTranscript
) to new ones (Begin
, Turn
).
Benefits of migration
- Improved Reliability: Better error handling and recovery
- Lower Latency: Reduced buffer sizes for faster response
- Enhanced Features: Formatted transcripts and session statistics
- Better Resource Management: Proper cleanup prevents memory leaks
- Graceful Shutdown: Clean termination with proper cleanup
Conclusion
This migration provides a more robust, maintainable, and feature-rich streaming transcription implementation. The enhanced error handling, resource management, and modern API features make it suitable for production use cases where reliability and performance are critical.