Why are my Streaming Transcription results repeating?
Streaming Transcription is designed to be low latency where audio and transcription data are continuously streamed back and forth via a WebSocket. Our Streaming Transcription pipeline uses a two-phase transcription strategy, broken into partial and final results. Partial transcripts are sent repeatedly as the utterance is built out, with the Final Transcript coming after the end of an utterance that has been detected. The model will finalize the results sent to you with higher accuracy and add punctuation and casing to the transcription text.
As the partial transcripts designed are built out in this manner, this behavior can sometimes be perceived as repeated text.
You can specify your message_type
parameter to either PartialTranscript
and/or FinalTranscript
only based on your use case!