Blog

Improved Real-Time Transcription Speed and Accuracy

Changelog
Share on social icon.Share on social icon.Share on social icon.Share on social icon.

Accuracy Update

Over the past few months, we've seen a huge uptick in developers looking to implement Real-Time Transcription into their applications and products. Real-Time Transcription is powering innovative accessibility features, like closed captioning of online events, live coaching of customer support and sales calls to help improve the customer's experience, and a slew of other interesting use cases and applications.

At AssemblyAI, we're focused on building the easiest to implement, and most accurate API for automatic speech recognition. That's why we have a great team of Speech Scientists and Deep Learning Engineers focused on rapidly improving the performance and accuracy of our models - and today we're excited to be releasing an improvement to our Real-Time WebSocket Transcription API, that returns more accurate and faster results for developers.

New Helper Libraries

As part of this release, we're also working on launching more sample code and helper libraries to implement our WebSocket API. A great example of that is our new demo, that shows how to stream audio from the browser to our WebSocket API using JavaScript and WebRTC. All the code for this demo is open source! So you can easily use it as inspiration for your own applications.

The above GIF shows a glimpse of just how fast and accurate our real-time API can stream transcription results back!

Wrapping Up

For more information about our Real-Time WebSocket API, you can check out the public API Docs. You can also write to us any time at support@assemblyai.com, or at @AssemblyAI on Twitter!

Subscribe to our blog!

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

You may also like

Checkout some of our recent research and product updates

How to Convert an MP3 File to Text with an API
Tutorials
How to Convert an MP3 File to Text with an API

how to convert an mp3 file to text with an API

Is Word Error Rate a Good Measure of Speech Recognition Systems?
Deep Learning
Is Word Error Rate a Good Measure of Speech Recognition Systems?

What is Word Error Rate? Word Error Rate is a measure of how accurate an Automatic Speech Recognition (ASR) system performs. Quite literally, it calculates how many “errors” are in the transcription text produced by an ASR system, compared to a human transcription.

The State of Python Speech Recognition in 2021
Tutorials
The State of Python Speech Recognition in 2021

The State of Python Speech Recognition in 2021

build with assemblyai

Accurately convert your audio and video files to text with AssemblyAI's Speech-to-Text API