2021 at AssemblyAI - A Year in Review

The end of 2021 is almost upon us - and it’s been a good one here at AssemblyAI! Read on for a summary of all that’s happened in 2021 at AssemblyAI.

2021 at AssemblyAI - A Year in Review

The end of 2021 is almost upon us - and it’s been a good one here at AssemblyAI! We’ve worked hard and delivered some of our best updates and new features yet, including:

And much more! Read on for a summary of all that’s happened in 2021 at AssemblyAI.

v8 Transcription Model with 20% Higher Accuracy

In October, we released our most accurate Speech Recognition model to date. v8 delivered up to 18.72% better accuracy across all types of audio and video data to our customers. Proper noun accuracy also increased by an amazing 24.47%.

Read more about v8 here.

Our 2021 Accuracy Benchmark Report showcases these accuracy gains by comparing our transcription accuracy to Google Cloud Speech-to-Text and AWS Transcribe, as well as providing demonstrations of our features in action.

Check out the full Benchmark Report here.

New Features

We released 9 major new features in 2021, as well as countless updates to others.

1. Real-time Transcription

If you're working with live audio, we can stream your transcripts to you within a few hundred milliseconds, and additionally, revise these transcripts with more accuracy over time as more context arrives. Learn more about our real-time transcription API here.

2. Entity Detection

Our Entity Detection feature automatically detects a wide range of entities found in your transcription text such as names, addresses, phone numbers, social security numbers, locations, and more. Learn more about our Entity Detection feature here.

3. Auto Chapters (Summarization)

Our Auto Chapters feature provides a "summary over time" for audio content by first breaking audio/video files into logical "chapters" as the topic of conversation changes, and then providing an automatically generated summary for each "chapter" of content. Read more about Auto Chapters here.

4. Sentiment Analysis

Our Sentiment Analysis feature detects the sentiment of each sentence spoken in your audio files as either “positive,” “negative,” or “neutral”. Learn more about Sentiment Analysis here.

5. Filler Words

Filler-words like "um" and "uh" can now be included in your transcription text with high accuracy. Read more about how to use our Disfluencies feature here.

6. Severity Scores for Content Safety

Severity Scores works with our Content Safety feature to measure how intense a detected Content Safety label is on a scale of 0 to 1. Read more about both Content Safety and Severity Scores here.

Search completed transcripts for a set of specific keywords. Learn how to use the Word Search feature here.

8. Paragraph Detection

Break your transcription into automatically generated paragraphs for easier reading and comprehension. Learn how to use Paragraph Detection here.

9. Usage Alerts

Our Usage Alert feature now lets customers set a monthly usage threshold on their account, along with a list of email addresses to be notified when that monthly threshold has been exceeded. This feature can be enabled by clicking “Set up alerts” on the “Developers” tab in the Dashboard.

Make sure you subscribe to our weekly updates via our Changelog to keep up-to-date, including soon to be released features like Emotion Detection and Translation.

More Updates

2021 also saw additional product updates, such as a much improved public changelog:

We also launched a new developer dashboard that offers real-time usage and spend data for developers, as well as a web interface to quickly transcribe a video or audio file from your browser, so that you can quickly try AssemblyAI’s models without having to write any code.

Our social media accounts grew too! We launched a new YouTube channel featuring original Deep Learning content and tutorials. It quickly grew to over 300 subscribers in under 30 days!

You can also find us on Twitter, Instagram, and TikTok.


We kicked off the year publishing an overview of End-to-End Architectures for Speech Recognition (2021). In this review, we published the major differences between popular, modern architectures for Automatic Speech Recognition - including LAS, CTC, and RNN-T.

Later in the year, we published a deep dive into how Transducer models (2021) can be used for Automatic Speech Recognition with high accuracy, and went into more detail about how they compare to the more common CTC model architecture.

As a startup building large scale Transformer models, we shared some of the lessons we’ve learned and tips for other startups in the AI space. We also started our popular Weekly Deep Learning Paper Review series - where our research team provides commentary on exciting new research that’s coming out from the broader AI research community. For example, we looked at:


AssemblyAI was recognized as both a Fall and Winter 2021 High Performer and Momentum Leader on G2!

The High Performer award recognizes products with high customer satisfaction scores. The Momentum Leader award recognizes products in the top 25% of their category’s products. AssemblyAI rates an average of 4.8 out of 5 stars on the G2 platform.  

Top Blog Posts

We produced 81 pieces of blog content in 2021!

Here were the top 10 blog posts:

A Year Wrapped

We can’t wait to hit the ground running in 2022! Thank you for coming along with us on this exciting journey!