Newsletter

Newsletter #30: 🚀Universal-1 Model Launch

This week, we’ve launched Universal-1, our most powerful and accurate Speech-to-Text model to date, trained on 12.5M hours of multilingual audio data

Newsletter #30: 🚀Universal-1 Model Launch

Hey đź‘‹, this weekly update contains the latest info on our new product features, tutorials, and our community.

🚀Universal-1 Model Launch

This week, we’ve launched Universal-1, our most powerful and accurate Speech-to-Text model to date, trained on 12.5M hours of multilingual audio data. 

Key Highlights of Universal-1:

  • 71% better speaker count estimation and 14% better word timestamp estimation compared to our prior models
  • Up to 30% fewer hallucinations compared to Whisper Large-v3, ensuring cleaner, more reliable transcriptions. 
  • Over 22% more accurate compared to speech-to-text APIs from Azure, AWS, and Google. 
  • Ability to code switch, transcribing multiple languages within a single audio file.
  • And, it processes an hour of audio in just 38 seconds. ⚡️

Check out our docs to start building with Universal-1.

Fresh From Our Blog

Transcribe an audio file with Universal-1 using Go: Learn how to transcribe an audio file in your Go applications with industry-leading accuracy using Universal-1. Read more>>

Automatically redact PII from audio and video with Python: Learn how to automatically redact Personal Identifiable Information (PII) from audio and video files in 5 minutes using Python and AssemblyAI. Read more>>

How to do Speech-To-Text with Go: Integrate speech recognition into your Go application in only a few lines of code. Read more>>

This new model is transforming Speech AI: Accurate, Fast, Cost-Effective: AssemblyAI just launched Universal-1, our most capable and highly trained speech recognition model. 

Coding an AI Voice Bot from Scratch: Real-Time Conversation with Python: Learn how to build a real-time AI voice assistant using Python that transcribes real-time speech, generates AI responses, and provides a human-like conversational experience.

How to Build a RAG Application for Multi-Speaker Audio Data: Learn how to build a RAG application in 10 minutes that can take multiple speakers into account when answering a question.