Industry-leading transcription accuracy
Conformer-2: a state-of-the-art speech recognition model
Conformer-2 is our latest AI model for automatic speech recognition. Conformer-2 is trained on 1.1M hours of English audio data, extending Conformer-1 to provide improvements on proper nouns, alphanumerics, and robustness to noise.
See how Conformer-2 works
Is it going to be a first world championship for Verstappen? Is it going to be an 8th world championship for Lewis Hamilton? Where can Verstappen try and get past Hamilton? First overtaking zone is normally down into turn five. Is verstappen far enough back. He's going to make the lunge down the inside.
Hamilton sees it coming. It's a late lunch by Verstappen who takes the lead of the race. Verstappen now slatches the championship trophy from Lewis Hamilton who's trying to fight back.
No DRS for two laps, so Lewis Hamilton will not get the rear wing open. Now he's going to go down the outside if Verstappen keeps it tight and neat. But he hasn't.
He's gone a little bit wide.
Every feature needed to transcribe audio
The AssemblyAI API can transcribe pre-recorded audio and/or video files in seconds, with human-level accuracy. Highly scalable to tens of thousands of files in parallel.
Boost accuracy for vocabulary that is unique or custom to your specific use case or product.
Automatically detect the number of speakers in your audio file, and each word in the transcription text can be associated with its speaker.
International language support
Gain support to transcribe over 16 languages and counting, including Global English (English and all of its accents).
Automatic punctuation and casing
Automatically add casing and punctuation of proper nouns to the transcription text.
Get a confidence score for each word in the transcript.
Word-by-word timestamps across the entire transcript text.
Optionally include disfluencies in the transcripts of your audio files.
Automatically detect and replace profanity in the transcription text.
Automatic language detection
Automatically detect if the dominant language of the spoken audio is supported by our API and route it to the appropriate model for transcription.
Specify how you would like certain words to be spelled or formatted in the transcription text.
Weekly product and accuracy improvements
- Pricing decreases
- Significant Summarization model speedups
- Introducing LeMUR, the easiest way to build LLM apps on spoken data