Deepgram vs. AssemblyAI
Learn why customers choose AssemblyAI to build powerful speech-to-text products that exceed industry standards.
• Best-in-class Speech AI models
• Trusted by developers
• Seamless API with no-code updates
Thank you for calling Acme Corporation, Sarah speaking. How may I assist you today?
Hi Sarah, this is John. I’m having trouble with my Acme Widget. It seems to be malfunctioning.
I’m sorry to hear that, John. Let’s get that sorted out for you. Could you please provide me with the serial number of your widget?
Thank you, John. Now, could you describe the issue you’re experiencing with your widget? Well, it’s not turning on at all, even though I’ve replaced the batteries.
Let’s try a few troubleshooting steps. Have you checked if the batteries are inserted correctly? Yes, I’ve double-checked that.
AssemblyAI is more than accurate—we’re preferred.
Universal-2 is the most preferred model to date. Before that? Universal-1 took the cake. We’ve made a habit out of making models people love.
At a glance: Deepgram vs. AssemblyAI
The most accurate speech-to-text models on the market with top performance rankings across major industry benchmarks.
Feature | AssemblyAI Universal-2 | Deepgram Nova-2 |
---|---|---|
Word Accuracy Rate | 93.3% | 90.76% |
Word Error Rate (English) | 6.7% | 9.24% |
Proper Nouns (PWER) | 13.87% | 21.14% |
Alphanumerics (WER) | 4.00% | 4.97% |
Text Formatting (U-WER) | 10.06% | 21.14% |
Accented Speech (WER) | 11.05% | 13.26% |
The accuracy is better than any other tools in the market (and we have tried them all). Highly recommend!
Vedant Maheshwari, CEO at Vidyo
The most robust feature set on the market
We’re more than accurate. AssemblyAI offers advanced features you won’t find anywhere else.
Async Speech-to-Text
The AssemblyAI API can transcribe pre-recorded audio and/or video files in seconds, with human-level accuracy. Highly scalable to tens of thousands of files in parallel.
Speaker Diarization
Detect the number of speakers in your audio file, with each word in the text associated with its speaker.
Automatic Language Detection
Automatically detect if the dominant language of the spoken audio is supported by our API and route it to the appropriate model for transcription.
International Language Support
Gain support to transcribe over 99+ languages and counting, including Global English (English and all of its accents).
Word Timings
View word-by-word timestamps across the entire transcript text.
Auto Punctuation and Casing
Automatically add casing and punctuation of proper nouns to the transcription text.
Custom Vocabulary
Boost accuracy for vocabulary that is unique or custom to your specific use case or product.
Profanity Filtering
Detect and replace profanity in the transcription text with ease.
Confidence Scores
Get a confidence score for each word in the transcript.
Filler Words
Optionally include disfluencies in the transcripts of your audio files.
Custom Spelling
Specify how you would like certain words to be spelled or formatted in the transcription text.
Join 200K+ developers building new experiences with voice data
Learn why they choose us.
Ryan Johnson
Chief Product Officer at CallRail
"Partnering with AssemblyAI has made it easy for us to deliver world-class voice intelligence powered by market-leading speech-to-text technology."
Vedant Maheshwari
CEO at Vidyo
"We have had a phenomenal experience so far. The integration was simple and easy for developers to get started. The accuracy is better than any other tools in the market (and we have tried them all). Highly recommend!"
Tom Lavery
Founder & CEO at Jiminny
"AssemblyAI has a real high-touch personal service. It’s a great partnership—we’re very collaborative and get to test new AI models early. AssemblyAI is really pushing boundaries, helping us create a well-rounded Conversation Intelligence platform."
Alexander Kvamme
Co-founder & CEO at EchoAI
"Works incredibly well out of the box. Allowed us to focus on product instead of infrastructure. As a result, we were able to bring a transformative new product to market in half the time."
I’ve tested many speech-to-text APIs (Google, AWS, IBM) and AssemblyAI consistently wins. Highly recommend for devs.
Developer & Co-founder
Nathan Webb
Product Manager at Aloware
"The accuracy was strong, but the great documentation and unique models like Auto Chapters and Sentiment Analysis is what really won us over."
Learn how Veed.io helps users produce high-quality videos.
Get started in seconds
1
2
3
4
5
6
import assemblyai as aai
transcriber = aai.Transcriber()
transcript = transcriber.transcribe(URL, config)
print(transcript)
{
"id": "6rlr37h8f4-e310-4e23-bbf3-ea5f347dc684",
"language_code": "en_us",
"status": "completed",
"text": "Runner's knee is a condition characterized by pain behind or around the kneecap...",
"confidence": 0.98122,
"audio_duration": 3200,
"words": [
{ "text": "Runner's", "start": 0, "end": 550, "speaker": "A", "confidence": 0.98113 },
{ "text": "knee", "start": 580, "end": 1130, "speaker": "A", "confidence": 0.95417 }
]
}