Deepgram vs. AssemblyAI

Learn why customers choose AssemblyAI to build powerful speech-to-text products that exceed industry standards.


• Best-in-class Speech AI models
• Trusted by developers
• Seamless API with no-code updates

Call Transcript (04.03.2024)

Thank you for calling Acme Corporation, Sarah speaking. How may I assist you today?


Hi Sarah, this is John. I’m having trouble with my Acme Widget. It seems to be malfunctioning.


I’m sorry to hear that, John. Let’s get that sorted out for you. Could you please provide me with the serial number of your widget?


Thank you, John. Now, could you describe the issue you’re experiencing with your widget? Well, it’s not turning on at all, even though I’ve replaced the batteries.


Let’s try a few troubleshooting steps. Have you checked if the batteries are inserted correctly? Yes, I’ve double-checked that.

AssemblyAI is more than accurate—we’re preferred.

Universal-2 is the most preferred model to date. Before that? Universal-1 took the cake. We’ve made a habit out of making models people love.

At a glance: Deepgram vs. AssemblyAI

The most accurate speech-to-text models on the market with top performance rankings across major industry benchmarks.

Feature
AssemblyAI

Universal-2

Deepgram

Nova-2

Word Accuracy Rate

93.3%

90.76%

Word Error Rate (English)

6.7%

9.24%

Proper Nouns (PWER)

13.87%

21.14%

Alphanumerics (WER)

4.00%

4.97%

Text Formatting (U-WER)

10.06%

21.14%

Accented Speech (WER)

11.05%

13.26%

The accuracy is better than any other tools in the market (and we have tried them all). Highly recommend!

Vedant Maheshwari, CEO at Vidyo

The most robust feature set on the market

We’re more than accurate. AssemblyAI offers advanced features you won’t find anywhere else.

Async Speech-to-Text

The AssemblyAI API can transcribe pre-recorded audio and/or video files in seconds, with human-level accuracy. Highly scalable to tens of thousands of files in parallel.

Speaker Diarization

Detect the number of speakers in your audio file, with each word in the text associated with its speaker.

Automatic Language Detection

Automatically detect if the dominant language of the spoken audio is supported by our API and route it to the appropriate model for transcription.

International Language Support

Gain support to transcribe over 99+ languages and counting, including Global English (English and all of its accents).

Word Timings

View word-by-word timestamps across the entire transcript text.

Auto Punctuation and Casing

Automatically add casing and punctuation of proper nouns to the transcription text.

Custom Vocabulary

Boost accuracy for vocabulary that is unique or custom to your specific use case or product.

Profanity Filtering

Detect and replace profanity in the transcription text with ease.

Confidence Scores

Get a confidence score for each word in the transcript.

Filler Words

Optionally include disfluencies in the transcripts of your audio files.

Custom Spelling

Specify how you would like certain words to be spelled or formatted in the transcription text.

Sign up to test

Join 200K+ developers building new experiences with voice data

Learn why they choose us.

Portrait of a man, smiling at the camera.

Ryan Johnson

Chief Product Officer at CallRail

"Partnering with AssemblyAI has made it easy for us to deliver world-class voice intelligence powered by market-leading speech-to-text technology."

Portrait of a man, smiling at the camera.

Vedant Maheshwari

CEO at Vidyo

"We have had a phenomenal experience so far. The integration was simple and easy for developers to get started. The accuracy is better than any other tools in the market (and we have tried them all). Highly recommend!"

Portrait of a man, smiling at the camera.

Tom Lavery

Founder & CEO at Jiminny

"AssemblyAI has a real high-touch personal service. It’s a great partnership—we’re very collaborative and get to test new AI models early. AssemblyAI is really pushing boundaries, helping us create a well-rounded Conversation Intelligence platform."

Portrait of a man, smiling at the camera.

Alexander Kvamme

Co-founder & CEO at EchoAI

"Works incredibly well out of the box. Allowed us to focus on product instead of infrastructure. As a result, we were able to bring a transformative new product to market in half the time."

I’ve tested many speech-to-text APIs (Google, AWS, IBM) and AssemblyAI consistently wins. Highly recommend for devs.

Nico R.

Developer & Co-founder

Portrait of a man, smiling at the camera.

Nathan Webb

Product Manager at Aloware

"The accuracy was strong, but the great documentation and unique models like Auto Chapters and Sentiment Analysis is what really won us over."

Learn how Veed.io helps users produce high-quality videos.

START BUILDING WITH AI

Get started in seconds

1
2
3
4
5
6
import assemblyai as aai

transcriber = aai.Transcriber()
transcript = transcriber.transcribe(URL, config)

print(transcript)
{
  "id": "6rlr37h8f4-e310-4e23-bbf3-ea5f347dc684",
  "language_code": "en_us",
  "status": "completed",
  "text": "Runner's knee is a condition characterized by pain behind or around the kneecap...",
  "confidence": 0.98122,
  "audio_duration": 3200,
  "words": [
    { "text": "Runner's", "start": 0, "end": 550, "speaker": "A", "confidence": 0.98113 },
    { "text": "knee", "start": 580, "end": 1130, "speaker": "A", "confidence": 0.95417 }
  ]
}