Speech-to-Text API

Get clean, customizable transcripts in 99 languages with industry-leading accuracy and natural language prompting.

Universal-3 Pro

Your transcriptions will show here...

Delphi
Happy Scribe
Glean
Granola
Supernormal
Runway
Ashby
Jiminny
JotPsych
Earmark
EdgeTier
Genio
Grain
Loop
Calabrio
Veed.io
Dovetail
WhatConverts
CallRail
Delphi
Happy Scribe
Glean
Granola
Supernormal
Runway
Ashby
Jiminny
JotPsych
Earmark
EdgeTier
Genio
Grain
Loop
Calabrio
Veed.io
Dovetail
WhatConverts
CallRail
Delphi
Happy Scribe
Glean
Granola
Supernormal
Runway
Ashby
Jiminny
JotPsych
Earmark
EdgeTier
Genio
Grain
Loop
Calabrio
Veed.io
Dovetail
WhatConverts
CallRail
Delphi
Happy Scribe
Glean
Granola
Supernormal
Runway
Ashby
Jiminny
JotPsych
Earmark
EdgeTier
Genio
Grain
Loop
Calabrio
Veed.io
Dovetail
WhatConverts
CallRail
Models

Pick the model that fits your workload

Accuracy that holds up on real-world audio, tunable with a single parameter.

Universal-3 Pro

The most accurate, controllable model on the market.

  • Complex, domain-specific audio
  • Natural language prompting
  • Precise entity handling
  • 6 languages with code-switching

Universal-2

High-accuracy transcription at scale across 99 languages.

  • Proven accuracy at scale
  • Keyterms prompting
  • Strong entity handling
  • 99 languages with code-switching

Compare features

Model
Universal-3 Pro Domain-specific, multilingual, complex audio
Universal-2 High-volume, cost efficient, global languages
Medical Mode Clinical settings, medical term recognition
Price
$0.21 /hr
$0.15 /hr
+$0.07 /hr on any model
Languages
EN, ES, FR, DE, IT, PT
99 Languages
Inherits base model languages
Natural language prompting
Up to ~1,500 words
Keyterm prompting
Medical vocabulary
Code-switching
Speaker diarization
10+ speakers
10+ speakers
Clinician / patient labels
Medical terminology
Drugs, dosages, ICD codes
HIPAA BAA
On request
On request
On request
Unlimited concurrency
Use cases

Built for every voice workflow

Async transcription powers every application where you work with recorded audio.

Conversation intelligence

Transcribe sales calls, support tickets, and customer interviews. Feed clean transcripts into sentiment analysis and topic detection.

AI Scribes

Capture patient-provider conversations with clinical-grade accuracy. Generate SOAP notes, intake summaries, and EHR-ready documentation.

Podcast and media

Transcribe long-form audio for search indexing, automated chapters, and subtitle generation. 99 languages, no configuration needed.

Call analytics

Process thousands of call recordings per day with speaker labels, sentiment scores, and key phrase extraction. Automate QA at scale.

AI notetakers

Turn recorded meetings into structured summaries with speaker attribution, action items, and searchable timestamps.

Frequently asked questions