INTRODUCING medical mode

Clinical-grade accuracy on every drug name, dose, and diagnosis

20% fewer missed entities on the terminology that affects patient outcomes — across real-time and async workflows.

Try medication names (ibuprofen, metformin, amoxicillin), dosage instructions, procedure names, and anatomical terms. Take a few steps away from your device to mimic an ambient environment.

Medical Mode in Universal-3 Pro Streaming

Source

Clinical evaluation history:

00:00

01:59

"prompt": "Produce a transcript for a clinical history evaluation. It's important to capture medication and dosage accurately. Every disfluency is meaningful data. Include: fillers (um, uh, er, erm, ah, hmm, mhm, like, you know, I mean), repetitions (I I I, the the), restarts (I was- I went), stutters (th-that, b-but, no-not), and informal speech (gonna, wanna, gotta)"

Without prompting

"I just want to move you along a bit further. Do you take any prescribed medicines? I know you've got diabetes and high blood pressure. I do. I take Ramipril. Okay. And I take Metformin, and there's another one that begins with G for the diabetes. Glicoside."

With context aware prompting

"I just wanna move you along a bit further. Do you take any prescribed medicines? I know you've got diabetes and high blood pressure. I, I do. I take, um, I take Ramipril. Okay, mhm. And I take Metformin, and there's another one that begins with G for the diabetes. So glycosi — glycosi— glycoside."

Source

Non-speech audio event:

00:00

01:59

"prompt": "Produce a transcript suitable for conversational analysis. Every disfluency is meaningful data. Include: Tag sounds: [beep]"

Without audio tagging

"Your call has been forwarded to an automatic voice message system. At the tone, please record your message. When you have finished recording, you may hang up or press 1 for more options."

With audio tagging

"Your call has been forwarded to an automatic voice message system. At the tone, please record your message. When you have finished recording, you may hang up or press 1 for more options. [beep]"

Source

Speech with disfluencies:

00:00

01:59

"prompt": "Produce a transcript suitable for conversational analysis. Every disfluency is meaningful data. Include: fillers (um, uh, er, ah, hmm, mhm, like, you know, I mean), repetitions (I I, the the), restarts (I was- I went), stutters (th-that, b-but, no-not), and informal speech (gonna, wanna, gotta)"

Without disfluency prompting

Do you and Quentin still socialize when you come to Los Angeles, or is it like he's so used to having you here? No, no, no, we're friends. What do you do with him?

With disfluency prompting

Do you and Quentin still socialize, uh, when you come to Los Angeles, or is it like he's so used to having you here? No, no, no, we, we, we're friends. What do you do with him?

Source

Proper noun spelling:

00:00

01:59

"keyterms_prompt": ["Kelly Byrne-Donoghue"]

Without keyterms prompting

"Hi, this is Kelly Byrne Donahue"

Without keyterms prompting

"Hi, this is Kelly Byrne-Donahue"

Source

Caputuring speaker roles:

00:00

01:59

"prompt": "Produce a transcript with every disfluency data. Additionally, label speakers with their respective roles. 1. Place [Speaker:role] at the start of each speaker turn. Example format: [Speaker:NURSE] Hello there. How can I help you today? [Speaker:PATIENT] I'm feeling unwell. I have a headache."}

With traditional speaker labels

Speaker A: 5Mg. And do you take it regularly?
‍
Speaker B: Oh yeah, yeah.
‍
Speaker A: Good.
‍
Speaker B: Every evening.
‍
Speaker A: And no side effects with it?

With speaker labels prompting

Speaker [Nurse]: 5Mg. And do you take it regularly?
‍
Speaker [Patient]: Oh yeah, yeah.
‍
Speaker [Nurse]: Good.
‍
Speaker [Patient]: Every evening.
‍
Speaker [Nurse]: And no side effects with it?

Source

Spanish and english audio:

00:00

01:59

"language_detection": True
"prompt": Preserve natural code-switching between English and Spanish. Retain spokenlanguage as-is (correct "I was hablando con mi manager").

Without codeswitching

Would definitely think I spoke Spanish if you heard me speak Spanish. But I still make mistakes. Soy wines. Paltro Soy. La fundadora de goop. Thank you. Thank you for doing that.

With codeswitching

You would definitely think I spoke Spanish if you heard me speak Spanish, but I still make mistakes. Soy Gwyneth Paltrow, soy la fundadora de Goop. Thank you. Thank you for doing that.

Industry-leading accuracy, now with medical-grade precision

Medical Mode reduces missed medical entities by over 20% compared to Universal-3 Pro alone.

Missed Entity Rate: Universal-3 Pro vs. Universal-3 Pro with Medical Mode

Lower is better · % of entities not correctly transcribed

Universal-3 Pro with Medical Mode

Universal-3 Pro

Pre-recorded English

3.24%

3.95%

18% Improvement

OpenAI

Microsoft

4.22%

4.93%

14% Improvement

Real-time English

OpenAI

Deepgram

9.28%

10.98%

Pre-recorded Non-English

15% Improvement

More accurate on medical terms than every other provider

The terms that determine patient outcomes — medication names, dosages, and diagnoses — transcribed more accurately than ever.

MER & WER across medical transcription models

Lower is better · % of entities not correctly transcribed

MER (Missed Entity Rate)

WER (Word Error Rate)

AssemblyAI Universal-3
Pro w/ Medical Mode

3.2%

5.3%

Deepgram

3.6%

5.5%

Speechmatics
Enhanced Medical

Deepgram

4.7%

6.1%

Deepgram Nova-3
Medical

Deepgram

8.7%

5.9%

AWS Transcribe
Medical

OpenAI

Microsoft

24.4%

12.9%

Google Medical
Conversation

See the performance on your own files

Reach out to our Applied AI team to run latency and accuracy benchmarks on your own data.

Contact Applied AI

Built for the nuances of patient encounters

Every capability engineered for real conversations in ambient, far field, and multi-speaker healthcare settings.

Far-field accuracy, without the tradeoffs

Drug names, procedures, dosages — transcribed correctly the first time, even in noisy rooms.

Capture every medication, procedure, and dosage correctly — 87% fewer medical entity errors than other medical models
Handle the noise of real care settings — equipment, overlapping voices, and multi-speaker encounters without accuracy tradeoffs
Perform across every specialty without retraining — oncology, cardiology, primary care, and everything in between, out of the box

Compliant, affordable, and built to scale

HIPAA-eligible infrastructure, BAA included, and $0.15/hr. No compliance tax, no surprises.

Go live for $0.15/hr — transparent add-on pricing with no compliance upcharges
Ship with compliance already handled — HIPAA-eligible infrastructure and BAA included, data training opted out by default
Scale without contracts or hidden overages — no lock-in, no concurrency limits, and predictable, usage based pricing

The full Voice AI stack, with medical accuracy built in

Speaker diarization, real-time streaming, PHI redaction, all with medical domain accuracy.

Separate every voice in the encounter — provider, patient, and staff accurately identified across the full visit
Generate EHR-ready output automatically — PHI stripped, SOAP structured, and ready for your downstream systems
Stream medical-grade accuracy live — ambient scribes and clinical copilots get terms right as they're spoken

Unlock the value of voice data

Build what’s next on the platform powering thousands of the industry’s leading of Voice AI apps.

Try our API for free Contact sales

INTRODUCING medical mode

Clinical-grade accuracy on every drug name, dose, and diagnosis

Industry-leading accuracy, now with medical-grade precision

Missed Entity Rate: Universal-3 Pro vs. Universal-3 Pro with Medical Mode

More accurate on medical terms than every other provider

MER & WER across medical transcription models

See the performance on your own files

Built for the nuances of patient encounters

Far-field accuracy, without the tradeoffs

Compliant, affordable, and built to scale

The full Voice AI stack, with medical accuracy built in

More on Medical Mode

What's next

Playground

Start Building

Unlock the value of voice data