Can I transcribe telehealth video calls in real time?

Yes. Universal-3.5 Pro Realtime with Medical Mode runs live at ~150ms P50 latency, suitable for in-visit captions and ambient notes.

How does it handle poor telehealth video-call audio?

Universal-3 Pro is built for real-world audio and has a hallucination rate about 30% lower than Whisper, which matters when network compression degrades the signal.

Is it suitable for HIPAA-regulated telehealth?

AssemblyAI is a business associate under HIPAA and signs a standard BAA; PHI redaction across audio and transcripts; SOC 2 Type 2.

Does it support non-English telehealth visits?

Medical Mode supports English, Spanish, German, and French, including intra-utterance code-switching.

How much does real-time telehealth transcription cost?

$0.15/hr on top of the realtime or pre-recorded model — pay-as-you-go, no minimums.

How do I enable Medical Mode for telehealth transcription?

Add domain: "medical-v1" to your realtime session parameters.

Telehealth

Clinical-grade transcription for telehealth, live or recorded

Video visits add variable audio, headsets, and network jitter on top of clinical vocabulary. Universal-3 Pro with Medical Mode holds a 3.2% Missed Entity Rate on that audio and runs in real time during the call — so your scribe, captions, or triage layer keeps up. One parameter, on async or realtime.

Try Medical Mode free on your audio Read the docs

Accuracy that survives real telehealth audio — accents, headsets, compression

Missed Entity Rate 3.2%

Missed Entity Rate on clinical conversations with Medical Mode.

vs. Universal-3 Pro ~20%

Fewer missed medical entities than Universal-3 Pro alone.

vs. base model 87%

Fewer medical entity errors than the base model.

Benchmarked providers #1

Lowest Missed Entity Rate vs. Deepgram, Speechmatics, AWS, and Google.

See the medical benchmarks

Activation

Live captions and notes during the visit

Stream the call's audio into one WebSocket and get medical-grade transcripts back in real time — no model swap, no re-integration.

from assemblyai.streaming.v3 import StreamingClient, StreamingParameters

client = StreamingClient(api_key="YOUR_API_KEY")

# Telehealth: real-time transcription during the video visit
client.connect(StreamingParameters(
    speech_model="universal-3-5-pro",
    domain="medical-v1",
    keyterms_prompt=["telehealth", "hypertension", "lisinopril", "follow-up"],
))

Real-time, low latency

~150ms P50 on Universal-3.5 Pro Realtime for live captions and in-visit documentation.

Universal-3.5 Pro Realtime

Robust on messy audio

Hallucination rate ~30% lower than Whisper; holds accuracy through accents, headsets, and compressed call audio.

Universal-3 Pro

Multilingual visits

English, Spanish, German, French with native code-switching.

Medical Mode

Common questions

: Yes. Universal-3.5 Pro Realtime with Medical Mode runs live at ~150ms P50 latency, suitable for in-visit captions and ambient notes.
: Universal-3 Pro is built for real-world audio and has a hallucination rate about 30% lower than Whisper, which matters when network compression degrades the signal.
: AssemblyAI is a business associate under HIPAA and signs a standard BAA; PHI redaction across audio and transcripts; SOC 2 Type 2.
: Medical Mode supports English, Spanish, German, and French, including intra-utterance code-switching.
: $0.15/hr on top of the realtime or pre-recorded model — pay-as-you-go, no minimums.
: Add domain: "medical-v1" to your realtime session parameters.

Try Medical Mode free on your audio

Run your own telehealth audio through Medical Mode in the playground — one parameter, real time. Free to start, no credit card required.

Try Medical Mode free