Streaming Speech-to-Text

Convert live audio streams into text synchronously with nearly 90% accuracy and <600ms latency.

LIVE

01:27:19 PM

Hey, adventurers! Welcome to today's exciting livestream event, where we're embarking on an expedition to uncover the secrets of the Lost Temple hidden deep within this mysterious jungle! I'm your host, Emily, and I'm thrilled to have you all joining me on this epic adventure. Just look at this incredible jungle landscape, teeming with life and brimming with secrets waiting to be discovered! Who knows what ancient mysteries lie within these dense foliage? [Camera zooms in on a vine-covered ruin peeking through the trees] Emily: And there it is, folks! Our destination, the Lost Temple, a relic of a long-forgotten civilization lost to time. Legend has it that this temple holds untold riches and powerful artifacts beyond imagination!

Automatically turn live audio into text

Transcribe conversations, meetings, and live events synchronously and elevate live interactions instantly.

Try in the Playground
An illustration of the AssemblyAI realtime playground. On top, there's a button with the Text "Start talking". Below, there's a timestamp and output with text "Hello today is"

Industry-leading quality at low latency

Low latency
Automatically transcribe live audio, nearly instantaneously, with customized end point control.
Industry-leading quality
Retrieve highly accurate results.
High concurrency
Easily process a high volume of audio files at scale.
Advanced punctuation & casing
Automatically add casing and punctuation of proper nouns to the transcription text.

Speech-to-Text

Build on top of the most accurate Speech-to-Text model on the market with >92.5% accuracy.

Speech Understanding

Extract maximum value from voice data with Audio Intelligence, and leverage Large Language Models with LeMUR.

START BUILDING WITH AI

Get started in seconds

1
2
3
4
5
6
7
import assemblyai as aai
import json

transcriber = aai.Transcriber()
transcript = transcriber.transcribe(URL, config)

print(json.dumps(transcript, indent=2))
{
  "id": "6rlr37h8f4-e310-4e23-bbf3-ea5f347dc684",
  "language_code": "en_us",
  "status": "completed",
  "text": "Runner's knee is a condition characterized by pain behind or around the kneecap...",
  "confidence": 0.98122,
  "audio_duration": 3200,
  "words": [
    { "text": "Runner's", "start": 0, "end": 550, "speaker": "A", "confidence": 0.98113 },
    { "text": "knee", "start": 580, "end": 1130, "speaker": "A", "confidence": 0.95417 }
  ]
}