Speaker Diarization

Assign speaker labels to each utterance and determine speaker count in conversations with our advanced Speech AI models, providing industry-leading speech-to-text accuracy.
An illustration on a blue background demonstrating speaker diarization

Get started with less than 10 lines of code

Simply enable Speaker Diarization in our API, and receive a detailed transcript with a list of utterances.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
import assemblyai as aai

aai.settings.api_key = "YOUR_API_KEY"

transcriber = aai.Transcriber()

audio_url = (
    "https://storage.googleapis.com/aai-web-samples/5_common_sports_injuries.mp3"
)

config = aai.TranscriptionConfig(speaker_labels=True)

transcript = transcriber.transcribe(audio_url, config)

print(transcript.text)

for utterance in transcript.utterances:
    print(f"Speaker {utterance.speaker}: {utterance.text}")

Improve transcription quality and readability

Reduce speaker misattribution and transcription errors, enabling cleaner data for NLP tasks and enhancing user experience in speech-to-text applications.
An illustration showing AssemblyAI's DER and cpWER agains its competitors.

Make every voice count

Improve the readability of your transcriptions

Unlock call center insights

Create a better search experience

Assess communication patterns

Optimize short-form content generation

Enhance automated dubbing precision

Implement intelligent camera focus

Determine talk time for sales teams

Join over 200,000 developers building with AssemblyAI

START BUILDING WITH AI

Get started in seconds

1
2
3
4
5
6
import assemblyai as aai

transcriber = aai.Transcriber()
transcript = transcriber.transcribe(URL, config)

print(transcript)
{
  "id": "6rlr37h8f4-e310-4e23-bbf3-ea5f347dc684",
  "language_code": "en_us",
  "status": "completed",
  "text": "Runner's knee is a condition characterized by pain behind or around the kneecap...",
  "confidence": 0.98122,
  "audio_duration": 3200,
  "words": [
    { "text": "Runner's", "start": 0, "end": 550, "speaker": "A", "confidence": 0.98113 },
    { "text": "knee", "start": 580, "end": 1130, "speaker": "A", "confidence": 0.95417 }
  ]
}