Automatic Language Detection

Identify the dominant language spoken in an audio file and use it during the transcription. Enable it to detect any of the supported languages.

Additional language support

We will be adding support for all Universal tier languages by the end of Q3 2025. Until then, if you need automatic language detection for an unsupported language please reach out to our Support team at support@assemblyai.com for more information on how that can be accomplished.

To reliably identify the dominant language, a file must contain at least 15 seconds of spoken audio. Results will be improved if there is at least 15-90 seconds of spoken audio in the file.
1import assemblyai as aai
2
3aai.settings.api_key = "<YOUR_API_KEY>"
4
5# audio_file = "./local_file.mp3"
6audio_file = "https://assembly.ai/wildfires.mp3"
7
8config = aai.TranscriptionConfig(language_detection=True)
9
10transcript = aai.Transcriber(config=config).transcribe(audio_file)
11
12print(transcript.text)
13print(transcript.json_response["language_code"])

Confidence score

If language detection is enabled, the API returns a confidence score for the detected language. The score ranges from 0.0 (low confidence) to 1.0 (high confidence).

1import assemblyai as aai
2
3aai.settings.api_key = "<YOUR_API_KEY>"
4
5# audio_file = "./local_file.mp3"
6audio_file = "https://assembly.ai/wildfires.mp3"
7
8config = aai.TranscriptionConfig(language_detection=True)
9
10transcript = aai.Transcriber(config=config).transcribe(audio_file)
11
12print(transcript.text)
13print(transcript.json_response["language_confidence"])

Set a language confidence threshold

You can set the confidence threshold that must be reached if language detection is enabled. An error will be returned if the language confidence is below this threshold. Valid values are in the range [0,1] inclusive.

1import assemblyai as aai
2
3aai.settings.api_key = "<YOUR_API_KEY>"
4
5# audio_file = "./local_file.mp3"
6audio_file = "https://assembly.ai/wildfires.mp3"
7
8config = aai.TranscriptionConfig(language_detection=True, language_confidence_threshold=0.8)
9
10transcript = aai.Transcriber(config=config).transcribe(audio_file)
11
12if transcript.status == "error":
13 raise RuntimeError(f"Transcription failed: {transcript.error}")
Fallback to a default language

For a workflow that resubmits a transcription request using a default language if the threshold is not reached, see this cookbook.