> ## Documentation Index > Fetch the complete documentation index at: https://assemblyai.com/docs/llms.txt > Use this file to discover all available pages before exploring further. # Automatic Language Detection export const LanguageTable = ({languages, columns = 3}) => { return

{languages.map(language =>

{language.name}


            {language.code}

)}

; };

US & EU
Identify the dominant language spoken in an audio file and use it during the transcription. Enable it to detect any of the [supported languages](/pre-recorded-audio/supported-languages). When language detection is enabled, the system automatically routes your request to the best available model based on the detected language and the models you provide in the `speech_models` parameter. For example, with `speech_models: ["universal-3-pro", "universal-2"]`, the system will use Universal-3 Pro for languages it supports and automatically fall back to Universal-2 for all other languages. You can check which model processed your request using the `speech_model_used` field in the response. See the [Model selection](/pre-recorded-audio/select-the-speech-model) page for more details. If you enable a feature that isn't supported for the detected language, the request will still complete successfully, but the unsupported feature will be silently omitted from the response. This differs from using [`language_code`](/pre-recorded-audio/set-language-manually), which rejects the request with an error if the feature isn't supported. For details, see [Unsupported feature behavior](/pre-recorded-audio/supported-languages#unsupported-feature-behavior). To reliably identify the dominant language, a file must contain **at least 15 seconds** of spoken audio. Results will be improved if there is at least 15-90 seconds of spoken audio in the file. ```python title="Python SDK" for="python-sdk" highlight={16} theme={null} import assemblyai as aai aai.settings.api_key = "" # audio_file = "./local_file.mp3" audio_file = "https://assembly.ai/wildfires.mp3" config = aai.TranscriptionConfig( speech_models=["universal-3-pro", "universal-2"], language_detection=True ) transcript = aai.Transcriber(config=config).transcribe(audio_file) print(transcript.text) print(transcript.json_response["language_code"]) ``` ```python title="Python" for="python" highlight={21} expandable theme={null} import requests import time base_url = "https://api.assemblyai.com" headers = { "authorization": "" } with open("./my-audio.mp3", "rb") as f: response = requests.post(base_url + "/v2/upload", headers=headers, data=f) upload_url = response.json()["upload_url"] data = { "audio_url": upload_url, # You can also use a URL to an audio or video file on the web "speech_models": ["universal-3-pro", "universal-2"], "language_detection": True, } url = base_url + "/v2/transcript" response = requests.post(url, json=data, headers=headers) transcript_id = response.json()['id'] polling_endpoint = base_url + "/v2/transcript/" + transcript_id while True: transcription_result = requests.get(polling_endpoint, headers=headers).json() if transcription_result['status'] == 'completed': print(f"Transcript ID: {transcript_id}") print(f"Language Code: {transcription_result['language_code']}") print(f"Text: {transcription_result['text']}") break elif transcription_result['status'] == 'error': raise RuntimeError(f"Transcription failed: {transcription_result['error']}") else: time.sleep(3) ``` ```javascript title="JavaScript SDK" for="javascript-sdk" highlight={14} expandable theme={null} import { AssemblyAI } from "assemblyai"; const client = new AssemblyAI({ apiKey: "", }); // const audioFile = './local_file.mp3' const audioFile = "https://assembly.ai/wildfires.mp3"; const params = { audio: audioFile, speech_models: ["universal-3-pro", "universal-2"], language_detection: true, }; const run = async () => { const transcript = await client.transcripts.transcribe(params); console.log(transcript.text); console.log(transcript.language_code); }; run(); ``` ```javascript title="JavaScript" for="javascript" highlight={21} expandable theme={null} import fs from "fs-extra"; const baseUrl = "https://api.assemblyai.com"; const headers = { authorization: "", }; const path = "./my-audio.mp3"; const audioData = await fs.readFile(path); let res = await fetch(`${baseUrl}/v2/upload`, { method: "POST", headers, body: audioData, }); if (!res.ok) throw new Error(`Error: ${res.status}`); const uploadResponse = await res.json(); const uploadUrl = uploadResponse.upload_url; const data = { audio_url: uploadUrl, // You can also use a URL to an audio or video file on the web speech_models: ["universal-3-pro", "universal-2"], language_detection: true, }; const url = `${baseUrl}/v2/transcript`; res = await fetch(url, { method: "POST", headers: { ...headers, "Content-Type": "application/json" }, body: JSON.stringify(data), }); if (!res.ok) throw new Error(`Error: ${res.status}`); const response = await res.json(); const transcriptId = response.id; const pollingEndpoint = `${baseUrl}/v2/transcript/${transcriptId}`; while (true) { res = await fetch(pollingEndpoint, { headers }); if (!res.ok) throw new Error(`Error: ${res.status}`); const transcriptionResult = await res.json(); if (transcriptionResult.status === "completed") { console.log(transcriptionResult.text); console.log(transcriptionResult.language_code); break; } else if (transcriptionResult.status === "error") { throw new Error(`Transcription failed: ${transcriptionResult.error}`); } else { await new Promise((resolve) => setTimeout(resolve, 3000)); } } ``` ## Set a list of expected languages If you're confident the audio is in one of a few languages, provide that list via `language_detection_options.expected_languages`. Detection is restricted to these candidates and the model will choose the language with the highest confidence from this list. This can eliminate scenarios where Automatic Language Detection selects an unexpected language for transcription. * Use our [language codes](/pre-recorded-audio/supported-languages) (e.g., `"en"`, `"es"`, `"fr"`). * If `expected_languages` is not specified, it is set to `["all"]` by default. ```python title="Python SDK" for="python-sdk" highlight={15} expandable theme={null} import assemblyai as aai aai.settings.api_key = "" # audio_file = "./local_file.mp3" audio_file = "https://assembly.ai/wildfires.mp3" options = aai.LanguageDetectionOptions( expected_languages=["en", "es", "fr", "de"], fallback_language="auto" ) config = aai.TranscriptionConfig( speech_models=["universal-3-pro", "universal-2"], language_detection=True, language_detection_options=options ) transcript = aai.Transcriber(config=config).transcribe(audio_file) print(transcript.text) print(transcript.json_response["language_code"]) ``` ```python title="Python" for="python" highlight={23} expandable theme={null} import requests import time base_url = "https://api.assemblyai.com" headers = { "authorization": "" } with open("./my-audio.mp3", "rb") as f: response = requests.post(base_url + "/v2/upload", headers=headers, data=f) upload_url = response.json()["upload_url"] data = { "audio_url": upload_url, # You can also use a URL to an audio or video file on the web "speech_models": ["universal-3-pro", "universal-2"], "language_detection": True, "language_detection_options": { "expected_languages": ["en", "es", "fr", "de"], "fallback_language": "auto" } } url = base_url + "/v2/transcript" response = requests.post(url, json=data, headers=headers) transcript_id = response.json()['id'] polling_endpoint = base_url + "/v2/transcript/" + transcript_id while True: transcription_result = requests.get(polling_endpoint, headers=headers).json() if transcription_result['status'] == 'completed': print(f"Transcript ID: {transcript_id}") print(f"Language Code: {transcription_result['language_code']}") print(f"Text: {transcription_result['text']}") break elif transcription_result['status'] == 'error': raise RuntimeError(f"Transcription failed: {transcription_result['error']}") else: time.sleep(3) ``` ```javascript title="JavaScript SDK" for="javascript-sdk" highlight={16} expandable theme={null} import { AssemblyAI } from "assemblyai"; const client = new AssemblyAI({ apiKey: "", }); // const audioFile = './local_file.mp3' const audioFile = "https://assembly.ai/wildfires.mp3"; const params = { audio: audioFile, speech_models: ["universal-3-pro", "universal-2"], language_detection: true, language_detection_options: { expected_languages: ["en", "es", "fr", "de"], fallback_language: "auto", }, }; const run = async () => { const transcript = await client.transcripts.transcribe(params); console.log(transcript.text); console.log(transcript.language_code); }; run(); ``` ```javascript title="JavaScript" for="javascript" highlight={23} expandable theme={null} import fs from "fs-extra"; const baseUrl = "https://api.assemblyai.com"; const headers = { authorization: "", }; const path = "./my-audio.mp3"; const audioData = await fs.readFile(path); let res = await fetch(`${baseUrl}/v2/upload`, { method: "POST", headers, body: audioData, }); if (!res.ok) throw new Error(`Error: ${res.status}`); const uploadResponse = await res.json(); const uploadUrl = uploadResponse.upload_url; const data = { audio_url: uploadUrl, // You can also use a URL to an audio or video file on the web speech_models: ["universal-3-pro", "universal-2"], language_detection: true, language_detection_options: { expected_languages: ["en", "es", "fr", "de"], fallback_language: "auto", }, }; const url = `${baseUrl}/v2/transcript`; res = await fetch(url, { method: "POST", headers: { ...headers, "Content-Type": "application/json" }, body: JSON.stringify(data), }); if (!res.ok) throw new Error(`Error: ${res.status}`); const response = await res.json(); const transcriptId = response.id; const pollingEndpoint = `${baseUrl}/v2/transcript/${transcriptId}`; while (true) { res = await fetch(pollingEndpoint, { headers }); if (!res.ok) throw new Error(`Error: ${res.status}`); const transcriptionResult = await res.json(); if (transcriptionResult.status === "completed") { console.log(transcriptionResult.text); console.log(transcriptionResult.language_code); break; } else if (transcriptionResult.status === "error") { throw new Error(`Transcription failed: ${transcriptionResult.error}`); } else { await new Promise((resolve) => setTimeout(resolve, 3000)); } } ``` ## Choose a fallback language Control what language transcription should fall back to when detection cannot confidently select a language from the `expected_languages` list. * Set `language_detection_options.fallback_language` to a specific language code (e.g., `"en"`). * `fallback_language` must be one of the language codes in `expected_languages` or `"auto"`. * When `fallback_language` is unspecified, it is set to `"auto"` by default. This tells our model to choose the fallback language from `expected_languages` with the highest confidence score. ```python title="Python SDK" for="python-sdk" highlight={16} expandable theme={null} import assemblyai as aai aai.settings.api_key = "" # audio_file = "./local_file.mp3" audio_file = "https://assembly.ai/wildfires.mp3" options = aai.LanguageDetectionOptions( expected_languages=["en", "es", "fr", "de"], fallback_language="auto" ) config = aai.TranscriptionConfig( speech_models=["universal-3-pro", "universal-2"], language_detection=True, language_detection_options=options ) transcript = aai.Transcriber(config=config).transcribe(audio_file) print(transcript.text) print(transcript.json_response["language_code"]) ``` ```python title="Python" for="python" highlight={23} expandable theme={null} import requests import time base_url = "https://api.assemblyai.com" headers = { "authorization": "" } with open("./my-audio.mp3", "rb") as f: response = requests.post(base_url + "/v2/upload", headers=headers, data=f) upload_url = response.json()["upload_url"] data = { "audio_url": upload_url, # You can also use a URL to an audio or video file on the web "speech_models": ["universal-3-pro", "universal-2"], "language_detection": True, "language_detection_options": { "expected_languages": ["en", "es", "fr", "de"], "fallback_language": "auto" } } url = base_url + "/v2/transcript" response = requests.post(url, json=data, headers=headers) transcript_id = response.json()['id'] polling_endpoint = base_url + "/v2/transcript/" + transcript_id while True: transcription_result = requests.get(polling_endpoint, headers=headers).json() if transcription_result['status'] == 'completed': print(f"Transcript ID: {transcript_id}") print(f"Language Code: {transcription_result['language_code']}") print(f"Text: {transcription_result['text']}") break elif transcription_result['status'] == 'error': raise RuntimeError(f"Transcription failed: {transcription_result['error']}") else: time.sleep(3) ``` ```javascript title="JavaScript SDK" for="javascript-sdk" highlight={16} expandable theme={null} import { AssemblyAI } from "assemblyai"; const client = new AssemblyAI({ apiKey: "", }); // const audioFile = './local_file.mp3' const audioFile = "https://assembly.ai/wildfires.mp3"; const params = { audio: audioFile, speech_models: ["universal-3-pro", "universal-2"], language_detection: true, language_detection_options: { expected_languages: ["en", "es", "fr", "de"], fallback_language: "auto", }, }; const run = async () => { const transcript = await client.transcripts.transcribe(params); console.log(transcript.text); console.log(transcript.language_code); }; run(); ``` ```javascript title="JavaScript" for="javascript" highlight={23} expandable theme={null} import fs from "fs-extra"; const baseUrl = "https://api.assemblyai.com"; const headers = { authorization: "", }; const path = "./my-audio.mp3"; const audioData = await fs.readFile(path); let res = await fetch(`${baseUrl}/v2/upload`, { method: "POST", headers, body: audioData, }); if (!res.ok) throw new Error(`Error: ${res.status}`); const uploadResponse = await res.json(); const uploadUrl = uploadResponse.upload_url; const data = { audio_url: uploadUrl, // You can also use a URL to an audio or video file on the web speech_models: ["universal-3-pro", "universal-2"], language_detection: true, language_detection_options: { expected_languages: ["en", "es", "fr", "de"], fallback_language: "auto", }, }; const url = `${baseUrl}/v2/transcript`; res = await fetch(url, { method: "POST", headers: { ...headers, "Content-Type": "application/json" }, body: JSON.stringify(data), }); if (!res.ok) throw new Error(`Error: ${res.status}`); const response = await res.json(); const transcriptId = response.id; const pollingEndpoint = `${baseUrl}/v2/transcript/${transcriptId}`; while (true) { res = await fetch(pollingEndpoint, { headers }); if (!res.ok) throw new Error(`Error: ${res.status}`); const transcriptionResult = await res.json(); if (transcriptionResult.status === "completed") { console.log(transcriptionResult.text); console.log(transcriptionResult.language_code); break; } else if (transcriptionResult.status === "error") { throw new Error(`Transcription failed: ${transcriptionResult.error}`); } else { await new Promise((resolve) => setTimeout(resolve, 3000)); } } ``` ## Confidence score If language detection is enabled, the API returns a confidence score for the detected language. The score ranges from 0.0 (low confidence) to 1.0 (high confidence). ```python title="Python SDK" for="python-sdk" highlight={10,16} theme={null} import assemblyai as aai aai.settings.api_key = "" # audio_file = "./local_file.mp3" audio_file = "https://assembly.ai/wildfires.mp3" config = aai.TranscriptionConfig( speech_models=["universal-3-pro", "universal-2"], language_detection=True ) transcript = aai.Transcriber(config=config).transcribe(audio_file) print(transcript.text) print(transcript.json_response["language_confidence"]) ``` ```python title="Python" for="python" highlight={34} expandable theme={null} import requests import time base_url = "https://api.assemblyai.com" headers = { "authorization": "" } with open("./my-audio.mp3", "rb") as f: response = requests.post(base_url + "/v2/upload", headers=headers, data=f) upload_url = response.json()["upload_url"] data = { "audio_url": upload_url, # You can also use a URL to an audio or video file on the web "speech_models": ["universal-3-pro", "universal-2"], "language_detection": True } url = base_url + "/v2/transcript" response = requests.post(url, json=data, headers=headers) transcript_id = response.json()['id'] polling_endpoint = base_url + "/v2/transcript/" + transcript_id while True: transcription_result = requests.get(polling_endpoint, headers=headers).json() if transcription_result['status'] == 'completed': print(f"Transcript ID: {transcript_id}") print(f"Language Confidence: {transcription_result['language_confidence']}") print(f"Text: {transcription_result['text']}") break elif transcription_result['status'] == 'error': raise RuntimeError(f"Transcription failed: {transcription_result['error']}") else: time.sleep(3) ``` ```javascript title="JavaScript SDK" for="javascript-sdk" highlight={20} expandable theme={null} import { AssemblyAI } from "assemblyai"; const client = new AssemblyAI({ apiKey: "", }); // const audioFile = './local_file.mp3' const audioFile = "https://assembly.ai/wildfires.mp3"; const params = { audio: audioFile, speech_models: ["universal-3-pro", "universal-2"], language_detection: true, }; const run = async () => { const transcript = await client.transcripts.transcribe(params); console.log(transcript.text); console.log(transcript.language_confidence); }; run(); ``` ```javascript title="JavaScript" for="javascript" highlight={37} expandable theme={null} import fs from "fs-extra"; const baseUrl = "https://api.assemblyai.com"; const headers = { authorization: "", }; const path = "./my-audio.mp3"; const audioData = await fs.readFile(path); let res = await fetch(`${baseUrl}/v2/upload`, { method: "POST", headers, body: audioData, }); if (!res.ok) throw new Error(`Error: ${res.status}`); const uploadResponse = await res.json(); const uploadUrl = uploadResponse.upload_url; const data = { audio_url: uploadUrl, // You can also use a URL to an audio or video file on the web speech_models: ["universal-3-pro", "universal-2"], language_detection: true, }; const url = `${baseUrl}/v2/transcript`; res = await fetch(url, { method: "POST", headers: { ...headers, "Content-Type": "application/json" }, body: JSON.stringify(data), }); if (!res.ok) throw new Error(`Error: ${res.status}`); const response = await res.json(); const transcriptId = response.id; const pollingEndpoint = `${baseUrl}/v2/transcript/${transcriptId}`; while (true) { res = await fetch(pollingEndpoint, { headers }); if (!res.ok) throw new Error(`Error: ${res.status}`); const transcriptionResult = await res.json(); if (transcriptionResult.status === "completed") { console.log(transcriptionResult.text); console.log(transcriptionResult.language_confidence); break; } else if (transcriptionResult.status === "error") { throw new Error(`Transcription failed: ${transcriptionResult.error}`); } else { await new Promise((resolve) => setTimeout(resolve, 3000)); } } ``` ## Set a language confidence threshold You can set the confidence threshold that must be reached if language detection is enabled. An error will be returned if the language confidence is below this threshold. Valid values are in the range \[0,1] inclusive. ```python title="Python SDK" for="python-sdk" highlight={10,16} theme={null} import assemblyai as aai aai.settings.api_key = "" # audio_file = "./local_file.mp3" audio_file = "https://assembly.ai/wildfires.mp3" config = aai.TranscriptionConfig( speech_models=["universal-3-pro", "universal-2"], language_detection=True, language_confidence_threshold=0.8 ) transcript = aai.Transcriber(config=config).transcribe(audio_file) if transcript.status == "error": raise RuntimeError(f"Transcription failed: {transcript.error}") else: print(transcript.json_response["language_confidence"]) print(transcript.text) ``` ```python title="Python" for="python" highlight={21} expandable theme={null} import requests import time base_url = "https://api.assemblyai.com" headers = { "authorization": "" } with open("./my-audio.mp3", "rb") as f: response = requests.post(base_url + "/v2/upload", headers=headers, data=f) upload_url = response.json()["upload_url"] data = { "audio_url": upload_url, # You can also use a URL to an audio or video file on the web "speech_models": ["universal-3-pro", "universal-2"], "language_detection": True, "language_confidence_threshold": 0.8 } url = base_url + "/v2/transcript" response = requests.post(url, json=data, headers=headers) transcript_id = response.json()['id'] polling_endpoint = base_url + "/v2/transcript/" + transcript_id while True: transcription_result = requests.get(polling_endpoint, headers=headers).json() if transcription_result['status'] == 'completed': print(f"Transcript ID: {transcript_id}") print(f"Text: {transcription_result['text']}") break elif transcription_result['status'] == 'error': raise RuntimeError(f"Transcription failed: {transcription_result['error']}") else: time.sleep(3) ``` ```javascript title="JavaScript SDK" for="javascript-sdk" highlight={14} expandable theme={null} import { AssemblyAI } from "assemblyai"; const client = new AssemblyAI({ apiKey: "", }); // const audioFile = './local_file.mp3' const audioFile = "https://assembly.ai/wildfires.mp3"; const params = { audio: audioFile, speech_models: ["universal-3-pro", "universal-2"], language_detection: true, language_confidence_threshold: 0.8, }; const run = async () => { const transcript = await client.transcripts.transcribe(params); if (transcript.status === "error") { throw new Error(`Transcription failed: ${transcript.error}`); } console.log(transcript.text); console.log(transcript.language_confidence); }; run(); ``` ```javascript title="JavaScript" for="javascript" highlight={21} expandable theme={null} import fs from "fs-extra"; const baseUrl = "https://api.assemblyai.com"; const headers = { authorization: "", }; const path = "./my-audio.mp3"; const audioData = await fs.readFile(path); let res = await fetch(`${baseUrl}/v2/upload`, { method: "POST", headers, body: audioData, }); if (!res.ok) throw new Error(`Error: ${res.status}`); const uploadResponse = await res.json(); const uploadUrl = uploadResponse.upload_url; const data = { audio_url: uploadUrl, // You can also use a URL to an audio or video file on the web speech_models: ["universal-3-pro", "universal-2"], language_detection: true, language_confidence_threshold: 0.8, }; const url = `${baseUrl}/v2/transcript`; res = await fetch(url, { method: "POST", headers: { ...headers, "Content-Type": "application/json" }, body: JSON.stringify(data), }); if (!res.ok) throw new Error(`Error: ${res.status}`); const response = await res.json(); const transcriptId = response.id; const pollingEndpoint = `${baseUrl}/v2/transcript/${transcriptId}`; while (true) { res = await fetch(pollingEndpoint, { headers }); if (!res.ok) throw new Error(`Error: ${res.status}`); const transcriptionResult = await res.json(); if (transcriptionResult.status === "completed") { console.log(transcriptionResult.text); console.log(transcriptionResult.language_confidence); break; } else if (transcriptionResult.status === "error") { throw new Error(`Transcription failed: ${transcriptionResult.error}`); } else { await new Promise((resolve) => setTimeout(resolve, 3000)); } } ``` If the `language_confidence_threshold` you specify is not met you will receive an error message like `detected language 'bg', confidence 0.2949, is below the requested confidence threshold value of '0.4'`. ## Troubleshooting ### Accented speech detected as the wrong language Automatic Language Detection uses Whisper-based language identification, which can sometimes misidentify heavily accented speech as a different language. For example, English spoken with a strong accent may be detected as Finnish, Latvian, Latin, or Arabic. When this happens, the model might not just return a wrong language label -- it might also **transcribe the audio in the incorrectly detected language**. This effectively translates the speech rather than transcribing it, producing output in a language the speaker wasn't using. The exact transcription behavior can vary depending on the detected language and speech model used. ### Recommended mitigations **Use `expected_languages` to constrain detection (most effective).** If you know which languages your audio may contain, set `expected_languages` to only those languages. This prevents the model from selecting an unexpected language entirely. For example, if your application processes interviews in English, Spanish, and French: ```json theme={null} { "language_detection": true, "language_detection_options": { "expected_languages": ["en", "es", "fr"], "fallback_language": "en" } } ``` Setting `fallback_language` to your most common language (e.g., `"en"`) ensures that if the model can't confidently choose between the expected languages, it defaults to the language most likely to produce a useful transcript. **Use `language_confidence_threshold` to reject low-confidence detections.** Setting a threshold (e.g., `0.7`) causes the API to return an error instead of a transcript when confidence is low. This helps catch some misdetections, but not cases where the model is confidently wrong. **Monitor `language_confidence` in responses.** Log the `language_code` and `language_confidence` fields from your transcript responses. Unexpected language codes or unusual confidence patterns can help you identify misdetection issues early and decide whether to retry with `expected_languages` or flag the transcript for review.