Code Switching

Transcribe audio containing multiple languages with code switching detection. This feature enables accurate transcription of conversations where speakers naturally switch between languages during conversations.

Code Switching and How It Works

Code switching occurs when speakers alternate between languages in dialogue.

Our code switching feature detects and transcribes these language transitions, maintaining accuracy across languages. When you submit a file with code-switching dialogue, our model can transcribe both languages, with optimal performance for English paired with another language.

Supported Language Pairs

Code switching performance varies by language pair.

Optimal performance:

  • English + Spanish
  • English + German

Other supported languages:

While additional languages are supported for code switching, optimal results typically require the non-English language to be dominant in the audio. For English-dominant content with other languages, standard single-language transcription may be more appropriate.

We highly recommend testing sample code switching files with your specific audio to assess performance and evaluate outputs. We also recommend using an LLM to correct and fine-tune our model’s outputs.

Manually Setting Language Codes

To manually set the language codes, you can use the language_codes parameter. A max of two language codes can be set and one code must be "en". For example, if your file contains both English and Spanish, it would be "language_codes": ["en", "es"].

1import requests
2import time
3
4base_url = "https://api.assemblyai.com"
5
6headers = {
7 "authorization": "<YOUR_API_KEY>"
8}
9
10with open("./bilingual-audio.mp3", "rb") as f:
11 response = requests.post(base_url + "/v2/upload",
12 headers=headers,
13 data=f)
14
15upload_url = response.json()["upload_url"]
16
17data = {
18 "audio_url": upload_url,
19 "language_codes": ["en", "es"] # English-Spanish code switching
20}
21
22url = base_url + "/v2/transcript"
23response = requests.post(url, json=data, headers=headers)
24
25transcript_id = response.json()['id']
26polling_endpoint = base_url + "/v2/transcript/" + transcript_id
27
28while True:
29 transcription_result = requests.get(polling_endpoint, headers=headers).json()
30
31 if transcription_result['status'] == 'completed':
32 print(f"Transcript:", transcription_result['text'])
33 break
34
35 elif transcription_result['status'] == 'error':
36 raise RuntimeError(f"Transcription failed: {transcription_result['error']}")
37
38 else:
39 time.sleep(3)

Use Automatic Language Detection

To use automatic language detection for code switching, set language_detection to True. Then, within language_detection_options set code_switching to True.

Optionally, you can set code_switching_confidence_threshold to a number between 0 and 1. This will only use identified languages above this threshold. A max of two languages can be identified.

Code Switching Confidence Threshold

By default, the code_switching_confidence_threshold parameter is set to 0.3. If you would like to disable this, make sure to set this parameter to 0.

1import requests
2import time
3
4base_url = "https://api.assemblyai.com"
5
6headers = {
7 "authorization": "<YOUR_API_KEY>"
8}
9
10with open("./bilingual-audio.mp3", "rb") as f:
11 response = requests.post(base_url + "/v2/upload",
12 headers=headers,
13 data=f)
14
15upload_url = response.json()["upload_url"]
16
17data = {
18 "audio_url": upload_url,
19 "language_detection": True,
20 "language_detection_options": {
21 "code_switching": True,
22 "code_switching_confidence_threshold": 0.5 # Optional parameter - this is set to 0.3 by default
23 },
24}
25
26url = base_url + "/v2/transcript"
27response = requests.post(url, json=data, headers=headers)
28
29transcript_id = response.json()['id']
30polling_endpoint = base_url + "/v2/transcript/" + transcript_id
31
32while True:
33 transcription_result = requests.get(polling_endpoint, headers=headers).json()
34
35 if transcription_result['status'] == 'completed':
36 print(f"Transcript:", transcription_result['text'])
37 break
38
39 elif transcription_result['status'] == 'error':
40 raise RuntimeError(f"Transcription failed: {transcription_result['error']}")
41
42 else:
43 time.sleep(3)

Example API Response

When enabling code switching with automatic language detection, the two detected language codes with the highest confidence and their confidence will be included in the transcript JSON.

1"language_detection_results": {
2 "code_switching_languages": [
3 {"language": "en", "confidence": 0.8},
4 {"language": "es", "confidence": 0.7}
5 ]
6}

Example output

Here’s what you can expect from a properly configured code-switching transcription:

Input: Audio containing English and Spanish conversation

Original: "Hola, how are you today? Estoy muy bien, gracias. I wanted to discuss el proyecto that we talked about yesterday."
Transcribed: "Hola, how are you today? Estoy muy bien, gracias. I wanted to discuss el proyecto that we talked about yesterday."

The transcription maintains accuracy across both languages without attempting to force everything into a single language.

Troubleshooting

If your audio contains primarily one language with only occasional words from another language, standard single-language transcription may be more appropriate than code-switching mode.