Code Switching

Transcribe audio containing multiple languages with code switching detection. This feature enables accurate transcription of conversations where speakers naturally switch between languages during conversations.

What is code switching?

Code switching occurs when speakers alternate between languages in dialogue.

Our code switching feature detects and transcribes these language transitions, maintaining accuracy across languages.

How it works

When you submit a file with code-switching dialogue, our model can transcribe both languages, with optimal performance for English paired with another language. The key to effective code switching transcription is selecting the appropriate primary language parameter. For Spanish and German, paired with English, the model performs equally well regardless of which language dominates the conversation.

Best practices

  1. Language Selection: For English paired with Spanish or German, specify the non-English language as the primary language. This approach works equally well whether English or the other language dominates the conversation. For other language pairs, this strategy works best when the non-English language is dominant in the audio.

  2. Avoid Automatic Detection: Do not use automatic language detection (language_detection=True) for code-switched content, as it may default to English and produce poor results for non-English segments.

  3. Testing: Test with sample audio to determine the best language parameter for your specific use case.

Language Selection Strategy

For best results with code-switched content please do the following based on the language combination:

  • Spanish/German + English: Specify the non-English language as primary (e.g., es for Spanish). This works equally well regardless of language dominance.
  • Other languages + English: Specify the non-English language as primary only when that language is dominant in the audio. For English-dominant content with other languages, standard English transcription may be more appropriate. - This approach prevents the model from overfitting to English and producing poor results for non-English segments.

Supported language pairs

Code switching performance varies by language pair.

Optimal performance (works equally well regardless of language dominance):

  • English + Spanish
  • English + German

Other supported languages:

While additional languages are supported for code switching, optimal results typically require the non-English language to be dominant in the audio. For English-dominant content with other languages, standard single-language transcription may be more appropriate.

We highly recommend testing sample code switching files with your specific audio to assess performance and evaluate outputs. We also recommend using an LLM to correct and fine-tune our model’s outputs.

Code switching implementation

1import assemblyai as aai
2
3aai.settings.api_key = "<YOUR_API_KEY>"
4
5audio_file = "https://assembly.ai/spanglish.mp3"
6
7# Set Spanish as primary language for English-Spanish mixing
8config = aai.TranscriptionConfig(language_code="es")
9
10transcript = aai.Transcriber(config=config).transcribe(audio_file)
11
12print(transcript.text)

Example output

Here’s what you can expect from a properly configured code-switching transcription:

Input: Audio containing English and Spanish conversation

Configuration: language_code="es"

Original: "Hola, how are you today? Estoy muy bien, gracias. I wanted to discuss el proyecto that we talked about yesterday."
Transcribed: "Hola, how are you today? Estoy muy bien, gracias. I wanted to discuss el proyecto that we talked about yesterday."

The transcription maintains accuracy across both languages without attempting to force everything into a single language.

Troubleshooting

Poor transcription quality?

If you’re getting gibberish or incorrect transcriptions:

  1. Ensure you’re NOT using language_code="en" for English-mixed content
  2. Try setting the non-English language as the primary language
  3. Avoid using automatic language detection for code-switched audio

If your audio contains primarily one language with only occasional words from another language, standard single-language transcription may be more appropriate than code-switching mode.