Models
AssemblyAI offers several state-of-the-art speech recognition models, each optimized for different use cases. Choose the model that best fits your needs based on accuracy, latency, cost, and language requirements.
Highest accuracy for transcribing English pre-recorded audio with fine-tuning support and customization via prompting
Best for out-of-the-box transcription of pre-recorded audio with multi-lingual support, excellent accuracy, and low latency
Streaming audio transcription optimized for voice agents and real-time applications
Choosing the right model
Slam-1
- Best for: English content requiring highest accuracy
- Key benefits:
- Superior accuracy for English content
- Fine-tuning support
- Ideal for domain-specific terminology
Universal
- Best for: Production-ready transcription out of the box
- Key benefits:
- Excellent accuracy-to-latency ratio
- Multi-language support
- No configuration needed
- Ideal for conversational intelligence
Breakdown of Universal language support
High accuracy (†10% WER)
English, Spanish, French, German, Indonesian, Italian, Japanese, Dutch, Polish, Portuguese, Russian, Turkish, Ukrainian, Catalan
Good accuracy (>10% to â€25% WER)
Arabic, Azerbaijani, Bulgarian, Bosnian, Mandarin Chinese, Czech, Danish, Greek, Estonian, Finnish, Filipino, Galician, Hindi, Croatian, Hungarian, Korean, Macedonian, Malay, Norwegian BokmÄl, Romanian, Slovak, Swedish, Thai, Urdu, Vietnamese, Cantonese
Moderate accuracy (>25% to â€50% WER)
Afrikaans, Belarusian, Welsh, Persian (Farsi), Hebrew, Armenian, Icelandic, Kazakh, Lithuanian, Latvian, MÄori, Marathi, Slovenian, Swahili, Tamil
Fair accuracy (>50% WER)
Amharic, Assamese, Bengali, Gujarati, Hausa, Javanese, Georgian, Khmer, Kannada, Luxembourgish, Lingala, Lao, Malayalam, Mongolian, Maltese, Burmese, Nepali, Occitan, Punjabi, Pashto, Sindhi, Shona, Somali, Serbian, Telugu, Tajik, Uzbek, Yoruba
Streaming
- Best for: Voice agents and real-time voice applications
- Key benefits:
- ~300ms immutable transcripts
- Continuous speech recognition
- Intelligent endpointing
- Ideal for voice agents and interactive applications
Pricing
For detailed pricing information, visit our pricing page.
For volume discounts, please reach out to sales@assemblyai.com.
Next steps
- For pre-recorded audio, see how to select your model
- For real-time transcription, check out our streaming documentation