For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
PlaygroundChangelogSign In
OverviewAPI ReferencePre-recorded STTStreaming STTVoice AgentsSpeech UnderstandingGuardrailsLLM GatewayFAQ
OverviewAPI ReferencePre-recorded STTStreaming STTVoice AgentsSpeech UnderstandingGuardrailsLLM GatewayFAQ
  • Getting started
    • Overview
    • Build with AI coding agents
    • Models
    • Evaluate model accuracy
    • Manage your account
    • Introducing Universal-3 Pro
    • End-to-end examples
  • Use cases & integrations
    • Use case guides
    • Integrations
  • Trust & security
    • Trust center
    • Security overview
    • Data retention and model training
LogoLogo
PlaygroundChangelogSign In
On this page
  • Overview
Getting started

Introducing Universal-3 Pro

Learn how to transcribe audio using Universal-3 Pro.

Was this page helpful?
Built with

Overview

Universal-3 Pro is our most powerful Voice AI model, designed to capture the “hard stuff” that traditional ASR models struggle with. It delivers state-of-the-art accuracy for entities, rare words, and domain-specific terminology out of the box, with code switching and optional prompting for more control. It’s also our fastest model, so you get the best accuracy without sacrificing speed.

Universal-3 Pro is available for both pre-recorded (async) and streaming use cases. Configuration and settings differ between the two because streaming is optimized for real-time audio utterances typically under 10 seconds, with special efficiencies built into the model for low-latency turn detection and voice agent workflows.

Based on your use case, navigate to the appropriate guide below:

Universal-3 Pro Async

For pre-recorded audio files. Supports long-form audio, prompting, keyterms prompting, and full language detection.

Universal-3 Pro Streaming

For real-time audio streams. Optimized for low-latency turn detection, voice agents, and live transcription.