Chat with us, powered by LiveChat

Products

Core Transcription

Automatically convert your audio and video files, and live audio streams, into text with advanced AI models using our simple Speech-to-Text APIs.

Explore Key Features

Why Core Transcription

Quickly process large volumes of data
Our API processes millions of audio files every day for hundreds of customers, including dozens of Fortune 500 enterprises.
Easily access our cutting-edge AI models
With a simple API request, access years of research into state-of-the-art AI models for speech recognition.
Securely analyze your data at scale
Easily access our state-of-the-art AI models with our secure SOC 2 and GDPR-certified API.

All Core Transcription features

Async Transcription
Transcribe pre-recorded audio and/or video files in seconds, with human-level accuracy. Highly scalable to tens of thousands of files in parallel.
Learn more
Real-Time Transcription
If you're working with live audio streams, you can stream your audio data in real-time. We will stream transcripts back to you within a few hundred milliseconds, and additionally, revise these transcripts with more accuracy over time as more context arrives.
Learn more
Custom Vocabulary
Boost accuracy for vocabulary that is unique or custom to your specific use case or product.
Learn more
Speaker Labels
The AssemblyAI API can automatically detect the number of speakers in your audio file, and each word in the transcription text can be associated with its speaker.
Learn more
International Language Support
We support over 15 languages and counting, including Global English (English and all of its accents).
Learn more
All Audio and Video Formats Accepted
Don't worry about file formats or sampling rates, our API supports virtually all audio and video files without any transcoding required.
Learn more
Automatic Punctuation and Casing
Casing and punctuation of proper nouns are automatically added to the transcription text.
Learn more
Confidence Scores
Get a confidence score for each word in the transcript.
Learn more
Word Timings
Word-by-word timestamps across the entire transcript text.
Learn more
Paragraph Detection
Export your transcription broken down into automatically generated paragraphs.
Learn more
Export as Captions
Easily export your transcription in SRT or VTT format, to be plugged into a video player for subtitles and closed captions.
Learn more
Dual-Channel Transcription
The API can split your dual-channel audio files and provide a transcription for each unique channel.
Learn more
Language Detection
Automatically detect if the dominant language of the spoken audio is supported by our API and route it to the appropriate model for transcription.
Learn more
Filler Words
Optionally include disfluencies in the transcripts of your audio files.
Learn more
Profanity Filtering
Automatically detect and replace profanity in the transcription text.
Learn more
Word Search
Search over your completed transcripts for specific words and phrases.
Learn more
Privacy Protection
Files sent to the API for transcription are never stored, and you can request the deletion of transcription text permanently from our database.
Learn more
Custom Spelling
Specify how you would like certain words to be spelled or formatted in the transcription text.
Learn more

Want to do more with your audio?

Explore our range of Audio Intelligence features to do more with your audio, from Sentiment Analysis to Content Moderation.

Learn more about Audio Intelligence
Sentiment Analysis
With Sentiment Analysis, AssemblyAI can detect the sentiment of each sentence of speech spoken in your audio files.
Learn more
Entity Detection
Identify a wide range of entities that are spoken in your files, such as person and company names, email addresses, and locations.
Learn more

All with one simple API

Built for developers

Powered by deep learning

A colorful square