Batch Asynchronous Transcription. Transcribe thousands of audio and/or video files in minutes.
Speaker Labels (Speaker Diarization). The number of speakers in the audio can be automatically detected, and each word transcribed associated with its speaker. This can help answer the question "Who Spoke When?"
Top-Rated Accuracy. Our API is built using the latest advances in Deep Learning, using cutting edge research from our in-house team of AI researchers.
Word Timings. Word-by-word timestamps across the entire transcript text.
Paragraph Detection. Export your transcription broken down into automatically generated paragraphs.
Real-Time Transcription. Transcribe audio-streams in real-time with high accuracy.
All Audio and Video Formats Accepted. Don't worry about file formats or sampling rates, our API supports virtually all audio/video files without any transcoding required.
Automatic Punctuation and Casing. Casing and punctuation of proper nouns are automatically added to the transcription text, to make transcripts produced by the API more readable.
Confidence Scores. Get a confidence score for each word in the transcript.
Export as Captions (SRT/VTT). Easily export your transcription in SRT or VTT format, to be plugged into a video player for subtitles and closed captions.
Topic Detection. Automatically determine the topics discussed in your audio or video files. This feature uses the IAB Taxonomy to predict over 698 different topic labels, such as "Automotive > Self Driving Cars".
PII Redaction. Automatically detect and replace sensitive data, like credit card numbers and social security numbers, in the transcription text and source audio.
Profanity Filtering. Automatically detect and replace profanity in the transcription text.
Sentiment Analysis. Automatically score segments of your transcript between -1 and 1 based on how positive the sentiment is.
Emotion Detection. Automatically detect the severity of emotions (happy, angry, etc) in each segment of your audio content.
Translation. Automatically translate transcription text from one to multiple other languages.
Content Safety Detection. Automatically detect sensitive content in your transcriptions, such as content about drugs, weapons, NSFW content, and over 20 other types of content.
Automatic Transcript Highlights. The API can automatically detect key phrases and words in your transcription text - very useful for tagging content, and providing a summary of the transcription text.
Chapter Detection. Automatically split content into chapters (similar to YouTube), to make large transcripts more structured and readable.
Entity Detection. Automatically detect a wide range of entities like people and company names, email addresses, dates, locations, events, and more.
Summarization. Automatically summarize your content into one comprehensive transcript to make sense of longer-form content.
Custom Vocabulary. Boost accuracy for a list of keywords/phrases when transcribing an audio or video file. This can include proper nouns like company names, people names, product names, and/or specific industry terms.
99.99%+ Uptime. Thousands of developers and startups trust AssemblyAI to power core features in production, with over 99.9% uptime and transcript completion rate.
Advanced Security & Privacy. We adhere to best-practice guidelines like encryption in transit and at rest.
24x7 Support. We're here to work with you as much, or as little, as you'd like. Talk to us over live chat, Slack, email, or phone.
Data Privacy. We are not in the business of monetizing your data. Files sent to the API for transcription are never stored, and you can request the deletion of transcription text permanently from our database.