Automatically convert your audio and video file into text with an advanced AI model using our simple API.
AssemblyAI can generate a single abstractive summary of the entire source submitted for transcription.
AssemblyAI can label the topics that are spoken in your audio/video files, following the standardized IAB Taxonomy.
Generate a summary over time, providing an automatically generated summary for each chapter of content.
AssemblyAI can detect if any sensitive content is spoken, and pinpoint exactly when and what was spoken.
Automatically detect important phrases and words in your source. Your submission may take up to 60 seconds longer to process.
Detect the sentiment of each sentence of speech spoken in your source, returning the confidence score and exact timestamp.
Identify a wide range of entities that are spoken in your audio files, such as person and company names, locations, and more.
Automatically remove Personally Identifiable Information, such as phone numbers and social security numbers.
Automatically detect the number of speakers in your audio file, and the text associated with its speaker.
If your source is a dual channel audio file, the API supports transcribing each channel separately.
By default, the API will return a verbatim transcription of the audio, meaning profanity will be present in the transcript.