Overview
AssemblyAI's API can be used to transcribe and understand audio/video files with our AI models. For a very quick introduction, check out our Quickstart below.
Our team of developers is online nearly 24x7 to answer any questions you might have about our API, Documentation, or any of our features.
The AssemblyAI CLI is the easiest way to test our API. It's simple to install on any operating system (macOS, Windows, Linux), and works on virtually any audio or video file - even YouTube links!
You can also pass flags to the CLI to enable various models, such as --auto_highlights
. To view them all, and more thorough documentation about the CLI, head over to the CLI's GitHub repository.
Asynchronous transcription refers to transcription of pre-recorded audio/video files.
When you submit an audio file for transcription, it will complete in 15-30% of the audio file's duration. For example, a 10 minute file would complete in around 1.5 minutes, but could take up to 3 minutes.
Our Real-Time Streaming WebSocket API streams text transcriptions back to clients within a few hundred milliseconds.
Your account has what's called a "Throttle Limit"
- which controls how many concurrent audio files, or real-time audio streams, you can process in parallel.
If you need a higher limit than what we list below, please reach out to us to have these limits increased.
Below are the limits for how many audio files you can have processing in parallel when submitting jobs via the /v2/transcript
endpoint. If you go over your limit, your jobs will be will begin to queue.
Account Type | Limit |
---|---|
Free | 5 |
Paid | 32 |
If you need a higher limit than what is listed above, please reach out to us to have these limits increased.
Below are the limits for how many real-time audio streams you can have open in parallel.
Account Type | Limit |
---|---|
Free | 0 |
Paid | 32 |
If you need a higher limit than what is listed above, please reach out to us to have these limits increased.
More on Concurrency
Your Throttle Limit is the same for Asynchronous and Real-Time Transcription but when it comes to the number of jobs processing in parallel, these two are calculated separately. So if your Throttle Limit is 32, you can have up to 32 Asynchronous files processing and up to 32 Real-Time streams open at the same time and not exceed your Throttle Limit
The API will always return a JSON response when there is an error.
Invalid API Token
API requests made with an invalid API token will always return a 401
status code and a JSON response like:
{
"error": "Authentication error, API token missing/invalid"
}
Invalid API Request
When something is wrong with your API request, the API will return with a status code 400
:
{
"error": "format_text must be a Boolean"
}
The error
key will always contain more information about what was wrong with your request.
Server Errors
When something is wrong on our side, the API will return with a status code 500
:
{
"error": "Server error, developers have been alerted."
}
A transcription job can fail because something was wrong with your audio file, or because of an error on our side.
Whenever a transcription job fails, the status of the transcription will go to error
, and there will be an error
key in the JSON response from the API when fetching the transcription with a GET request.
The error
key will describe the error in more detail. For example:
{
// the status is shown as error here
"status": "error",
// the error is described in detail here
"error": "Download error to https://foo.bar, 403 Client Error: Forbidden for url: https://foo.bar",
...
}
Transcripts usually fail because of one of the following reasons:
When a transcription job fails due to an error on our side (a server error), we always recommend resubmitting the file for transcription. When you resubmit the file, usually a different server in our cluster will be able to process your audio file successfully.
The table below shows which languages are supported by the AssemblyAI API, their language_code
values, and the features available for that language.
See the Specifying a Language documentation for more information on using the language_code
parameter to specify the language of the file you are submitting for transcription.
Pro tip
If you try to use a feature that is not supported for the language_code
included in your POST
request you will receive a 400
status code and a "The following addons are not available in this language: <feature name(s)>"
error.
The AssemblyAI API can transcribe a large number of audio and video file formats.
If you don't find your audio file format listed in the below list, please let us know and we can look into adding support for it.
The AssemblyAI API can also transcribe video files, automatically stripping the audio out of the video file. If you don't find your video file format listed in the below list, please let us know and we can look into adding support for it.