Skip to main content

Concurrency limit

Every AssemblyAI account has certain limits to ensure smooth and optimal performance for all users.

Concurrency limit refers to the number of transcripts or real-time audio sessions that a user can process at the same time. This is also sometimes referred to as "throttle". You can find it by signing into your account and checking your settings.

There is also a usage limit, which determines the number of hours of audio that a user can transcribe in a given month. This is specific to each account and can also be found in your account settings.

Additionally, there is a rate limit at the API level, which is the number of API calls that a user can make within a particular time frame. This is to ensure that a single user or bad actor doesn't affect the performance of the API for other users.

Concurrency limit

AssemblyAI concurrency limits

AssemblyAI has two types of accounts: Free and Paid. To upgrade to a Paid account, you need to add a credit card after creating your account.

The default concurrency limits for each account type are listed in the tables below.

Asynchronous limits

Below are the default limits for how many requests you can have processing in parallel when submitting jobs to our /v2/transcript endpoint.


Real-time transcription limits

Below are the default limits for how many real-time transcription sessions you can have open in parallel.


Note that real-time is a paid-only model. In addition to the concurrency limit, there is a rate limit for the API, which restricts users to a maximum of 20,000 requests per five minutes.

Exceeding your concurrency limit

If you are using the /v2/transcript endpoint and exceed your concurrency limit, any additional jobs will be placed in a queue until currently processing jobs complete. While all transcripts will still be processed, the turnaround time may be longer than usual. As soon as a processing job completes, one of the queued jobs will begin processing in its place.

If you are using our real-time transcription feature and exceed your concurrency limit, you will receive a 402 error and a response that includes a "This account has exceeded the number of allowed streams" message.

If you exceed your concurrency limit, you will receive an email stating that your transcripts have been throttled. Please note that you will receive it only once per day.

Concurrency limit

Common causes of exceeding your concurrency limits

The most common causes of a notification that you have exceeded your concurrency limit and been throttled is by exceeding the number of requests or sessions that you can run in parallel, but there are other potential causes of throttling.

Your account has reached a negative balance

When your account first reaches a negative balance you will still be able to use the API for a certain period but your concurrency limit effectively becomes 1. If you unexpectedly receive an email that your account has been throttle check your account balance as this could be the cause.

You are not properly closing real-time sessions

When ending a real-time session you should send a JSON message with a terminate_session key set to true. Failure to do this will result in your real-time session remaining open even after you close the websocket connection.

Not properly closing your session by sending the terminate_session message can sometimes cause you to exceed your throttle limit. If you are using real-time and unexpectedly receive an email that your account has been throttled check to ensure that you are properly closing your sessions.


Concurrency limits can be adjusted based on the needs of each individual customer. If you need a higher limit that your existing concurrency limit please reach out to us at to discuss having your limit increased.