What are my concurrency limits?
A concurrency limit is the number of requests that can be processed for an account at any given time.
Each account has an assigned concurrency limit (also referred to as a throttle limit). For free accounts, the default is 5 for Asynchronous and Streaming transcriptions. For upgraded accounts, the default is 200 for Asynchronous transcriptions and 100 for Streaming transcriptions.
You can check your concurrency limits on your dashboard.
Asynchronous speech-to-text limits
Below are the default limits for how many requests you can have processing in parallel when submitting jobs to our /v2/transcript
endpoint.
Streaming speech-to-text limits
AssemblyAI’s Universal-Streaming API features unlimited, automatic scaling concurrency limits for paid accounts that are dynamic based on usage. We do not limit the total number of concurrent streaming sessions. Instead, there is only a limit on the number of new streaming sessions that can be created per minute. These limits start at:
Anytime you are utilizing 70% or more of your current limit, the number of new streams able to be opened over the next minute will automatically increase by 10%.
Assuming you follow this pattern minute-over-minute maxing out your available new sessions rate limit for 5 minutes (opening 100, then 110, then 121, then 133, then 146 new streams each minute), you’d have 610
total concurrent streams. Over the next 60 seconds, a maximum of 161
new streams would be able to be opened.
You can find more information on Concurrency Limits in our Documentation here!
Need a higher concurrency?
We offer custom concurrency limits that scale to support any workload at no additional cost. If you need a higher concurrency limit please either contact our Sales team or reach out to us at support@assemblyai.com.
Rate Limits
In addition to the concurrency limit, there’s a rate limit for the API, which restricts users to a maximum of 20,000 requests per five minutes.