Overview
This guide walks you through transcribing your first audio file with AssemblyAI. You will learn how to submit an audio file for transcription and retrieve the results using the AssemblyAI API. When transcribing an audio file, there are three main things you will want to specify:- The speech models you would like to use (required).
- The region you would like to use (optional).
- Other models you would like to use like Speaker Diarization or PII Redaction (optional).
speech_models is optionalThe
speech_models parameter is optional. If you omit it, the request defaults to ["universal-3-pro", "universal-2"]. See Model selection to learn about available models.Prerequisites
Before you begin, make sure you have:- Python
- Python SDK
- JavaScript
- JavaScript SDK
- An AssemblyAI API key (get one by signing up at assemblyai.com)
- Python 3.6 or later installed
- The
requestslibrary (pip install requests)
Step 1: Set up your API credentials
First, configure your API endpoint and authentication:- Python
- Python SDK
- JavaScript
- JavaScript SDK
YOUR_API_KEY with your actual AssemblyAI API key.Need EU data residency?Use our EU endpoint by changing
base_url to
"https://api.eu.assemblyai.com".Step 2: Specify your audio source
You can transcribe audio files in two ways:- Python
- Python SDK
- JavaScript
- JavaScript SDK
Option A: Use a publicly accessible URLOption B: Upload a local fileIf your audio file is stored locally, upload it to AssemblyAI first:
Step 3: Submit the transcription request
Create a request with your audio URL and desired configuration options:- Python
- Python SDK
- JavaScript
- JavaScript SDK
- Uses both the
universal-3-proanduniversal-2models for broad language coverage. Learn more about our different speech recognition models here. - Uses our Automatic Language Detection model to detect the dominant language in the spoken audio.
- Uses our Speaker Diarization model to create turn-by-turn utterances.
Log the transcript ID for every request
The
The
id field returned from POST /v2/transcript is the transcript ID. Persist it (along with a timestamp and the API region) for every transcription request, not just when you hit an error. The transcript ID is required to fetch results, retry, or delete the transcript later — and it’s the first thing support@assemblyai.com will ask for when troubleshooting a specific request. See Troubleshoot common errors for the full debugging flow.Step 4: Poll for the transcription result
Transcription happens asynchronously. Poll the API until the transcription is complete:- Python
- Python SDK
- JavaScript
- JavaScript SDK
Step 5: Access speaker diarization (optional)
If you enabled speaker labels, you can access the speaker-separated utterances:- Python
- Python SDK
- JavaScript
- JavaScript SDK
Complete example
Here is the full working code:- Python
- Python SDK
- JavaScript
- JavaScript SDK
Next steps
Now that you have transcribed your first audio file:- Learn how you can do even more with Universal-3 Pro with prompting
- Explore our Speech Understanding features for more ways to analyze your audio data
- Learn more about searching, summarizing, or asking questions on your transcript with our LLM Gateway feature
- Find out how to use webhooks to get notified when your transcripts are ready