Documentation Index Fetch the complete documentation index at: https://assemblyai.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
Overview
This guide walks you through transcribing your first audio file with AssemblyAI. You will learn how to submit an audio file for transcription and retrieve the results using the AssemblyAI API.
Building a medical scribe or clinical documentation app? Check out the Medical Scribe guides for post-visit and real-time transcription workflows with Medical Mode, HIPAA-compliant configuration, and SOAP note generation.
When transcribing an audio file, there are three main things you will want to specify:
The speech models you would like to use (required).
The region you would like to use (optional).
Other models you would like to use like Speaker Diarization or PII Redaction (optional).
speech_models is required You must include the speech_models parameter in every transcription request. There is no default model for pre-recorded transcription. If you omit speech_models, the request will fail. See Model selection to learn about available models.
Recommended model We recommend Universal-3 Pro for pre-recorded audio transcription. It delivers the highest accuracy and fastest transcription out of the box, with optional prompting for when you need more control. For the broadest language coverage (99 languages), use ["universal-3-pro", "universal-2"] to automatically fall back to Universal-2 for unsupported languages.
Prerequisites
Before you begin, make sure you have:
Python
Python SDK
JavaScript
JavaScript SDK
An AssemblyAI API key (get one by signing up at assemblyai.com )
Python 3.6 or later installed
The requests library (pip install requests)
An AssemblyAI API key (get one by signing up at assemblyai.com )
Python 3.8 or later installed
The assemblyai package (pip install assemblyai)
An AssemblyAI API key (get one by signing up at assemblyai.com )
Node.js 18 or later installed
The fs-extra package (npm install fs-extra)
An AssemblyAI API key (get one by signing up at assemblyai.com )
Node.js 18 or later installed
The assemblyai package (npm install assemblyai)
Step 1: Set up your API credentials
First, configure your API endpoint and authentication:
Python
Python SDK
JavaScript
JavaScript SDK
import requests
import time
base_url = "https://api.assemblyai.com"
headers = { "authorization" : "YOUR_API_KEY" }
Replace YOUR_API_KEY with your actual AssemblyAI API key. Need EU data residency? Use our EU endpoint by changing base_url to
"https://api.eu.assemblyai.com".
import assemblyai as aai
aai.settings.base_url = "https://api.assemblyai.com"
aai.settings.api_key = "YOUR_API_KEY"
Replace YOUR_API_KEY with your actual AssemblyAI API key. Need EU data residency? Use our EU endpoint by changing base_url to
"https://api.eu.assemblyai.com".
import fs from "fs-extra" ;
const baseUrl = "https://api.assemblyai.com" ;
const headers = {
authorization: "YOUR_API_KEY" ,
};
Replace YOUR_API_KEY with your actual AssemblyAI API key. Need EU data residency? Use our EU endpoint by changing baseUrl to
"https://api.eu.assemblyai.com".
import { AssemblyAI } from "assemblyai" ;
const baseUrl = "https://api.assemblyai.com" ;
const client = new AssemblyAI ({
apiKey: "YOUR_API_KEY" ,
baseUrl: baseUrl ,
});
Replace YOUR_API_KEY with your actual AssemblyAI API key. Need EU data residency? Use our EU endpoint by changing baseUrl to
"https://api.eu.assemblyai.com".
Step 2: Specify your audio source
You can transcribe audio files in two ways:
Python
Python SDK
JavaScript
JavaScript SDK
Option A: Use a publicly accessible URL audio_file = "https://assembly.ai/wildfires.mp3"
Option B: Upload a local file If your audio file is stored locally, upload it to AssemblyAI first: with open ( "./example.mp3" , "rb" ) as f:
response = requests.post(base_url + "/v2/upload" , headers = headers, data = f)
if response.status_code != 200 :
print ( f "Error: { response.status_code } , Response: { response.text } " )
response.raise_for_status()
upload_json = response.json()
audio_file = upload_json[ "upload_url" ]
Option A: Use a publicly accessible URL audio_file = "https://assembly.ai/wildfires.mp3"
Option B: Use a local file audio_file = "./example.mp3"
The SDK handles local file uploads automatically. Option A: Use a publicly accessible URL const audioFile = "https://assembly.ai/wildfires.mp3" ;
Option B: Upload a local file If your audio file is stored locally, upload it to AssemblyAI first: const audioData = await fs . readFile ( "./example.mp3" );
let res = await fetch ( ` ${ baseUrl } /v2/upload` , {
method: "POST" ,
headers ,
body: audioData ,
});
if ( ! res . ok ) throw new Error ( `Error: ${ res . status } ` );
const uploadResponse = await res . json ();
const audioFile = uploadResponse . upload_url ;
Option A: Use a publicly accessible URL const audioFile = "https://assembly.ai/wildfires.mp3" ;
Option B: Use a local file const audioFile = "./example.mp3" ;
The SDK handles local file uploads automatically.
Step 3: Submit the transcription request
Create a request with your audio URL and desired configuration options:
Python
Python SDK
JavaScript
JavaScript SDK
data = {
"audio_url" : audio_file,
"speech_models" : [ "universal-3-pro" , "universal-2" ],
"language_detection" : True ,
"speaker_labels" : True
}
response = requests.post(base_url + "/v2/transcript" , headers = headers, json = data)
if response.status_code != 200 :
print ( f "Error: { response.status_code } , Response: { response.text } " )
response.raise_for_status()
transcript_json = response.json()
transcript_id = transcript_json[ "id" ]
config = aai.TranscriptionConfig(
speech_models = [ "universal-3-pro" , "universal-2" ],
language_detection = True ,
speaker_labels = True ,
)
transcript = aai.Transcriber().transcribe(audio_file, config = config)
const data = {
audio_url: audioFile ,
speech_models: [ "universal-3-pro" , "universal-2" ],
language_detection: true ,
speaker_labels: true ,
};
let res = await fetch ( ` ${ baseUrl } /v2/transcript` , {
method: "POST" ,
headers: { ... headers , "Content-Type" : "application/json" },
body: JSON . stringify ( data ),
});
if ( ! res . ok ) throw new Error ( `Error: ${ res . status } ` );
const transcriptResponse = await res . json ();
const transcriptId = transcriptResponse . id ;
const params = {
audio: audioFile ,
speech_models: [ "universal-3-pro" , "universal-2" ],
language_detection: true ,
speaker_labels: true ,
};
const transcript = await client . transcripts . transcribe ( params );
This configuration:
Log the transcript ID for every request
The id field returned from POST /v2/transcript is the transcript ID. Persist it (along with a timestamp and the API region) for every transcription request, not just when you hit an error. The transcript ID is required to fetch results, retry, or delete the transcript later — and it’s the first thing support@assemblyai.com will ask for when troubleshooting a specific request. See Troubleshoot common errors for the full debugging flow.
Model Pricing
Pricing can vary based on the speech model used in the request.If you already have an account with us, you can find your specific pricing on the Billing page of your dashboard. If you are a new customer, you can find general pricing information here .
Step 4: Poll for the transcription result
Transcription happens asynchronously. Poll the API until the transcription is complete:
Python
Python SDK
JavaScript
JavaScript SDK
polling_endpoint = f " { base_url } /v2/transcript/ { transcript_id } "
while True :
transcript = requests.get(polling_endpoint, headers = headers).json()
if transcript[ "status" ] == "completed" :
print ( f " \n Full Transcript: \n\n { transcript[ 'text' ] } " )
break
elif transcript[ "status" ] == "error" :
raise RuntimeError ( f "Transcription failed: { transcript[ 'error' ] } " )
else :
time.sleep( 3 )
The polling loop checks the transcription status every 3 seconds and prints the full transcript once processing is complete. The SDK handles polling automatically. Check the result: if transcript.status == aai.TranscriptStatus.error:
raise RuntimeError ( f "Transcription failed: { transcript.error } " )
print ( f " \n Full Transcript: \n\n { transcript.text } " )
const pollingEndpoint = ` ${ baseUrl } /v2/transcript/ ${ transcriptId } ` ;
let transcript ;
while ( true ) {
let res = await fetch ( pollingEndpoint , { headers });
if ( ! res . ok ) throw new Error ( `Error: ${ res . status } ` );
transcript = await res . json ();
if ( transcript . status === "completed" ) {
console . log ( ` \n Full Transcript: \n\n ${ transcript . text } ` );
break ;
} else if ( transcript . status === "error" ) {
throw new Error ( `Transcription failed: ${ transcript . error } ` );
} else {
await new Promise (( resolve ) => setTimeout ( resolve , 3000 ));
}
}
The polling loop checks the transcription status every 3 seconds and prints the full transcript once processing is complete. The SDK handles polling automatically. Check the result: if ( transcript . status === "error" ) {
throw new Error ( `Transcription failed: ${ transcript . error } ` );
}
console . log ( ` \n Full Transcript: \n\n ${ transcript . text } ` );
Step 5: Access speaker diarization (optional)
If you enabled speaker labels, you can access the speaker-separated utterances:
Python
Python SDK
JavaScript
JavaScript SDK
for utterance in transcript[ 'utterances' ]:
print ( f "Speaker { utterance[ 'speaker' ] } : { utterance[ 'text' ] } " )
for utterance in transcript.utterances:
print ( f "Speaker { utterance.speaker } : { utterance.text } " )
for ( const utterance of transcript . utterances ) {
console . log ( `Speaker ${ utterance . speaker } : ${ utterance . text } ` );
}
for ( const utterance of transcript . utterances ) {
console . log ( `Speaker ${ utterance . speaker } : ${ utterance . text } ` );
}
Complete example
Here is the full working code:
Python
Python SDK
JavaScript
JavaScript SDK
import requests
import time
base_url = "https://api.assemblyai.com"
headers = { "authorization" : "YOUR_API_KEY" }
# Use a publicly-accessible URL
audio_file = "https://assembly.ai/wildfires.mp3"
# Or upload a local file:
# with open("./example.mp3", "rb") as f:
# response = requests.post(base_url + "/v2/upload", headers=headers, data=f)
# if response.status_code != 200:
# print(f"Error: {response.status_code}, Response: {response.text}")
# response.raise_for_status()
# upload_json = response.json()
# audio_file = upload_json["upload_url"]
data = {
"audio_url" : audio_file,
"speech_models" : [ "universal-3-pro" , "universal-2" ],
"language_detection" : True ,
"speaker_labels" : True
}
response = requests.post(base_url + "/v2/transcript" , headers = headers, json = data)
if response.status_code != 200 :
print ( f "Error: { response.status_code } , Response: { response.text } " )
response.raise_for_status()
transcript_json = response.json()
transcript_id = transcript_json[ "id" ]
polling_endpoint = f " { base_url } /v2/transcript/ { transcript_id } "
while True :
transcript = requests.get(polling_endpoint, headers = headers).json()
if transcript[ "status" ] == "completed" :
print ( f " \n Full Transcript: \n\n { transcript[ 'text' ] } " )
# Optionally print speaker diarization results
# for utterance in transcript['utterances']:
# print(f"Speaker {utterance['speaker']}: {utterance['text']}")
break
elif transcript[ "status" ] == "error" :
raise RuntimeError ( f "Transcription failed: { transcript[ 'error' ] } " )
else :
time.sleep( 3 )
See all 48 lines
import assemblyai as aai
aai.settings.base_url = "https://api.assemblyai.com"
aai.settings.api_key = "YOUR_API_KEY"
# Use a publicly-accessible URL
audio_file = "https://assembly.ai/wildfires.mp3"
# Or use a local file:
# audio_file = "./example.mp3"
config = aai.TranscriptionConfig(
speech_models = [ "universal-3-pro" , "universal-2" ],
language_detection = True ,
speaker_labels = True ,
)
transcript = aai.Transcriber().transcribe(audio_file, config = config)
if transcript.status == aai.TranscriptStatus.error:
raise RuntimeError ( f "Transcription failed: { transcript.error } " )
print ( f " \n Full Transcript: \n\n { transcript.text } " )
# Optionally print speaker diarization results
# for utterance in transcript.utterances:
# print(f"Speaker {utterance.speaker}: {utterance.text}")
See all 27 lines
import fs from "fs-extra" ;
const baseUrl = "https://api.assemblyai.com" ;
const headers = {
authorization: "YOUR_API_KEY" ,
};
async function transcribe () {
// Use a publicly-accessible URL
const audioFile = "https://assembly.ai/wildfires.mp3" ;
// Or upload a local file:
// const audioData = await fs.readFile("./example.mp3");
// const uploadRes = await fetch(`${baseUrl}/v2/upload`, {
// method: "POST",
// headers,
// body: audioData,
// });
// if (!uploadRes.ok) throw new Error(`Error: ${uploadRes.status}`);
// const uploadResponse = await uploadRes.json();
// const audioFile = uploadResponse.upload_url;
const data = {
audio_url: audioFile ,
speech_models: [ "universal-3-pro" , "universal-2" ],
language_detection: true ,
speaker_labels: true ,
};
let res = await fetch ( ` ${ baseUrl } /v2/transcript` , {
method: "POST" ,
headers: { ... headers , "Content-Type" : "application/json" },
body: JSON . stringify ( data ),
});
if ( ! res . ok ) throw new Error ( `Error: ${ res . status } ` );
const transcriptResponse = await res . json ();
const transcriptId = transcriptResponse . id ;
const pollingEndpoint = ` ${ baseUrl } /v2/transcript/ ${ transcriptId } ` ;
while ( true ) {
res = await fetch ( pollingEndpoint , { headers });
if ( ! res . ok ) throw new Error ( `Error: ${ res . status } ` );
const transcript = await res . json ();
if ( transcript . status === "completed" ) {
console . log ( ` \n Full Transcript: \n\n ${ transcript . text } ` );
// Optionally print speaker diarization results
// for (const utterance of transcript.utterances) {
// console.log(`Speaker ${utterance.speaker}: ${utterance.text}`);
// }
break ;
} else if ( transcript . status === "error" ) {
throw new Error ( `Transcription failed: ${ transcript . error } ` );
} else {
await new Promise (( resolve ) => setTimeout ( resolve , 3000 ));
}
}
}
transcribe ();
See all 62 lines
import { AssemblyAI } from "assemblyai" ;
const baseUrl = "https://api.assemblyai.com" ;
const client = new AssemblyAI ({
apiKey: "YOUR_API_KEY" ,
baseUrl: baseUrl ,
});
// Use a publicly-accessible URL
const audioFile = "https://assembly.ai/wildfires.mp3" ;
// Or use a local file:
// const audioFile = "./example.mp3";
const params = {
audio: audioFile ,
speech_models: [ "universal-3-pro" , "universal-2" ],
language_detection: true ,
speaker_labels: true ,
};
const run = async () => {
const transcript = await client . transcripts . transcribe ( params );
if ( transcript . status === "error" ) {
throw new Error ( `Transcription failed: ${ transcript . error } ` );
}
console . log ( ` \n Full Transcript: \n\n ${ transcript . text } ` );
// Optionally print speaker diarization results
// for (const utterance of transcript.utterances) {
// console.log(`Speaker ${utterance.speaker}: ${utterance.text}`);
// }
};
run ();
See all 38 lines
Next steps
Now that you have transcribed your first audio file:
For more information, check out the full API reference documentation .