Overview
By the end of this guide, you’ll have a working script that transcribes an audio file in a single SDK call. Build it with an AI coding agent, or write it yourself — both are below. Prefer to try it first? Transcribe audio without writing any code in the AssemblyAI Playground.Before you begin
You’ll need:-
An API key — grab one from your dashboard. Every example below reads it from an environment variable, so set it once:
- Python 3.8+ or Node.js 18+, depending on which SDK you use.
Transcribe your first file
Prefer to write it yourself? Follow these steps to transcribe our hosted sample file. The SDK uploads, submits, and polls for you in a single call.Step 1: Install the SDK
- Python SDK
- JavaScript SDK
Step 2: Run your first transcription
Save this astranscribe.py (Python) or transcribe.js (JavaScript):
- Python SDK
- JavaScript SDK
python transcribe.py or node transcribe.js. You’ll see the transcript printed:
Customize your request
The call above works with no extra configuration. Add capabilities by setting options on the same request — combine as many as you need (the complete example sets several at once).Transcribe a local file
Pass a file path instead of a URL; the SDK uploads it for you.- Python SDK
- JavaScript SDK
Identify speakers
Enable Speaker Diarization to split the transcript by speaker. Each labeled segment (an utterance) has a speaker ID and its text.- Python SDK
- JavaScript SDK
Detect the language automatically
Use Automatic Language Detection to detect the dominant spoken language.- Python SDK
- JavaScript SDK
Complete example
Here’s the complete, runnable script — the call above plus options and error handling:- Python SDK
- JavaScript SDK
What you get back
A completed transcript includes the fulltext plus metadata, and per-speaker utterances when you enable speaker_labels. The SDK exposes these as attributes (transcript.text, transcript.utterances[0].speaker); the raw API returns the same fields as JSON:
start and end are in milliseconds. Persist id to fetch, retry, or delete the transcript later. See the transcript API reference for the complete field list.
Using the HTTP API directly
Not using an SDK? The same flow works over plain HTTP — authenticate with your key in theauthorization header (no Bearer prefix), submit to POST /v2/transcript, then poll (repeatedly call GET /v2/transcript/{id}) until the status is completed. The SDKs above do all of this for you, including uploading local files and polling.
All three examples read your key from the same ASSEMBLYAI_API_KEY environment variable you set in Before you begin. The cURL example also needs jq (brew install jq); the Python example needs the requests library (pip install requests); the JavaScript example needs Node.js 18+ (built-in fetch).
- cURL
- Python
- JavaScript
Submit the file, poll until the status is To transcribe a local file, upload it first and use the returned
completed, then print the text. (The variable is named state because zsh reserves status.)upload_url as the audio_url:Limits
- File size: up to 5 GB per request (
/v2/transcript); local files uploaded via/v2/uploadup to 2.2 GB. - Duration: 160 ms to 10 hours per file.
- Formats: most common audio and video formats — submit your file as-is, no transcoding needed.
- Concurrency: default 5 parallel jobs on free accounts, 200 on paid. Check yours on the rate limits page.
Next steps
Now that you have transcribed your first audio file:- Learn how you can do even more with Universal-3 Pro with prompting
- Explore our Speech Understanding features for more ways to analyze your audio data
- Learn more about searching, summarizing, or asking questions on your transcript with our LLM Gateway feature
- Find out how to use webhooks to get notified when your transcripts are ready