> ## Documentation Index
> Fetch the complete documentation index at: https://assemblyai.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Quickstart

> Learn how to transcribe and analyze an audio file.

## Overview

By the end of this guide, you'll have a working script that transcribes an audio file in a single SDK call. Build it with an AI coding agent, or write it yourself — both are below.

Prefer to try it first? Transcribe audio without writing any code in the [AssemblyAI Playground](https://www.assemblyai.com/playground).

## Before you begin

You'll need:

* **An API key** — grab one from [your dashboard](https://www.assemblyai.com/dashboard/api-keys). Every example below reads it from an environment variable, so set it once:

  ```bash theme={null}
  export ASSEMBLYAI_API_KEY=<your-key>
  ```

* **Python 3.8+ or Node.js 18+**, depending on which SDK you use.

**Building with an AI coding agent?** Wire it up to AssemblyAI's live docs (MCP server) and the AssemblyAI skill so it writes correct, up-to-date code instead of relying on stale training data:

```bash theme={null}
claude mcp add --transport http --scope user assemblyai-docs https://assemblyai.com/docs/mcp
npx skills add AssemblyAI/assemblyai-skill --global
```

Then describe what you want to build. To get the same result as the steps below, paste:

```text theme={null}
Use the AssemblyAI Python SDK to transcribe https://assembly.ai/wildfires.mp3 and print the transcript text.
```

## Transcribe your first file

Prefer to write it yourself? Follow these steps to transcribe our hosted sample file. The SDK uploads, submits, and polls for you in a single call.

### Step 1: Install the SDK

<Tabs groupId="language">
  <Tab language="python-sdk" title="Python SDK" default>
    ```bash theme={null}
    pip install assemblyai
    ```
  </Tab>

  <Tab language="javascript-sdk" title="JavaScript SDK">
    ```bash theme={null}
    npm install assemblyai
    ```
  </Tab>
</Tabs>

### Step 2: Run your first transcription

Save this as `transcribe.py` (Python) or `transcribe.js` (JavaScript):

<Tabs groupId="language">
  <Tab language="python-sdk" title="Python SDK" default>
    ```python theme={null}
    import os
    import assemblyai as aai

    aai.settings.api_key = os.environ["ASSEMBLYAI_API_KEY"]

    transcript = aai.Transcriber().transcribe("https://assembly.ai/wildfires.mp3")
    print(transcript.text)
    ```
  </Tab>

  <Tab language="javascript-sdk" title="JavaScript SDK">
    ```javascript theme={null}
    import { AssemblyAI } from "assemblyai";

    const client = new AssemblyAI({ apiKey: process.env.ASSEMBLYAI_API_KEY });

    const transcript = await client.transcripts.transcribe({
      audio: "https://assembly.ai/wildfires.mp3",
    });
    console.log(transcript.text);
    ```
  </Tab>
</Tabs>

Then run it — `python transcribe.py` or `node transcribe.js`. You'll see the transcript printed:

```text theme={null}
Smoke from hundreds of wildfires in Canada is triggering air quality alerts throughout the US...
```

That's the whole first call. From here you can add options — speaker labels, language detection, or a local file — see the [complete example](#complete-example) to combine them, or use the [HTTP API directly](#using-the-http-api-directly) if you're not using an SDK.

## Customize your request

The call above works with no extra configuration. Add capabilities by setting options on the same request — combine as many as you need (the [complete example](#complete-example) sets several at once).

### Transcribe a local file

Pass a file path instead of a URL; the SDK uploads it for you.

<Tabs groupId="language">
  <Tab language="python-sdk" title="Python SDK" default>
    ```python theme={null}
    transcript = aai.Transcriber().transcribe("./example.mp3")
    ```
  </Tab>

  <Tab language="javascript-sdk" title="JavaScript SDK">
    ```javascript theme={null}
    const transcript = await client.transcripts.transcribe({
      audio: "./example.mp3",
    });
    ```
  </Tab>
</Tabs>

### Identify speakers

Enable [Speaker Diarization](/pre-recorded-audio/label-speakers) to split the transcript by speaker. Each labeled segment (an *utterance*) has a speaker ID and its text.

<Tabs groupId="language">
  <Tab language="python-sdk" title="Python SDK" default>
    ```python theme={null}
    config = aai.TranscriptionConfig(speaker_labels=True)
    transcript = aai.Transcriber().transcribe("https://assembly.ai/wildfires.mp3", config=config)

    for utterance in transcript.utterances:
        print(f"Speaker {utterance.speaker}: {utterance.text}")
    ```
  </Tab>

  <Tab language="javascript-sdk" title="JavaScript SDK">
    ```javascript theme={null}
    const transcript = await client.transcripts.transcribe({
      audio: "https://assembly.ai/wildfires.mp3",
      speaker_labels: true,
    });

    for (const utterance of transcript.utterances) {
      console.log(`Speaker ${utterance.speaker}: ${utterance.text}`);
    }
    ```
  </Tab>
</Tabs>

### Detect the language automatically

Use [Automatic Language Detection](/pre-recorded-audio/language-detection) to detect the dominant spoken language.

<Tabs groupId="language">
  <Tab language="python-sdk" title="Python SDK" default>
    ```python theme={null}
    config = aai.TranscriptionConfig(language_detection=True)
    transcript = aai.Transcriber().transcribe("https://assembly.ai/wildfires.mp3", config=config)
    ```
  </Tab>

  <Tab language="javascript-sdk" title="JavaScript SDK">
    ```javascript theme={null}
    const transcript = await client.transcripts.transcribe({
      audio: "https://assembly.ai/wildfires.mp3",
      language_detection: true,
    });
    ```
  </Tab>
</Tabs>

## Complete example

Here's the complete, runnable script — the call above plus options and error handling:

<Tabs groupId="language">
  <Tab language="python-sdk" title="Python SDK" default>
    ```python expandable theme={null}
    import os
    import assemblyai as aai

    aai.settings.base_url = "https://api.assemblyai.com"
    aai.settings.api_key = os.environ["ASSEMBLYAI_API_KEY"]

    # Use a publicly-accessible URL
    audio_file = "https://assembly.ai/wildfires.mp3"

    # Or use a local file:
    # audio_file = "./example.mp3"

    config = aai.TranscriptionConfig(
        speech_models=["universal-3-pro", "universal-2"],
        language_detection=True,
        speaker_labels=True,
    )

    transcript = aai.Transcriber().transcribe(audio_file, config=config)

    if transcript.status == aai.TranscriptStatus.error:
        raise RuntimeError(f"Transcription failed: {transcript.error}")

    # Log transcript.id for every request (not just errors), with a timestamp and API region.
    # It's required to fetch results, retry, or delete the transcript later, and it's the first
    # thing support@assemblyai.com asks for. Delete: /pre-recorded-audio/delete-transcripts
    # Troubleshooting: /pre-recorded-audio/guides/common_errors_and_solutions

    print(f"\nFull Transcript:\n\n{transcript.text}")

    # Optionally print speaker diarization results
    # for utterance in transcript.utterances:
    #     print(f"Speaker {utterance.speaker}: {utterance.text}")
    ```
  </Tab>

  <Tab language="javascript-sdk" title="JavaScript SDK">
    ```javascript expandable theme={null}
    import { AssemblyAI } from "assemblyai";

    const baseUrl = "https://api.assemblyai.com";

    const client = new AssemblyAI({
      apiKey: process.env.ASSEMBLYAI_API_KEY,
      baseUrl: baseUrl,
    });

    // Use a publicly-accessible URL
    const audioFile = "https://assembly.ai/wildfires.mp3";

    // Or use a local file:
    // const audioFile = "./example.mp3";

    const params = {
      audio: audioFile,
      speech_models: ["universal-3-pro", "universal-2"],
      language_detection: true,
      speaker_labels: true,
    };

    const run = async () => {
      const transcript = await client.transcripts.transcribe(params);

      if (transcript.status === "error") {
        throw new Error(`Transcription failed: ${transcript.error}`);
      }

      // Log transcript.id for every request (not just errors), with a timestamp and API region.
      // It's required to fetch results, retry, or delete the transcript later, and it's the first
      // thing support@assemblyai.com asks for. Delete: /pre-recorded-audio/delete-transcripts
      // Troubleshooting: /pre-recorded-audio/guides/common_errors_and_solutions

      console.log(`\nFull Transcript:\n\n${transcript.text}`);

      // Optionally print speaker diarization results
      // for (const utterance of transcript.utterances) {
      //   console.log(`Speaker ${utterance.speaker}: ${utterance.text}`);
      // }
    };

    run();
    ```
  </Tab>
</Tabs>

## What you get back

A completed transcript includes the full `text` plus metadata, and per-speaker `utterances` when you enable `speaker_labels`. The SDK exposes these as attributes (`transcript.text`, `transcript.utterances[0].speaker`); the raw API returns the same fields as JSON:

```json theme={null}
{
  "id": "106993b6-ac12-45d0-b74a-1bbd923e755d",
  "status": "completed",
  "text": "Smoke from hundreds of wildfires in Canada is triggering air quality alerts...",
  "language_code": "en",
  "audio_duration": 282,
  "confidence": 0.95,
  "utterances": [
    {
      "speaker": "A",
      "text": "Smoke from hundreds of wildfires in Canada is triggering air quality alerts...",
      "confidence": 0.97,
      "start": 100,
      "end": 26560,
      "words": [
        { "text": "Smoke", "start": 100, "end": 640, "confidence": 0.9, "speaker": "A" }
      ]
    }
  ]
}
```

`start` and `end` are in milliseconds. Persist `id` to fetch, retry, or delete the transcript later. See the [transcript API reference](/api-reference/transcripts/get) for the complete field list.

## Using the HTTP API directly

Not using an SDK? The same flow works over plain HTTP — authenticate with your key in the `authorization` header (no `Bearer` prefix), submit to `POST /v2/transcript`, then poll (repeatedly call `GET /v2/transcript/{id}`) until the status is `completed`. The SDKs above do all of this for you, including uploading local files and polling.

All three examples read your key from the same `ASSEMBLYAI_API_KEY` environment variable you set in [Before you begin](#before-you-begin). The cURL example also needs [`jq`](https://jqlang.github.io/jq/) (`brew install jq`); the Python example needs the `requests` library (`pip install requests`); the JavaScript example needs Node.js 18+ (built-in `fetch`).

<Tabs groupId="language">
  <Tab language="curl" title="cURL" default>
    Submit the file, poll until the status is `completed`, then print the text. (The variable is named `state` because zsh reserves `status`.)

    ```bash expandable theme={null}
    id=$(curl -s -X POST https://api.assemblyai.com/v2/transcript \
      -H "authorization: $ASSEMBLYAI_API_KEY" \
      -H "content-type: application/json" \
      -d '{
        "audio_url": "https://assembly.ai/wildfires.mp3",
        "speech_models": ["universal-3-pro", "universal-2"],
        "language_detection": true,
        "speaker_labels": true
      }' | jq -r .id)

    while true; do
      state=$(curl -s https://api.assemblyai.com/v2/transcript/$id \
        -H "authorization: $ASSEMBLYAI_API_KEY" | jq -r .status)
      [ "$state" = "completed" ] && break
      [ "$state" = "error" ] && { echo "Transcription failed"; break; }
      sleep 3
    done

    curl -s https://api.assemblyai.com/v2/transcript/$id \
      -H "authorization: $ASSEMBLYAI_API_KEY" | jq -r .text
    ```

    To transcribe a local file, upload it first and use the returned `upload_url` as the `audio_url`:

    ```bash theme={null}
    curl -s -X POST https://api.assemblyai.com/v2/upload \
      -H "authorization: $ASSEMBLYAI_API_KEY" \
      --data-binary @./example.mp3 | jq -r .upload_url
    ```

    <Warning>
      The file must be streamed as raw bytes with `curl --data-binary @<file>` (note the `@`). Using `-d`/`--data`, or passing a JSON body or a file-path string, will return a successful `upload_url` but then fail downstream at transcription with a `Transcoding failed. File type application/json` or `text/plain` error. See [Troubleshoot Common Errors](/pre-recorded-audio/guides/common_errors_and_solutions) for details.
    </Warning>
  </Tab>

  <Tab language="python" title="Python">
    ```python expandable theme={null}
    import os
    import requests
    import time

    base_url = "https://api.assemblyai.com"
    headers = {"authorization": os.environ["ASSEMBLYAI_API_KEY"]}

    # Use a publicly-accessible URL
    audio_file = "https://assembly.ai/wildfires.mp3"

    # Or upload a local file:
    # with open("./example.mp3", "rb") as f:
    #     response = requests.post(base_url + "/v2/upload", headers=headers, data=f)
    #     if response.status_code != 200:
    #         print(f"Error: {response.status_code}, Response: {response.text}")
    #         response.raise_for_status()
    #     upload_json = response.json()
    #     audio_file = upload_json["upload_url"]

    data = {
        "audio_url": audio_file,
        "speech_models": ["universal-3-pro", "universal-2"],
        "language_detection": True,
        "speaker_labels": True
    }

    response = requests.post(base_url + "/v2/transcript", headers=headers, json=data)

    if response.status_code != 200:
        print(f"Error: {response.status_code}, Response: {response.text}")
        response.raise_for_status()

    transcript_json = response.json()
    transcript_id = transcript_json["id"]
    polling_endpoint = f"{base_url}/v2/transcript/{transcript_id}"

    while True:
        transcript = requests.get(polling_endpoint, headers=headers).json()
        if transcript["status"] == "completed":
            print(f"\nFull Transcript:\n\n{transcript['text']}")

            # Optionally print speaker diarization results
            # for utterance in transcript['utterances']:
            #     print(f"Speaker {utterance['speaker']}: {utterance['text']}")
            break
        elif transcript["status"] == "error":
            raise RuntimeError(f"Transcription failed: {transcript['error']}")
        else:
            time.sleep(3)
    ```
  </Tab>

  <Tab language="javascript" title="JavaScript">
    ```javascript expandable theme={null}
    const baseUrl = "https://api.assemblyai.com";

    const headers = {
      authorization: process.env.ASSEMBLYAI_API_KEY,
    };

    async function transcribe() {
      // Use a publicly-accessible URL
      const audioFile = "https://assembly.ai/wildfires.mp3";

      // Or upload a local file:
      // import fs from "fs-extra";
      // const audioData = await fs.readFile("./example.mp3");
      // const uploadRes = await fetch(`${baseUrl}/v2/upload`, {
      //   method: "POST",
      //   headers,
      //   body: audioData,
      // });
      // if (!uploadRes.ok) throw new Error(`Error: ${uploadRes.status}`);
      // const uploadResponse = await uploadRes.json();
      // const audioFile = uploadResponse.upload_url;

      const data = {
        audio_url: audioFile,
        speech_models: ["universal-3-pro", "universal-2"],
        language_detection: true,
        speaker_labels: true,
      };

      let res = await fetch(`${baseUrl}/v2/transcript`, {
        method: "POST",
        headers: { ...headers, "Content-Type": "application/json" },
        body: JSON.stringify(data),
      });
      if (!res.ok) throw new Error(`Error: ${res.status}`);
      const transcriptResponse = await res.json();
      const transcriptId = transcriptResponse.id;
      const pollingEndpoint = `${baseUrl}/v2/transcript/${transcriptId}`;

      while (true) {
        res = await fetch(pollingEndpoint, { headers });
        if (!res.ok) throw new Error(`Error: ${res.status}`);
        const transcript = await res.json();

        if (transcript.status === "completed") {
          console.log(`\nFull Transcript:\n\n${transcript.text}`);

          // Optionally print speaker diarization results
          // for (const utterance of transcript.utterances) {
          //   console.log(`Speaker ${utterance.speaker}: ${utterance.text}`);
          // }
          break;
        } else if (transcript.status === "error") {
          throw new Error(`Transcription failed: ${transcript.error}`);
        } else {
          await new Promise((resolve) => setTimeout(resolve, 3000));
        }
      }
    }

    transcribe();
    ```
  </Tab>
</Tabs>

## Limits

* **File size:** up to 5 GB per request (`/v2/transcript`); local files uploaded via `/v2/upload` up to 2.2 GB.
* **Duration:** 160 ms to 10 hours per file.
* **Formats:** most common audio and video formats — submit your file as-is, no transcoding needed.
* **Rate limit:** default 5 parallel jobs on free accounts, 200 on paid. Check yours on the [rate limits page](https://www.assemblyai.com/dashboard/rate-limits).

## Next steps

Now that you have transcribed your first audio file:

* Learn how you can do even more with Universal-3 Pro with [prompting](/pre-recorded-audio/universal-3-pro/prompting)
* Explore [our Speech Understanding features](https://www.assemblyai.com/products/speech-understanding) for more ways to analyze your audio data
* Learn more about searching, summarizing, or asking questions on your transcript with [our LLM Gateway feature](/llm-gateway/quickstart)
* Find out how to use [webhooks](/pre-recorded-audio/webhooks) to get notified when your transcripts are ready

For more information, check out the full [API reference documentation](/).

## Need some help?

If you get stuck, or have any other questions, we'd love to help you out. Contact our support team at [support@assemblyai.com](mailto:support@assemblyai.com) or create a [support ticket](https://www.assemblyai.com/contact/support).
