> ## Documentation Index
> Fetch the complete documentation index at: https://assemblyai.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Migration guide: AWS Transcribe to AssemblyAI

This guide walks through the process of migrating from AWS Transcribe to AssemblyAI for transcribing pre-recorded audio.

### Get Started

Before we begin, make sure you have an AssemblyAI account and an API key. You can [sign up](https://assemblyai.com/dashboard/signup) for a free account and get your API key from your dashboard.

## Side-by-side code comparison

Below is a side-by-side comparison of a basic snippet to transcribe a file by AWS Transcribe and AssemblyAI:

<Tabs groupId="language">
  <Tab language="aws" title="AWS Transcribe">
    ```python expandable theme={null}
    import time
    import boto3

    def transcribe_file(job_name, file_uri, transcribe_client):
        transcribe_client.start_transcription_job(
            TranscriptionJobName=job_name,
            Media={"MediaFileUri": file_uri},
            MediaFormat="wav",
            LanguageCode="en-US",
        )

        max_tries = 60
        while max_tries > 0:
            max_tries -= 1

            job = transcribe_client.get_transcription_job(
                TranscriptionJobName=job_name
            )

            job_status = job["TranscriptionJob"]["TranscriptionJobStatus"]

            if job_status in ["COMPLETED", "FAILED"]:
                print(f"Job {job_name} is {job_status}.")

            if job_status == "COMPLETED":
                print(
                    f"Download the transcript from\n"
                    f"\t{job['TranscriptionJob']['Transcript']['TranscriptFileUri']}."
                )
                break
            else:
                print(f"Waiting for {job_name}. Current status is {job_status}.")
                time.sleep(10)

    def main():
        transcribe_client = boto3.client("transcribe")
        file_uri = "s3://test-transcribe/answer2.wav"
        transcribe_file("Example-job", file_uri, transcribe_client)

    if name == "main":
        main()
    ```
  </Tab>

  <Tab language="aai" title="AssemblyAI">
    ```python expandable theme={null}
    import assemblyai as aai

    aai.settings.api_key = "YOUR-API-KEY"
    transcriber = aai.Transcriber()

    # You can use a local filepath:
    # audio_file = "./example.mp3"
    # Or use a publicly-accessible URL:
    audio_file = (
        "https://assembly.ai/sports_injuries.mp3"
    )

    config = aai.TranscriptionConfig(
        speech_models=["universal-3-5-pro", "universal-2"],
        language_detection=True,
        speaker_labels=True,
    )
    transcript = transcriber.transcribe(audio_file, config)

    if transcript.status == aai.TranscriptStatus.error:
        print(f"Transcription failed: {transcript.error}")
        exit(1)

    print(transcript.text)
    for utterance in transcript.utterances:
        print(f"Speaker {utterance.speaker}: {utterance.text}")
    ```
  </Tab>
</Tabs>

## Installation

<Tabs groupId="language">
  <Tab language="aws" title="AWS Transcribe">
    ```python theme={null}
    import boto3
    import time

    transcribe_client = boto3.client("transcribe")
    ```
  </Tab>

  <Tab language="aai" title="AssemblyAI">
    ```python theme={null}
    import assemblyai as aai

    aai.settings.api_key = "YOUR-API-KEY"
    transcriber = aai.Transcriber()
    ```
  </Tab>
</Tabs>

When migrating from AWS to AssemblyAI, you'll first need to handle authentication and SDK setup:

Get your API key from your [AssemblyAI dashboard](https://www.assemblyai.com/dashboard/home)

Things to know:

* Store your API key securely in an environment variable
* API key authentication works the same across all AssemblyAI SDKs

## Audio File Sources

<Tabs groupId="language">
  <Tab language="aws" title="AWS Transcribe">
    ```python theme={null}
    def transcribe_file(job_name, file_uri, transcribe_client):
        transcribe_client.start_transcription_job(
            TranscriptionJobName=job_name,
            Media={"MediaFileUri": file_uri},
            MediaFormat="wav",
            LanguageCode="en-US",
        )
    ```
  </Tab>

  <Tab language="aai" title="AssemblyAI">
    ```python theme={null}
    transcriber = aai.Transcriber()

    # Local files

    transcript = transcriber.transcribe("./audio.mp3")

    # Public URLs

    transcript = transcriber.transcribe("https://example.com/audio.mp3")

    # S3 files (using pre-signed URLs)

    s3_client = boto3.client('s3')
    presigned_url = s3_client.generate_presigned_url(
    'get_object',
    Params={'Bucket': 'my-bucket', 'Key': 'audio.mp3'},
    ExpiresIn=3600
    )
    transcript = transcriber.transcribe(presigned_url)
    ```
  </Tab>
</Tabs>

Here are helpful things to know when migrating your audio input handling:

* There's no need to specify the audio format to AssemblyAI - it's auto-detected. AssemblyAI accepts almost every audio/video file type: [here is a full list of all our supported file types](/faq/what-audio-and-video-file-types-are-supported-by-your-api)
* Our SDK handles file upload and transcription automatically in one step
* For S3 files, you'll need to generate pre-signed URLs ([see example in cookbook](/pre-recorded-audio/guides/transcribe_from_s3))

## Basic Transcription

<Tabs groupId="language">
  <Tab language="aws" title="AWS Transcribe">
    ```python theme={null}
    while max_tries > 0:
        max_tries -= 1
        job = transcribe_client.get_transcription_job(
            TranscriptionJobName=job_name
        )
        job_status = job["TranscriptionJob"]["TranscriptionJobStatus"]
        if job_status in ["COMPLETED", "FAILED"]:
            break
        time.sleep(10)
    ```
  </Tab>

  <Tab language="aai" title="AssemblyAI">
    ```python theme={null}
    transcriber = aai.Transcriber()

    # Local files

    transcript = transcriber.transcribe("./audio.mp3")

    # Public URLs

    transcript = transcriber.transcribe("https://example.com/audio.mp3")

    # S3 files (using pre-signed URLs)

    s3_client = boto3.client('s3')
    presigned_url = s3_client.generate_presigned_url(
        'get_object',
        Params={'Bucket': 'my-bucket', 'Key': 'audio.mp3'},
        ExpiresIn=3600
    )
    transcript = transcriber.transcribe(presigned_url)
    ```
  </Tab>
</Tabs>

Here are helpful things to know about our `transcribe` method:

* The SDK handles polling under the hood
* Transcript is directly accessible via `transcript.text`
* English is the default language. We recommend specifying `speech_models=["universal-3-5-pro", "universal-2"]` for the highest accuracy
* We have a [cookbook for error handling common errors](/pre-recorded-audio/guides/common_errors_and_solutions) when using our API.

## Adding Features

<Tabs groupId="language">
  <Tab language="aws" title="AWS Transcribe">
    ```python theme={null}
    transcribe_client.start_transcription_job(
        TranscriptionJobName=job_name,
        Media={"MediaFileUri": file_uri},
        Settings={
            "ShowSpeakerLabels": True,
            "MaxSpeakerLabels": 2
        }
    )
    ```
  </Tab>

  <Tab language="aai" title="AssemblyAI">
    ```python theme={null}
    config = aai.TranscriptionConfig(
        speech_models=["universal-3-5-pro", "universal-2"],
        language_detection=True,
        speaker_labels=True,           # Speaker diarization
        auto_chapters=True,           # Auto chapter detection
        entity_detection=True,        # Named entity detection
    )

    transcript = transcriber.transcribe(audio_file, config)

    # Access speaker labels

    for utterance in transcript.utterances:
        print(f"Speaker {utterance.speaker}: {utterance.text}")

    ```
  </Tab>
</Tabs>

Key differences:

* Use `aai.TranscriptionConfig` to specify any extra features that you wish to use
* The results for Speaker Diarization are stored in `transcript.utterances`. To see the full transcript response object, refer to our [API Reference](/api-reference).
* Check our [documentation](/speech-understanding/getting-started) for our full list of available features and their parameters
