> ## Documentation Index
> Fetch the complete documentation index at: https://assemblyai.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Stream a pre-recorded file in real time

When you stream a pre-recorded audio file to the Streaming API, you need to send audio at the same pace it was recorded. If you send audio faster than real time, the server receives more data than it can process in sequence, which can cause degraded transcription accuracy, unexpected session closures, or other errors.

This guide shows you how to pace audio correctly so that the server processes it as if a person were speaking into a live microphone.

## Why real-time pacing matters

The Streaming API is designed for live audio. It expects audio to arrive at roughly the same rate it was originally spoken. When you stream a pre-recorded file without any pacing, your code reads and sends the entire file in seconds, even if the recording is minutes long. This causes problems:

* **Unexpected session behavior** — Sending audio faster than real time can overwhelm the connection and cause the server to close the session or return errors.
* **Inaccurate results** — The speech model is optimized for real-time input. Audio that arrives too quickly may not be processed the same way as live speech, potentially affecting transcription quality.
* **Unreliable benchmarks** — If you're evaluating transcription quality, faster-than-real-time streaming produces results that don't reflect production conditions where audio arrives at normal speed.

<Note>
  If you only need a transcript and don't need real-time results, use the [pre-recorded transcription API](/pre-recorded-audio/getting-started/transcribe-an-audio-file) instead. It processes audio as fast as possible and is optimized for batch workloads.
</Note>

## Before you begin

To complete this guide, you need:

* An AssemblyAI API key. [Sign up](https://assemblyai.com/dashboard/signup) and get your key from the [dashboard](https://www.assemblyai.com/dashboard/home).
* [Python 3.8+](https://www.python.org/) or [Node.js 18+](https://nodejs.org/).
* A WAV audio file (mono, 16-bit PCM). If your file is in a different format, see [Prepare your audio file](#prepare-your-audio-file).

## Quickstart

<Tabs groupId="language">
  <Tab language="python" title="Python" default>
    ```python expandable theme={null}
    import websocket
    import json
    import threading
    import time
    import wave
    import os
    from urllib.parse import urlencode

    # --- Configuration ---
    ASSEMBLYAI_API_KEY = os.environ["ASSEMBLYAI_API_KEY"]
    AUDIO_FILE = "audio.wav"
    CHUNK_DURATION = 0.1  # Send 100ms of audio per chunk
    SAMPLE_RATE = 16000   # Must match your audio file's sample rate

    CONNECTION_PARAMS = {
        "speech_model": "u3-rt-pro",
        "sample_rate": SAMPLE_RATE,
    }
    API_ENDPOINT = f"wss://streaming.assemblyai.com/v3/ws?{urlencode(CONNECTION_PARAMS)}"

    ws_app = None
    audio_thread = None
    stop_event = threading.Event()


    def on_open(ws):
        print("Connected. Streaming audio at real-time speed...")

        def stream_file():
            with wave.open(AUDIO_FILE, "rb") as wf:
                frames_per_chunk = int(wf.getframerate() * CHUNK_DURATION)
                start_time = time.monotonic()
                chunks_sent = 0

                while not stop_event.is_set():
                    frames = wf.readframes(frames_per_chunk)
                    if not frames:
                        break

                    ws.send(frames, websocket.ABNF.OPCODE_BINARY)
                    chunks_sent += 1

                    # Wall-clock pacing: sleep until the next chunk is due
                    next_chunk_time = start_time + (chunks_sent * CHUNK_DURATION)
                    sleep_duration = next_chunk_time - time.monotonic()
                    if sleep_duration > 0:
                        time.sleep(sleep_duration)

            print("Finished sending audio. Waiting for final transcripts...")
            try:
                ws.send(json.dumps({"type": "Terminate"}))
            except Exception:
                pass

        global audio_thread
        audio_thread = threading.Thread(target=stream_file, daemon=True)
        audio_thread.start()


    def on_message(ws, message):
        data = json.loads(message)

        if data["type"] == "Begin":
            print(f"Session ID: {data['id']}")
        elif data["type"] == "Turn":
            transcript = data.get("transcript", "")
            if not transcript:
                return
            if data.get("end_of_turn"):
                print(f"[Final]: {transcript}")
            else:
                print(f"[Partial]: {transcript}")
        elif data["type"] == "Termination":
            print(f"Done. Processed {data.get('audio_duration_seconds', 0)}s of audio.")


    def on_error(ws, error):
        print(f"Error: {error}")
        stop_event.set()


    def on_close(ws, status_code, msg):
        print(f"Disconnected (status={status_code})")
        stop_event.set()


    ws_app = websocket.WebSocketApp(
        API_ENDPOINT,
        header={"Authorization": ASSEMBLYAI_API_KEY},
        on_open=on_open,
        on_message=on_message,
        on_error=on_error,
        on_close=on_close,
    )

    ws_thread = threading.Thread(target=ws_app.run_forever, daemon=True)
    ws_thread.start()

    try:
        while ws_thread.is_alive():
            time.sleep(0.1)
    except KeyboardInterrupt:
        print("\nStopping...")
        stop_event.set()
        if ws_app and ws_app.sock and ws_app.sock.connected:
            try:
                ws_app.send(json.dumps({"type": "Terminate"}))
                time.sleep(2)
            except Exception:
                pass
        if ws_app:
            ws_app.close()
        ws_thread.join(timeout=2.0)
    ```
  </Tab>

  <Tab language="javascript" title="JavaScript">
    ```javascript expandable theme={null}
    import WebSocket from "ws";
    import { readFileSync } from "fs";
    import { stringify } from "querystring";

    // --- Configuration ---
    const ASSEMBLYAI_API_KEY = process.env.ASSEMBLYAI_API_KEY;
    const AUDIO_FILE = "audio.wav";
    const CHUNK_DURATION = 0.1; // Send 100ms of audio per chunk
    const SAMPLE_RATE = 16000;  // Must match your audio file's sample rate

    const CONNECTION_PARAMS = {
      speech_model: "u3-rt-pro",
      sample_rate: SAMPLE_RATE,
    };
    const API_ENDPOINT = `wss://streaming.assemblyai.com/v3/ws?${stringify(CONNECTION_PARAMS)}`;

    function readWavAudioData(filePath) {
      const fileBuffer = readFileSync(filePath);

      // Verify RIFF/WAVE header
      if (fileBuffer.toString("ascii", 0, 4) !== "RIFF" ||
          fileBuffer.toString("ascii", 8, 12) !== "WAVE") {
        throw new Error("Not a valid WAV file");
      }

      // Find the "data" chunk
      let offset = 12;
      while (offset < fileBuffer.length - 8) {
        const chunkId = fileBuffer.toString("ascii", offset, offset + 4);
        const chunkSize = fileBuffer.readUInt32LE(offset + 4);
        if (chunkId === "data") {
          return fileBuffer.subarray(offset + 8, offset + 8 + chunkSize);
        }
        offset += 8 + chunkSize;
      }
      throw new Error("No data chunk found in WAV file");
    }

    function run() {
      const ws = new WebSocket(API_ENDPOINT, {
        headers: { Authorization: ASSEMBLYAI_API_KEY },
      });

      ws.on("open", () => {
        console.log("Connected. Streaming audio at real-time speed...");

        const audioData = readWavAudioData(AUDIO_FILE);

        const bytesPerSample = 2; // 16-bit PCM
        const bytesPerChunk = Math.floor(SAMPLE_RATE * CHUNK_DURATION) * bytesPerSample;
        let offset = 0;
        let chunksSent = 0;
        const startTime = Date.now();

        function sendNextChunk() {
          if (offset >= audioData.length) {
            console.log("Finished sending audio. Waiting for final transcripts...");
            ws.send(JSON.stringify({ type: "Terminate" }));
            return;
          }

          const chunk = audioData.subarray(offset, offset + bytesPerChunk);
          ws.send(chunk);
          offset += bytesPerChunk;
          chunksSent++;

          // Wall-clock pacing: schedule the next chunk at the correct time
          const nextChunkTime = startTime + chunksSent * CHUNK_DURATION * 1000;
          const delay = nextChunkTime - Date.now();
          setTimeout(sendNextChunk, Math.max(0, delay));
        }

        sendNextChunk();
      });

      ws.on("message", (message) => {
        const data = JSON.parse(message);

        if (data.type === "Begin") {
          console.log(`Session ID: ${data.id}`);
        } else if (data.type === "Turn") {
          const transcript = data.transcript || "";
          if (!transcript) return;
          if (data.end_of_turn) {
            console.log(`[Final]: ${transcript}`);
          } else {
            console.log(`[Partial]: ${transcript}`);
          }
        } else if (data.type === "Termination") {
          console.log(`Done. Processed ${data.audio_duration_seconds || 0}s of audio.`);
        }
      });

      ws.on("error", (error) => console.error(`Error: ${error}`));

      ws.on("close", (code, reason) => {
        console.log(`Disconnected (status=${code})`);
      });
    }

    run();
    ```
  </Tab>
</Tabs>

## Step-by-step guide

### Install dependencies

<Tabs groupId="language">
  <Tab language="python" title="Python" default>
    ```bash theme={null}
    pip install websocket-client
    ```
  </Tab>

  <Tab language="javascript" title="JavaScript">
    ```bash theme={null}
    npm install ws
    ```
  </Tab>
</Tabs>

### Prepare your audio file

The Streaming API accepts raw audio samples. WAV is the simplest format to work with because it contains uncompressed PCM data that you can read directly.

Your audio file must be:

* **Mono** (single channel)
* **16-bit PCM** encoding
* A **sample rate** that matches the `sample_rate` connection parameter

If your file doesn't meet these requirements, convert it with [FFmpeg](https://ffmpeg.org/):

```bash theme={null}
ffmpeg -i input.mp3 -ar 16000 -ac 1 -sample_fmt s16 output.wav
```

To check your file's properties:

```bash theme={null}
ffprobe -v quiet -print_format json -show_streams audio.wav
```

### Configure the connection

Set your API key and match the `sample_rate` parameter to your audio file:

<Tabs groupId="language">
  <Tab language="python" title="Python" default>
    ```python theme={null}
    import websocket
    import json
    import threading
    import time
    import wave
    import os
    from urllib.parse import urlencode

    ASSEMBLYAI_API_KEY = os.environ["ASSEMBLYAI_API_KEY"]
    AUDIO_FILE = "audio.wav"
    CHUNK_DURATION = 0.1  # 100ms per chunk
    SAMPLE_RATE = 16000

    CONNECTION_PARAMS = {
        "speech_model": "u3-rt-pro",
        "sample_rate": SAMPLE_RATE,
    }
    API_ENDPOINT = f"wss://streaming.assemblyai.com/v3/ws?{urlencode(CONNECTION_PARAMS)}"
    ```
  </Tab>

  <Tab language="javascript" title="JavaScript">
    ```javascript theme={null}
    import WebSocket from "ws";
    import { readFileSync } from "fs";
    import { stringify } from "querystring";

    const ASSEMBLYAI_API_KEY = process.env.ASSEMBLYAI_API_KEY;
    const AUDIO_FILE = "audio.wav";
    const CHUNK_DURATION = 0.1; // 100ms per chunk
    const SAMPLE_RATE = 16000;

    const CONNECTION_PARAMS = {
      speech_model: "u3-rt-pro",
      sample_rate: SAMPLE_RATE,
    };
    const API_ENDPOINT = `wss://streaming.assemblyai.com/v3/ws?${stringify(CONNECTION_PARAMS)}`;
    ```
  </Tab>
</Tabs>

### Implement wall-clock pacing

The key to simulating real-time audio is **wall-clock pacing**. Instead of calling `sleep` for a fixed duration after each chunk (which accumulates drift from processing time), track elapsed time from the start and sleep only until the next chunk is due.

Here's the difference:

**Naive approach (not recommended)** — Fixed sleep after each send. Processing time adds up, so audio arrives progressively later than real time:

```python theme={null}
# Don't do this for benchmarking
while True:
    frames = wav_file.readframes(frames_per_chunk)
    ws.send(frames)
    time.sleep(chunk_duration)  # Drift accumulates over time
```

**Wall-clock approach (recommended)** — Calculate when each chunk should be sent based on the start time. This self-corrects any drift:

<Tabs groupId="language">
  <Tab language="python" title="Python" default>
    ```python theme={null}
    start_time = time.monotonic()
    chunks_sent = 0

    while not stop_event.is_set():
        frames = wav_file.readframes(frames_per_chunk)
        if not frames:
            break

        ws.send(frames, websocket.ABNF.OPCODE_BINARY)
        chunks_sent += 1

        # Sleep until the next chunk is due
        next_chunk_time = start_time + (chunks_sent * CHUNK_DURATION)
        sleep_duration = next_chunk_time - time.monotonic()
        if sleep_duration > 0:
            time.sleep(sleep_duration)
    ```
  </Tab>

  <Tab language="javascript" title="JavaScript">
    ```javascript expandable theme={null}
    const audioData = readFileSync(AUDIO_FILE); // Raw PCM bytes from WAV data chunk
    const bytesPerSample = 2; // 16-bit PCM
    const bytesPerChunk = Math.floor(SAMPLE_RATE * CHUNK_DURATION) * bytesPerSample;
    let offset = 0;
    let chunksSent = 0;
    const startTime = Date.now();

    function sendNextChunk() {
      if (offset >= audioData.length) {
        ws.send(JSON.stringify({ type: "Terminate" }));
        return;
      }

      const chunk = audioData.subarray(offset, offset + bytesPerChunk);
      ws.send(chunk);
      offset += bytesPerChunk;
      chunksSent++;

      // Schedule the next chunk at the correct wall-clock time
      const nextChunkTime = startTime + chunksSent * CHUNK_DURATION * 1000;
      const delay = nextChunkTime - Date.now();
      setTimeout(sendNextChunk, Math.max(0, delay));
    }
    ```
  </Tab>
</Tabs>

This approach uses `time.monotonic()` (Python) or `Date.now()` (JavaScript) to track elapsed time from the start of streaming. Each chunk is scheduled based on its position in the file, not relative to the previous chunk. If one iteration takes longer than expected, the next chunk is sent sooner to catch up — keeping the overall pace at real time.

### End the session

After you send all audio, send a `Terminate` message so the server can flush its buffers and return any remaining transcripts:

<Tabs groupId="language">
  <Tab language="python" title="Python" default>
    ```python theme={null}
    ws.send(json.dumps({"type": "Terminate"}))
    ```
  </Tab>

  <Tab language="javascript" title="JavaScript">
    ```javascript theme={null}
    ws.send(JSON.stringify({ type: "Terminate" }));
    ```
  </Tab>
</Tabs>

The server responds with a `Termination` message that includes the total audio duration processed. Wait for this message before closing the WebSocket connection so you don't miss any final transcripts.

## Choosing a chunk duration

The `CHUNK_DURATION` value controls how much audio you send in each message. Common values:

* **100ms** (`0.1`) — Good default. Balances network overhead with smooth pacing.
* **50ms** (`0.05`) — More closely simulates microphone input. Use this if you want behavior closest to a live mic stream.
* **200ms** (`0.2`) — Fewer network calls, slightly less real-time feel. Acceptable for most benchmarks.

Smaller chunks send more WebSocket messages but more closely approximate continuous microphone input. For benchmarking, 100ms is a good starting point.

## Common mistakes

| Mistake                 | Impact                                                                                 | Fix                                                            |
| ----------------------- | -------------------------------------------------------------------------------------- | -------------------------------------------------------------- |
| No pacing at all        | Audio arrives in seconds; session may close or return errors                           | Add wall-clock pacing as shown above                           |
| Naive fixed sleep       | Drift accumulates over a long file; audio arrives late                                 | Use wall-clock pacing with `time.monotonic()` or `Date.now()`  |
| Wrong sample rate       | Server interprets audio at the wrong speed                                             | Match `sample_rate` to your file. Check with `ffprobe`         |
| Sending stereo audio    | Only the first channel is used, or the session errors                                  | Convert to mono: `ffmpeg -i input.wav -ac 1 output.wav`        |
| Not sending `Terminate` | Server waits for more audio until the session times out, so you miss final transcripts | Always send `{"type": "Terminate"}` after the last audio chunk |

## Next steps

* [Transcribe audio files with Streaming](/streaming/guides/streaming_transcribe_audio_file) — Full example with audio playback and transcript saving.
* [Evaluate Streaming transcription accuracy with WER](/streaming/guides/evaluate_streaming_wer) — Benchmark your streaming transcription quality.
* [Common session errors and closures](/streaming/common-session-errors-and-closures) — Troubleshoot session disconnects.
