Transcribe streaming audio | AssemblyAI

Overview

By the end of this tutorial, you’ll be able to transcribe audio from your microphone.

By default, Universal-Streaming is set to transcribe English audio. If you’d like to enable multilingual streaming (support for English, Spanish, French, German, Italian, and Portuguese), enable multilingual transcription instead.

Streaming is now available in EU-West via streaming.eu.assemblyai.com. To use the EU streaming endpoint, replace streaming.assemblyai.com with streaming.eu.assemblyai.com in your connection configuration.

Before you begin

To complete this tutorial, you need:

Python or Node.

Here’s the full sample code of what you’ll build in this tutorial:

Python SDK

Python

JavaScript SDK

JavaScript

1 import logging
2 from typing import Type
3 
4 import assemblyai as aai
5 from assemblyai.streaming.v3 import (
6     BeginEvent,
7     StreamingClient,
8     StreamingClientOptions,
9     StreamingError,
10     StreamingEvents,
11     StreamingParameters,
12     StreamingSessionParameters,
13     TerminationEvent,
14     TurnEvent,
15 )
16 
17 api_key = "<YOUR_API_KEY>"
18 
19 logging.basicConfig(level=logging.INFO)
20 logger = logging.getLogger(__name__)
21 
22 
23 def on_begin(self: Type[StreamingClient], event: BeginEvent):
24     print(f"Session started: {event.id}")
25 
26 
27 def on_turn(self: Type[StreamingClient], event: TurnEvent):
28     print(f"{event.transcript} ({event.end_of_turn})")
29 
30 
31 def on_terminated(self: Type[StreamingClient], event: TerminationEvent):
32     print(
33         f"Session terminated: {event.audio_duration_seconds} seconds of audio processed"
34     )
35 
36 
37 def on_error(self: Type[StreamingClient], error: StreamingError):
38     print(f"Error occurred: {error}")
39 
40 
41 def main():
42     client = StreamingClient(
43         StreamingClientOptions(
44             api_key=api_key,
45             api_host="streaming.assemblyai.com",
46         )
47     )
48 
49     client.on(StreamingEvents.Begin, on_begin)
50     client.on(StreamingEvents.Turn, on_turn)
51     client.on(StreamingEvents.Termination, on_terminated)
52     client.on(StreamingEvents.Error, on_error)
53 
54     client.connect(
55         StreamingParameters(
56             sample_rate=16000,
57             format_turns=True,
58         )
59     )
60 
61     try:
62         client.stream(
63           aai.extras.MicrophoneStream(sample_rate=16000)
64         )
65     finally:
66         client.disconnect(terminate=True)
67 
68 
69 if __name__ == "__main__":
70     main()

Step 1: Install and import dependencies

Python SDK

Python

JavaScript SDK

JavaScript

Install the AssemblyAI Python SDK via PIP:

$ pip install assemblyai

Create a file called main.py and import the following packages at the top of your file:

1 import logging
2 from typing import Type
3 
4 import assemblyai as aai
5 from assemblyai.streaming.v3 import (
6     BeginEvent,
7     StreamingClient,
8     StreamingClientOptions,
9     StreamingError,
10     StreamingEvents,
11     StreamingParameters,
12     StreamingSessionParameters,
13     TerminationEvent,
14     TurnEvent,
15 )

Step 2: Configure the API key

In this step, you’ll configure your AssemblyAI API key to authenticate your application and enable access to the streaming transcription service.

Browse to API Keys in your dashboard, and then copy your API key.

Python SDK

Python

JavaScript SDK

JavaScript

Configure the SDK to use your API key. Replace <YOUR_API_KEY> with your copied API key.

1 api_key = "<YOUR_API_KEY>"

Authenticate with a temporary token

If you need to authenticate on the client, you can avoid exposing your API key by using temporary authentication tokens.

Step 3: Set up audio and websocket configuration

Python SDK

Python

JavaScript SDK

JavaScript

The Python SDK handles audio configuration automatically. You’ll specify the sample rate when connecting to the transcriber. If you don’t set a sample rate, it defaults to 16 kHz.

Step 4: Create event handlers

In this step, you’ll define event handlers to manage the different types of events emitted during the streaming session. The handlers will respond to session lifecycle events, transcription turns, errors, and session termination.

Python SDK

Python

JavaScript SDK

JavaScript

Implement basic event handlers. These handlers let your app respond to key streaming events:

on_begin – Logs when the session starts.
on_turn – Handles each transcription turn and optionally enables formatted turns.
on_terminated – Logs when the session ends and how much audio was processed.
on_error – Captures and prints any errors during streaming.

1 logging.basicConfig(level=logging.INFO)
2 logger = logging.getLogger(__name__)
3 
4 def on_begin(self: Type[StreamingClient], event: BeginEvent):
5     print(f"Session started: {event.id}")
6 
7 
8 def on_turn(self: Type[StreamingClient], event: TurnEvent):
9     print(f"{event.transcript} ({event.end_of_turn})")
10 
11 
12 def on_terminated(self: Type[StreamingClient], event: TerminationEvent):
13     print(
14         f"Session terminated: {event.audio_duration_seconds} seconds of audio processed"
15     )
16 
17 
18 def on_error(self: Type[StreamingClient], error: StreamingError):
19     print(f"Error occurred: {error}")

Message sequence and turn events

To get a better understanding of the turn event and the message sequences, check out our Message Sequence Breakdown page. This object is how you’ll receive your transcripts.

Step 5: Connect and start transcription

Streaming Speech-to-Text uses WebSockets to stream audio to AssemblyAI. This requires first establishing a connection to the API.

Python SDK

Python

JavaScript SDK

JavaScript

In the main function create a client and connect to the streaming service:

1     client = StreamingClient(
2         StreamingClientOptions(
3             api_key=api_key,
4             api_host="streaming.assemblyai.com",
5         )
6     )
7 
8     client.on(StreamingEvents.Begin, on_begin)
9     client.on(StreamingEvents.Turn, on_turn)
10     client.on(StreamingEvents.Termination, on_terminated)
11     client.on(StreamingEvents.Error, on_error)
12 
13     client.connect(
14         StreamingParameters(
15             sample_rate=16000,
16             format_turns=True,
17         )
18     )

Next, create a microphone stream and begin transcribing audio. Make sure the sample_rate matches the value you specified in the StreamingParameters when initializing the streaming client.

1     try:
2         client.stream(
3           aai.extras.MicrophoneStream(sample_rate=16000)
4         )

Step 6: Close the connection

Python SDK

Python

JavaScript SDK

JavaScript

Disconnect the client when you’re done:

1     finally:
2         client.disconnect(terminate=True)

The connection will also close automatically when you press Ctrl+C. In both cases, the .disconnect() handler will clean up the audio resources.

Note: Pricing is based on session duration so it is very important to close sessions properly to avoid unexpected usage and cost.

Next steps

To learn more about Streaming Speech-to-Text, see the following resources:

Need some help?

If you get stuck, or have any other questions, we’d love to help you out. Contact our support team at support@assemblyai.com or create a support ticket.