Streaming Migration Guide: Gladia to AssemblyAI | AssemblyAI

This guide walks through the process of migrating from Gladia to AssemblyAI for transcribing streaming audio.

Get Started

Before we begin, make sure you have an AssemblyAI account and an API key. You can sign up for a free account and get your API key from your AssemblyAI dashboard.

Side-By-Side Code Comparison

Below is a side-by-side comparison of a basic snippet to transcribe live audio by Gladia and AssemblyAI using a microphone:

Gladia

AssemblyAI

1 import asyncio
2 import base64
3 import json
4 import signal
5 from datetime import time
6 import pyaudio
7 import requests
8 from websockets.asyncio.client import connect
9 from websockets.exceptions import ConnectionClosedOK
10 
11 # Constants
12 GLADIA_API_KEY = "<YOUR_GLADIA_API_KEY>"
13 GLADIA_API_URL = "https://api.gladia.io"
14 
15 # Audio configuration
16 CHANNELS = 1
17 FORMAT = pyaudio.paInt16
18 FRAMES_PER_BUFFER = 3200
19 SAMPLE_RATE = 16_000
20 
21 async def main():
22     # Initialize the session
23     config = {
24         "encoding": "wav/pcm",
25         "sample_rate": SAMPLE_RATE,
26         "bit_depth": 16,
27         "channels": CHANNELS,
28         "language_config": {
29             "languages": ["en"],
30         },
31     }
32     
33     # Start the live session
34     response = requests.post(
35         f"{GLADIA_API_URL}/v2/live",
36         headers={"X-Gladia-Key": GLADIA_API_KEY},
37         json=config,
38         timeout=3,
39     )
40     
41     if not response.ok:
42         print(f"{response.status_code}: {response.text or response.reason}")
43         exit(response.status_code)
44     
45     session_data = response.json()
46     
47     # Connect to the websocket
48     async with connect(session_data["url"]) as websocket:
49         print("\n################ Begin session ################\n")
50         
51         # Handle Ctrl+C to stop recording
52         loop = asyncio.get_running_loop()
53         loop.add_signal_handler(
54             signal.SIGINT,
55             lambda: loop.create_task(stop_recording(websocket)),
56         )
57         
58         # Create tasks for sending audio and receiving transcripts
59         send_audio_task = asyncio.create_task(send_audio(websocket))
60         receive_transcript_task = asyncio.create_task(receive_transcript(websocket))
61         
62         await asyncio.wait([send_audio_task, receive_transcript_task])
63 
64 async def stop_recording(websocket):
65     print(">>>>> Ending the recording…")
66     await websocket.send(json.dumps({"type": "stop_recording"}))
67     await asyncio.sleep(0)
68 
69 async def send_audio(websocket):
70     # Initialize PyAudio
71     p = pyaudio.PyAudio()
72     
73     # Open audio stream
74     stream = p.open(
75         format=FORMAT,
76         channels=CHANNELS,
77         rate=SAMPLE_RATE,
78         input=True,
79         frames_per_buffer=FRAMES_PER_BUFFER,
80     )
81     
82     # Send audio chunks
83     while True:
84         data = stream.read(FRAMES_PER_BUFFER)
85         data = base64.b64encode(data).decode("utf-8")
86         json_data = json.dumps({"type": "audio_chunk", "data": {"chunk": str(data)}})
87         try:
88             await websocket.send(json_data)
89             await asyncio.sleep(0.1)  # Send audio every 100ms
90         except ConnectionClosedOK:
91             return
92 
93 async def receive_transcript(websocket):
94     # Process incoming messages
95     async for message in websocket:
96         content = json.loads(message)
97         
98         # Print transcripts
99         if content["type"] == "transcript" and content["data"]["is_final"]:
100             text = content["data"]["utterance"]["text"].strip()
101             print(f"Final: {text}")
102             
103         # Print final results
104         if content["type"] == "post_final_transcript":
105             print("\n################ End of session ################\n")
106             print(json.dumps(content, indent=2, ensure_ascii=False))
107 
108 if __name__ == "__main__":
109     asyncio.run(main())

Authentication

Gladia

AssemblyAI

1 import asyncio
2 import base64
3 import json
4 import signal
5 from datetime import time
6 import pyaudio
7 import requests
8 from websockets.asyncio.client import connect
9 from websockets.exceptions import ConnectionClosedOK
10 
11 GLADIA_API_KEY = "<YOUR_GLADIA_API_KEY>"

Protect Your API Key

For improved security, store your API key as an environment variable.

Connection Parameters & Microphone Setup

Gladia

AssemblyAI

1 GLADIA_API_URL = "https://api.gladia.io"
2 
3 CHANNELS = 1
4 FORMAT = pyaudio.paInt16
5 FRAMES_PER_BUFFER = 3200
6 SAMPLE_RATE = 16_000
7 
8 config = {
9     "encoding": "wav/pcm",
10     "sample_rate": SAMPLE_RATE,
11     "bit_depth": 16,
12     "channels": CHANNELS,
13     "language_config": {
14         "languages": ["en"]
15     },
16 }

Helpful information about our streaming model:

One Universal Model - Just connect to wss://streaming.assemblyai.com/v3/ws. The live endpoint always uses our latest, best-performing model.
Formatted Finals - When format_turns is set to True, every Final message arrives with smart punctuation & casing. Set it to False (or omit it) to get raw lowercase text—useful if you do your own formatting.
Partial Transcripts - AssemblyAI streams immutable interim results automatically. There’s no switch to toggle. Expect fast, token-level updates that refine until each Final is emitted.

Open Microphone Stream & Create WebSocket

Gladia

AssemblyAI

1 # Start the live session
2 response = requests.post(
3     f"{GLADIA_API_URL}/v2/live",
4     headers={"X-Gladia-Key": GLADIA_API_KEY},
5     json=config,
6     timeout=3,
7 )
8 
9 if not response.ok:
10     print(f"{response.status_code}: {response.text or response.reason}")
11     exit(response.status_code)
12 
13 session_data = response.json()
14 
15 async def send_audio(websocket):
16     # Initialize PyAudio
17     p = pyaudio.PyAudio()
18     
19     # Open audio stream
20     stream = p.open(
21         format=FORMAT,
22         channels=CHANNELS,
23         rate=SAMPLE_RATE,
24         input=True,
25         frames_per_buffer=FRAMES_PER_BUFFER,
26     )
27     
28     # Send audio chunks
29     while True:
30         data = stream.read(FRAMES_PER_BUFFER)
31         data = base64.b64encode(data).decode("utf-8")
32         json_data = json.dumps({"type": "audio_chunk", "data": {"chunk": str(data)}})
33         try:
34             await websocket.send(json_data)
35             await asyncio.sleep(0.1)  # Send audio every 100ms
36         except ConnectionClosedOK:
37             return

Open WebSocket

Gladia

AssemblyAI

1 # Connect to Websocket
2 async with connect(session_data["url"]) as websocket:
3     print("\n################ Begin session ################\n")
4     
5     # Create tasks for sending audio and receiving transcripts
6     send_audio_task = asyncio.create_task(send_audio(websocket))
7     receive_transcript_task = asyncio.create_task(receive_transcript(websocket))
8     
9     await asyncio.wait([send_audio_task, receive_transcript_task])

Receive Messsages from WebSocket

Gladia

AssemblyAI

1 async def receive_transcript(websocket):
2     # Process incoming messages
3     async for message in websocket:
4         content = json.loads(message)
5         
6         # Print transcripts
7         if content["type"] == "transcript" and content["data"]["is_final"]:
8             text = content["data"]["utterance"]["text"].strip()
9             print(f"Final: {text}")
10             
11         # Print final results
12         if content["type"] == "post_final_transcript":
13             print("\n################ End of session ################\n")
14             print(json.dumps(content, indent=2, ensure_ascii=False))

Helpful information about AssemblyAI’s message payloads:

Clear Message Types – Instead of checking is_final, you’ll receive explicit "Begin", "Turn", and "Termination" events, making your logic simpler and more readable.
Session Metadata Up-Front – The first "Begin" message delivers a session_id and expiry timestamp so you can immediately log or surface these for tracing or billing.
Formatted vs. Raw Finals – Each "Turn" object includes a boolean turn_is_formatted. When you set format_turns to True, punctuation/casing appears in the Final Transcript, so you can toggle display styles on the fly.

Close the WebSocket

Gladia

AssemblyAI

1 async def stop_recording(websocket):
2     print(">>>>> Ending the recording…")
3     await websocket.send(json.dumps({"type": "stop_recording"}))
4     await asyncio.sleep(0)

Helpful information about AssemblyAI’s WebSocket Closure:

Connection Diagnostics - If the socket closes unexpectedly, AssemblyAI supplies both a status code and a reason message (close_status_code, close_msg), so you know immediately whether the server timed out, refused authentication, or encountered a different error.

Session Shutdown

Gladia

AssemblyAI

1 async with connect(session_data["url"]) as websocket:
2     ...
3     # Handle Ctrl+C to stop recording
4     loop = asyncio.get_running_loop()
5     loop.add_signal_handler(
6         signal.SIGINT,
7         lambda: loop.create_task(stop_recording(websocket)),
8     )

Helpful information to know about AssemblyAI’s shutdown:

JSON Payload Difference - When closing the stream with AssemblyAI, your JSON payload will be { "type": "Terminate" } instead of { "type": "stop_recording" }.
No Metadata Race Condition - AssemblyAI provides session info at “Begin” and doesn’t append extra data at shutdown, making the exit faster and less error-prone.

Resources

For additional information about using AssemblyAI’s Streaming Speech-To-Text API you can also refer to:

Streaming Guide