Apply LLM Gateway to Streaming

Learn how to analyze streaming audio transcripts using LLM Gateway.

Overview

A Large Language Model (LLM) is a machine learning model that uses natural language processing (NLP) to generate text. LLM Gateway is a unified API that provides access to 15+ models from Claude, GPT, and Gemini through a single interface. You can use LLM Gateway to analyze streaming audio transcripts in real time, for example to summarize a live conversation or extract action items as they happen.

By the end of this tutorial, you’ll be able to use LLM Gateway to analyze a streaming audio transcript from your microphone.

Here’s the full sample code for what you’ll build in this tutorial:

1import logging
2from typing import Type
3
4import assemblyai as aai
5from assemblyai.streaming.v3 import (
6 BeginEvent,
7 LLMGatewayResponseEvent,
8 StreamingClient,
9 StreamingClientOptions,
10 StreamingError,
11 StreamingEvents,
12 StreamingParameters,
13 TurnEvent,
14 TerminationEvent,
15)
16from assemblyai.streaming.v3.models import LLMGatewayConfig, LLMGatewayMessage
17
18api_key = "<YOUR_API_KEY>"
19prompt = "Provide a brief summary of the transcript.\n\nTranscript: {{turn}}"
20
21logging.basicConfig(level=logging.INFO)
22logger = logging.getLogger(__name__)
23
24def on_begin(self: Type[StreamingClient], event: BeginEvent):
25 print(f"Session started: {event.id}")
26
27def on_turn(self: Type[StreamingClient], event: TurnEvent):
28 if event.end_of_turn:
29 print(f"\nTranscript:\n{event.transcript}\n")
30
31def on_llm_response(self: Type[StreamingClient], event: LLMGatewayResponseEvent):
32 # Extract the actual LLM response content from the data
33 llm_content = event.data.get("choices", [{}])[0].get("message", {}).get("content", "")
34 print(f"LLM Response:\n{llm_content}\n")
35
36def on_terminated(self: Type[StreamingClient], event: TerminationEvent):
37 print(
38 f"Session terminated: {event.audio_duration_seconds} seconds of audio processed"
39 )
40
41def on_error(self: Type[StreamingClient], error: StreamingError):
42 print(f"Error occurred: {error}")
43
44def main():
45 client = StreamingClient(
46 StreamingClientOptions(
47 api_key=api_key,
48 api_host="streaming.assemblyai.com",
49 )
50 )
51
52 client.on(StreamingEvents.Begin, on_begin)
53 client.on(StreamingEvents.Turn, on_turn)
54 client.on(StreamingEvents.LLMGatewayResponse, on_llm_response)
55 client.on(StreamingEvents.Termination, on_terminated)
56 client.on(StreamingEvents.Error, on_error)
57
58 client.connect(
59 StreamingParameters(
60 sample_rate=16000,
61 speech_model="u3-rt-pro",
62 format_turns=True,
63 llm_gateway=LLMGatewayConfig(
64 model="claude-sonnet-4-20250514",
65 messages=[
66 LLMGatewayMessage(role="user", content=prompt)
67 ],
68 max_tokens=4000
69 )
70 )
71 )
72
73 try:
74 client.stream(
75 aai.extras.MicrophoneStream(sample_rate=16000)
76 )
77 finally:
78 client.disconnect(terminate=True)
79
80if __name__ == "__main__":
81 main()

Before you begin

To complete this tutorial, you need:

Step 1: Install prerequisites

Install the AssemblyAI Python SDK via pip:

$pip install "assemblyai[extras]"

Step 2: Connect to Universal Streaming

In this step, you’ll set up a connection to the Universal Streaming API with the llm_gateway parameter. This parameter configures LLM Gateway to process your streaming transcripts.

For more information about streaming transcription, see Transcribe streaming audio.

1import logging
2from typing import Type
3
4import assemblyai as aai
5from assemblyai.streaming.v3 import (
6 BeginEvent,
7 LLMGatewayResponseEvent,
8 StreamingClient,
9 StreamingClientOptions,
10 StreamingError,
11 StreamingEvents,
12 StreamingParameters,
13 TurnEvent,
14 TerminationEvent,
15)
16from assemblyai.streaming.v3.models import LLMGatewayConfig, LLMGatewayMessage
17
18api_key = "<YOUR_API_KEY>"
19prompt = "Provide a brief summary of the transcript.\n\nTranscript: {{turn}}"
20
21logging.basicConfig(level=logging.INFO)
22logger = logging.getLogger(__name__)

The llm_gateway parameterisa JSON-stringified object that follows the same interface as the LLM Gateway chat completions API. It accepts the following fields:

KeyTypeDescription
modelstringThe model to use. See Available models.
messagesarrayAn array of message objects. The content field contains your prompt.
max_tokensnumberThe maximum number of tokens to generate.

Step 3: Stream audio and analyze with LLM Gateway

In this step, you’ll stream audio from your microphone, collect the transcribed text from completed turns, and then send the accumulated transcript to LLM Gateway for analysis when the session ends.

1

Set up the event handlers to stream audio and collect transcripts from completed turns.

1def on_begin(self: Type[StreamingClient], event: BeginEvent):
2 print(f"Session started: {event.id}")
3
4def on_turn(self: Type[StreamingClient], event: TurnEvent):
5 if event.end_of_turn:
6 print(f"\nTranscript:\n{event.transcript}\n")
7
8def on_llm_response(self: Type[StreamingClient], event: LLMGatewayResponseEvent):
9 # Extract the actual LLM response content from the data
10 llm_content = event.data.get("choices", [{}])[0].get("message", {}).get("content", "")
11 print(f"LLM Response:\n{llm_content}\n")
12
13def on_terminated(self: Type[StreamingClient], event: TerminationEvent):
14 print(
15 f"Session terminated: {event.audio_duration_seconds} seconds of audio processed"
16 )
17
18def on_error(self: Type[StreamingClient], error: StreamingError):
19 print(f"Error occurred: {error}")
2

Define a function to send the accumulated transcript to LLM Gateway for analysis. This function uses the LLM Gateway chat completions API to process the transcript with your prompt.

When using the Python SDK with LLMGatewayConfig, analysis responses are received automatically through the LLMGatewayResponseEvent event handler registered in the previous step. No separate API call is needed.

3

Run the streaming session and analyze the transcript with LLM Gateway when the session ends.

1def main():
2 client = StreamingClient(
3 StreamingClientOptions(
4 api_key=api_key,
5 api_host="streaming.assemblyai.com",
6 )
7 )
8
9 client.on(StreamingEvents.Begin, on_begin)
10 client.on(StreamingEvents.Turn, on_turn)
11 client.on(StreamingEvents.LLMGatewayResponse, on_llm_response)
12 client.on(StreamingEvents.Termination, on_terminated)
13 client.on(StreamingEvents.Error, on_error)
14
15 client.connect(
16 StreamingParameters(
17 sample_rate=16000,
18 speech_model="u3-rt-pro",
19 format_turns=True,
20 llm_gateway=LLMGatewayConfig(
21 model="claude-sonnet-4-20250514",
22 messages=[
23 LLMGatewayMessage(role="user", content=prompt)
24 ],
25 max_tokens=4000
26 )
27 )
28 )
29
30 try:
31 client.stream(
32 aai.extras.MicrophoneStream(sample_rate=16000)
33 )
34 finally:
35 client.disconnect(terminate=True)
36
37if __name__ == "__main__":
38 main()

The output will look something like this:

Session started: de5d9927-73a6-4be8-b52d-b4c07be37e6b
Transcript: Hi, my name is Sonny.
Transcript: I am a voice agent.
Stopping...
Session terminated: 12s of audio processed
Analyzing conversation with LLM Gateway...
The speaker introduces themselves as Sonny and identifies as a voice agent.

Next steps

In this tutorial, you’ve learned how to analyze streaming audio transcripts using LLM Gateway. The type of output depends on your prompt, so try exploring different prompts to see how they affect the output. Here are a few more prompts to try:

  • “Provide an analysis of the transcript and offer areas to improve with exact quotes.”
  • “What’s the main take-away from the transcript?”
  • “Generate a set of action items from this transcript.”

To learn more about LLM Gateway and streaming, see the following resources:

Need some help?

If you get stuck, or have any other questions, we’d love to help you out. Contact our support team at support@assemblyai.com or create a support ticket.