Segment A Phone Call using LLM Gateway
In this guide we will show you how to use AssemblyAI’s LLM Gateway to segment a phone call. LLM Gateway provides access to multiple LLM providers through a unified API.
Quickstart
1 import requests 2 import time 3 4 base_url = "https://api.assemblyai.com" 5 6 headers = { 7 "authorization": "<YOUR_API_KEY>" 8 } 9 10 # Step 1: Transcribe and get VTT subtitles 11 with open("./my-audio.mp3", "rb") as f: 12 response = requests.post(base_url + "/v2/upload", headers=headers, data=f) 13 14 upload_url = response.json()["upload_url"] 15 data = {"audio_url": upload_url} 16 17 response = requests.post(base_url + "/v2/transcript", json=data, headers=headers) 18 transcript_id = response.json()['id'] 19 polling_endpoint = base_url + "/v2/transcript/" + transcript_id 20 21 while True: 22 transcription_result = requests.get(polling_endpoint, headers=headers).json() 23 if transcription_result['status'] == 'completed': 24 break 25 elif transcription_result['status'] == 'error': 26 raise RuntimeError(f"Transcription failed: {transcription_result['error']}") 27 else: 28 time.sleep(3) 29 30 # Get VTT subtitles 31 vtt_response = requests.get(f"{polling_endpoint}/vtt", headers=headers) 32 vtt_content = vtt_response.text 33 34 # Step 2: Define phases and analyze with LLM Gateway 35 phases = ["Introduction", "Complaint", "Resolution", "Goodbye"] 36 37 prompt = f''' 38 Analyze the following transcript of a phone call conversation and divide it into the following phases: 39 {', '.join(phases)} 40 41 You will be given the transcript in the format of VTT captions. 42 43 For each phase: 44 1. Identify the start and end timestamps (in seconds) 45 2. Provide a brief summary of what happened in that phase 46 47 Format your response as a JSON object with the following structure: 48 {{ 49 "phases": [ 50 {{ 51 "name": "Phase Name", 52 "start_time": start_time_in_seconds, 53 "end_time": end_time_in_seconds, 54 "summary": "Brief summary of the phase" 55 }}, 56 ... 57 ] 58 }} 59 60 Ensure that all parts of the conversation are covered by a phase, using "Other" for any parts that don't fit into the specified phases. 61 ''' 62 63 llm_gateway_data = { 64 "model": "claude-sonnet-4-5-20250929", 65 "messages": [ 66 {"role": "user", "content": f"{prompt}\n\nVTT Transcript:\n{vtt_content}"} 67 ], 68 "max_tokens": 2000 69 } 70 71 response = requests.post( 72 "https://llm-gateway.assemblyai.com/v1/chat/completions", 73 headers=headers, 74 json=llm_gateway_data 75 ) 76 77 result = response.json()["choices"][0]["message"]["content"] 78 print(result)
Get Started
Before we begin, make sure you have an AssemblyAI account and an API key. You can sign up for an AssemblyAI account and get your API key from your dashboard.
Step-by-Step Instructions
Install the required packages:
$ pip install requests
Set up your API client and transcribe the audio file:
1 import requests 2 import time 3 4 base_url = "https://api.assemblyai.com" 5 6 headers = { 7 "authorization": "<YOUR_API_KEY>" 8 } 9 10 with open("./my-audio.mp3", "rb") as f: 11 response = requests.post(base_url + "/v2/upload", 12 headers=headers, 13 data=f) 14 15 upload_url = response.json()["upload_url"] 16 17 data = { 18 "audio_url": upload_url # You can also use a URL to an audio or video file on the web 19 } 20 21 url = base_url + "/v2/transcript" 22 response = requests.post(url, json=data, headers=headers) 23 24 transcript_id = response.json()['id'] 25 polling_endpoint = base_url + "/v2/transcript/" + transcript_id 26 27 while True: 28 transcription_result = requests.get(polling_endpoint, headers=headers).json() 29 30 if transcription_result['status'] == 'completed': 31 print(f"Transcript ID:", transcript_id) 32 break 33 34 elif transcription_result['status'] == 'error': 35 raise RuntimeError(f"Transcription failed: {transcription_result['error']}") 36 37 else: 38 time.sleep(3) 39 40 # Get VTT subtitles for timestamp information 41 vtt_response = requests.get(f"{polling_endpoint}/vtt", headers=headers) 42 vtt_content = vtt_response.text 43 44 with open(f"transcript_{transcript_id}.vtt", "w") as vtt_file: 45 vtt_file.write(vtt_content)
Define the phases you want to identify.
Here is an example of phases that can be used for customer support calls:
1 phases = [ 2 "Introduction", 3 "Complaint", 4 "Resolution", 5 "Goodbye" 6 ]
Use LLM Gateway to analyze the transcript and divide it into phases. This is an example prompt, which you can modify to suit your specific requirements. See our documentation for more information about prompt engineering.
1 prompt = f''' 2 Analyze the following transcript of a phone call conversation and divide it into the following phases: 3 {', '.join(phases)} 4 5 You will be given the transcript in the format of VTT captions. 6 7 For each phase: 8 1. Identify the start and end timestamps (in seconds) 9 2. Provide a brief summary of what happened in that phase 10 11 Format your response as a JSON object with the following structure: 12 {{ 13 "phases": [ 14 {{ 15 "name": "Phase Name", 16 "start_time": start_time_in_seconds, 17 "end_time": end_time_in_seconds, 18 "summary": "Brief summary of the phase" 19 }}, 20 ... 21 ] 22 }} 23 24 Ensure that all parts of the conversation are covered by a phase, using "Other" for any parts that don't fit into the specified phases. 25 ''' 26 27 llm_gateway_data = { 28 "model": "claude-sonnet-4-5-20250929", 29 "messages": [ 30 {"role": "user", "content": f"{prompt}\n\nVTT Transcript:\n{vtt_content}"} 31 ], 32 "max_tokens": 2000 33 } 34 35 response = requests.post( 36 "https://llm-gateway.assemblyai.com/v1/chat/completions", 37 headers=headers, 38 json=llm_gateway_data 39 ) 40 41 result = response.json()["choices"][0]["message"]["content"] 42 print(result)
Example output of the analysis of a transcript divided into phases and formatted as a JSON object:
{ "phases": [ { "name": "Introduction", "start_time": 1.52, "end_time": 15.57, "summary": "The customer service representative greets the caller and asks how they can help. The caller states they want to know the status of their order refund." }, { "name": "Complaint", "start_time": 15.57, "end_time": 59.41, "summary": "The representative asks for the order ID, which the caller provides. The representative confirms the order details and that it was cancelled. The caller mentions they couldn't complete their test." }, { "name": "Resolution", "start_time": 59.41, "end_time": 210.01, "summary": "The representative informs the caller that the refund was initiated on April 8th and will be credited by April 21st. They explain the refund timeline and bank processing days. The caller expresses some confusion about the timeline, and the representative clarifies the process." }, { "name": "Goodbye", "start_time": 210.01, "end_time": 235.8, "summary": "The caller accepts the explanation. The representative asks if there's anything else they can help with, requests feedback, and concludes the call with a farewell." } ] }