Creating summarized chapters from podcasts

The Auto Chapters approach uses LLM Gateway to summarize audio data over time into chapters. Chapters make it easy for users to navigate and find specific information.

Each chapter contains the following:

  • Summary
  • One-line gist
  • Headline
  • Start and end timestamps

The auto_chapters parameter on the transcription API is deprecated. Use LLM Gateway as shown below for more flexible and powerful chapter summaries.

In this step-by-step guide, you’ll learn how to generate chapter summaries using LLM Gateway. You’ll transcribe your audio, retrieve the paragraphs, and then use LLM Gateway to generate chapter summaries.

Get started

Before we begin, make sure you have an AssemblyAI account and an API key. You can sign up for a free account and get your API key from your dashboard.

Here’s an audio example for this guide:

$https://assembly.ai/wildfires.mp3

Step-by-step instructions

1

Create a new file and import the necessary libraries.

1import requests
2import time
2

Set up the API endpoint and headers.

1base_url = "https://api.assemblyai.com"
2
3headers = {
4 "authorization": "<YOUR_API_KEY>"
5}
3

Upload your local file and submit a transcription request.

1with open("./my-audio.mp3", "rb") as f:
2 response = requests.post(base_url + "/v2/upload",
3 headers=headers,
4 data=f)
5
6upload_url = response.json()["upload_url"]
7
8data = {
9 "audio_url": upload_url,
10 "speech_models": ["universal-3-pro", "universal-2"],
11 "language_detection": True
12}
13
14response = requests.post(base_url + "/v2/transcript", json=data, headers=headers)
15
16transcript_id = response.json()['id']
17polling_endpoint = base_url + "/v2/transcript/" + transcript_id
18
19while True:
20 transcription_result = requests.get(polling_endpoint, headers=headers).json()
21 if transcription_result['status'] == 'completed':
22 break
23 elif transcription_result['status'] == 'error':
24 raise RuntimeError(f"Transcription failed: {transcription_result['error']}")
25 else:
26 time.sleep(3)
4

Get paragraphs from the transcript and group them into chapters.

1paragraphs = requests.get(polling_endpoint + '/paragraphs', headers=headers).json()['paragraphs']
2
3# Group paragraphs into chapters
4step = 2 # Adjust to control chapter length
5combined_paragraphs = []
6
7for i in range(0, len(paragraphs), step):
8 paragraph_group = paragraphs[i : i + step]
9 start = paragraph_group[0]['start']
10 end = paragraph_group[-1]['end']
11 text = " ".join(p['text'] for p in paragraph_group)
12 combined_paragraphs.append({"text": text, "start": start, "end": end})
5

Generate chapter summaries using LLM Gateway.

1for chapter in combined_paragraphs:
2 llm_gateway_data = {
3 "model": "claude-sonnet-4-6",
4 "messages": [
5 {"role": "user", "content": f"Provide a brief one-paragraph summary, a one-line gist, and a headline for this section of a transcript.\n\nText: {chapter['text']}"}
6 ],
7 "max_tokens": 500
8 }
9
10 response = requests.post(
11 "https://llm-gateway.assemblyai.com/v1/chat/completions",
12 headers=headers,
13 json=llm_gateway_data
14 )
15
16 result = response.json()["choices"][0]["message"]["content"]
17 print(f"Chapter Start Time: {chapter['start']}")
18 print(f"Chapter End Time: {chapter['end']}")
19 print(f"Chapter Summary: {result}\n")

Understanding the response

The LLM Gateway returns a summary for each chapter group. You can customize the output format by adjusting the prompt. For example, you can request a JSON response:

1prompt = """For this section of a transcript, provide the following in JSON format:
2{
3 "headline": "A single sentence headline",
4 "gist": "A few words summarizing the section",
5 "summary": "A one paragraph summary"
6}
7
8Text: """ + chapter['text']

Conclusion

Creating chapter summaries using LLM Gateway gives you full control over the format and content of each chapter. You can customize the prompt to match your needs, use different models, and even use Structured Outputs for consistent JSON formatting.

This approach works on all kinds of input sources, not just podcasts. For example, you can use it to summarize lecture videos or other long-form content.

Next steps