Skip to main content

Auto Chapters

The Auto Chapters model summarizes audio data over time into chapters. Chapters makes it easy for users to navigate and find specific information.

Each chapter contains the following:

  • Summary
  • One-line gist
  • Headline
  • Start and end timestamps

Quickstart

In the Creating summarized chapters from podcasts guide, the client uploads an audio file and configures the API request to create auto chapters of the content.

You can explore the full JSON response here:

Show JSON

You run this code snippet in Colab here, or you can view the full source code here.

Note that Auto Chapters and Summarization cannot both be used in the same request. Additionally, Auto Chapters requires that punctuate: True be set in the request as above.

Understanding the response

The JSON object above contains all information about the transcription. Depending on which Models are used to analyze the audio, the attributes of this object will vary. For example, in the quickstart above we did not enable Summarization, which is reflected by the summarization: false key-value pair in the JSON above. Had we activated Summarization, then the summary, summary_type, and summary_model key values would contain the file summary (and additional details) rather than the current null values.

To access the Auto Chapters information, we use the auto_chapters and chapters keys:

The reference table below lists all relevant attributes along with their descriptions, where we've called the JSON response object results. Object attributes are accessed via dot notation, and arbitrary array elements are denoted with [i]. For example, results.words[i].text refers to the text attribute of the i-th element of the words array in the JSON results object.

results.auto_chaptersbooleanWhether Auto Chapters was enabled in the transcription request
results.chaptersarrayAn array of temporally sequential chapters for the audio file
results.chapters[i].giststringAn ultra-short summary (just a few words) of the content spoken in the i-th chapter
results.chapters[i].headlinestringA single sentence summary of the content spoken during the i-th chapter
results.chapters[i].summarystringA one paragraph summary of the content spoken during the i-th chapter
results.chapters[i].startnumberThe starting time, in milliseconds, for the i-th chapter
results.chapters[i].endnumberThe ending time, in milliseconds, for the i-th chapter

Troubleshooting

Why am I not getting any chapter predictions for my audio file?

One possible reason is that the audio file doesn't contain enough variety in topic or tone for the model to identify separate chapters. Another reason could be due to background noise or low-quality audio interfering with the model's analysis.

Can I specify the number of chapters to be generated by the Auto Chapters model?

No, the number of chapters generated by the Auto Chapters model isn't configurable by the user. The model automatically segments the audio file into logical chapters as the topic of conversation changes.

Can I use the Auto Chapters model and the Summarization model together in the same request?

No, the Auto Chapters model and the Summarization model can't be used together in the same request. If you attempt to enable both models in a single request, an error message is returned indicating that only one of the models can be enabled at a time.