Auto Chapters
The Auto Chapters model summarizes audio data over time into chapters. Chapters makes it easy for users to navigate and find specific information.
Each chapter contains the following:
- Summary
- One-line gist
- Headline
- Start and end timestamps
Quickstart
In the Creating summarized chapters from podcasts guide, the client uploads an audio file and configures the API request to create auto chapters of the content.
You can explore the full JSON response here:
Show JSON
You run this code snippet in Colab here, or you can view the full source code here.
Note that Auto Chapters and Summarization cannot both be used in the same request. Additionally, Auto Chapters requires that punctuate: True
be set in the request as above.
Understanding the response
The JSON object above contains all information about the transcription. Depending on which Models are used to analyze the audio, the attributes of this object will vary. For example, in the quickstart above we did not enable Summarization, which is reflected by the summarization: false
key-value pair in the JSON above. Had we activated Summarization, then the summary
, summary_type
, and summary_model
key values would contain the file summary (and additional details) rather than the current null
values.
To access the Auto Chapters information, we use the auto_chapters
and chapters
keys:
The reference table below lists all relevant attributes along with their descriptions, where we've called the JSON response object results
. Object attributes are accessed via dot notation, and arbitrary array elements are denoted with [i]
.
For example, results.words[i].text
refers to the text
attribute of the i-th
element of the words
array in the JSON results
object.
results.auto_chapters | boolean | Whether Auto Chapters was enabled in the transcription request |
results.chapters | array | An array of temporally sequential chapters for the audio file |
results.chapters[i].gist | string | An ultra-short summary (just a few words) of the content spoken in the i-th chapter |
results.chapters[i].headline | string | A single sentence summary of the content spoken during the i-th chapter |
results.chapters[i].summary | string | A one paragraph summary of the content spoken during the i-th chapter |
results.chapters[i].start | number | The starting time, in milliseconds, for the i-th chapter |
results.chapters[i].end | number | The ending time, in milliseconds, for the i-th chapter |
Troubleshooting
One possible reason is that the audio file doesn't contain enough variety in topic or tone for the model to identify separate chapters. Another reason could be due to background noise or low-quality audio interfering with the model's analysis.
No, the number of chapters generated by the Auto Chapters model isn't configurable by the user. The model automatically segments the audio file into logical chapters as the topic of conversation changes.
No, the Auto Chapters model and the Summarization model can't be used together in the same request. If you attempt to enable both models in a single request, an error message is returned indicating that only one of the models can be enabled at a time.