Auto chapters
The Auto Chapters model is capable of providing a "summary over time" for the audio content, which makes it easy for users to navigate and find specific information.
Quickstart
In the Creating summarized chapters from podcasts guide, the client uploads an audio file and configures the API request to create auto chapters of the content.
You can also view the full source code here.
Understanding the response
After submitting an audio file for transcription, the resulting object will contain a chapters
key that provides information on the detected chapters in the audio file. Each chapter object within the array includes a summary
, headline
, gist
, start
and end
timestamps, and a short summary of the content spoken during that timeframe. The information can be accessed and displayed in a customized format based on the specific needs of your users.
# Summary: Dan Gilbert is a psychologist and happiness expert. His talk is recorded live at the Ted conference and contains powerful visuals. In 2 million years, the human brain has nearly tripled in mass. The prefrontal cortex is an experience simulator. Pilots practice in flight simulators so that they don't make real mistakes in planes. People can have experiences in their heads before they try them out in real life...
# Start: 8590, End: 1278920
# Headline: One of the main reasons that our brain got so big is because it got a new part called the frontal lobe, and particularly a part called the prefrontal cortex.
# Gist: The big brain.
# ...
Here is a reference table with all parameters of a chapter:
start | Starting timestamp (in milliseconds) of the portion of audio being summarized |
end | Ending timestamp (in milliseconds) of the portion of audio being summarized |
summary | A one paragraph summary of the content spoken during this timeframe |
headline | A single sentence summary of the content spoken during this timeframe |
gist | An ultra-short summary, just a few words, of the content spoken during this timeframe |
Troubleshooting
One possible reason is that the audio file does not contain enough variety in topic or tone for the model to identify separate chapters. Another reason could be due to background noise or low-quality audio interfering with the model's analysis.
No, the number of chapters generated by the Auto Chapters model is not configurable by the user. The model automatically segments the audio file into logical chapters as the topic of conversation changes.
No, the Auto Chapters model and the Summarization model cannot be used together in the same request. If you attempt to enable both models in a single request, an error message will be returned indicating that only one of the models can be enabled at a time.