Skip to main content

Auto chapters

The Auto Chapters model is capable of providing a "summary over time" for the audio content, which makes it easy for users to navigate and find specific information.

Quickstart

In the Creating summarized chapters from podcasts guide, the client uploads an audio file and configures the API request to create auto chapters of the content.

You can also view the full source code here.

Understanding the response

After submitting an audio file for transcription, the resulting object will contain a chapters key that provides information on the detected chapters in the audio file. Each chapter object within the array includes a summary, headline, gist, start and end timestamps, and a short summary of the content spoken during that timeframe. The information can be accessed and displayed in a customized format based on the specific needs of your users.

# Summary: Dan Gilbert is a psychologist and happiness expert. His talk is recorded live at the Ted conference and contains powerful visuals. In 2 million years, the human brain has nearly tripled in mass. The prefrontal cortex is an experience simulator. Pilots practice in flight simulators so that they don't make real mistakes in planes. People can have experiences in their heads before they try them out in real life...
# Start: 8590, End: 1278920
# Headline: One of the main reasons that our brain got so big is because it got a new part called the frontal lobe, and particularly a part called the prefrontal cortex.
# Gist: The big brain.
# ...

Here is a reference table with all parameters of a chapter:

startStarting timestamp (in milliseconds) of the portion of audio being summarized
endEnding timestamp (in milliseconds) of the portion of audio being summarized
summaryA one paragraph summary of the content spoken during this timeframe
headlineA single sentence summary of the content spoken during this timeframe
gistAn ultra-short summary, just a few words, of the content spoken during this timeframe

Troubleshooting

Why am I not getting any chapter predictions for my audio file?

One possible reason is that the audio file does not contain enough variety in topic or tone for the model to identify separate chapters. Another reason could be due to background noise or low-quality audio interfering with the model's analysis.

Can I specify the number of chapters to be generated by the Auto Chapters model?

No, the number of chapters generated by the Auto Chapters model is not configurable by the user. The model automatically segments the audio file into logical chapters as the topic of conversation changes.

Can I use the Auto Chapters model and the Summarization model together in the same request?

No, the Auto Chapters model and the Summarization model cannot be used together in the same request. If you attempt to enable both models in a single request, an error message will be returned indicating that only one of the models can be enabled at a time.