For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
PlaygroundChangelogSign In
OverviewAPI ReferencePre-recorded STTStreaming STTVoice AgentsSpeech UnderstandingGuardrailsLLM GatewayFAQ
OverviewAPI ReferencePre-recorded STTStreaming STTVoice AgentsSpeech UnderstandingGuardrailsLLM GatewayFAQ
  • Models & features
    • Getting started
    • Identify speakers by name/role
    • Translate transcripts
    • Format transcripts with custom rules
    • Detect entities in transcript
    • Analyze sentiment of speech
    • Create summarized chapters
    • Identify highlights
      • Identifying highlights in audio or video files
    • Detect discussion topics
    • Summarize transcripts
LogoLogo
PlaygroundChangelogSign In
On this page
  • Get started
  • Step-by-step instructions
  • Understanding the response
  • Conclusion
Models & featuresIdentify highlights

Identifying highlights in audio and video files

Was this page helpful?
Built with

The Key Phrases model identifies significant words and phrases in your transcript and lets you to extract the most important concepts or highlights from your audio or video file.

For example, if you’re a call center, you can analyze highlights from recorded phone calls.

In this step-by-step guide, you’ll learn how to apply the model. You’ll send the auto_highlights parameter in your request, and then use the auto_highlights_result property in the response.

Get started

Before we begin, make sure you have an AssemblyAI account and an API key. You can sign up for a free account and get your API key from your dashboard.

The complete source code for this guide can be viewed here.

Here’s an audio sample for this guide:

$https://assembly.ai/wildfires.mp3

Step-by-step instructions

1
Python (requests)
Python SDK
JavaScript

Create a new file and request.

1import requests
2import time
2
Python
Python SDK
JavaScript

Set up the API endpoint and headers. The headers should include your API key.

1base_url = "https://api.assemblyai.com"
2
3headers = {
4 "authorization": "<YOUR_API_KEY>"
5}
3
Python
Python SDK
JavaScript

Upload your local file to the AssemblyAI API.

1with open("./my-audio.mp3", "rb") as f:
2 response = requests.post(base_url + "/v2/upload",
3 headers=headers,
4 data=f)
5
6upload_url = response.json()["upload_url"]
4
Python
Python SDK
JavaScript

Use the upload_url returned by the AssemblyAI API to create a JSON payload containing the audio_url parameter and the auto_highlights parameter set to True.

1data = {
2 "audio_url": upload_url,
3 "speech_models": ["universal-3-pro", "universal-2"],
4 "language_detection": True,
5 "auto_highlights": True
6}
5
Python
Python SDK
JavaScript

Make a POST request to the AssemblyAI API endpoint with the payload and headers.

1url = base_url + "/v2/transcript"
2response = requests.post(url, json=data, headers=headers)
6
Python
Python SDK
JavaScript

After making the request, you’ll receive an ID for the transcription. Use it to poll the API every few seconds to check the status of the transcript job. Once the status is completed, you can retrieve the transcript from the API response, as well as the auto highlight results.

1transcription_id = response.json()['id']
2polling_endpoint = f"https://api.assemblyai.com/v2/transcript/{transcription_id}"
3
4while True:
5 transcription_result = requests.get(polling_endpoint, headers=headers).json()
6
7 if transcription_result['status'] == 'completed':
8 auto_highlights_result = transcription_result['auto_highlights_result']
9 for highlight in auto_highlights_result['results']:
10 print(f"Highlight: {highlight['text']}, Count: {highlight['count']}, Rank: {highlight['rank']}, Timestamps: {highlight['timestamps']}")
11 break
12
13 elif transcription_result['status'] == 'error':
14 raise RuntimeError(f"Transcription failed: {transcription_result['error']}")
15
16 else:
17 time.sleep(3)

Understanding the response

The auto_highlights_result key in the response contains a list of all the highlights found in the transcription text. For each entry, the results include the text of the phrase or word detected (text), how many times it occurred in the text (count), its relevancy score (rank), and a list of all the timestamps (timestamps), in milliseconds, in the audio where the phrase or word is spoken.

For more information about the API response, see API/Model reference.

Conclusion

Automatically highlighting relevant phrases in calls is a great way to focus on important information at a glance. In general, adding AI to Conversation Intelligence tools can augment them by generating actionable summaries to speed up call review, generating insights, monitoring for concerns, increasing engagement, and more. Our AI summarization model has several customizable parameters that you can experiment with for other types of recordings.

To learn more about how to use AI summarization for call coaching, see AssemblyAI blog.