Skip to main content

Key Phrases

The Key Phrases model can accurately identify significant words and phrases in your transcription, enabling you to extract the most pertinent concepts or highlights from your audio/video file.

Getting started

In the Analyzing highlights of call center recordings guide, the client uploads an audio file and configures the API request to use Key Phrase extraction.

You can also view the full source code here.

Understanding the response

After submitting an audio file for transcription, the resulting object will contain a key_phrases key that provides information on the detected key phrases in the audio file. Each key phrase object within the array includes the text, count, rank, and timestamps fields. The information can be accessed and displayed in a customized format based on the specific needs of your users.

# Key Phrase: human brain
# Timestamps: [{'start': 324900, 'end': 325516}, {'start': 325740, 'end': 326095}]
# Count: 1
# Rank: 0.22
#
# Key Phrase: prefrontal cortex
# Timestamps: [{'start': 327656, 'end': 328019}, {'start': 329876, 'end': 330294}, {'start': 332066, 'end': 332431}]
# Count: 18
# Rank: 1.0

Here is a reference table with all parameters of a result:

textThe phrase/word itself that was detected
countHow many times this phrase occurred in the text
rankThe relevancy of this phrase - the higher the score, the better
timestampsa list of all the timestamps, in milliseconds, in the audio where each phrase/word is spoken

Frequently Asked Questions

How does the Key Phrase model identify important phrases in my transcription?

The Key Phrase model uses natural language processing and machine learning algorithms to analyze the frequency and distribution of words and phrases in your transcription. The algorithm identifies key phrases based on their relevancy score, which takes into account factors such as the number of times a phrase occurs, the distance between occurrences, and the overall length of the transcription.

What is the difference between the Key Phrase model and the Topic Detection model?

The Key Phrase model is designed to identify important phrases and words in your transcription, whereas the Topic Detection model is designed to categorize your transcription into predefined topics. While both models use natural language processing and machine learning algorithms, they have different goals and approaches to analyzing your text.

Can the Key Phrase model handle misspelled or unrecognized words?

Yes, the Key Phrase model can handle misspelled or unrecognized words to some extent. However, the accuracy of the detection may depend on the severity of the misspelling or the obscurity of the word. It is recommended to provide high-quality, relevant audio files with accurate transcriptions for the best results.

What are some limitations of the Key Phrase model?

Some limitations of the Key Phrase model include its limited understanding of context, which may lead to inaccuracies in identifying the most important phrases in certain cases, such as text with heavy use of jargon or idioms. Additionally, the model assigns higher scores to words or phrases that occur more frequently in the text, which may lead to an over-representation of common words and phrases that may not be as important in the context of the text. Finally, the Key Phrase model is a general-purpose algorithm that cannot be easily customized or fine-tuned for specific domains, meaning it may not perform as well for specialized texts where certain keywords or concepts may be more important than others.

How can I optimize the performance of the Key Phrase model?

To optimize the performance of the Key Phrase model, it is recommended to provide high-quality, relevant audio files with accurate transcriptions, to review and adjust the model's configuration parameters, such as the confidence threshold for key phrase detection, and to refer to the list of identified key phrases to guide the analysis. It may also be helpful to consider adding additional training data to the model or consulting with AssemblyAI support for further assistance.