Topic detection
The Topic Detection model leverages the IAB Content Taxonomy, a comprehensive list of 698 topics, to provide a "common language" for content description.
Quickstart
When submitting files for transcription, include the iab_categories
parameter in your request body and set it to true
. The model utilizes the IAB Content Taxonomy, consisting of 698 comprehensive topics, to establish a standardized language for content description.
You can also view the transcription source code here.
Understanding the response
The response object contains information about the transcription process and its result, including the predicted topics and any additional data associated with it.
The bulk of the results are stored within the iab_categories_result
key.
status | Will be either success , or unavailable in the rare case that the Topic Detection model failed |
results | A list of topics that were predicted for the audio file, including the text that influenced each topic label prediction, and other metadata about relevancy and timestamps |
results.text | The transcription text for the portion of audio that was classified with topic labels |
results.timestamp | The start and end time, in milliseconds, for where the portion of text in `results.text` was spoken in the audio file |
results.labels | The list of labels that were predicted for this portion of text. The relevance key gives a score between 0 and 1.0 for how relevant each label is for the portion of text. |
summary | The twenty topic labels from the results array with the highest relevancy score across the entire audio file. For example, if the Science>Environment label is detected only 1 time in a 60 minute audio file, the summary key will show a low relevancy score for that label, since the entire transcription was not found to consistently be about Science>Environment. |
Troubleshooting
The Topic Detection model uses natural language processing and machine learning to identify related words and phrases even if they are misspelled or unrecognized. However, the accuracy of the detection may depend on the severity of the misspelling or the obscurity of the word.
No, the Topic Detection model can only identify entities that are part of the IAB Taxonomy. The model is optimized for contextual targeting use cases, so using the predefined IAB categories ensures the most accurate results.
There could be several reasons why you are not getting any topic predictions for your audio file. One possible reason is that the audio file does not contain enough relevant content for the model to analyze. Additionally, the accuracy of the predictions may be affected by factors such as background noise, low-quality audio, or a low confidence threshold for topic detection. It is recommended to review and adjust the model's configuration parameters and to provide high-quality, relevant audio files for analysis.
There could be several reasons why you are getting inaccurate or irrelevant topic predictions for your audio file. One possible reason is that the audio file contains background noise or other non-relevant content that is interfering with the model's analysis. Additionally, the accuracy of the predictions may be affected by factors such as low-quality audio, a low confidence threshold for topic detection, or insufficient training data. It is recommended to review and adjust the model's configuration parameters, to provide high-quality, relevant audio files for analysis, and to consider adding additional training data to the model.
To optimize the performance of the Topic Detection model, it is recommended to provide high-quality, relevant audio files for analysis, to review and adjust the model's configuration parameters, such as the confidence threshold for topic detection, and to refer to the list of available IAB topics to guide the analysis. It may also be helpful to consider adding additional training data to the model or consulting with AssemblyAI support for further assistance.