AssemblyAI offers a cutting-edge Sentiment Analysis model that can detect the sentiment of each sentence spoken in audio files. When you enable it, you can get a detailed analysis of the positive, negative, or neutral sentiment conveyed in the audio, along with a confidence score for each result.
You can also learn the content on this page from Sentiment Classification for Audio Files in Python on AssemblyAI's YouTube channel.
Understanding the response
The JSON object above contains all information about the transcription. Depending on which Models are used to analyze the audio, the attributes of this object will vary. For example, in the quickstart above we did not enable Summarization, which is reflected by the
summarization: false key-value pair in the JSON above. Had we activated Summarization, then the
summary_model key values would contain the file summary (and additional details) rather than the current
To access the Sentiment Analysis information, we use the
The reference table below lists all relevant attributes along with their descriptions, where we've called the JSON response object
results. Object attributes are accessed via dot notation, and arbitrary array elements are denoted with
results.words[i].text refers to the
text attribute of the
i-th element of the
words array in the JSON
|boolean||Whether Sentiment Analysis was enabled in the transcription request|
|array||A temporal sequence of Sentiment Analysis results for the audio file, one element for each sentence in the file|
|string||The transcript of the i-th sentence|
|number||The starting time, in milliseconds, of the i-th sentence|
|number||The ending time, in milliseconds, of the i-th sentence|
|string||The detected sentiment for the i-th sentence, one of |
|number||The confidence score for the detected sentiment of the i-th sentence, from 0 to 1|
|string or null||The speaker of the i-th sentence if Speaker Diarization is enabled, else null|
The Sentiment Analysis model is based on the interpretation of the transcript and may not always accurately capture the intended sentiment of the speaker. It's recommended to take into account the context of the transcript and to validate the sentiment analysis results with human judgment when possible.
It's important to ensure that the audio being analyzed is relevant to your use case. Additionally, it's recommended to take into account the context of the transcript and to evaluate the confidence score for each sentiment label.
The Sentiment Analysis model is designed to be fast and efficient, but processing times may vary depending on the size of the audio file and the complexity of the language used. If you experience longer processing times than expected, don't hesitate to contact our support team.