Table of contents
In Automatic Speech Recognition, or ASR, Sentiment Analysis refers to detecting the sentiment of specific speech segments throughout an audio or video file, though it can also be applied to any body of text for textual analysis. Sentiment Analysis is sometimes referred to as Sentiment “Mining” because you are identifying and extracting--or mining--subjective information in your source material.
Sentiment Analysis is a well studied field with interesting and useful applications across a diverse range of industries. In this post, we’ll look more closely at how Sentiment Analysis works, current models, use cases, the best APIs to use when performing Sentiment Analysis, and some of its current limitations.
How Does Sentiment Analysis Work?
In Sentiment Analysis, the goal is to take your audio or video file or transcript and produce three potential outputs--positive, negative, or neutral.
To achieve this, most models output a number between -1 and 1 with:
- -1 = negative
- 0 = neutral
- 1 = positive
This is also referred to as sentiment polarity. Now, your model can either be set up to categorize these numbers on a scale or by probability. On a scale, for example, an output of .6 would be classified as positive since it is closer to 1 than 0 or -1. Probability instead uses multiclass classification to output certainty probabilities - say that it is 25% sure that it is positive, 50% sure it is negative, and 25% sure it is neutral. The sentiment with the highest probability, in this case negative, would be your output.
Sentiment Analysis Models
Sentiment Analysis is a very active area of study in the field of Natural Language Processing (NLP), with recent advances made possible through cutting edge Machine Learning and Deep Learning research. Mainly, Sentiment Analysis is accomplished by fine-tuning transformers since this method has been proven to deal well with sequential data like text and speech, and scales extremely well to parallel processing hardware like GPUs.Learn More: Fine-tuning Transformers for NLP
There are also strong open source datasets and benchmarks for training data to work with as you fine-tune. Review sites, such as Amazon, IMDB for movies, Yelp, and Twitter, all make excellent training data since sentiments are usually strong and lean more toward one side of our positive-negative scale.
Best APIs for Sentiment Analysis
Looking to perform Sentiment Analysis on a piece of pre-written text or an audio or video file? Here are the top Sentiment Analysis tools and APIs to consider (note that these APIs support either Sentiment Analysis on pre-written texts or audio streams, or both):
1. Twinword Sentiment Analysis API
Twinword’s Sentiment Analysis API is a great option for simple textual analysis. The API’s basic package is free for up to 500 words per month, with paid plans ranging from $19 to $250 per month depending on usage.
The API applies scores and ratios to mark a text as positive, negative, or neutral. Ratios are determined by comparing the overall scores of negative sentiments to positive sentiments and are applied on a -1 to 1 scale.
In addition to Sentiment Analysis, Twinword also offers other forms of textual analysis such as Emotion Analysis, Text Similarity, and Word Associations.
2. AssemblyAI’s Sentiment Analysis API
Released in November 2021, AssemblyAI’s Sentiment Analysis API has high accuracy for those looking to perform Sentiment Analysis on audio or video streams, and is more affordable than many other Sentiment Analysis APIs on the market today. As discussed above, its Sentiment Analysis model leverages sentiment polarity to determine the probability that speech segments are positive, negative, or neutral.Try AssemblyAI's Sentiment Analysis API
3. Watson Natural Language Understanding
IBM Watson’s Natural Language Understanding API performs Sentiment Analysis and more nuanced emotional/sentiment detection, such as emotions, relations, and semantic roles on static texts.
However, keep in mind that the technology used to accurately identify these emotional complexities is still in its infancy, so use these more advanced features with caution.
The pure Sentiment Analysis API assigns sentiments detected in either entities or keywords both a magnitude and score to help users better understand chosen texts.
4. Amazon Comprehend for AWS Transcribe
As part of an add-on feature to AWS Transcribe, Amazon Comprehend rates text sentiments found in audio streams as positive, negative, or neutral. In addition, Amazon Comprehend can assign “mixed” to a text if the sentiments extracted in the text aren’t clear or flip flop back and forth.
When Amazon Comprehend is enabled, transcripts will display a probability score for each of the sentiments described above, as well as the overall ascribed sentiment for each text segment.
Be aware that in order to use Amazon Comprehend, you’ll need to host your transcription files in the Amazon S3 Cloud Storage.
5. Google Cloud Natural Language API for Google Speech-to-Text
Google also has a Sentiment Analysis API called Google Cloud Natural Language API that works similarly to Amazon Comprehend.
analyzeSentiment feature, you’ll receive a sentiment of positive, neutral, or negative for each speech segment in a transcription text. Each text segment will also be assigned a magnitude score that indicates how much emotional content was present for analysis.
Using Google Speech-to-Text and Cloud Natural Language can be quite expensive but it’s a good option if you’re already familiar with Google’s NLP offerings.
Sentiment Analysis Tutorial
Want to learn how to perform Sentiment Analysis yourself? This video tutorial walks you through applying Sentiment Analysis to mock earnings calls.
Applications and Use Cases
What is Sentiment Analysis used for? A lot! Telephony companies use Sentiment Analysis to extract the sentiments of customer-agent conversations via cloud-based contact centers. Then, they can track customer feelings and feedback toward particular products, events, or even agents, aiding customer service. They can also use it to analyze agent behavior as well.
Virtual meeting platforms use Sentiment Analysis to determine participant sentiments by portion of meeting, meeting topic, meeting time, etc. This can be a powerful analytic tool that helps companies make better informed decisions to improve products, customer relations, agent training, and more.
As you can see in the examples above, most Sentiment Analysis APIs can only ascribe three attributes accurately--positive, negative, or neutral. As we know, human sentiments are much more nuanced than this black and white output.
Another limitation is in open source datasets. While there are an abundance of datasets available to train Sentiment Analysis models, the majority of them are text, not audio. Because of this, some of the connotations in what may have been implied in an audio stream is often lost. For example, someone could say the same phrase “Let’s go to the grocery store” with enthusiasm, neutrality, or begrudgingly, depending on the situation.