As a Contact Center Software as a Service, Aloware specializes in helping companies turn more leads into deals for companies worldwide. Aloware also supports company compliance and efficiency initiatives and has helped power over 20 million calls, send over 30 million SMS/MMS, and reach over 15 million contacts.
Facilitating customer calls and messages was a great starting point for Aloware. But these calls and messages generated gigabytes of unstructured data that was sitting unused. Aloware’s product team wondered if AI models could help.
Thanks to recent advances in Machine Learning and Deep Learning, AI models are more accurate today than ever before. The most sophisticated models serve as the brains behind today’s most impressive tech, such as self-driving cars, automated fraud detection, personalized recommendations, and more. The principles behind these models are also being integrated into technology such as speech recognition and automated text analysis.
With this in mind, Aloware began searching for the right AI tech for their use case. Their product team knew automated, accurate transcription was a good fit for their product roadmap, but wondered what else they could do with all this transcription data. The product team also had a short time frame to deployment, so needed a single API that could meet all of their requirements.
Aloware’s search led them to AssemblyAI. AssemblyAI’s Speech-to-Text and Audio Intelligence APIs help product teams apply the latest AI models to build new features that deliver winning outcomes for their customers— across performance, productivity, and efficiency.
Deploying AI in Just 6 Weeks
Leveraging AssemblyAI’s AI models, Aloware was able to ship Smart Transcription for its customers in just 6 weeks. Now, most QA tasks on Aloware’s platform are automated, helping their customers conduct QA significantly faster.
With AssemblyAI, each call Aloware receives is transcribed automatically and at near human-level accuracy. In the past, automatic transcription for phone calls was available but at low accuracy–traditional speech transcription models could only transcribe at about 70% accuracy. They would also omit basic punctuation, casing, and formatting, which made the transcripts difficult to read.
Here’s a snippet of what an unformatted transcript would look like, using a transcribed interview as an example:
When i'm gone for a while but hes always supportive so that always takes a lot of stress off and lets me play and its a lot easier sure wow well back to your college and pro career i know you are a usc player and im sure that was an amazing team experience but a lot of college players dont go on to go pro even though theyre incredible players and the college level is very high its a shame that there isnt much more interest in college tennis but how do you talk about mindset shift from choosing tennis as a career versus like business or coding or something you know and then making that decision from usc to just go pro one of my goals when I was little was to always play professional i think maybe some people just want to go to college or get a scholarship and then end there but i knew i always wanted to continue my tennis
Thankfully, modern AI models, like AssemblyAI's, include Automatic Punctuation and Casing, Paragraph Detection, and Speaker Diarization, which result in transcripts being much more readable for end users.
Here’s what the same transcript looks like with the above models applied:
<Speaker A> When I'm gone for a while, but he's always supportive, so that always takes a lot of stress off and lets me play, and it's a lot easier. <Speaker B> Sure. Wow. Well, back to your college and pro career. I know you are a USC player and I'm sure that was an amazing team experience but a lot of college players don't go on to go pro, even though they're incredible players and the college level is very high. It's a shame that there isn't much more interest in college tennis. But how do you talk about mindset shift from choosing tennis as a career versus, like, business or coding or something, you know, and then making that decision from USC to just go pro? <Speaker A> One of my goals when I was little was to always play professional. I think maybe some people just want to go to college or get a scholarship and then end there, but I knew I always wanted to continue my tennis.
Aloware was able to apply these models to package a high value smart transcription tool for its end users:
AssemblyAI also offers a suite of AI models that help product teams build high ROI tools on top of audio data. For Aloware, these models–particularly Auto Chapters and Sentiment Analysis–helped its team fill key gaps in its current offering to end users. This includes building new tools that help their customers gain insights into customer sentiment, sales representative performance, and call analysis for improving customer experience and interactions.
Aloware also liked that these AI models came from a single provider, making the tools easier and faster to build.
“The accuracy was strong,” explains Nathan Webb, Product Manager at Aloware. But the “great documentation and unique models like Auto Chapters and Sentiment Analysis is what really won us over,” he continues.
Auto Chapters is a Text Summarization model that automatically surfaces key highlights and summaries from audio and video streams. The Auto Chapters model works by first segmenting the audio/video stream into logical, time-stamped chapters, or points where the topic of conversation naturally changes. The model then generates a short summary for each of these chapters. The result is similar to what YouTube displays beneath videos when automatic chapters are enabled.
For Aloware, the Auto Chapters model speeds up Quality Assurance (QA) by making call transcripts easier to digest and process.
Webb explains, “Auto Chapters is especially helpful to customers looking to quickly and intelligently perform Quality QA on their recorded calls.”
Sentiment Analysis, another one of the AssemblyAI’s AI models, detects and labels positive, negative, and neutral sentiments in speech segments. Sentiment Analysis is useful for tracking customer opinions and attitudes across various locations, time zones, products, support agents, and more.
Finally, Webb explains that AssemblyAI’s demonstrated commitment to continuous model and feature improvement through its AI research was a big deciding factor to go with the startup’s services.
Infusing AI into Contact Centers
Aloware has been thrilled with the accurate transcription and AI features it can now offer customers with AssemblyAI’s state-of-the-art AI models.
In addition, working with AssemblyAI has gone smoothly, says Webb: “The ongoing support has been strong and AssemblyAI continues to act like real partners, not just vendors.”
Aloware’s results have been just as impressive. “AssemblyAI is the first true Machine Learning feature we have developed and provided to our customers,” explains Webb. “It saves our customers hours of call listening on lengthy calls. Moreover, the tool has opened a new world of unforeseen insights and performance tracking for call reviews. Customers consistently tell me that this is one of the coolest things that Aloware has ever built,” he continues.
AssemblyAI’s AI models have also helped Aloware win more customers with its new automated call QA feature.
What’s next for Aloware? “In the immediate future, our team is developing aggregated reporting for managers to quickly view agent call performance,” says Webb. “In the longer term, we want to use AssemblyAI to provide in-moment notifications for relevant poor call quality. There may be more exciting features on the horizon.”
Those looking to learn more about Aloware or sign up for its smart transcription service can do so here.