Boosted by our recent $30M Series B announcement, product velocity at AssemblyAI has been accelerating faster than ever before. Now, we’re thrilled to announce a slew of model updates and new services on the horizon for fall 2022.
AssemblyAI’s API platform already helps product teams and developers integrate State-of-the-Art, production ready AI models, such as Asynchronous and Real-time audio/video transcription, Content Moderation, Topic Detection, Entity Detection, and Sentiment Analysis, into their products.
As an AI company, we’re always looking for new ways to improve, expand, and iterate on our current offerings. That’s why our research, engineering, and product teams have been hard at work optimizing our models and designing and building new services we think will be of most value to our customers.
- Premier Support
- Summarization Model
- Improvements to Real-time Transcription
- Support for Additional Languages
Let’s explore each further.
AutoTune is a no-code way to automatically unlock next-level transcription accuracy. When enabled for your account, AutoTune automatically identifies any errors our models make with your data. Our team of AI researchers and engineers can then optimize our models to reduce these errors going forward. The result: improved accuracy on the things that matter most to your use case.
With AutoTune, we will be able to better support enterprises across a wide variety of use cases, including:
- Boosting Conversation Intelligence through more precise keyword identification and subsequent downstream task triggers
- Improving competitor name recognition for better media monitoring
- Increasing proper noun identification accuracy for Brand Safety and Topic Detection
We will also be rolling out our new Premier Support this fall. With Premier Support, you’ll partner with a dedicated AssemblyAI Support Engineer and Technical Account Manager to help your product team launch AI-powered tools and features faster.
Customers that opt-in to this end-to-end support service will gain early access to our latest AI models, 1:1 implementation, proactive and personalized health checks, 24/7 support via Slack, customized training sessions, and more.
With our new Summarization model, product teams and developers will be able to build tools that automatically summarize phone calls, podcasts, virtual interviews, virtual meetings, and other audio or video files processed with our Speech-to-Text API. These tools can then help end users process every interaction easier and faster, helping them gain intelligent, actionable insights from the data.
Our Summarization model is powered by advanced AI research for the highest accuracy and utility across a wide range of use cases, including:
- Automatically summarizing and sharing key parts of virtual meetings
- Speeding QA review, identifying key sections of calls, and flagging call sections for further follow-up
- Auto-summarizing large analytical and legal documents
- Generating content summaries for podcasts and YouTube videos
Improvements to Real-time Transcription
In addition to shipping the new products and services described above, our AI research and engineering teams have invested significant time in upgrading our Real-time Transcription model. Our previous v8 transcription model release improved accuracy by 18.72%, pushing AssemblyAI’s accuracy higher than other names in the space.
Now, our latest updates to the model will push model accuracy to its highest yet, helping to solidify our position as industry leaders for AI and speech transcription. The updates will also include improvements to proper noun recognition accuracy. We’ll provide a further announcement with accuracy and improvement metrics when the update ships later this year.
Additional Language Support
AssemblyAI’s Core Transcription and Audio Intelligence APIs will soon be available in even more languages by the end of 2022, with additional language support coming in early 2023. Our product team has already shipped language support for our Core Transcription API for many highly requested languages, including Spanish, German, French, Italian, Dutch, and Portuguese. Support for Norwegian, Swedish, and Danish, and potential others, will be added in this fall.
Automatic Language Detection, released earlier this year, can automatically identify the dominant language that’s spoken in an audio or video file and provide a transcription in that language, given that certain parameters–like 50 seconds of talk time in that language–are met.
We're excited to see what you think and as always, thank you so much for your continued support — let’s go!