The end of 2022 is quickly approaching, and what a year it has been! As we get closer to 2023, we wanted to take a moment to look back and reflect on some of the highlights of the past year.
In 2022, we:
- Launched our v9 Core Transcription Model, with significant improvements over v8.
- Launched new Summarization models, including Summarization models trained for specific use cases.
- Released the AssemblyAI CLI to help developers test our AI models directly from their terminal.
- Announced our Enterprise launch which includes additional features such as AutoTune, Premier Support, and advanced security measures like our SOC 2 Type 2 Compliance for 2022/2023.
- Released support for additional languages for our Core Transcription API, including Spanish, German, French, Italian, Dutch, and Portuguese. We also released our Automatic Language Detection feature in conjunction with this roll-out.
- Raised a $28M Series A led by Accel and $30M Series B led by Insight Partners, giving us a runway to accelerate our product roadmap, build out our AI infrastructure, and grow our AI research team.
- Launched the AssemblyAI playground, making it easier to test our AI models with just a few clicks.
- Accelerated our growth initiatives, growing our team from 20 to 62 in 2022 (P.S. We’re still hiring!).
- Announced our AssemblyAI Creators program, a community of creators in the AI space.
- Hosted our first AI Hackathon!
New Models and Features
V9 Core Transcription Model
We released our v9 Core Transcription Model, demonstrating increased accuracy across the board compared to our previous model. v9 boasts an 11% improvement on average in WER, 15% improvement on proper nouns, and overall improved transcription quality.
In October, we released new AI-powered Summarization models that achieve state-of-the-art results on conversational data. Then, in December, we added three new Summarization AI models: informative, conversational, and catchy. Together, these Summarization models combine to help developers and product teams better build exciting products that accurately summarize audio and video content.
We also released the new AssemblyAI CLI to help developers quickly test our latest AI models right from their terminal.
The CLI is simple and easy to install and makes it a seamless process to experiment with AssemblyAI’s catalog of models.
AutoTune, Premier Support, and SOC 2 Type 2
In September, we released two new Enterprise offerings: AutoTune Early Access and Premier Support. AutoTune combines our AI expertise with better transcription accuracy into a single service for AssemblyAI customers as a premium add-on service. Premier Support gives Enterprises that build with AssemblyAI exclusive access to our team of AI specialists, product experts, and engineers to guide them through their journey with us.
In addition, we also announced that we obtained our SOC 2 Type 2 certification for 2022/2023 in our commitment to enhanced security.
Additional language support
Our product team shipped language support for many highly requested languages, including Spanish, German, French, Italian, Dutch, and Portuguese. You can see the entire list of all languages currently supported by our Core Transcription API here.
In addition, we also released Automatic Language Detection, which can automatically identify the dominant language that’s spoken in an audio or video file and provide a transcription in that language.
We also announced our new AssemblyAI playground! The playground lets you test our AI models with just a few clicks - simply drop in any YouTube video link, local audio file, or local video file to transcribe and analyze it in seconds.
New pricing calculator
Finally, we added a new pricing calculator to our website to help product teams gain a better understanding of expected costs associated with using our API.
2022 has been packed with incredible research. Most notably, physics-inspired Diffusion Models have made great strides, powering text-to-image models like DALL-E 2 and Imagen. Stable Diffusion, a publicly-available, high-performing open-source text-to-image model, was also released this year. In the months after its release, countless projects were built on top of Stable Diffusion, marking a wave of creative progress unparalleled in recent years. Stable Diffusion 2 was released just a couple of months later, constituting a seminal achievement as the world’s first open-source, open-data, open-weight text-to-image model.
Additionally, beyond Diffusion Models, we have seen other physics-inspired models being invented and developed, like Poisson Flow Generative Models which use principles from electrostatics to generate images.
Beyond image generation, there has been progress on the natural language front. One of our AI researchers Wancong (Kevin) Zhang published Light-weight probing of unsupervised representations for Reinforcement Learning together with a team of researchers from New York University and Facebook AI Research. Additionally, Whisper, a publicly-available speech-to-text model, was released this year.
We’re excited to see where these developments take us in the new year.
In January, we announced our $28M Series A led by Accel, with participation from investors such as Y Combinator, John and Patrick Collison (Stripe), Nat Friedman (GitHub), and Daniel Gross (Pioneer). This was followed by our July announcement that we raised an additional $30M in our Series B round led by global software investor Insight Partners. Combined, this funding gives us the ability to accelerate our product roadmap, build out a robust AI infrastructure, and grow our AI research team.
In addition, we were also proud to be recognized as a G2 High Performer and Momentum Leader in the Voice Recognition Software category for Fall, Summer, and Spring 2022. To receive G2’s High Performer recognition, companies must maintain the highest customer satisfaction scores in the given category. To receive G2’s Momentum Leader recognition, companies must perform in the top 25% against other companies in the same category.
In early December, we held our first AI hackathon! We had 1785 registrants with 440 participants from over 80 countries. There were 151 projects submitted, with Superpaint taking the $35K top prize!
We also helped sponsor several other successful hackathons, including HackHawks 2022 and DeltaHacks.
2022 saw huge growth for our YouTube channel. We started the year with just over 500 followers, growing to 22.8K by the end of the year, with 1.2 million views of our videos.
We also published over 110 pieces of content on the AssemblyAI blog across tutorials, Deep Learning, industry, announcements, and more.
Here are the top pieces of content across each channel for 2022:
Top YouTube Videos
How does DALLE-2 actually work?
Getting Started with Hugging Face in 15 Minutes | Transformers, Pipeline, Tokenizer, Models
Diffusion models explained in 4-difficulty levels
Top Blog Posts
Introduction to Diffusion Models for Machine Learning
How to Run Stable Diffusion Locally to Generate Images
How DALLE-2 Actually Works
The Top Free Speech-to-Text APIs, AI Models, and Open Source Engines
How to Run OpenAI’s Whisper Speech Recognition Model
AssemblyAI Creators Program
Finally, we introduced the AssemblyAI Creators Program, a community of like minded creators in the AI space who work to grow together and give back to the developer community.
Thank you to everyone who joined us on this exciting journey in 2022 – we can’t wait to see what 2023 has in store!