Changelog
Follow along to see weekly accuracy and product improvements.
Introducing Universal-2
Last week we released Universal-2, our latest Speech-to-Text model. Universal-2 builds upon our previous model Universal-1 to make significant improvements in "last mile" challenges critical to real-world use cases - proper nouns, formatting, and alphanumerics.

Comparison of error rates for Universal-2 vs Universal-1 across overall performance (Standard ASR) and four last-mile areas, each measured by the appropriate metric
Universal-2 is now the default model for English files sent to our `v2/transcript` endpoint for async processing. You can read more about Universal-2 in our announcement blog or research blog, or you can try it out now on our Playground.
Claude Instant 1.2 removed from LeMUR
The following models were removed from LeMUR: anthropic/claude-instant-1-2
and basic
(legacy, equivalent to anthropic/claude-instant-1-2
), which will now return a 400 validation error if called.
These models were removed due to Anthropic sunsetting legacy models in favor of newer models which are more performant, faster, and cheaper. We recommend users who were using the removed models switch to Claude 3 Haiku (anthropic/claude-3-haiku
).
French performance patch; bugfix
We recently observed a degradation in accuracy when transcribing French files through our API. We have since pushed a bugfix to restore performance to prior levels.
We've improved error messaging for greater clarity for both our file download service and Invalid LLM response
errors from LeMUR.
We've released a fix to ensure that rate limit headers are always returned from LeMUR requests, and not just 200
and 400
responses.
New and improved - AssemblyAI Q3 recap
Check out our quarterly wrap-up for a summary of the new features and integrations we launched this quarter, as well as improvements we made to existing models and functionality.
Claude 3 in LeMUR
We added support for Claude 3 in LeMUR, allowing users to prompt the following LLMs in relation to their transcripts:
- Claude 3.5 Sonnet
- Claude 3 Opus
- Claude 3 Sonnet
- Claude 3 Haiku
Check out our related blog post to learn more.
Automatic Language Detection
We made significant improvements to our Automatic Language Detection (ALD) Model, supporting 10 new languages for a total of 17, with best in-class accuracy in 15 of those 17 languages. We also added a customizable confidence threshold for ALD.
Learn more about these improvements in our announcement post.
We released the AssemblyAI Ruby SDK and the AssemblyAI C# SDK, allowing Ruby and C# developers to easily add SpeechAI to their applications with AssemblyAI. The SDKs let developers use our asynchronous Speech-to-Text and Audio Intelligence models, as well as LeMUR through a simple interface.
Learn more in our Ruby SDK announcement post and our C# SDK announcement post.
This quarter, we shipped two new integrations:
Activepieces 🤝 AssemblyAI
The AssemblyAI integration for Activepieces allows no-code and low-code builders to incorporate AssemblyAI's powerful SpeechAI in Activepieces automations. Learn how to use AssemblyAI in Activepieces in our Docs.
Langflow 🤝 AssemblyAI
We've released the AssemblyAI integration for Langflow, allowing users to build with AssemblyAI in Langflow - a popular open-source, low-code app builder for RAG and multi-agent AI applications. Check out the Langflow docs to learn how to use AssemblyAI in Langflow.
Assembly Required
This quarter we launched Assembly Required - a series of candid conversations with AI founders sharing insights, learnings, and the highs and lows of building a company.
Click here to check out the first conversation in the series, between Edo Liberty, founder and CEO of Pinecone, and Dylan Fox, founder and CEO of AssemblyAI.
We released the AssemblyAI API Postman Collection, which provides a convenient way for Postman users to try our API, featuring endpoints for Speech-to-Text, Audio Intelligence, LeMUR, and Streaming for you to use. Similar to our API reference, the Postman collection also provides example responses so you can quickly browse endpoint results.
Free offer improvements
This quarter, we improved our free offer with:
- $50 in free credits upon signing up
- Access to usage dashboard, billing rates, and concurrency limit information
- Transfer of unused free credits to account balance upon upgrading to Pay as you go
We released 36 new blogs this quarter, from tutorials to projects to technical deep dives. Here are some of the blogs we released this quarter:
- Build an AI-powered video conferencing app with Next.js and Stream
- Decoding Strategies: How LLMs Choose The Next Word
- Florence-2: How it works and how to use it
- Speaker diarization vs speaker recognition - what's the difference?
- Analyze Audio from Zoom Calls with AssemblyAI and Node.js
We also released 10 new YouTube videos, demonstrating how to build SpeechAI applications and more, including:
- Best AI Tools and Helpers Apps for Software Developers in 2024
- Build a Chatbot with Claude 3.5 Sonnet and Audio Data
- How to build an AI Voice Translator
- Real-Time Medical Transcription Analysis Using AI - Python Tutorial
We also made improvements to a range of other features, including:
- Timestamps accuracy, with 86% of timestamps accuracy to within 0.1s and 96% of timestamps accurate to within 0.2s
- Enhancements to the AssemblyAI app for Zapier, supporting 5 new events. Check out our tutorial on generating subtitles with Zapier to see it in action.
- Various upgrades to our API, including more improved error messaging and scaling improvements to improve p90 latency
- Improvements to billing, now alerting users upon auto-refill failures
- Speaker Diarization improvements, especially robustness in distinguishing speakers with similar voices
- A range of new and improved Docs
And more!
We can't wait for you to see what we have in store to close out the year 🚀
Claude 1 & 2 sunset
Recently, Anthropic announced that they will be deprecating legacy LLM models that are usable via LeMUR. We will therefore be sunsetting these models in advance of Anthropic's end-of-life for them:
- Claude Instant 1.2 (“LeMUR Basic”) will be sunset on October 28th, 2024
- Claude 2.0 and 2.1 (“LeMUR Default”) will be sunset on February 6th, 2025
You will receive API errors rejecting your LeMUR requests if you attempt to use either of the above models after the sunset dates. Users who have used these models recently have been alerted via email with notice to select an alternative model to use via LeMUR.
We have a number of newer models to choose from, which are not only more performant but also ~50% more cost-effective than the legacy models.
- If you are using Claude Instant 1.2 (“LeMUR Basic”), we recommend switching to Claude 3 Haiku.
- If you are using Claude 2.0 (“LeMUR Default”) or Claude 2.1, we recommend switching to Claude 3.5 Sonnet.
Check out our docs to learn how to select which model you use via LeMUR.
Langflow 🤝 AssemblyAI
We've released the AssemblyAI integration for Langflow, allowing low-code builders to incorporate Speech AI into their workflows.
Langflow is a popular open-source, low-code app builder for RAG and multi-agent AI applications. Using Langflow, you can easily connect different components via drag and drop and build your AI flow. Check out the Langflow docs for AssemblyAI's integration here to learn more.

Speaker Labels bugfix
We've fixed an edge-case issue that would cause requests using Speaker Labels to fail for some files.
Activepieces 🤝 AssemblyAI
We've released the AssemblyAI integration for Activepieces, allowing no-code and low-code builders to incorporate Speech AI into their workflows.
Activepieces is an open-source, no-code automation platform that allows users to build workflows that connect various applications. Now, you can use AssemblyAI's powerful models to transcribe speech, analyze audio, and build generative features in Activepieces.
Read more about how you can use AssemblyAI in Activepieces in our Docs.

Language confidence threshold bugfix
We've fixed an edge-case which would sometimes occur due to language fallback when Automatic Language Detection (ALD) was used in conjunction with language_confidence_threshold
, resulting in executed transcriptions that violated the user-set language_confidence_threshold
. Now such transcriptions will not execute, and instead return an error to the user.
Automatic Language Detection improvements
We've made improvements to our Automatic Language Detection (ALD) model, yielding increased accuracy, expanded language support, and customizable confidence thresholds.
In particular, we have added support for 10 new languages, including Chinese, Finnish, and Hindi, to support a total of 17 languages in our Best tier. Additionally, we've achieved best in-class accuracy in 15 of those 17 languages when benchmarked against four leading providers.
Finally, we've added a customizable confidence threshold for ALD, allowing you to set a minimum confidence threshold for the detected language and be alerted if this threshold is not satisfied.
Read more about these recent improvements in our announcement post.
Free Offer improvements
We've made a series of improvements to our Free Offer:
- All new and existing users will get $50 in free credits (equivalent to 135 hours of Best transcription, or 417 hours of Nano transcription)
- All unused free credits will be automatically transferred to a user's account balance after upgrade to pay-as-you-go pricing.
- Free Offer users will now see a tracker in their dashboard to see how many credits they have remaining
- Free Offer users will now have access to the usage dashboard, their billing rates, concurrency limit, and billing alerts
Learn more about our Free Offer on our Pricing page, and then check out our Quickstart in our Docs to get started.
Speaker Diarization improvements
We've made improvements to our Speaker Diarization model, especially robustness in distinguishing between speakers with similar voices.
We've fixed an error in which the last word in a transcript was always attributed to the same speaker as the second-to-last word.
File upload improvements and more
We've made improvements to error handling for file uploads that fail. Now if there is an error, such as a file containing no audio, the following 422 error will be returned:
Upload failed, please try again. If you continue to have issues please reach out to support@assemblyai.com
We've made scaling improvements that reduce p90 latency for some non-English languages when using the Best tier
We've made improvements to notifications for auto-refill failures. Now, users will be alerted more rapidly when their automatic payments are unsuccessful.
New endpoints for LeMUR Claude 3
Last month, we announced support for Claude 3 in LeMUR. Today, we are adding support for two new endpoints - Question & Answer and Summary (in addition to the pre-existing Task endpoint) - for these newest models:
- Claude 3 Opus
- Claude 3.5 Sonnet
- Claude 3 Sonnet
- Claude 3 Haiku
Here's how you can use Claude 3.5 Sonnet to summarize a virtual meeting with LeMUR:
import assemblyai as aai
aai.settings.api_key = "YOUR-KEY-HERE"
audio_url = "https://storage.googleapis.com/aai-web-samples/meeting.mp4"
transcript = aai.Transcriber().transcribe(audio_url)
result = transcript.lemur.summarize(
final_model=aai.LemurModel.claude3_5_sonnet,
context="A GitLab meeting to discuss logistics",
answer_format="TLDR"
)
print(result.response)
Learn more about these specialized endpoints and how to use them in our Docs.
Enhanced AssemblyAI app for Zapier
We've launched our Zapier integration v2.0, which makes it easy to use our API in a no-code way. The enhanced app is more flexible, supports more Speech AI features, and integrates more closely into the Zap editor.
The Transcribe
event (formerly Get Transcript
) now supports all of the options available in our transcript API, making all of our Speech Recognition and Audio Intelligence features available to Zapier users, including asynchronous transcription. In addition, we've added 5 new events to the AssemblyAI app for Zapier:
Get Transcript
: Retrieve a transcript that you have previously created.Get Transcript Subtitles
: Generate STT or VTT subtitles for the transcript.Get Transcript Paragraphs
: Retrieve the transcript segmented into paragraphs.Get Transcript Sentences
: Retrieve the transcript segmented into sentences.Get Transcript Redacted Audio Result
: Retrieve the result of the PII audio redaction model. The result contains the status and the URL to the redacted audio file.
Read more about how to use the new app in our Docs, or check out our tutorial to see how you can generate subtitles with Zapier and AssemblyAI.
LeMUR browser support
LeMUR can now be used from browsers, either via our JavaScript SDK or fetch
.
LeMUR - Claude 3 support
Last week, we released Anthropic's Claude 3 model family into LeMUR, our LLM framework for speech.
- Claude 3.5 Sonnet
- Claude 3 Opus
- Claude 3 Sonnet
- Claude 3 Haiku
You can now easily apply any of these models to your audio data. Learn more about how to get started in our docs or try out the new models in a no-code way through our playground.
For more information, check out our blog post about the release.
import assemblyai as aai
# Step 1: Transcribe an audio file
transcriber = aai.Transcriber()
transcript = transcriber.transcribe("./common_sports_injuries.mp3")
# Step 2: Define a prompt
prompt = "Provide a brief summary of the transcript."
# Step 3: Choose an LLM to use with LeMUR
result = transcript.lemur.task(
prompt,
final_model=aai.LemurModel.claude3_5_sonnet
)
print(result.response)
JavaScript SDK fix
We've fixed an issue which was causing the JavaScript SDK to surface the following error when using the SDK in the browser:
Access to fetch at 'https://api.assemblyai.com/v2/transcript' from origin 'https://exampleurl.com' has been blocked by CORS policy: Request header field assemblyai-agent is not allowed by Access-Control-Allow-Headers in preflight response.
Timestamps improvement; bugfixes
We've made significant improvements to the timestamp accuracy of our Speech-to-Text Best tier for English, Spanish, and German. 96% of timestamps are accurate within 200ms, and 86% of timestamps are now accurate within 100ms.
We've fixed a bug in which confidence scores of transcribed words for the Nano tier would sometimes be outside of the range [0, 1]
We've fixed a rare issue in which the speech for only one channel in a short dual channel file would be transcribed when disfluencies
was also enabled.
Streaming (formerly Real-time) improvements
We've made model improvements that significantly improve the accuracy of timestamps when using our Streaming Speech-to-Text service. Most timestamps are now accurate within 100 ms.
Our Streaming Speech-to-Text service will now return a new error 'Audio too small to be transcoded'
(code 4034
) when a client submits an audio chunk that is too small to be transcoded (less than 10 ms).