Newsletter

🚀 New Punctuation & Casing Model For Real-Time Transcription

We are excited to announce the release of our latest Punctuation and Truecasing model!

🚀 New Punctuation & Casing Model For Real-Time Transcription

🚀 New Punctuation & Casing Model For Real-Time

We recently released a significant improvement to our Punctuation and Truecasing model for asynchronous transcription. 

This week we updated our real-time transcription service with the new punctuation and truecasing model, which provide the following improvements:

  • Question marks are properly attributed for real-time streaming.
  • Significant improvements in handling casing for challenging linguistic types, such as: mixed-case words (+39% F1 score), acronyms (+20% F1 score), and capital-case (+11% F1 score).
  • Overall 17% relative improvement on average across our test datasets for predicting upper-case letter classification.
  • Overall punctuation accuracy up by 11% (F1 score).
  • Our qualitative analysis shows that the new model is preferred on average 61% over the previous model by our human evaluators.

Try it out now in our new real-time playground.

AssemblyAI Node/JavaScript SDK v4 Released

Version 4 of our JavaScript SDK is live! The new version not only works on Node.js, but also in client-side JavaScript runtime environments including web browsers, Bun, Cloudflare Workers, and more.

We've revamped our sample applications to enable previously incompatible integrations to now use this new SDK version.

import { AssemblyAI } from 'assemblyai'

const client = new AssemblyAI({
  apiKey: 'YOUR_API_KEY'
})

const audioUrl =
  'https://storage.googleapis.com/aai-web-samples/5_common_sports_injuries.mp3'

const run = async () => {
  const transcript = await client.transcripts.transcribe({ audio: audioUrl })
  console.log(transcript.text)
}

run()

Fresh From Our Blog

AI for Universal Audio Understanding: Qwen-Audio Explained: Recently, researchers have made progress towards universal audio understanding, marking an advancement towards foundational audio models. The approach is based on a joint audio-language pre-training that enhances performance without task-specific fine-tuning. Read more>>

How to Create SRT Files for Videos in Python: Learn how to create SRT subtitle files for videos using Python in this easy-to-follow guide. Read more>>

Key phrase detection in audio files using Python: Learn how to identify key phrases and important words using Python and AssemblyAI. Read more>>

Convert Speech to Text In Java (Basic Tutorial): Learn how to transcribe an audio file in Java using AssemblyAI's speech-to-text Java SDK. 

Build AI App Prototypes Visually with No-Code (Open-source): Learn how to use Rivet to build a no-code AI app that transcribes a podcast episode, and takes your question and generates an answer using LeMUR. 

Run LLMs locally - 5 Must-Know Frameworks!: Learn how to run LLMs locally including, Ollama, GPT4All, PrivateGPT, llama.cpp and LangChain.