December 15, 2023

🚀 New Punctuation & Casing Model For Real-Time Transcription

We are excited to announce the release of our latest Punctuation and Truecasing model!

Smitha Kolan

Developer Educator

Smitha Kolan

Developer Educator

Table of contents

[Visible on live site]

Get $50 in credits

🚀 New Punctuation & Casing Model For Real-Time

We recently released a significant improvement to our Punctuation and Truecasing model for asynchronous transcription.

This week we updated our real-time transcription service with the new punctuation and truecasing model, which provide the following improvements:

Question marks are properly attributed for real-time streaming.
Significant improvements in handling casing for challenging linguistic types, such as: mixed-case words (+39% F1 score), acronyms (+20% F1 score), and capital-case (+11% F1 score).
Overall 17% relative improvement on average across our test datasets for predicting upper-case letter classification.
Overall punctuation accuracy up by 11% (F1 score).
Our qualitative analysis shows that the new model is preferred on average 61% over the previous model by our human evaluators.

Try it out now in our new real-time playground.

AssemblyAI Node/JavaScript SDK v4 Released

Version 4 of our JavaScript SDK is live! The new version not only works on Node.js, but also in client-side JavaScript runtime environments including web browsers, Bun, Cloudflare Workers, and more.

We've revamped our sample applications to enable previously incompatible integrations to now use this new SDK version.

import { AssemblyAI } from 'assemblyai' const client = new AssemblyAI({ apiKey: 'YOUR_API_KEY' }) const audioUrl = 'https://storage.googleapis.com/aai-web-samples/5_common_sports_injuries.mp3' const run = async () => { const transcript = await client.transcripts.transcribe({ audio: audioUrl }) console.log(transcript.text) } run()

Fresh From Our Blog

AI for Universal Audio Understanding: Qwen-Audio Explained: Recently, researchers have made progress towards universal audio understanding, marking an advancement towards foundational audio models. The approach is based on a joint audio-language pre-training that enhances performance without task-specific fine-tuning. Read more>>

How to Create SRT Files for Videos in Python: Learn how to create SRT subtitle files for videos using Python in this easy-to-follow guide. Read more>>

Key phrase detection in audio files using Python: Learn how to identify key phrases and important words using Python and AssemblyAI. Read more>>

Our Trending YouTube Tutorials

Convert Speech to Text In Java (Basic Tutorial): Learn how to transcribe an audio file in Java using AssemblyAI's speech-to-text Java SDK.

Build AI App Prototypes Visually with No-Code (Open-source): Learn how to use Rivet to build a no-code AI app that transcribes a podcast episode, and takes your question and generates an answer using LeMUR.

Run LLMs locally - 5 Must-Know Frameworks!: Learn how to run LLMs locally including, Ollama, GPT4All, PrivateGPT, llama.cpp and LangChain.