Blog

PII Redaction for Speech-to-Text Transcriptions (March 2020 Update)

Announcements
PII Redaction for Speech-to-Text Transcriptions
Share on social icon.Share on social icon.Share on social icon.Share on social icon.

AssemblyAI's research team has launched a new neural network update, with big improvements in accuracy and speed. Below are some of the highlights that we're excited to share with you!

Further Accuracy Improvements

With our most recent update, we are consistently out-benchmarking all other API's in accuracy on our asynchronous English model. Below are our average accuracy %'s (using Word Error Rate or WER) based on benchmark reports run this month versus Google Cloud's video model, AWS Transcribe, and Microsoft Azure:

WER Results - March.png

Interested in benchmarking? Compare our accuracy and price side-by-side with your current provider by submitting a few of your files here.

Improved Security: PII redaction

Dealing with sensitive personal identifiable information (PII)?

PII redaction automatically detects and removes sensitive numbers, like credit card and social security numbers, from the transcription text. These sensitive numbers are replaced with pound signs "#" in the transcript.

For complete examples of how to turn on PII Redaction, take a look at the full API docs here.

Faster Transcription with Speed Boost

We work with a large number of telephony companies powering their visual voicemail. In visual voicemail applications, the turnaround time for transcriptions is key, so customers can see their visual voicemail shortly after their call ends.

To improve on this, we've added the speed boost feature which is built to transcribe 1 minute (or less) audio files in seconds. Your transcription will complete anywhere from 25-50% faster than normal for transcripts generated with this feature turned on.

Below is an example of how to turn on speed boost (Python example), for snippets of code in other languages, take a look at the full API docs here.

import requests

endpoint = "https://api.assemblyai.com/v2/transcript"

json = {
  "audio_url": "https://s3-us-west-2.amazonaws.com/blog.assemblyai.com/audio/8-7-2018-post/7510.mp3",
  "speed_boost": True
}

headers = {
    "authorization": "YOUR-API-TOKEN",
    "content-type": "application/json"
}

response = requests.post(endpoint, json=json, headers=headers)

print(response.json())


100% Uptime

Another month of 100% uptime across all our models, subscribe to our status page to stay up-to-date!

Subscribe to our blog!

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

You may also like

Checkout some of our recent research and product updates

Python Speech Recognition in Under 25 Lines of Code
Tutorials
Python Speech Recognition in Under 25 Lines of Code

How to build a YouTube Downloader in Python
Tutorials
How to build a YouTube downloader in Python

How to get the transcript of a YouTube video
Tutorials
How to get the transcript of a YouTube video

In this blog post, I'm going to show you how to build a command line tool that will download a video from a YouTube link and extract the transcription for you via AssemblyAI in Python 3!

ADVANCED TRANSCRIPTON FEATURES

Unlock your media with our advanced features like PII Redaction,
Keyword Boosts, Automatic Transcript Highlights, and more