What to Know About Speech-to-Text Privacy

If you’re an application developer, here are the questions about data security and privacy to explore before choosing the best Speech-to-Text API for your project.

What to Know About Speech-to-Text Privacy

You’re testing out a new Speech-to-Text API or Speech Recognition software. You upload your audio or video files, select your requirements, receive your transcription, and move on to the next step in your project.

But what happens to the files you uploaded? Or the transcription itself? Unfortunately, you can’t just assume the API has your data security and privacy at top of mind.

If you’re an application developer, here are the questions about Speech-to-Text data privacy and security to explore before choosing the best Speech-to-Text API for your project:

Does the API or Software Keep a Copy of my Data?

Unfortunately, big tech companies have a long history of taking data uploaded via their API and applying it to their own use. This means you can’t assume that the API or software will automatically delete your raw audio file or video file, or your actual transcription text, when you are finished using it.

Read the fine print. Some APIs may offer an “opt out” button that makes it so the company can’t use your data to train their models, for their own research, for competitor products, etc. However, often this is worded as only a partial opt-out--they might still keep meta data or a copy of the files in the cloud, such as the Google cloud, or others.

This can raise serious security concerns, especially if you work with sensitive data. What if there was a security breach? Or if they change policies to allow sharing uploaded data more freely? You don’t want to get caught in this position.

What does the API do with my Data?

You’ve determined that the API keeps a copy of your audio/video files and/or transcription text. Now, you need to determine what they actually do with that data.

Here are top actions we’ve seen APIs or software companies take with user data:

  1. They use it to train and improve their models. This may not seem like a big deal in theory, but if your files contain sensitive customer data or proprietary information, you may not want this information in the hands of someone else.
  2. They use it to optimize their own products. The unintentional training data you provide may be used by the API to improve their own offerings, which could operate in direct competition to your own.
  3. They share the data with third parties. It's common for some APIs to share data with a third-party human transcription service. If your files contain sensitive information, you may want to ensure this doesn't happen automatically.

Does my Data Contain Sensitive Information?

Your audio or video files may contain sensitive customer information or personal data such as credit card numbers, social security numbers, addresses, phone numbers, medical history, or more. While this Personal Identifiable Information (PII) can be redacted by the API, like with a PII Redaction feature, the raw files may still contain the unredacted versions.

This would be a serious problem in the event of a data breach, or even if the API is using this sensitive information to train their own models without the permission of your customers.

Can I Delete the Data I Send?

You should always ask the API or software if you can delete any data you upload, including both your audio or video files and your transcription text. Even if the API does automatically store any of this information, it should respond to any requests promptly and completely. This would be a huge red flag if you weren’t allowed to delete your data, even if explicitly requested.

Speech-to-Text Privacy Best Practices

Now that you know what to look out for, here’s what are considered data privacy best practices for Speech-to-Text APIs.

Look for Speech-to-Text APIs that:

  • Don’t store any raw audio/video files after transcription is complete.
  • Only keep encrypted versions of your transcription files in a secure database, in case developers need to access it. However, make sure that you can delete these files by request at any time and that this request will be promptly met.
  • Don’t use customer data to improve their models unless explicit customer permission is given. This should be more than an opt-in button--it should be a conversation and commitment that you both enter into knowingly and with explicit terms.
  • Handle sensitive customer data with care. Raw files containing such information should never be shared with third party and/or human transcription services without explicit consent. Data should be stored securely and promptly deleted post-transcription.

Any APIs or software that violates these best practices should be avoided.

Ensuring Confidence in Data Security and Speech-to-Text Privacy

At AssemblyAI, handling data responsibly is our top priority. That’s why we strictly adhere to the best practices outlined above. If you can’t find data security and privacy outlined explicitly in the terms of your partnership, be wary. This may not mean that your data won’t be handled responsibility, but you don’t want to find out after the fact when your request to delete sensitive data is denied, or if a data breach occurs.

By finding an API that is transparent about their data security practices up front, you can be confident in your relationship moving forward.

If you’d like to speak with us about data security, Speech-to-Text privacy, or other concerns prior to using our API, please reach out!