Tutorials

Redact PII in Audio with Make and AssemblyAI

Create a Make scenario using the AssemblyAI app that watches a Google Drive folder for new audio files, and then creates both a transcript and an audio file in which PII is redacted.

Redact PII in Audio with Make and AssemblyAI

Make (formerly Integromat) is a workflow automation tool that lets you integrate various services without requiring coding knowledge. With the AssemblyAI app for Make you can use our AI models to process audio data by transcribing it with speech recognition models, analyzing it with audio intelligence models, and building generative features on top of it with LLMs.

In this tutorial, you'll create a Make scenario that watches a Google Drive folder for new audio files, and then creates both a transcript and an audio file in which PII is redacted.

A Make scenario that watches a Google Drive folder for new audio files, then creates a transcript and an audio file where the PII is redacted.

Note

We're using Google Drive as the source of audio files, but you could pull them from any source including e.g. AWS S3. If the audio files already have public URLs, you don't need to upload them to AssemblyAI.

Build a PII redaction scenario

Sign up or log into your Make account.

Create a new scenario and set the Google Drive Watch Files in a Folder as the trigger.
If this is the first time you're creating a Google Drive module, you'll need to create a new connection and select it.
Configure the trigger to watch a specific folder in your Google Drive where you will upload your audio files. Let's assume this folder is named media.

Configure Google Drive Watch Files in a Folder module

Next, add a Google Drive Download a File module connected to the trigger, and configure the File ID with the File ID from the trigger.

Configure Google Drive Download a File module

Right-click on the connection between the trigger and Download a File module, and click on Set up a filter. Configure the filter condition so that it is true when the file name does not contain the word "redacted".

Set up a filter to only pass through files that aren't redacted

Later, when your scenario is fully built, your scenario will upload the redacted transcript and audio back to Google Drive. As a result, those uploaded redacted files will trigger your scenario again, but you don't want to run the scenario for the files that have been redacted already.

Add an AssemblyAI Upload a File module, to the Download a File module you just defined. The Google Drive file should automatically be selected.
If this is the first time you added an AssemblyAI module, you'll need to create a connection and select it.

Configure AssemblyAI Upload a File module

Add an AssemblyAI Transcribe an Audio File module. Since this module has a lot of parameters, I recommend clicking on the three dots button and then clicking on Collapse all.

Collapse all parameters from Make module

Pass the Uploaded File URL from the previous module to the Audio URL parameter.

Configure AssemblyAI Transcribe an Audio File module

Configure the PII redaction model by setting the following parameters:

  • Set Redact PII to Yes
  • Set Redact PII Audio to Yes
  • Configure at least on PII policy in the Redact PII Policies list
Configure AssemblyAI PII redaction in AssemblyAI Transcribe an Audio File module

Add an AssemblyAI Get Redacted Audio of a Transcript and pass the transcript ID from the Transcribe an Audio File module to the Transcript ID parameter.

Configure AssemblyAI Get Redacted Audio of a Transcript module

Create a Google Drive Create a File from Text module and configure the parameters:

  • Set New text File Location to the media folder
  • Set File Name to the Original Filename from the Google Drive Download a File module, but replace the file extension of the file name with redacted.txt
  • Set File Content to the Text property from the AssemblyAI Transcribe an Audio File module
Configure Create a File from Text module

Warning

The Transcribe an Audio File module output has multiple Text properties, some of which are nested inside other objects and arrays. Use the Text property from the root of the output.

Select text output

Warning

The output you see from the Transcribe an Audio File module is sample data if you haven't run the scenario before. Once the scenario has run, it'll use real data.

Add a Google Drive Upload a File module and configure the parameters:

  • Set New Folder Location to media
  • Select Map under the File parameter
    • Set File Name to the Original Filename from the Google Drive Download a File module, but replace the file extension of the file name with redacted.mp3
    • Set Data to the redacted_audio_file property from the Get Redacted Audio of a Transcript module.
Configure Google Drive Upload a File module.

Your scenario is complete. Save it and let's test it out.

Go to Google Drive and upload an audio file to the media folder. Now, switch back to your Make scenario and click the Run once button. Here's a sample audio file of a phone call that you can use.
Once your scenario is finished, you should see the transcript and redacted audio file appear in your Google Drive folder.

Conclusion

You just learned how to build a Make scenario that redacts PII from audio files from Google Drive. Instead of Google Drive, you could plug in other file services, CRMs, or wherever you store your audio files, as long as there's an app available for your service in Make to download and upload your files.

You can do a lot more with the AssemblyAI app. You can use additional speech recognition features, analyze your audio with audio intelligence models, and build generative features with LeMUR. Check out the AssemblyAI app for Make documentation to learn more.