Personal Identifiable Information (PII) Redaction is an AI model that is used to automatically remove sensitive information that can be used to uniquely identify an individual from your transcript text.
When submitting files for transcription, include the
redact_pii parameter in your request body and set it to
true, as well as the required parameter
redact_pii_policies, listing all policies that should be redacted.
You can also view the full source code here.
All policies supported by the model
With PII Redaction, the API can automatically remove Personally Identifiable Information (PII) such as phone numbers and social security numbers from the transcription text before it is returned. The redacted text will replace any sensitive information with "#" characters.
Below is a table that lists all the available PII redaction policies and their descriptions:
|Medical process, including treatments, procedures, and tests (e.g., heart surgery, CT scan)|
|Name of a medical condition, disease, syndrome, deficit, or disorder (e.g., chronic fatigue syndrome, arrhythmia, depression)|
|Blood type (e.g., O-, AB positive)|
|Medications, vitamins, or supplements (e.g., Advil, Acetaminophen, Panadol)|
|Bodily injury (e.g., I broke my arm, I have a sprained wrist)|
|A "lazy" rule that will redact any sequence of numbers equal to or greater than 2|
|Email address (e.g., email@example.com)|
|Date of Birth (e.g., Date of Birth: March 7,1961)|
|Telephone or fax number|
|Social Security Number or equivalent|
|Credit card number|
|Expiration date of a credit card|
|Credit card verification code (e.g., CVV: 080)|
|Specific calendar date (e.g., December 18)|
|Terms indicating nationality, ethnicity, or race (e.g., American, Asian, Caucasian)|
|Name of an event or holiday (e.g., Olympics, Yom Kippur)|
|Name of a natural language (e.g., Spanish, French)|
|Any Location reference including mailing address, postal code, city, state, province, or country|
|Name and/or amount of currency (e.g., 15 pesos, $94.50)|
|Name of a person (e.g., Bob, Doug Jones)|
|Number associated with an age (e.g., 27, 75)|
|Name of an organization (e.g., CNN, McDonalds, University of Alaska)|
|Terms referring to a political party, movement, or ideology (e.g., Republican, Liberal)|
|Job title or profession (e.g., professor, actors, engineer, CPA)|
|Terms indicating religious affiliation (e.g., Hindu, Catholic)|
|Driver’s license number (e.g., DL# 356933-540)|
|Banking information, including account and routing numbers|
In addition to the
redact_pii_policies parameter, users can also use the
redact_pii_sub parameter to further customize PII redaction. This parameter allows users to specify the exact text substrings to be redacted, regardless of the PII policy being used.
|PII that is detected is replaced with a hash - #. For example, I'm calling for John is replaced with ####. (Applied by default)|
|PII that is detected is replaced with the associated policy name. For example, John is replaced with |
Create a redacted audio file
In addition to redacting sensitive information from the transcription text, the API can also generate a version of the original audio file with the PII "beeped" out when it is being spoken. To do so, include the
redact_pii_audio parameter in your request when submitting files for transcription.
When the transcription is complete, you can retrieve a URL that points to your redacted audio file by making a request to the following API endpoint:
Webhooks allow you to receive real-time updates about the status of your PII redacted audio file.
webhook_url was provided in your request when submitting your audio file for transcription, we will send a
POST request to the URL.
When you receive the request from AssemblyAI, it will include the following headers.
accept-encoding: gzip, deflate
And the request body will include the following parameters.
The status field indicates whether the PII redaction was completed successfully or if there was an error. The
redacted_audio_urlfield contains a URL to the redacted audio file.
redact_pii_policiesparameter is included in your request with the desired policy names. If you're still experiencing issues, please reach out to our support team for assistance.
webhook_urlparameter is included with a valid URL that can be reached by AssemblyAI's API. If you're using custom authentication headers, ensure that the
webhook_auth_header_valueparameters are included and are correct. If you're still having issues, please contact our support team for assistance.