Skip to main content
Automatically detect and redact personally identifiable information from streaming transcripts in real time.

Overview

Streaming PII Redaction lets you automatically detect and remove personally identifiable information from your streaming transcripts in real time. When enabled, the API redacts PII in final turns only before sending them to the client.
Final turns onlyPII redaction only applies to final turns. When redact_pii is true, include_partial_turns defaults to false automatically so no unredacted text reaches the client. Only set include_partial_turns to true if you explicitly want partial (non-final) turns, which will contain unredacted PII alongside the redacted final turns.
When you enable PII redaction, your final turns will look like this:
  • With hash substitution: Hi, my name is ####!
  • With entity_name substitution: Hi, my name is [PERSON_NAME]!
Pre-recorded PII redactionFor PII redaction on pre-recorded audio, including generating redacted audio files, see Redact PII from transcripts.

Connection parameters

ParameterTypeRequiredDefaultDescription
redact_piibooleanYesfalseEnable PII text redaction. Only applies to final turns.
redact_pii_policiesarrayNoAllPII entity types to redact. Over the raw WebSocket, pass a JSON-encoded array of policy names (e.g. ["person_name","phone_number"]). The SDKs accept a native list/array. If omitted and redact_pii is true, all detected PII is redacted. See PII policies for the full list.
redact_pii_substringNohashReplacement scheme. hash replaces PII with # characters, entity_name replaces with [ENTITY_TYPE].
include_partial_turnsbooleanNofalse when redact_pii is true, otherwise trueWhether to include partial (non-final) turns. Defaults to false automatically when PII redaction is enabled, so no unredacted text reaches the client. Set to true only if you explicitly want to receive partial turns, which will contain unredacted PII.

Quickstart

Enable PII redaction by setting redact_pii to true when you open the WebSocket. Optionally pass redact_pii_policies to limit which entity types are redacted, and redact_pii_sub to choose the replacement scheme.
import json

CONNECTION_PARAMS = {
    "sample_rate": 16000,
    "speech_model": "u3-rt-pro",
    "redact_pii": True,
    "redact_pii_policies": json.dumps(["person_name", "phone_number", "email_address"]),
    "redact_pii_sub": "entity_name",
}

Example output

With entity_name substitution:
Hi, my name is [PERSON_NAME] and you can reach me at [PHONE_NUMBER] or [EMAIL_ADDRESS].
With hash substitution:
Hi, my name is #### and you can reach me at ###-###-#### or ####@#####.###.

Supported PII policies

Streaming PII redaction supports the same policies as pre-recorded PII redaction, including person_name, phone_number, email_address, credit_card_number, us_social_security_number, date_of_birth, and more. For the full list of available policies, see PII policies.

Troubleshooting

PII redaction only applies to final turns. If you’re seeing PII, you likely set include_partial_turns to true, which returns unredacted partial turns alongside redacted finals. Remove that override (or set it to false) to only receive redacted final turns — this is the default when redact_pii is enabled.
Audio redaction is not available for streaming. To generate a redacted audio file, use pre-recorded PII redaction with the redact_pii_audio parameter.