Does AssemblyAI offer Zero Data Retention?
We do not have a formal Zero Data Retention policy (ZDR). If you are interested in achieving ZDR, please contact our Sales team.
AssemblyAI maintains customer data according to applicable law, customer contract, and data minimization principles.Â
Files are maintained in our Production Environment according to the protocol below:
Audio & transcripts in production environment
Pre-recorded audio in production environment
This section applies only to the pre-recorded audio transcription endpoint: https://api.assemblyai.com/v2/transcript.
For pre-recorded audio, AssemblyAI can perform deletion at the customer’s request after a file has been transcribed by the API. Each file is provided with a unique identifier known as the transcript_id
which can be stored by the customer and/or retrieved via a GET request.
Upon making a successful POST request for transcription to https://api.assemblyai.com/v2/transcript, the response will include a key id in the JSON response. This id is the unique identifier for the transcript, which can be used to retrieve a final transcript once the transcription job has been completed on the AssemblyAI side. Until a GET request is made to retrieve the final transcript, it is not recommended to run any deletion since you will have not yet been able to retrieve the result from the API.
Upon making a successful GET request to https://api.assemblyai.com/v2/transcript:transcript_id, the status key should be checked for the response as completed or error. If the status is not completed or error, the user should continue to poll for results to retrieve the transcript. Once a completed status is achieved, all outputs should be stored on the customer end in their own database for any record-keeping. If an error status is retrieved, the customer should check the error key of the response to diagnose what went wrong and re-run the file with the recommended changes.
Once a transcript has been retrieved via a completed status or has thrown an error status, it should now be deleted via a DEL request to https://api.assemblyai.com/v2/transcript:transcript_id. This will trigger a deletion job on AssemblyAI’s end.Â
Regarding the deletion of audio data, if you used a presigned URL and host the audio file in your cloud environment, all audio data will also be deleted when the deletion request for a given transcript_id is submitted. If you used the upload endpoint for your audio file https://www.assemblyai.com/docs/api-reference/files/upload, the audio file data from this endpoint will be deleted on a schedule after 2 days. All intermediate audio transcription artifacts used for processing the file - transcoded audio, original audio files, etc. - will be deleted on a schedule after 3 days, unless a deletion request is submitted.
You can also reference the API documentation for the requests above:
- Transcribe a file - POST request
- Retrieve a transcript - GET request
- Delete a transcript - DEL request
- Handling transcription errors
Confirming Data Deletion: Pre-recorded audio
Should you wish to confirm a file has been deleted, or in case you did not store the transcript_id when the transcription request was made, you can get a list of all transcripts. You can make a GET request to https://api.assemblyai.com/v2/transcript, which will return a list of all transcripts created or specify a transcript_id to get a single transcript status. This will provide a full list of all transcripts associated with an account and their current status.
Streaming audio in production environment
This section applies only to the streaming audio transcription endpoint: wss://streaming.assemblyai.com/v3/ws.
If you are opted out of model training, we do not store or maintain any information about the audio streamed to us in our Streaming Product - it is processed for transcription and thrown away. Certain metadata about the transcript is stored and maintained for logging and billing purposes, but none of the original audio is stored.
Model training:
The model training environment differs from the production environment. You can find more information on model training in our Model Training FAQ. If you would like to opt out of model training, please see our Opt-Out FAQ.