Reference
This is an object representing a transcription. You can create them, retrieve them to see their status and results, and delete them.
Attribute | Description | Required |
---|---|---|
id string | The unique identifier of your transcription | Yes |
status string | The status of your transcription. queued , processing , completed , or error |
No |
language_code string | The language of your audio file. Possible values are found in Supported Languages. The default value is en_us . |
No |
audio_url string | The URL of your media file to transcribe | No |
text string | The text transcription of your media file | No |
words array | A list of all the individual words transcribed | No |
utterances array | When dual_channel or speaker_labels is enabled, a list of turn-by-turn utterances |
No |
confidence float | The confidence our model has in the transcribed text, between 0.0 and 1.0 |
No |
audio_duration float | The duration of your media file, in seconds | No |
punctuate boolean | Enable Automatic Punctuation, can be true or false |
No |
format_text boolean | Enable Text Formatting, can be true or false |
No |
dual_channel boolean | Enable Dual Channel transcription, can be true or false |
No |
webhook_url string | The URL we should send webhooks to when your transcript is complete | No |
webhook_status_code string | The status code we received from your server when delivering your webhook | No |
auto_highlights_result array | The list of results when enabling Automatic Transcript Highlights | No |
audio_start_from integer | The point in time, in milliseconds, to begin transcription from in your media file | No |
audio_end_at integer | The point in time, in milliseconds, to stop transcribing in your media file | No |
word_boost array | A list of custom vocabulary to boost accuracy for | No |
boost_param string | The weight to apply to words/phrases in the word_boost array; can be "low" , "default" , or "high" |
No |
filter_profanity boolean | Filter profanity from the transcribed text, can be true or false |
No |
redact_pii boolean | Redact PII from the transcribed text, can be true or false |
No |
redact_pii_audio boolean | Generate a copy of the original media file with spoken PII "beeped" out, can be true or false |
No |
redact_pii_policies array | The list of PII Redaction policies to enable | No |
redact_pii_sub string | The replacement logic for detected PII, can be "entity_type" or "hash" |
No |
speaker_labels boolean | Enable Speaker Diarization, can be true or false |
No |
content_safety boolean | Enable Content Safety Detection, can be true or false |
No |
iab_categories boolean | Enable Topic Detection, can be true or false |
No |
content_safety_labels array | The list of results when content_safety is true |
No |
iab_categories_result array | The list of results when iab_categories is true |
No |
custom_spelling array | Customize how words are spelled and formatted using to and from values |
No |
disfluencies boolean | Transcribe Filler Words, like "umm", in your media file; can be true or false |
No |
sentiment_analysis boolean | Enable Sentiment Analysis, can be true or false |
No |
auto_chapters boolean | Enable Auto Chapters, can be true or false |
No |
chapters array | When Auto Chapters is enabled, the list of Auto Chapters results | No |
sentiment_analysis_results array | When Sentiment Analysis is enabled, the list of Sentiment Analysis results | No |
entity_detection boolean | Enable Entity Detection, can be true or false |
No |
entities array | When Entity Detection is enabled, the list of detected Entities | No |
Create a transcription.
Attribute | Description |
---|---|
audio_url string required | The URL of your media file to transcribe |
language_code string | The language of your audio file. Possible values are found in Supported Languages. The default value is en_us . |
punctuate boolean | Enable Automatic Punctuation, can be true or false |
format_text boolean | Enable Text Formatting, can be true or false |
dual_channel boolean | Enable Dual Channel transcription, can be true or false |
webhook_url string | The URL we should send webhooks to when your transcript is complete |
audio_start_from integer | The point in time, in milliseconds, to begin transcription from in your media file |
audio_end_at integer | The point in time, in milliseconds, to stop transcribing in your media file |
word_boost array | A list of custom vocabulary to boost accuracy for |
boost_param string | The weight to apply to words/phrases in the word_boost array; can be "low" , "default" , or "high" |
filter_profanity boolean | Filter profanity from the transcribed text, can be true or false |
redact_pii boolean | Redact PII from the transcribed text, can be true or false |
redact_pii_audio boolean | Generate a copy of the original media file with spoken PII "beeped" out, can be true or false |
redact_pii_policies array | The list of PII Redaction policies to enable |
redact_pii_sub string | The replacement logic for detected PII, can be "entity_type" or "hash" |
speaker_labels boolean | Enable Speaker Diarization, can be true or false |
content_safety boolean | Enable Content Safety Detection, can be true or false |
iab_categories boolean | Enable Topic Detection, can be true or false |
custom_spelling array | Customize how words are spelled and formatted using to and from values |
disfluencies boolean | Transcribe Filler Words, like "umm", in your media file; can be true or false |
sentiment_analysis boolean | Enable Sentiment Analysis, can be true or false |
auto_chapters boolean | Enable Auto Chapters, can be true or false |
entity_detection boolean | Enable Entity Detection, can be true or false |
Get the detailed information of a specific transcript by id.
Query for just the sentences of a transcript by id
.
Query for just the paragraphs of a transcript by id
.
List all your transcripts.
All of the below parameters are optional.
Attribute | Description |
---|---|
limit integer | Max results to return in a single response, between 1 and 200 inclusive |
status string | Filter by transcript status, "processing" , "queued" , "completed" , or "error" |
created_on string | Only return transcripts created on this date; format: "YYYY-MM-DD" |
before_id string | Return transcripts that were created before this id |
after_id string | Return transcripts that were created after this id |
throttled_only boolean | Only return throttled transcripts, overrides status filter |
Permanently delete a transcript by id. The record of the transcript will exist and remain queryable, however, all fields containing sensitive data (like text transcriptions) will be permanently deleted.
Uploads can be used to upload media files directly to the AssemblyAI API for transcription.
Attribute | Description |
---|---|
upload_url string | A URL that points to your audio file, accessible only by AssemblyAI's servers |
Upload a file to our servers for transcription. Learn more at Uploading Local Files for Transcription
Attribute | Value |
---|---|
Transfer-Encoding | chunked |
The contents of your media file.
If you're working with short bursts of audio, less than 15 seconds, you can send the audio data directly to the /v2/stream
endpoint which will return a transcript to you within a few hundred milliseconds, directly in the request-response loop.
The audio data you send to this endpoint has to comply with a strict format. This is because we don't do any transcoding to your data, we send it directly to the model for transcription. You can send the content of a .wav
file to this endpoint, or raw data read directly from a microphone. Either way, you must record your audio in the following format to use this endpoint:
When making a POST
request to this endpoint, you should include the following parameters.
Param | Example | Info | Required |
---|---|---|---|
audio_data |
UklGRtjIAABXQVZFZ… |
Raw audio data, base64 encoded. This can be the raw data recorded directly from a microphone or read from a wav file. | Yes |
format_text |
true |
This is set to false by default; however, a developer can add auto formatting of text by setting it to true . |
No |
punctuate |
true |
This is set to false by default; however, a developer can add auto punctuation by setting it to true . |
No |
base64 encoding:
base64 encoding is a simple way to encode your raw audio data so that it can be included as a JSON parameter in your POST
request. Most programming languages have very simple built-in functions for encoding binary data to base64.
Depending on how much audio data you send, the API will respond within 100-750 milliseconds. The following keys will be in the JSON response.
Param | Example | Info |
---|---|---|
id |
5551722-f677-48a6-9287-39c0aafd9ac1 |
The unique id of your transcription. |
status |
completed |
The status of your transcription. |
confidence |
0.956 |
The confidence score of the entire transcription, between 0 and 1. |
text |
You know Demons on TV like... |
The complete transcription for your audio. |
words |
[{"confidence": 1.0, "end": 440, "start": 0, "text": "You"}, ...] |
An array of objects, with the information for each word in the transcription text. Will include the start/end time (in milliseconds) of the word and the confidence score of the word. |
created |
2019-06-27 22:26:47.048512 |
The timestamp for your request |