The Entity Detection model is a powerful tool that can automatically identify and categorize key information in audio content transcribed, such as the names of people, organizations, addresses, phone numbers, medical data, social security numbers, and more.
When submitting files for transcription, include the
entity_detection parameter in your request body and set it to
You can also view the transcription source code here.
All entities supported by the model
The model is designed to automatically detect and classify various types of entities within the transcription text. The detected entities and their corresponding types will be listed individually in the entities key of the response object, ordered by when they first appear in the transcript.
|Blood type (e.g., O-, AB positive)|
|Credit card verification code (e.g., CVV: 080)|
|Expiration date of a credit card|
|Credit card number|
|Specific calendar date (e.g., December 18)|
|Date of Birth (e.g., Date of Birth: March 7, 1961)|
|Medications, vitamins, or supplements (e.g., Advil, Acetaminophen, Panadol)|
|Name of an event or holiday (e.g., Olympics, Yom Kippur)|
|Email address (e.g., firstname.lastname@example.org)|
|Bodily injury (e.g., I broke my arm, I have a sprained wrist)|
|Name of a natural language (e.g., Spanish, French)|
|Any location reference including mailing address, postal code, city, state, province, or country|
|Name of a medical condition, disease, syndrome, deficit, or disorder (e.g., chronic fatigue syndrome, arrhythmia, depression)|
|Medical process, including treatments, procedures, and tests (e.g., heart surgery, CT scan)|
|Name and/or amount of currency (e.g., 15 pesos, $94.50)|
|Terms indicating nationality, ethnicity, or race (e.g., American, Asian, Caucasian)|
|Job title or profession (e.g., professor, actors, engineer, CPA)|
|Name of an organization (e.g., CNN, McDonalds, University of Alaska)|
|Number associated with an age (e.g., 27, 75)|
|Name of a person (e.g., Bob, Doug Jones)|
|Telephone or fax number|
|Terms referring to a political party, movement, or ideology (e.g., Republican, Liberal)|
|Terms indicating religious affiliation (e.g., Hindu, Catholic)|
|Social Security Number or equivalent|
|Driver's license number (e.g., DL #356933-540)|
|Banking information, including account and routing numbers|
Understanding the response
The identified entities are stored within the
entities key, which contains an array of objects, one for each entity identified within the transcribed text. Each dictionary includes information such as the type of entity, start and end timestamps, and the corresponding text. The
text key can be used to retrieve the specific entity identified.
# Entity: Ted Talks
# Type: event
# Entity: Ted Conference
# Type: event
# Entity: Dan Gilbert
# Type: person
# Entity: psychologist
# Type: occupation
The model is capable of identifying entities with variations in spelling or formatting. However, the accuracy of the detection may depend on the severity of the variation or misspelling.
No, the Entity Detection model currently does not support the detection of custom entity types. However, the model is capable of detecting a wide range of predefined entity types, including people, organizations, locations, dates, times, addresses, phone numbers, medical data, and banking information, among others.
To improve the accuracy of the Entity Detection model, it is recommended to provide high-quality audio files with clear and distinct speech. In addition, it is important to ensure that the audio content is relevant to the use case and that the entities being detected are relevant to the intended analysis. Finally, it may be helpful to review and adjust the model's configuration parameters, such as the confidence threshold for entity detection, to optimize the results.