Skip to main content

Entity Detection

The Entity Detection model lets you automatically identify and categorize key information in transcribed audio content.

Here are a few examples of what you can detect:

  • Names of people
  • Organizations
  • Addresses
  • Phone numbers
  • Medical data
  • Social security numbers

For the full list of entities that you can detect, see Supported entities.

Supported languages

Entity Detection is available in multiple languages. See Supported languages.


Enable Entity Detection by setting entity_detection to true in the transcription config.

Example output

Timestamp: 2548 - 3130

the US
Timestamp: 5498 - 6350


API reference


curl \
--header "Authorization: YOUR_API_KEY" \
--header "Content-Type: application/json" \
--data '{
"audio_url": "YOUR_AUDIO_URL",
"entity_detection": true
entity_detectionbooleanEnable Entity Detection.


entitiesarrayAn array of detected entities.
entities[i].entity_typestringThe type of entity for the i-th detected entity.
entities[i].textstringThe text for the i-th detected entity.
entities[i].startnumberThe starting time, in milliseconds, at which the i-th detected entity appears in the audio file.
entities[i].endnumberThe ending time, in milliseconds, for the i-th detected entity in the audio file.

The response also includes the request parameters used to generate the transcript.

Supported entities

The model is designed to automatically detect and classify various types of entities within the transcription text. The detected entities and their corresponding types is listed individually in the entities key of the response object, ordered by when they first appear in the transcript.

banking_informationBanking information, including account and routing numbers.
blood_typeBlood type (e.g., O-, AB positive).
credit_card_cvvCredit card verification code (e.g., CVV: 080).
credit_card_expirationExpiration date of a credit card.
credit_card_numberCredit card number.
dateSpecific calendar date (e.g., December 18).
date_of_birthDate of Birth (e.g., Date of Birth: March 7, 1961).
drivers_licenseDriver's license number (e.g., DL #356933-540).
drugMedications, vitamins, or supplements (e.g., Advil, Acetaminophen, Panadol).
email_addressEmail address (e.g.,
eventName of an event or holiday (e.g., Olympics, Yom Kippur).
injuryBodily injury (e.g., I broke my arm, I have a sprained wrist).
languageName of a natural language (e.g., Spanish, French).
locationAny location reference including mailing address, postal code, city, state, province, or country.
medical_conditionName of a medical condition, disease, syndrome, deficit, or disorder (e.g., chronic fatigue syndrome, arrhythmia, depression).
medical_processMedical process, including treatments, procedures, and tests (e.g., heart surgery, CT scan).
money_amountName and/or amount of currency (e.g., 15 pesos, $94.50).
nationalityTerms indicating nationality, ethnicity, or race (e.g., American, Asian, Caucasian).
occupationJob title or profession (e.g., professor, actors, engineer, CPA).
organizationName of an organization (e.g., CNN, McDonalds, University of Alaska).
passwordAccount passwords, PINs, access keys, or verification answers (e.g., 27%alfalfa, temp1234, My mother's maiden name is Smith).
person_ageNumber associated with an age (e.g., 27, 75).
person_nameName of a person, such as "Bob" and "Doug Jones".
phone_numberTelephone or fax number.
political_affiliationTerms referring to a political party, movement, or ideology. For example, "Republican" and "Liberal".
religionTerms indicating religious affiliation, such as "Hindu" and "Catholic".
timeExpressions indicating clock times, such as "19:37:28" and "10pm EST".
urlInternet addresses, such as "".
us_social_security_numberSocial Security Number or equivalent.

Frequently asked questions

How does the Entity Detection model handle misspellings or variations of entities?

The model is capable of identifying entities with variations in spelling or formatting. However, the accuracy of the detection may depend on the severity of the variation or misspelling.

Can the Entity Detection model identify custom entity types?

No, the Entity Detection model doesn't support the detection of custom entity types. However, the model is capable of detecting a wide range of predefined entity types, including people, organizations, locations, dates, times, addresses, phone numbers, medical data, and banking information, among others.

How can I improve the accuracy of the Entity Detection model?

To improve the accuracy of the Entity Detection model, it's recommended to provide high-quality audio files with clear and distinct speech. In addition, it's important to ensure that the audio content is relevant to the use case and that the entities being detected are relevant to the intended analysis. Finally, it may be helpful to review and adjust the model's configuration parameters, such as the confidence threshold for entity detection, to optimize the results.