Entity Detection | AssemblyAI

Supported languages

Global Englishen

Australian Englishen_au

British Englishen_uk

US Englishen_us

Spanishes

Frenchfr

Germande

Italianit

Portuguesept

Dutchnl

Hindihi

Japaneseja

Chinesezh

Finnishfi

Koreanko

Polishpl

Russianru

Turkishtr

Ukrainianuk

Vietnamesevi

Afrikaansaf

Arabicar

Belarusianbe

Bulgarianbg

Burmesemy

Catalanca

Croatianhr

Czechcs

Danishda

Estonianet

Georgianka

Greekel

Hebrewhe

Hungarianhu

Icelandicis

Indonesianid

Khmerkm

Latvianlv

Lithuanianlt

Luxembourgishlb

Malayms

Norwegianno

Persianfa

Romanianro

Slovaksk

Sloveniansl

Swahilisw

Swedishsv

Tagalogtl

Tamilta

Supported models

Slam-1slam-1

Universaluniversal

Supported regions

US & EU

The Entity Detection model lets you automatically identify and categorize key information in transcribed audio content.

Here are a few examples of what you can detect:

Names of people
Organizations
Addresses
Phone numbers
Medical data
Social security numbers

For the full list of entities that you can detect, see Supported entities.

Supported Languages

Entity Detection is available in multiple languages. See Supported languages.

Quickstart

Python SDK

Python

JavaScript SDK

JavaScript

C#

Ruby

PHP

Enable Entity Detection by setting entity_detection to True in the transcription config.

1 import assemblyai as aai
2 
3 aai.settings.api_key = "<YOUR_API_KEY>"
4 
5 # audio_file = "./local_file.mp3"
6 audio_file = "https://assembly.ai/wildfires.mp3"
7 
8 config = aai.TranscriptionConfig(entity_detection=True)
9 
10 transcript = aai.Transcriber().transcribe(audio_file, config)
11 print(f"Transcript ID:", transcript.id)
12 
13 for entity in transcript.entities:
14     print(entity.text)
15     print(entity.entity_type)
16     print(f"Timestamp: {entity.start} - {entity.end}\n")

Example output

1 Canada
2 location
3 Timestamp: 2548 - 3130
4 
5 the US
6 location
7 Timestamp: 5498 - 6350
8 
9 ...

API reference

Request

$ curl https://api.assemblyai.com/v2/transcript \
> --header "Authorization: <YOUR_API_KEY>" \
> --header "Content-Type: application/json" \
> --data '{
>   "audio_url": "YOUR_AUDIO_URL",
>   "entity_detection": true
> }'

Key	Type	Description
`entity_detection`	boolean	Enable Entity Detection.

Response

Key	Type	Description
`entities`	array	An array of detected entities.
`entities[i].entity_type`	string	The type of entity for the i-th detected entity.
`entities[i].text`	string	The text for the i-th detected entity.
`entities[i].start`	number	The starting time, in milliseconds, at which the i-th detected entity appears in the audio file.
`entities[i].end`	number	The ending time, in milliseconds, for the i-th detected entity in the audio file.

The response also includes the request parameters used to generate the transcript.

Supported entities

The model is designed to automatically detect and classify various types of entities within the transcription text. The detected entities and their corresponding types is listed individually in the entities key of the response object, ordered by when they first appear in the transcript.

Entity name	Description	Example
`account_number`	Customer account or membership identification number	Policy No. 10042992; Member ID: HZ-5235-001
`banking_information`	Banking information, including account and routing numbers
`blood_type`	Blood type	O-, AB positive
`credit_card_cvv`	Credit card verification code	CVV: 080
`credit_card_expiration`	Expiration date of a credit card
`credit_card_number`	Credit card number
`date`	Specific calendar date	December 18
`date_interval`	Broader time periods, including date ranges, months, seasons, years, and decades	2020-2021; 5-9 May; January 1984
`date_of_birth`	Date of birth	Date of Birth: March 7, 1961
`drivers_license`	Driver’s license number	DL# 356933-540
`drug`	Medications, vitamins, or supplements	Advil, Acetaminophen, Panadol
`duration`	Periods of time, specified as a number and a unit of time	8 months; 2 years
`email_address`	Email address	support@assemblyai.com
`event`	Name of an event or holiday	Olympics, Yom Kippur
`filename`	Names of computer files, including the extension or filepath	Taxes/2012/brad-tax-returns.pdf
`gender_sexuality`	Terms indicating gender identity or sexual orientation, including slang terms	female; bisexual; trans
`healthcare_number`	Healthcare numbers and health plan beneficiary numbers	Policy No.: 5584-486-674-YM
`injury`	Injuries or health issues	I broke my arm, I have a sprained wrist
`ip_address`	Internet IP address, including IPv4 and IPv6 formats	192.168.0.1
`language`	Name of a natural language	Spanish, French
`location`	Any Location reference including mailing address, postal code, city, state, province, country, or coordinates.	Lake Victoria, 145 Windsor St., 90210
`marital_status`	Terms indicating marital status	Single; common-law; ex-wife; married
`medical_condition`	Name of a medical condition, disease, syndrome, deficit, or disorder	chronic fatigue syndrome, arrhythmia, depression
`medical_process`	Medical process, including treatments, procedures, and tests	heart surgery, CT scan
`money_amount`	Name and/or amount of currency	15 pesos, $94.50
`nationality`	Terms indicating nationality, ethnicity, or race	American, Asian, Caucasian
`number_sequence`	Numerical PII (including alphanumeric strings) that doesn’t fall under other categories
`occupation`	Job title or profession	professor, actors, engineer, CPA
`organization`	Name of an organization	CNN, McDonalds, University of Alaska, Northwest General Hospital
`passport_number`	Passport numbers, issued by any country	PA4568332; NU3C6L86S12
`password`	Account passwords, PINs, access keys, or verification answers	27%alfalfa, temp1234, My mother’s maiden name is Smith
`person_age`	Number associated with an age	27, 75
`person_name`	Name of a person	Bob, Doug Jones, Dr. Kay Martinez, MD
`phone_number`	Telephone or fax number
`physical_attribute`	Distinctive bodily attributes, including terms indicating race	I’m 190cm tall; He belongs to the Black students’ association
`political_affiliation`	Terms referring to a political party, movement, or ideology	Republican, Liberal
`religion`	Terms indicating religious affiliation	Hindu, Catholic
`statistics`	Medical statistics	18%, 18 percent
`time`	Expressions indicating clock times	19:37:28; 10pm EST
`url`	Internet addresses	https://www.assemblyai.com/
`us_social_security_number`	Social Security Number or equivalent
`username`	Usernames, login names, or handles	@AssemblyAI
`vehicle_id`	Vehicle identification numbers (VINs), vehicle serial numbers, and license plate numbers	5FNRL38918B111818; BIF7547
`zodiac_sign`	Names of Zodiac signs	Aries; Taurus

Frequently asked questions

How does the Entity Detection model handle misspellings or variations of entities?

The model is capable of identifying entities with variations in spelling or formatting. However, the accuracy of the detection may depend on the severity of the variation or misspelling.

Can the Entity Detection model identify custom entity types?

No, the Entity Detection model doesn’t support the detection of custom entity types. However, the model is capable of detecting a wide range of predefined entity types, including people, organizations, locations, dates, times, addresses, phone numbers, medical data, and banking information, among others.

How can I improve the accuracy of the Entity Detection model?

To improve the accuracy of the Entity Detection model, it’s recommended to provide high-quality audio files with clear and distinct speech. In addition, it’s important to ensure that the audio content is relevant to the use case and that the entities being detected are relevant to the intended analysis. Finally, it may be helpful to review and adjust the model’s configuration parameters, such as the confidence threshold for entity detection, to optimize the results.