Announcements

Introducing Entity Detection - Detect Named Entities in Audio/Video

We are thrilled to announce the release of our latest feature for our Speech-to-Text API -- Entity Detection! As an introduction to this new feature, we’ll look at what exactly Entity Detection is, how it works, how to use it, and several use cases.

Introducing Entity Detection - Detect Named Entities in Audio/Video

Table of contents

We are thrilled to announce the release of our latest feature for our Speech-to-Text API – Entity Detection! As an introduction to this new feature, we’ll look at what exactly Entity Detection is, how it works, how to use it, and several use cases.

What is Entity Detection?

Entity Detection is a feature that can identify and categorize key information in text, such as a transcript generated from Automatic Speech Recognition, or ASR, technology.

For example, you could use Entity Detection to locate the names of people, organizations, or other “entities” such as addresses, phone numbers, social security numbers, locations, and more.

Generally, Entity Detection is a two step process: (1) identifying entities (such as people names, organization names, etc) and (2) classifying the entities that were identified (such as “location”  or “occupation” ).

For example, you might identify an entity “New York City” and the category as “location” or the entity “AssemblyAI” and the category as “company”.

How Does AssemblyAI Entity Detection Work?

The AssemblyAI Entity Detection feature automatically detects a wide range of entities found in your transcription text, and then returns them in the JSON response when fetching a completed transcription from the API.

For example, say you want to detect the following entities:

  • Name
  • Phone Number
  • Occupation

And your transcription text was:

John Bishop is a photographer from Chicago, Illinois. His phone number
is 111-333-1111.

AssemblyAI’s Entity Detection feature would output these associated entities:

  • John Bishop - “person_name”
  • 111-333-1111 - “phone_number”
  • Photographer - “occupation”

This is what the API response would look like when you enable the feature:

{
	...
	"entity_detection": true,
	"entities": [
		{
			"entity_type": "person_name",
			"text": "John Bishop",
			"start": 100,
			"end": 140
		},
		{
			"entity_type": "phone_number",
			"text": "111-333-1111",
			"start": 140,
			"end": 160
		},
		{
			"entity_type": "occupation",
			"text": "photographer",
			"start": 214,
			"end": 229
		}
	]
}

Note that this is displayed in the entities key towards the bottom of the JSON response once your transcript is complete.

Entity Types

Here are our currently supported entity types:

Using the Entity Detection Feature

To enable Entity Detection when transcribing your content with the API, you’ll just need to add the  entity_detection: true parameter to your POST request to /v2/transcript. Entity Detection is disabled by default.

For example, in cURL:

curl --request POST \
	--url https://api.assemblyai.com/v2/transcript \
	--header 'authorization: YOUR-API-TOKEN' \
	--header 'content-type: application/json' \
	--data '{"audio_url": "https://foo.bar/7510.mp3", "entity_detection": true}'

Entities are provided in a list format and have text, start, and end timestamps:

Check out the AssemblyAI Entity Detection Docs Here.

Top Use Cases

Why is Entity Detection important? Entity detection can be an extremely valuable data collection and analytical tool for a wide range of industries.

For example,

  • Telephony and CRM Platforms: Identify specific people, company, or competitor names and automatically populate associated fields. Improve customer response time by categorizing conversations.
  • Hiring Platforms: Identify certain roles, positions, companies, salaries, or more and automatically populate associated fields. Quickly sort through resumes and CVs to facilitate the hiring process.
  • Virtual Meeting Platforms: Identify specific people, companies, or competitor names and automatically populate associated fields. Analyze topics of conversation, participants, locations, and more.
  • Voice Bots: Identify people, companies, or competitor names and automatically trigger associated actions to automate and personalize interactions.
  • Medical: Identify conditions, statistics, medicines, injuries, and more to sort patient information and analyze results.

By collecting this entity information, you empower your company with invaluable customer or employee information, regardless of industry. Then, you can perform analytics to boost understanding of customers, adjust marketing campaigns, modify products, and much more.