Virtual interviews are on the rise–60% of HR professionals now use or have used video interviews in the hiring process. Not surprisingly, the market for smart tools that help companies better facilitate and process these virtual interviews is also expanding–including Hiring Intelligence Platforms.
Powered by cutting-edge Machine Learning and Deep Learning research, Hiring Intelligence Platforms do much more than host virtual interviews. The top platforms provide powerful tools that help companies transcribe interviews, create soundbites, support team collaboration, and boost interview equity.
One application set of this Machine Learning and Deep Learning research is Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Natural Language Understanding (NLU).
This article looks at how top ASR, NLP, and NLU tools and AI models can be integrated into Hiring Intelligence Platforms to help users:
- Automate asynchronous and real-time transcription for interview audio and video, supporting faster and easier candidate review.
- Generate highlights and key analysis, helping hiring managers spend less time on post-interview manual tasks and helping eliminate unconscious hiring biases.
- Surface insights that can be easily searched, tagged, and categorized, supporting seamless team collaboration.
First, we’ll explore what Hiring Intelligence Platforms do before examining the top three ways that ASR, NLP, and NLU applications can support the best Hiring Intelligence Platforms today.
What are Hiring Intelligence Platforms?
Hiring Intelligence Platforms, also referred to as Talent Intelligence Platforms, help companies screen, interview, and hire smarter for every posted job. At their core, these platforms help companies attract and hire more qualified candidates much faster and more efficiently than traditional hiring methods.
Tools offered by the platforms can assess a candidate’s soft skills, analyze tone, gauge emotions, examine proficiency, and more, all with the help of AI and Machine Learning.
And Hiring Intelligence Platforms boast impressive results. One platform’s–Screenloop’s–users realized an average 20% reduced time to hire, 60% less candidate drop off, 50% less rejected offers, and 90% less time spent on manual tasks during the hiring process.
In addition, Hiring Intelligence Platforms help companies reduce or eliminate bias in the hiring process by using “ethical AI”. According to Modern Hire, AI can be used to automate interview scoring, increase transparency, and ensure a fair interview process for each candidate.
To accomplish the above, many Hiring Intelligence Platforms are investing in ASR and NLP/NLU technology to optimize their offerings.
Speech-to-Text and Audio Intelligence for Hiring Intelligence Platforms
The Automatic Speech Recognition, or ASR, systems available today are more accurate, affordable, and accessible than ever before. This is due to major advances in the Deep Learning and Machine Learning models that power ASR tools.
For example, the best Speech-to-Text APIs transcribe asynchronous and real-time audio and video data–like virtual interviews–at near human-level accuracy.
In addition, NLP and NLU tools can help Hiring Intelligence Platforms build high ROI features and tools on top of interview transcription data for platform users. Referred to as Audio Intelligence APIs, these high value tools can help users summarize interview responses, surface key points and highlights, analyze responses, and facilitate team collaboration at scale.
Now, let’s look more closely at the three biggest impacts Speech-to-Text and Audio Intelligence technology can have on Hiring Intelligence Platforms:
1. Automate Interview Transcription
Speech-to-Text APIs can automatically transcribe video interviews as interviews occur in real-time or after the fact. When looking to incorporate speech transcription into a platform, consider accuracy and transcript readability. Today’s top Speech-to-Text APIs should demonstrate high accuracy, or low Word Error Rate (WER).
But just as important–if not more so–is transcript readability.
What is transcript readability? Many transcription APIs output transcriptions that, though technically accurate, are missing sentence and paragraph structure, basic punctuation and casing, and speaker labels. This makes the transcripts difficult to read through, lessening the utility of the transcription.
See this shortened transcript of an interview–sans formatting– as an example:
When i'm gone for a while but hes always supportive so that always takes a lot of stress off and lets me play and its a lot easier sure wow well back to your college and pro career i know you are a usc player and im sure that was an amazing team experience but a lot of college players dont go on to go pro even though theyre incredible players and the college level is very high its a shame that there isnt much more interest in college tennis but how do you talk about mindset shift from choosing tennis as a career versus like business or coding or something you know and then making that decision from usc to just go pro one of my goals when I was little was to always play professional i think maybe some people just want to go to college or get a scholarship and then end there but i knew i always wanted to continue my tennis
Thankfully, top Speech-to-Text APIs use Automatic Casing and Punctuation models and Paragraph Detection features that automatically add these missing features, making transcripts much more readable at a glance. Speaker Diarization APIs also increase transcript readability by automatically adding speaker labels to a transcript–a very important addition to interview transcripts.
Now, here’s what the same transcript would look like with these features added:
<Speaker A> When I'm gone for a while, but he's always supportive, so that always takes a lot of stress off and lets me play, and it's a lot easier. <Speaker B> Sure. Wow. Well, back to your college and pro career. I know you are a USC player and I'm sure that was an amazing team experience but a lot of college players don't go on to go pro, even though they're incredible players and the college level is very high. It's a shame that there isn't much more interest in college tennis. But how do you talk about mindset shift from choosing tennis as a career versus, like, business or coding or something, you know, and then making that decision from USC to just go pro? <Speaker A> One of my goals when I was little was to always play professional. I think maybe some people just want to go to college or get a scholarship and then end there, but I knew I always wanted to continue my tennis.
It’s significantly easier to read, right?
Automatic interview transcription helps hiring managers spend less time on manual interview tasks, like response review.
Sometimes, companies using a Hiring Intelligence Platform will need to make the interview transcript anonymous, perhaps to support a more ethical or unbiased interview process. To support this anonymity, platforms can use PII Redaction APIs to automatically remove or redact Personally Identifiable Information (PII) such as a candidate’s name. Platforms may even want to help companies remove other attributes of potential bias, such as age, gender, location, etc. A PII Redaction API would replace each specified PII with a
This table demonstrates the PII types that can typically be redacted:
Transcription is also a necessary first step to unlock some of the most powerful Audio Intelligence tools, such as generating highlights and key analysis and surfacing insights.
2. Generate Highlights and Key Analysis
In addition to providing an accurate, highly readable transcript, Hiring Intelligence Platforms must generate highlights and key analysis for users. This could include automatically searching transcripts for relevant skills or experience, creating highlight reels of key talking points, or analyzing a candidate’s overall behavior to determine best fit for the available role.
There are a few Audio Intelligence APIs that operate both individually and collectively to produce this sophisticated analysis.
First, Auto Chapters or Text Summarization APIs create short summaries or soundbites for each section of the interview. Text Summarization APIs work by:
- Breaking an audio or video file transcription into logical chapters, or where the topic of conversation changes (like a new interview question).
- Generating both a single sentence headline and multi-sentence summary for each of these chapters.
With Text Summarization, hiring managers can more easily review candidate responses to key interview questions or even create video snippets of desired responses based on interview time stamps in the transcription text.
Next, Entity Detection APIs, also referred to as Named Entity Recognition APIs, can be used to identify and classify important recurring information in an interview transcript. For example,
French is an entity that would be classified as a
Here are the entities that can commonly be detected in a text:
Topic Detection APIs work similarly to Entity Detection. Topic Detection APIs identify and label important or recurring topics in a transcription text. Topics can be ascribed according to the 698 topics delineated in the standardized IAB Taxonomy:
With Entity Detection and Topic Detection, hiring managers can find mentions of relevant skills and experience or other topics of interest. They can perform analysis across interviews as well–of the same candidate or across different ones–to find trends or flag recurring topics for further follow-up.
Finally, Sentiment Analysis APIs can be used to identify and label speech segments as
neutral. Hiring Intelligence Platforms can use Sentiment Analysis APIs to track a candidate’s emotional response toward a particular job attribute or interview question, helping give hiring managers even deeper insights into candidate behaviors.
When combined in one Hiring Intelligence Platform, these Audio Intelligence APIs can unlock valuable data about candidates and interviews.
3. Categorize, Tag, and Search Insights
In addition to identifying trends, surfacing highlights, and supporting key analysis, Hiring Intelligence Platforms can use Speech Transcription and Audio Intelligence to categorize, tag, and search candidate interviews.
For example, platforms could use this data to generate indexable categories that users can then use to tag or search, similar to how users use hashtags on platforms like Twitter.
These “auto” or “smart” tags can be used for internal collaboration or for automatically attaching notes to indexed sections of interviews. Smart tags also make searching through and screening responses much faster and more efficient for hiring managers. Once a key tag is indicated, hiring managers can even share the tagged video content for review with other key members of a hiring committee, without having to sift through the entire video response.
The Auto Chapters, Entity Detection, and Topic Detection APIs discussed above work together to create this smart tagging system feature. Entity Detection and Topic Detection APIs, for example, can identify common themes in interview responses. Then, the Hiring Intelligence Platform can use this data to generate smart tags that make these themes searchable. Auto Chapters APIs can then be used to auto-generate short contextual summaries to attach to the appropriate tag.
These APIs all combine to create a seamless hiring experience for candidates and hiring committees alike.
Competitive Hiring Intelligence Platforms
Hiring Intelligence Platforms must invest in Start-of-the-Art Machine Learning and Deep Learning-powered technology to create highly useful and powerful tools for users. Speech-to-Text and Audio Intelligence are one subset of this intelligent technology, helping platforms create accurate, readable transcripts, generate highlights and analysis, and build smart tags for key segments of interviews.
Used strategically, Speech Transcription and Audio Intelligence can support an easy transition to virtual screenings/interviews and hiring, smarter hiring decisions, and higher job retention.