2020 saw the rise of virtual events and meetings, brought about by the start of the COVID-19 pandemic. Rather than becoming a one year novelty item, however, virtual and hybrid events are now an enduring, and dominant, trend across a diverse range of industries. To stay relevant and profitable, conference and event hosts have had to quickly become virtual and hybrid event experts that present high-quality, engaging experiences for attendees–regardless of event format.
Thankfully, hybrid event solutions and platforms are working to support hosts as they make this transition. These solutions are designed to help hosts plan, market, execute, and follow-up for all event-related activities, working together to build the most engaging experience for all attendees.
To build these complex hybrid event platforms, many companies are turning to AI-backed Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Natural Language Understanding (NLU) tools.
This article examines how these ASR, NLP, and NLU tools and AI Models help hybrid event solutions:
- Quickly discover and surface important event sections to create highlights and summaries.
- Add video subtitles in real-time for better accessibility and compliance.
- Create transcripts of videos for better searchability, indexing, and discovery.
First, we’ll look at exactly what a hybrid event solution is before diving into the top three ways that ASR, NLP, and NLU tools serve as the foundational–and exceptional–tools that comprise the best hybrid event platforms today.
What are Hybrid Event Solutions?
Hybrid event solutions and tools platforms offer all-in-one event management for virtual, hybrid, in-person, and internal conferences/events. The goal of a hybrid event solution is to facilitate easier event planning, more engaging event environments, and seamless post-event connection.
According to Hopin, the overall aim of a hybrid event solution is to help brands “create a truly engaging virtual event that differentiates brand[s] and enables [them] to exceed [their] event goals.”
Hybrid event solutions also support best practices for digital events, including virtual and in-person networking, event personalization, attendee engagement and support, and more.
Many hybrid event solutions are investing in AI, Deep Learning (DL), and Machine Learning (ML) tools to optimize these services.
Speech-to-Text and Audio Intelligence for Hybrid Event Tools
One of these AI-powered tools being invested in is Automatic Speech Recognition, or ASR. Today’s ASR systems are more accurate, affordable, and accessible than ever thanks to major advances in AI and Deep Learning research. Top Speech-to-Text APIs, for example, can transcribe conference videos both asynchronously and in real-time at near-human level accuracy.
In addition, Audio Intelligence APIs apply NLP and NLU technology on top of this transcription data, helping hybrid event solutions quickly build high ROI features and applications that serve as the “brains” of their offerings. This could include automatically surfacing key sections of a webinar or talk, building accessibility features, or making video content more easily searchable.
Let’s explore the three biggest ways Speech-to-Text and Audio Intelligence technology can help build standout hybrid event tools:
1. Discovering Key Highlights
First, Speech-to-Text and Audio Intelligence APIs can help hybrid event solutions create tools that quickly surface important sections of talks or keynote presentations. Then, these “soundbites” can be used to create short clips for social media sharing or as concise summaries of each talk that a user attends. When used for summarization, this process also increases attendee engagement because attendees can focus more on the event itself and less on the note taking process.
There are a few Audio Intelligence APIs that help support this feature. First, Auto Chapters, or Text Summarization, APIs generate “summaries over time” for transcription texts. An Auto Chapters API segments an audio or video stream into logical chapters, or points where conversational topics change. Then, the API outputs a single sentence headline and multi-sentence summary for each of these chapters.
In addition to Text Summarization, some APIs can also detect important words and phrases in the transcription text and extract these for further analysis or use. This could include words or phrases that are thematically important, recurring, or topically relevant. Word Search features also let end users search for specific sets of keywords in a transcription text to more easily locate needed information.
2. Meeting Accessibility and Compliance Regulations
Another core component of a hybrid event solution is to support clients’ needs to meet accessibility and compliance regulations. Fifteen percent of the world’s population has some form of disability. Professional events must proactively work to serve this community, regardless of the channel–virtual, in-person, or hybrid–that the event takes place in. There are also regulatory boards that require this. For example, the Americans with Disabilities Act (ADA) requires audio descriptions and captioning to increase web accessibility. Sections 504 and 508 of the Rehabilitation Act also require captions for live video as well.
Thankfully, Speech-to-Text APIs can support both real-time and asynchronous transcription at extremely high accuracy. Both high accuracy and high readability are important to accessibility, although the latter is often overlooked.
For example, a transcript may be highly accurate–meaning it has correctly identified the words within an audio or video stream–but lacking in basic paragraph structure, punctuation and casing, or speaker labels. This transcript would have high accuracy but low readability, ultimately making it difficult to read and defeating the purpose of transcribing in the first place.
To combat this, the best Speech-to-Text APIs offer Automatic Casing and Punctuation Models and Paragraph Detection that automatically format transcription texts with proper capitalization, punctuation, sentence structure, paragraphs, and more, to boost readability and accessibility.
Moreover, Speaker Diarization APIs can automatically detect and label multiple speakers in an audio or video stream, taking the guesswork out of who spoke when during an event talk.
Compare the two transcription texts below, the first without Automatic Casing and Punctuation or Speaker Diarization, and the second with both, to see the difference.
Without Speaker Diarization:
But how did you guys first meet and how do you guys know each other? I
actually met her not too long ago. I met her, I think last year in
December, during pre season, we were both practicing at Carson a lot.
And then we kind of met through other players. And then I saw her a few
her last few torments this year, and we would just practice together
sometimes, and she's really, really nice. I obviously already knew who
she was because she was so good. Right. So. And I looked up to and I met
her. I already knew who she was, but that was cool for me. And then I
watch her play her last few events, and then I'm actually doing an
exhibition for her charity next month. I think super cool. Yeah. I'm
excited to be a part of that. Yeah. Well, we'll definitely highly
promote that. Vania and I are both together on the Diversity and
Inclusion committee for the USDA, so I'm sure she'll tell me all about
that. And we're really excited to have you as a part of that tournament.
So thank you so much. And you have had an exciting year so far. My
goodness. Within your first WTI 1000 doubles tournament, the Italian
Open.Congrats to that. That's huge. Thank you.
With Speaker Diarization:
<Speaker A> But how did you guys first meet and how do you guys know each
<Speaker B> I actually met her not too long ago. I met her, I think last
year in December, during pre season, we were both practicing at Carson a
lot. And then we kind of met through other players. And then I saw her a
few her last few torments this year, and we would just practice together
sometimes, and she's really, really nice. I obviously already knew who
she was because she was so good.
<Speaker A> Right. So.
<Speaker B> And I looked up to and I met her. I already knew who she
was, but that was cool for me. And then I watch her play her last few
events, and then I'm actually doing an exhibition for her charity next
<Speaker A> I think super cool.
<Speaker B> Yeah. I'm excited to be a part of that.
<Speaker A> Yeah. Well, we'll definitely highly promote that. Vania and
I are both together on the Diversity and Inclusion committee for the
USDA. So I'm sure she'll tell me all about that. And we're really
excited to have you as a part of that tournament. So thank you so much.
And you have had an exciting year so far. My goodness. Within your first
WTI 1000 doubles tournament, the Italian Open. Congrats to that. That's
<Speaker B> Thank you.
Together, these ASR and NLP/NLU APIs ensure accessibility and compliance regulations are met.
3. Boosting Searchability, Indexing, and Discovery
Finally, hybrid event solutions must support the searchability, indexing, and discovery of all event content. Transcription, captions, and subtitles, discussed previously, all aid this effort. Adding Timestamps to transcripts also lets end users quickly scan transcription texts and match where in the video these statements occur for ease of use.
In addition, Entity Detection is another Audio Intelligence API that can be used to support searchability or indexing.
Entity Detection, also referred to as Named Entity Recognition, is used to identify and classify important information in a transcription text. For example,
Cal Tech is an entity that would be classified as a
university. Entity Detection can also be used to create smart tags for each entity, aiding search and discovery algorithms.
Other common entities that can be detected include:
With Entity Detection, hybrid event solutions can help companies identify commonly recurring entities, such as company names, locations, occupations, etc., and aggregate them for further analysis. Or this data could be included in tools that support video or talk indexing and tagging, further aiding event content searchability and discovery by end users.
Intelligent Hybrid Event Solutions
Hybrid event solutions need intelligent tools to aid this transition to virtual and hybrid events for companies. Speech-to-Text and Audio Intelligence APIs help event platforms build out these smart features that help companies and end users discover key highlights, meet accessibility and compliance requirements, and support event indexing and searchability.
And since these ASR and NLP/NLU tools are powered by cutting-edge AI, Machine Learning, and Deep Learning technology, hybrid event solutions can be confident in providing a perpetual State-of-the-Art offering, for a competitive service for years to come.