Inside AssemblyAI's NYC voice agents January 2026 meetup: Production insights from the front lines
100+ voice AI builders gathered at AssemblyAI's NYC office to discuss production insights, where voice agents break, and 2026 predictions.



Over 100 voice AI builders packed AssemblyAI's NYC office on January 28, 2026. The conversation had shifted entirely from "what if" to "how we fixed it in production." No more "what if" questions. Attendees asked about pipeline redundancy, voicemail detection failures, and why female voices consistently outperform male ones in production.
That's where voice agents stand right now: widespread deployment, but a long way from satisfaction. According to AssemblyAI's report on What actually makes a good voice agent, 87% of respondents have deployed a voice agent to production. Yet 75% aren't satisfied with what they've built, leaving just 12% of builders happy with their current solution.
This meetup brought together the people actively closing that gap. The panel covered production architecture choices, what actually matters to customers, common failure modes, and where the industry is headed in 2026.

The speakers
AssemblyAI partnered with Hathora to host a panel of builders with hands-on production experience across different voice agent applications.
Blesson Abraham, Co-founder & CEO of Aviary AI, builds outbound voice agents for banks and credit unions, an industry where only 18% currently make outbound calls with any technology. Aviary handles use cases like welcome calls for new customers, account reactivation, and collections.
Craig Bonnoit, Co-Founder of Trellus, runs voice agent infrastructure at scale; his team processes around 3 million calls per month, mostly outbound. They've learned hard lessons about redundancy and what happens when vendors fail at high volume.
Luka Chkhetiani, Staff Researcher and Voice Agents Team Lead at AssemblyAI, shared perspectives on how production feedback shapes model development and why purpose-built models for specific use cases are becoming necessary.
Ryan Seams, VP of Customer Solutions at AssemblyAI, moderated the conversation.

What was discussed
Why satisfaction is so low and how to fix it
The 12% satisfaction number sparked immediate discussion: What separates successful deployments from the rest?
According to the panel, it starts before any code is written: you have to define what success actually means.
"When it's the CFO of a bank that's never done outbound calls, he wants all calls to be 100% perfect," Abraham explained. "You can't even get that with humans."
The builders who struggle are often chasing perfection on the wrong metrics, obsessing over a momentary delay or an imperfect response instead of asking whether the call accomplished its business objective.
Abraham's team tracks a metric they call "natural goodbye rate": the percentage of calls that end the way a human conversation would, rather than with a frustrated hang-up. Regardless of minor hiccups during the conversation, a natural goodbye indicates the call met its purpose.
"Did it accomplish what you intended it to do? Maybe there was some redundancy in the conversation. That's okay," Abraham said. "It accomplished what it needed to do."

What customers actually care about
When you're building a voice agent, it's easy to over-index on metrics that matter to engineers but not to end users. The panel offered a reality check.
For outbound calls, two things matter most: latency and voice quality.
Abraham's team has driven end-to-end latency below 1.6 seconds (including Twilio). That's down from 3.5 seconds at launch, a number that made customers "pissed." Before that, their early consumer app had seven-second latency, which actually worked because it frustrated debt collectors.
On voice quality, the panel shared a surprising finding: female voices consistently outperform male voices across their deployments. Conversations last longer, response rates are higher, and the pattern holds across different client types.
"No science really behind it," Abraham admitted. "We've done a few different AB testing of voice types. Female voices have been performing way better for us than male voices have been."

Where voice agents still break
Some problems remain stubbornly unsolved, voicemail detection topped the list.
"Voicemail detection sucks right now. Every vendor has been bad at this," Abraham said.
The failure mode is usually detecting a human as a voicemail, not the reverse. When someone answers with "Hello, this is Craig with Assembly AI, how can I help you?" the system sometimes declares it a voicemail because the greeting sounds too polished.
Interruption handling is another ongoing challenge. End-of-thought detection varies dramatically by customer demographic; older customers (like those calling about life insurance) need longer pauses before the agent responds. One to two seconds of silence is normal in human conversation, but agents often jump in too quickly.
Background noise is also affecting results more and more, a problem that's getting worse as calls happen in unpredictable environments.

What's coming in 2026
The panel closed with predictions for the year ahead.
Chkhetiani pointed to purpose-built models as a major focus: systems designed specifically for voice agent use cases or conversational intelligence rather than general-purpose transcription. The tradeoff between broad capability and specialized performance is becoming clearer, and for production deployments, specialized often wins.
Abraham took a broader view. Consumer adoption is driving enterprise demand in ways that weren't true two years ago.
"Whether people like it or not, there's this undercurrent being driven by consumers embracing voice more and more," he said. Every Android commercial now features someone talking directly to Gemini. Apple is pushing Siri. Voice is becoming the expected interface.
Mark Zuckerberg recently noted that 97% of digital interactions are still text-based, but that's not how humans naturally communicate. The builders betting on voice expect that ratio to shift dramatically.
"Consumers are going to demand that this is the way businesses interact back," Abraham said.

Get the full report
For the complete picture based on responses from 450+ builders, download AssemblyAI's report on What actually makes a good voice agent. It covers what's working in production, where builders are struggling, and how the market is evolving as voice agents move from experimental to expected.
Want to join future events? Follow AssemblyAI on LinkedIn or Twitter/X for announcements.
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.





