customers
All customer stories
Top Voice AI companies are building with Assembly.
resources
Latest Release
Voice Agent API
Voice agents that get it right, respond instantly, and ship the same day with our new Voice Agent API
resources
Build AI voice agents that qualify leads, book meetings, and handle outbound calls — then analyze every conversation with sentiment analysis, speaker diarization, and LLM-powered coaching scorecards. Powered by the fastest, most accurate speech-to-text.
Sales call scorecard
Rep: Sarah K. · Prospect: Acme Corp · 18 min
Talk / Listen
38 / 62
Sentiment
Positive
Sentiment timeline
Coaching suggestions
Sales managers review less than 2% of calls. Reps self-report outcomes. Competitor mentions, pricing objections, and buying signals vanish the moment the call ends. Meanwhile, manual lead qualification wastes hours that should be spent selling. Modern voice agents and AI-powered call intelligence — built on accurate streaming STT, a managed LLM, and sentiment analysis — capture everything and turn every conversation into structured, coachable data.
P50 median streaming latency for Universal-3 Pro Streaming.
Better alphanumeric accuracy than other providers.
SLA with SOC 2 Type 2 certification.
Audio processed daily in production.
Two ways to build
Ship an AI sales agent in an afternoon, or drop industry-leading STT into the conversation intelligence platform you already run.
Our proprietary voice stack via one WebSocket. Build lead qualification agents, appointment setters, and outbound dialers with zero infra to manage.
Best for
Free tier available · No credit card required
The STT and analytics layer for your conversation intelligence platform. Works natively with your preferred orchestrator and CRM integration.
Best for
No concurrency caps · Autoscaling included
Ingest call audio
Voice Agent API: single WebSocket for live agents. Or connect recordings from your dialer, Twilio, or call center platform.
Transcribe with speaker diarization
Speaker labels separate rep and customer. Sentiment analysis tracks emotional shifts. Entity detection catches competitor mentions and pricing.
Generate coaching scorecards
LLM Gateway produces talk/listen ratios, sentiment shift analysis, and specific coaching suggestions per rep. 25+ models across Claude, GPT, and Gemini.
Push to CRM
Summaries, action items, and deal risk scores pushed to Salesforce, HubSpot, or any CRM via webhook.
Call intelligence
Talk ratio
38%
Listen ratio
62%
Sentiment timeline
Voice Agent API — recommended
# Voice Agent API: sales lead qualification agent
import asyncio, json, websockets
API_KEY = "YOUR_API_KEY"
async def run_agent():
async with websockets.connect(
"wss://agents.assemblyai.com/v1/ws",
additional_headers={"Authorization": f"Bearer {API_KEY}"},
) as ws:
await ws.send(json.dumps({
"type": "session.update",
"session": {
"system_prompt": (
"You are a sales qualification agent for Acme Corp. "
"Ask about budget, timeline, and decision-maker."
),
"greeting": "Hi, thanks for your interest in Acme — how can I help?",
"input": {"keyterms": ["Acme Pro", "Enterprise Plan", "tier-2"]},
"output": {"voice": "ivy"},
},
}))
# Stream audio in, get audio + transcript back
async for msg in ws:
handle(json.loads(msg)) # transcript.user, reply.audio, tool.call, ...
Universal-3 Pro Streaming + LiveKit — BYO stack
# LiveKit + AssemblyAI STT in a cascading sales agent pipeline
from livekit.agents import Agent, AgentSession, TurnHandlingOptions
from livekit.plugins import assemblyai, cartesia, openai, silero
class SalesAgent(Agent):
def __init__(self):
super().__init__(
instructions=(
"You are a sales qualification agent for Acme Corp. "
"Be concise. Qualify on budget, timeline, and authority."
),
)
async def entrypoint(ctx):
session = AgentSession(
stt=assemblyai.STT(
model="u3-rt-pro",
min_turn_silence=100,
max_turn_silence=1000, # let buyers finish thoughts on email/phone numbers
vad_threshold=0.3,
keyterms_prompt=["Acme Pro", "Enterprise Plan", "tier-2"],
),
llm=openai.LLM(model="gpt-4o"),
tts=cartesia.TTS(),
vad=silero.VAD.load(activation_threshold=0.3),
turn_handling=TurnHandlingOptions(
turn_detection="stt",
endpointing={"min_delay": 0}, # avoid additive latency in STT-driven turns
),
)
await session.start(room=ctx.room, agent=SalesAgent())
Universal-3 Pro Streaming transcribes 94%+ on noisy phone audio — the difference between a captured competitor mention and a missed coaching opportunity.
Names, card numbers, addresses, and account IDs masked before transcripts hit your CRM, data warehouse, or QA stack.
Topic detection, sentiment, and call outcomes available on the live stream — coach agents in the moment, not the next day.
EdgeTier's customer CarTrawler reduced chat handling time by 25% through enhanced insights and agent optimization powered by AssemblyAI.
EdgeTier
The accuracy was strong, but the great documentation and unique models like Auto Chapters and Sentiment Analysis is what really won us over.
Nathan Webb, Product Manager — Aloware
Read more