Solutions

Voice agents for meeting intelligence & AI notetakers

Build AI meeting assistants that transcribe every conversation, identify speakers, and generate structured notes with action items, key decisions, and chapter summaries. Powered by the fastest, most accurate speech-to-text with built-in Speech Understanding.

Meeting notes — auto-generated

Q3 planning sync · 4 speakers · 42 min

Key decisions

Launch date moved to Sept 15. Budget approved for 2 additional engineers. Partner integration deprioritized to Q4.

Action items

Sarah: draft revised timeline by Friday. Mike: open 2 eng reqs in Greenhouse. Priya: update stakeholder deck

Chapters

0:00 Status update · 8:12 Timeline discussion · 22:40 Resourcing · 35:15 Next steps

Delphi
Happy Scribe
Granola
Supernormal
Runway
Ashby
Jiminny
JotPsych
Earmark
EdgeTier
Genio
Grain
Loop
Calabrio
Veed.io
Dovetail
WhatConverts
CallRail
Delphi
Happy Scribe
Granola
Supernormal
Runway
Ashby
Jiminny
JotPsych
Earmark
EdgeTier
Genio
Grain
Loop
Calabrio
Veed.io
Dovetail
WhatConverts
CallRail
Delphi
Happy Scribe
Granola
Supernormal
Runway
Ashby
Jiminny
JotPsych
Earmark
EdgeTier
Genio
Grain
Loop
Calabrio
Veed.io
Dovetail
WhatConverts
CallRail
Delphi
Happy Scribe
Granola
Supernormal
Runway
Ashby
Jiminny
JotPsych
Earmark
EdgeTier
Genio
Grain
Loop
Calabrio
Veed.io
Dovetail
WhatConverts
CallRail
The problem

Meetings end and the context disappears

Teams spend 31 hours per month in meetings, yet most decisions, action items, and commitments vanish within 24 hours. Manual note-taking splits attention, shared docs are incomplete, and recordings sit unwatched. AI meeting assistants — built on accurate speech-to-text, speaker diarization, and LLM-powered summarization — capture everything so your team can stay present during the conversation and act on the outcomes after.

Built for meeting transcription performance

Latency ~150ms

P50 median streaming latency for real-time meeting assistants.

Speakers 10+

Speakers identified across multi-participant meetings with built-in diarization.

Uptime 99.9%

SLA with SOC 2 Type 2 certification.

Scale 40TB+

Audio processed daily in production.

Two ways to build

Pick the API that fits your meeting product

Ship an interactive meeting assistant in an afternoon, or drop industry-leading STT and Speech Understanding into your notetaker product.

Recommended

Voice Agent API

Our proprietary voice stack via one WebSocket. Build interactive meeting assistants that transcribe, summarize, and answer questions about the conversation in real time.

Best for

  • Interactive meeting assistants with voice Q&A
  • Teams shipping fast — working assistant in an afternoon
  • Real-time transcription with speaker diarization
  • Claude Code compatible — paste the docs and build anything
$4.50/hr — speech, LLM, and voice all included
Get started for free

Free tier available · No credit card required

Bring Your Own Stack

Universal-3 Pro Streaming STT API

The STT and Speech Understanding layer for your meeting product. Transcription, diarization, chapters, action items, and LLM Gateway for custom summaries.

Best for

  • AI notetaker products with custom UX and workflows
  • Auto Chapters, action items, and topic detection built in
  • LLM Gateway for custom summaries and Q&A over transcripts
  • PII redaction before notes hit your workspace or CRM
  • High-scale deployments where margin and full control matter
$0.45/hr — transcription only, unlimited streams
View integration docs

No concurrency caps · Autoscaling included

Your meeting intelligence pipeline

Capture meeting audio

Voice Agent API: single WebSocket for real-time. Or ingest recordings from Zoom, Teams, Meet, or any conferencing platform via bot or API.

Transcribe with speaker diarization

Speaker labels identify who said what. ~150ms P50 streaming latency. Keyterm boosting for your product names, team members, and project codenames.

Extract structure and insights

Auto Chapters segment by topic. Summarization generates notes. LLM Gateway extracts action items, decisions, and follow-ups across 25+ models (Claude, GPT, Gemini).

Distribute to your workspace

Push notes, action items, and summaries to Slack, Notion, Google Docs, or your CRM via webhook. Searchable transcript archive for async review.

schedule

Meeting transcript

Sarah (PM)

"Let's lock the launch date — I'm proposing September 15."

Mike (Eng)

"That works if we get two more engineers. The API layer needs another sprint."

Priya (Design)

"I'll have the updated flows to eng by end of week."

Quickstart

Get a working assistant in minutes

Voice Agent API — recommended

# Voice Agent API: real-time meeting assistant
import asyncio, json, websockets

API_KEY = "YOUR_API_KEY"

async def run_agent():
    async with websockets.connect(
        "wss://agents.assemblyai.com/v1/ws",
        additional_headers={"Authorization": f"Bearer {API_KEY}"},
    ) as ws:
        await ws.send(json.dumps({
            "type": "session.update",
            "session": {
                "system_prompt": (
                    "You are a meeting assistant. Listen to the "
                    "conversation and answer questions about what was discussed. "
                    "Track action items, decisions, and follow-ups."
                ),
                "input": {"keyterms": ["Q3 launch", "Project Atlas", "Greenhouse"]},
                "output": {"voice": "ivy"},
            },
        }))
        # Stream meeting audio in, get transcript + answers back
        async for msg in ws:
            handle(json.loads(msg))  # transcript.user, reply.audio, tool.call, ...

Universal-3 Pro Streaming + LiveKit — BYO stack

# LiveKit + AssemblyAI STT in a real-time meeting notetaker pipeline
from livekit.agents import Agent, AgentSession
from livekit.plugins import assemblyai, cartesia, openai, silero

class MeetingAssistant(Agent):
    def __init__(self):
        super().__init__(
            instructions=(
                "You are a meeting assistant. Track action items, "
                "key decisions, and generate structured notes."
            ),
        )

async def entrypoint(ctx):
    session = AgentSession(
        stt=assemblyai.STT(
            model="u3-rt-pro",
            speaker_labels=True,                     # diarize multi-speaker meetings
            keyterms_prompt=["Q3 launch", "Project Atlas", "Greenhouse"],
        ),
        llm=openai.LLM(model="gpt-4o"),
        tts=cartesia.TTS(),
        vad=silero.VAD.load(),
    )
    await session.start(room=ctx.room, agent=MeetingAssistant())

Auto Chapters & summaries

Automatically segment meetings by topic with timestamped chapter headings. Generate paragraph, bullet, or headline summaries with a single API flag.

Speaker diarization

Identify who said what across meetings with 2 to 10+ participants — essential for attribution and action item assignment.

LLM Gateway

Ask questions about your transcripts, extract custom fields, generate meeting briefs, or build searchable archives. 25+ models across Claude, GPT, and Gemini through one unified API.

Frequently asked questions