Insights & Use Cases
April 2, 2026

Pipecat voice agent with AssemblyAI Universal-3 Pro Streaming

Build a real-time voice agent using Pipecat — Daily.co's open-source Voice AI framework — and the AssemblyAI Universal-3 Pro Streaming model as the speech-to-text engine.

Reviewed by
No items found.
Table of contents

Build a real-time voice agent using Pipecat — Daily.co's open-source Voice AI framework — and the AssemblyAI Universal-3 Pro Streaming model as the speech-to-text engine.

Pipecat's modular pipeline design means you can swap any component without touching the rest. AssemblyAI has a first-party Pipecat plugin with full Universal-3 Pro Streaming support — no manual WebSocket wiring required.

Why AssemblyAI in Pipecat?

Metric

AssemblyAI Universal-3 Pro

Deepgram Nova-3

P50 latency

307 ms

516 ms

P99 latency

1,012 ms

1,907 ms

Word Error Rate

8.14%

9.87%

Neural turn detection

❌ (VAD only)

Mid-session prompting

Anti-hallucination

Real-time diarization

The 41% latency advantage is noticeable in live conversation — and the neural turn detection means fewer awkward double-responses when users pause mid-thought.

Prerequisites

  • Python 3.11+
  • AssemblyAI API key
  • Daily.co API key
  • OpenAI API key
  • Cartesia API key

Quick start

git clone https://github.com/kelseyefoster/voice-agent-pipecat-universal-3-pro
cd voice-agent-pipecat-universal-3-pro

python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

cp .env.example .env
# Edit .env with your API keys

# Create a Daily.co room
python create_room.py

# Start the bot (paste the room URL from above)
python bot.py --url https://your-name.daily.co/your-room

Open the room URL in your browser and start talking.

Universal-3 Pro Streaming features

Keyterm prompting

Boost accuracy on domain-specific vocabulary without restarting the session:

stt = AssemblyAISTTService(
    connection_params=AssemblyAIConnectionParams(
        api_key=os.environ["ASSEMBLYAI_API_KEY"],
        speech_model="u3-rt-pro",
        keyterms_prompt=["AssemblyAI", "Universal-3", "Pipecat", "YourBrandName"],
    )
)

Up to 1,000 terms per session. Essential for medical, legal, and financial applications.

Real-time speaker diarization

connection_params=AssemblyAIConnectionParams(
    api_key=os.environ["ASSEMBLYAI_API_KEY"],
    speech_model="u3-rt-pro",
    speaker_labels=True,
    max_speakers=2,
)

Multilingual support

connection_params=AssemblyAIConnectionParams(
    api_key=os.environ["ASSEMBLYAI_API_KEY"],
    speech_model="u3-rt-pro",
    language_detection=True,
)

Supported languages: English, Spanish, French, German, Italian, Portuguese.

Tuning turn detection

connection_params=AssemblyAIConnectionParams(
    api_key=os.environ["ASSEMBLYAI_API_KEY"],
    speech_model="u3-rt-pro",
    end_of_turn_confidence_threshold=0.7,
    min_end_of_turn_silence_when_confident=300,
    max_turn_silence=1000,
)

Deploy to PipecatCloud

pip install pipecatcloud
pcc auth login
pcc init
pcc secrets set my-agent-secrets --file .env
pcc deploy

Resources

Add AssemblyAI to your Pipecat pipeline

Sign up for a free AssemblyAI account and drop Universal-3 Pro Streaming into any Pipecat voice agent in minutes.

Start building
Title goes here

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.

Button Text
AI voice agents
Universal-3 Pro Streaming