New Universal-3.5 Pro Realtime is here. Learn more

Cloudflare AI Gateway vs. AssemblyAI

Learn why developers building Voice AI choose AssemblyAI’s managed LLM Gateway over an edge control plane:

  • 0% markup and no provider keys to manage—call 25+ models through one API
  • Native, purpose-built speech-to-text in the same pipeline
  • EU data residency by default, opt-in zero data retention, and a BAA when you need one
Runway
Dovetail
Granola
Supernormal
Ashby
Jiminny
Calabrio
JotPsych
EdgeTier
Genio
Commure
Super
Retell
Loop
CallRail
Happy Scribe
Delphi
Runway
Dovetail
Granola
Supernormal
Ashby
Jiminny
Calabrio
JotPsych
EdgeTier
Genio
Commure
Super
Retell
Loop
CallRail
Happy Scribe
Delphi
Runway
Dovetail
Granola
Supernormal
Ashby
Jiminny
Calabrio
JotPsych
EdgeTier
Genio
Commure
Super
Retell
Loop
CallRail
Happy Scribe
Delphi
Runway
Dovetail
Granola
Supernormal
Ashby
Jiminny
Calabrio
JotPsych
EdgeTier
Genio
Commure
Super
Retell
Loop
CallRail
Happy Scribe
Delphi

At a glance: Cloudflare AI Gateway vs. AssemblyAI

Model
AssemblyAI LLM Gateway
Cloudflare AI Gateway
Model access
25+ curated models, billed at provider rates
Bring your own provider keys + Workers AI
Speech-to-text
First-party, purpose-built models
Whisper via Workers AI
Pricing
0% markup, no keys to manage
Free gateway; you manage provider billing
Automatic fallbacks
Configurable per-model
Fallback routing
EU data residency
Default-on for EU traffic
Global edge
Zero data retention
Opt-in, per request
Configurable logging
BAA available
OpenAI SDK compatible
Compatible endpoints

Managed model access, with native speech built in

Curated models, no keys to manage

Call 25+ frontier models through one API, billed at provider rates. There are no separate provider accounts or keys to create and rotate yourself.

Native, purpose-built speech-to-text

Transcribe with AssemblyAI’s own speech models and apply any LLM in one pipeline. Cloudflare offers Whisper via Workers AI, but not a purpose-built speech and audio-intelligence stack.

0% markup

Pay provider token rates with no gateway markup and no minimum commitment. The price you see is the price you pay.

Automatic fallbacks

Configure a primary model and any number of backups per request. If a provider errors or rate-limits, the Gateway retries the next model—no code changes.

OpenAI-compatible

Drop into any OpenAI SDK or HTTP client. Change a base URL and a model string, and the rest of your code keeps working.

EU data residency

Route requests through EU-resident infrastructure, on by default for EU traffic, for GDPR-sensitive workloads.

Zero data retention

Opt into zero data retention per request or project-wide, so prompts and responses are never stored.

BAA available

Sign a Business Associate Addendum for applications that process PHI, backed by SOC 2 Type 2, ISO 27001, PCI DSS, and GDPR.

Start building

Get your free API key and make your first Gateway call in minutes—no provider keys to wire up first.

Live demo

Pick a model. Take action on your audio.

Prompt

What is runner's knee?

claude-opus-4-7 412 ms · 47 tokens

Based on the transcript, runner's knee is a condition characterized by pain behind or around the kneecap. It is caused by overuse, muscle imbalance and inadequate stretching. Symptoms include pain under or around the kneecap and pain when walking.

Frequently asked questions