LLM Gateway Overview
Supported regions
US & EU
Overview
AssemblyAI’s LLM Gateway is a unified interface that allows you to connect with multiple LLM providers including Claude, GPT, and Gemini. You can use the LLM Gateway to build sophisticated AI applications through a single API.
The LLM Gateway is available in both US and EU regions. Use the EU endpoint to ensure your data stays within the European Union. Currently, Anthropic Claude and Google Gemini models are supported in the EU. OpenAI models are only available in the US region. See Cloud Endpoints and Data Residency for more details.
The LLM Gateway provides access to 15+ models across major AI providers with support for:
- Basic Chat Completions - Simple request/response interactions
- Streamed Responses - Stream output as it’s generated (OpenAI models)
- Multi-turn Conversations - Maintain context across multiple exchanges
- Structured Outputs - Constrain responses to a specific JSON schema
- Tool/Function Calling - Enable models to execute custom functions
- Agentic Workflows - Multi-step reasoning with automatic tool chaining
- Unified Interface - One API for Claude, GPT, Gemini, and more
Available models
By quality (LMArena Score)
By latency (per 10,000 tokens)
By provider
Anthropic Claude
OpenAI GPT
Google Gemini
Anthropic will retire Claude 3.0 Haiku (claude-3-haiku-20240307) on
April 20, 2026. To ensure uninterrupted service, switch to Claude 4.5
Haiku (claude-haiku-4-5-20251001) before that date.
Claude Opus 4.5 and Claude Opus 4.6 currently support context windows under 200k tokens via the LLM Gateway.
For information on data retention and model training policies for each provider, see Data Retention and Model Training.
Head to our Playground to test out LLM Gateway without having to write any code!
Select a model
You can specify which model to use in your request by setting the model parameter. Here are examples showing how to use Claude 4.5 Sonnet:
Python
JavaScript
Simply change the model parameter to use any of the available models listed in the Available models section above.
Want to compare models side-by-side? Try the Model Comparison Tool, a Lovable application, to test different LLM models and see how they perform.
Next steps
- Basic Chat Completions - Learn how to send simple messages and receive responses
- Multi-turn Conversations - Maintain context across multiple exchanges
- Structured Outputs - Constrain model responses to follow a specific JSON schema
- Tool Calling - Enable models to execute custom functions
- Agentic Workflows - Build multi-step reasoning applications
The LLM Gateway API is separate from the Speech-to-Text and Speech Understanding APIs. It provides a unified interface to work with large language models across multiple providers.