Cloud Endpoints and Data Residency

Choose the endpoint that best fits your application’s requirements—whether that’s using the default US region or ensuring your data stays within the European Union.

Default Endpoint

The default endpoint (llm-gateway.assemblyai.com) processes your LLM Gateway requests in the US region. If you don’t specify a base URL, this is the endpoint used by default.

EU Data Residency

The EU endpoint (llm-gateway.eu.assemblyai.com) guarantees your data never leaves the European Union. This is designed for organizations with strict data residency and governance requirements—your request and response data will remain entirely within the EU.

Endpoints

Endpoint	Base URL	Description
US (default)	`https://llm-gateway.assemblyai.com/v1/chat/completions`	Data stays in the US
EU	`https://llm-gateway.eu.assemblyai.com/v1/chat/completions`	Data stays in the EU

Effective July 1, 2026, in-region LLM Gateway requests will increase by 10% as a direct pass-through of provider price increases, with no AssemblyAI upcharge. Opt into global routing to keep current pricing.

EU model availability

The EU endpoint currently supports Anthropic Claude and Google Gemini models. OpenAI models are only available through the US endpoint.

Provider	US	EU
Anthropic Claude	Yes	Yes
Google Gemini	Yes	Yes
OpenAI GPT	Yes	No
Alibaba Cloud Qwen	Yes	No
Moonshot AI Kimi	Yes	No

For a full list of available models, see the LLM Gateway Overview.

Which endpoint should I use?

No data residency requirements? Use the default endpoint. No configuration change is needed.
Need EU data residency? Use the EU endpoint to ensure your data stays within the European Union. Note that only Claude and Gemini models are available in the EU region.
No data residency, compliance, or latency needs, and want cheaper calls? Opt into global routing on top of the default endpoint to route requests to the provider’s global endpoints at lower cost.

How to use it

Update your request URL to your preferred endpoint. Select an endpoint tab below to see examples for each.

US (default)
EU data residency

The US endpoint is the default. No configuration change is required.

import requests

headers = {
  "authorization": "<YOUR_API_KEY>"
}

# No URL change needed — US is the default
response = requests.post(
    "https://llm-gateway.assemblyai.com/v1/chat/completions",
    headers=headers,
    json={
        "model": "claude-sonnet-4-6",
        "messages": [
            {"role": "user", "content": "What is the capital of France?"}
        ],
        "max_tokens": 1000
    }
)

result = response.json()
print(result["choices"][0]["message"]["content"])

// No URL change needed — US is the default
const response = await fetch(
  "https://llm-gateway.assemblyai.com/v1/chat/completions",
  {
    method: "POST",
    headers: {
      authorization: "<YOUR_API_KEY>",
      "content-type": "application/json",
    },
    body: JSON.stringify({
      model: "claude-sonnet-4-6",
      messages: [{ role: "user", content: "What is the capital of France?" }],
      max_tokens: 1000,
    }),
  }
);

const result = await response.json();
console.log(result.choices[0].message.content);

curl -X POST \
  "https://llm-gateway.assemblyai.com/v1/chat/completions" \
  -H "Authorization: <YOUR_API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "messages": [
      {
        "role": "user",
        "content": "What is the capital of France?"
      }
    ],
    "max_tokens": 1000
  }'

Use the EU endpoint to keep your data within the European Union.

import requests

headers = {
  "authorization": "<YOUR_API_KEY>"
}

response = requests.post(
    "https://llm-gateway.eu.assemblyai.com/v1/chat/completions",
    headers=headers,
    json={
        "model": "claude-sonnet-4-6",
        "messages": [
            {"role": "user", "content": "What is the capital of France?"}
        ],
        "max_tokens": 1000
    }
)

result = response.json()
print(result["choices"][0]["message"]["content"])

const response = await fetch(
  "https://llm-gateway.eu.assemblyai.com/v1/chat/completions",
  {
    method: "POST",
    headers: {
      authorization: "<YOUR_API_KEY>",
      "content-type": "application/json",
    },
    body: JSON.stringify({
      model: "claude-sonnet-4-6",
      messages: [{ role: "user", content: "What is the capital of France?" }],
      max_tokens: 1000,
    }),
  }
);

const result = await response.json();
console.log(result.choices[0].message.content);

curl -X POST \
  "https://llm-gateway.eu.assemblyai.com/v1/chat/completions" \
  -H "Authorization: <YOUR_API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "messages": [
      {
        "role": "user",
        "content": "What is the capital of France?"
      }
    ],
    "max_tokens": 1000
  }'

Global routing

Global routing is an opt-in option that routes your request to the provider’s global (non-region) endpoints for lower-cost processing. Set model_region to "global" in your request body to enable it. Omit the parameter for default in-region processing. "global" is the only valid value for model_region.

Who should use global routing?

Global routing is designed for customers who:

Do not have data residency or compliance requirements that tie processing to a specific region.
Do not have strict latency requirements that depend on regional proximity.
Want lower-cost calls by using the provider’s global endpoints.

Global routing is always opt-in. If you don’t set model_region, requests continue to be processed in-region (US or EU), so data residency and compliance remain the default behavior. If you have data residency, compliance, or latency needs, keep using the default in-region processing on the US or EU endpoint.

Availability

Global routing is live for Anthropic Claude models. Support for Google Gemini 3 series models is coming soon.

Usage tracking

Global-routed usage appears as a new Global region in the spend and usage dashboard, separate from US and EU usage.

How to use it

Include model_region: "global" in your request body alongside model and messages.

import requests

headers = {
  "authorization": "<YOUR_API_KEY>"
}

response = requests.post(
    "https://llm-gateway.assemblyai.com/v1/chat/completions",
    headers=headers,
    json={
        "model": "claude-sonnet-4-6",
        "messages": [
            {"role": "user", "content": "What is the capital of France?"}
        ],
        "model_region": "global",
        "max_tokens": 1000
    }
)

result = response.json()
print(result["choices"][0]["message"]["content"])

const response = await fetch(
  "https://llm-gateway.assemblyai.com/v1/chat/completions",
  {
    method: "POST",
    headers: {
      authorization: "<YOUR_API_KEY>",
      "content-type": "application/json",
    },
    body: JSON.stringify({
      model: "claude-sonnet-4-6",
      messages: [{ role: "user", content: "What is the capital of France?" }],
      model_region: "global",
      max_tokens: 1000,
    }),
  }
);

const result = await response.json();
console.log(result.choices[0].message.content);

curl -X POST \
  "https://llm-gateway.assemblyai.com/v1/chat/completions" \
  -H "Authorization: <YOUR_API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "messages": [
      {
        "role": "user",
        "content": "What is the capital of France?"
      }
    ],
    "model_region": "global",
    "max_tokens": 1000
  }'

Getting started

Features

API reference

Core concepts

Voice AI Guides

Cloud Endpoints and Data Residency

Default Endpoint

EU Data Residency

Endpoints

EU model availability

Which endpoint should I use?

How to use it

Global routing

Who should use global routing?

Availability

Usage tracking

How to use it

​Default Endpoint

​EU Data Residency

​Endpoints

​EU model availability

​Which endpoint should I use?

​How to use it

​Global routing

​Who should use global routing?

​Availability

​Usage tracking

​How to use it

Default Endpoint

EU Data Residency

Endpoints

EU model availability

Which endpoint should I use?

How to use it

Global routing

Who should use global routing?

Availability

Usage tracking

How to use it