> ## Documentation Index
> Fetch the complete documentation index at: https://assemblyai.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Models

## Available models

### By quality ([LMArena Score](https://arena.ai/leaderboard))

| Model                             | Provider      | Parameter                       | [LMArena Score](https://arena.ai/leaderboard) | Latency per 10,000 tokens |
| --------------------------------- | ------------- | ------------------------------- | --------------------------------------------- | ------------------------- |
| **Claude Opus 4.6**               | Anthropic     | `claude-opus-4-6`               | 1498                                          | 7.4s                      |
| **Claude Opus 4.7**               | Anthropic     | `claude-opus-4-7`               | 1491                                          | TBD                       |
| **Gemini 3.5 Flash**              | Google        | `gemini-3.5-flash`              | 1480                                          | TBD                       |
| **GPT-5.5**                       | OpenAI        | `gpt-5.5`                       | 1475                                          | TBD                       |
| **Gemini 3 Flash Preview**        | Google        | `gemini-3-flash-preview`        | 1474                                          | 4.2s                      |
| **Claude Opus 4.5**               | Anthropic     | `claude-opus-4-5-20251101`      | 1468                                          | 3.9s                      |
| **Claude Sonnet 4.6**             | Anthropic     | `claude-sonnet-4-6`             | 1466                                          | 7.2s                      |
| **Claude 4.5 Sonnet**             | Anthropic     | `claude-sonnet-4-5-20250929`    | 1453                                          | 5.6s                      |
| **Gemini 2.5 Pro**                | Google        | `gemini-2.5-pro`                | 1448                                          | 4.0s                      |
| **GPT-5.1**                       | OpenAI        | `gpt-5.1`                       | 1439                                          | 2.7s                      |
| **Gemini 3.1 Flash Lite Preview** | Google        | `gemini-3.1-flash-lite-preview` | 1438                                          | TBD                       |
| **GPT-5.2**                       | OpenAI        | `gpt-5.2`                       | 1437                                          | 1.6s                      |
| **GPT-5**                         | OpenAI        | `gpt-5`                         | 1434                                          | 4.3s                      |
| **Kimi K2.5**                     | Moonshot AI   | `kimi-k2.5`                     | 1432                                          | 1.2s                      |
| **GPT-4.1**                       | OpenAI        | `gpt-4.1`                       | 1413                                          | 1.8s                      |
| **Gemini 2.5 Flash**              | Google        | `gemini-2.5-flash`              | 1411                                          | 2.6s                      |
| **Claude 4.5 Haiku**              | Anthropic     | `claude-haiku-4-5-20251001`     | 1409                                          | 4.1s                      |
| **Qwen3 Next 80B A3B**            | Alibaba Cloud | `qwen3-next-80b-a3b`            | 1402                                          | 3.1s                      |
| **GPT-5 mini**                    | OpenAI        | `gpt-5-mini`                    | 1390                                          | 3.8s                      |
| **Gemini 2.5 Flash-Lite**         | Google        | `gemini-2.5-flash-lite`         | 1380                                          | 1.1s                      |
| **gpt-oss-120b**                  | OpenAI        | `gpt-oss-120b`                  | 1353                                          | 1.4s                      |
| **Qwen3 32B**                     | Alibaba Cloud | `qwen3-32B`                     | 1347                                          | 3.7s                      |
| **GPT-5 nano**                    | OpenAI        | `gpt-5-nano`                    | 1337                                          | 3.2s                      |
| **gpt-oss-20b**                   | OpenAI        | `gpt-oss-20b`                   | 1317                                          | 1.1s                      |

### By latency (per 10,000 tokens)

| Model                             | Provider      | Parameter                       | Latency per 10,000 tokens | [LMArena Score](https://arena.ai/leaderboard) |
| --------------------------------- | ------------- | ------------------------------- | ------------------------- | --------------------------------------------- |
| **Gemini 2.5 Flash-Lite**         | Google        | `gemini-2.5-flash-lite`         | 1.1s                      | 1380                                          |
| **gpt-oss-20b**                   | OpenAI        | `gpt-oss-20b`                   | 1.1s                      | 1317                                          |
| **Kimi K2.5**                     | Moonshot AI   | `kimi-k2.5`                     | 1.2s                      | 1432                                          |
| **gpt-oss-120b**                  | OpenAI        | `gpt-oss-120b`                  | 1.4s                      | 1353                                          |
| **GPT-5.2**                       | OpenAI        | `gpt-5.2`                       | 1.6s                      | 1437                                          |
| **GPT-4.1**                       | OpenAI        | `gpt-4.1`                       | 1.8s                      | 1413                                          |
| **Gemini 2.5 Flash**              | Google        | `gemini-2.5-flash`              | 2.6s                      | 1411                                          |
| **GPT-5.1**                       | OpenAI        | `gpt-5.1`                       | 2.7s                      | 1439                                          |
| **Qwen3 Next 80B A3B**            | Alibaba Cloud | `qwen3-next-80b-a3b`            | 3.1s                      | 1402                                          |
| **GPT-5 nano**                    | OpenAI        | `gpt-5-nano`                    | 3.2s                      | 1337                                          |
| **Qwen3 32B**                     | Alibaba Cloud | `qwen3-32B`                     | 3.7s                      | 1347                                          |
| **GPT-5 mini**                    | OpenAI        | `gpt-5-mini`                    | 3.8s                      | 1390                                          |
| **Claude Opus 4.5**               | Anthropic     | `claude-opus-4-5-20251101`      | 3.9s                      | 1468                                          |
| **Gemini 2.5 Pro**                | Google        | `gemini-2.5-pro`                | 4.0s                      | 1448                                          |
| **Claude 4.5 Haiku**              | Anthropic     | `claude-haiku-4-5-20251001`     | 4.1s                      | 1409                                          |
| **Gemini 3 Flash Preview**        | Google        | `gemini-3-flash-preview`        | 4.2s                      | 1474                                          |
| **GPT-5**                         | OpenAI        | `gpt-5`                         | 4.3s                      | 1434                                          |
| **Claude 4.5 Sonnet**             | Anthropic     | `claude-sonnet-4-5-20250929`    | 5.6s                      | 1453                                          |
| **Claude Sonnet 4.6**             | Anthropic     | `claude-sonnet-4-6`             | 7.2s                      | 1466                                          |
| **Claude Opus 4.6**               | Anthropic     | `claude-opus-4-6`               | 7.4s                      | 1498                                          |
| **Claude Opus 4.7**               | Anthropic     | `claude-opus-4-7`               | TBD                       | 1491                                          |
| **GPT-5.5**                       | OpenAI        | `gpt-5.5`                       | TBD                       | 1475                                          |
| **Gemini 3.1 Flash Lite Preview** | Google        | `gemini-3.1-flash-lite-preview` | TBD                       | 1438                                          |
| **Gemini 3.5 Flash**              | Google        | `gemini-3.5-flash`              | TBD                       | 1480                                          |

### By provider

#### Anthropic Claude

| Model                 | Parameter                    | [LMArena Score](https://arena.ai/leaderboard) | Latency per 10,000 tokens |
| --------------------- | ---------------------------- | --------------------------------------------- | ------------------------- |
| **Claude Opus 4.7**   | `claude-opus-4-7`            | 1491                                          | TBD                       |
| **Claude Opus 4.6**   | `claude-opus-4-6`            | 1498                                          | 7.4s                      |
| **Claude Sonnet 4.6** | `claude-sonnet-4-6`          | 1466                                          | 7.2s                      |
| **Claude Opus 4.5**   | `claude-opus-4-5-20251101`   | 1468                                          | 3.9s                      |
| **Claude 4.5 Sonnet** | `claude-sonnet-4-5-20250929` | 1453                                          | 5.6s                      |
| **Claude 4.5 Haiku**  | `claude-haiku-4-5-20251001`  | 1409                                          | 4.1s                      |

#### OpenAI GPT

| Model            | Parameter      | [LMArena Score](https://arena.ai/leaderboard) | Latency per 10,000 tokens |
| ---------------- | -------------- | --------------------------------------------- | ------------------------- |
| **GPT-5.5**      | `gpt-5.5`      | 1475                                          | TBD                       |
| **GPT-5.2**      | `gpt-5.2`      | 1437                                          | 1.6s                      |
| **GPT-5.1**      | `gpt-5.1`      | 1439                                          | 2.7s                      |
| **GPT-5**        | `gpt-5`        | 1434                                          | 4.3s                      |
| **GPT-5 nano**   | `gpt-5-nano`   | 1337                                          | 3.2s                      |
| **GPT-5 mini**   | `gpt-5-mini`   | 1390                                          | 3.8s                      |
| **GPT-4.1**      | `gpt-4.1`      | 1413                                          | 1.8s                      |
| **gpt-oss-120b** | `gpt-oss-120b` | 1353                                          | 1.4s                      |
| **gpt-oss-20b**  | `gpt-oss-20b`  | 1317                                          | 1.1s                      |

#### Google Gemini

| Model                             | Parameter                       | [LMArena Score](https://arena.ai/leaderboard) | Latency per 10,000 tokens |
| --------------------------------- | ------------------------------- | --------------------------------------------- | ------------------------- |
| **Gemini 3.5 Flash**              | `gemini-3.5-flash`              | 1480                                          | TBD                       |
| **Gemini 3 Flash Preview**        | `gemini-3-flash-preview`        | 1474                                          | 4.2s                      |
| **Gemini 3.1 Flash Lite Preview** | `gemini-3.1-flash-lite-preview` | 1438                                          | TBD                       |
| **Gemini 2.5 Pro**                | `gemini-2.5-pro`                | 1448                                          | 4.0s                      |
| **Gemini 2.5 Flash**              | `gemini-2.5-flash`              | 1411                                          | 2.6s                      |
| **Gemini 2.5 Flash-Lite**         | `gemini-2.5-flash-lite`         | 1380                                          | 1.1s                      |

<Note>
  Gemini 3.1 Flash Lite Preview is currently available in the US region only.
</Note>

#### Alibaba Cloud Qwen

| Model                  | Parameter            | [LMArena Score](https://arena.ai/leaderboard) | Latency per 10,000 tokens |
| ---------------------- | -------------------- | --------------------------------------------- | ------------------------- |
| **Qwen3 Next 80B A3B** | `qwen3-next-80b-a3b` | 1402                                          | 3.1s                      |
| **Qwen3 32B**          | `qwen3-32B`          | 1347                                          | 3.7s                      |

#### Moonshot AI Kimi

| Model         | Parameter   | [LMArena Score](https://arena.ai/leaderboard) | Latency per 10,000 tokens |
| ------------- | ----------- | --------------------------------------------- | ------------------------- |
| **Kimi K2.5** | `kimi-k2.5` | 1432                                          | 1.2s                      |

<Note>
  Claude Opus 4.5 and Claude Opus 4.6 currently support context windows under
  200k tokens via the LLM Gateway.
</Note>

<Note>
  For information on data retention and model training policies for each
  provider, see [Data Retention and Model Training](/data-retention-and-model-training#llm-gateway-production-environment).
</Note>

<Note>
  Head to [our Playground](https://www.assemblyai.com/dashboard/playground) to
  test out LLM Gateway without having to write any code!
</Note>
