Prompting and Keyterms

Universal-3.5 Pro is highly accurate out of the box, but for challenging audio like short clips with limited context, noisy environments, or audio with niche references, you can give the model information about your audio to improve transcription accuracy. There are two ways to give the model information about your audio:

Contextual prompting (prompt) — a natural-language description of what the audio is about: the domain, the scenario, or the full details of the conversation.
Keyterms prompting (keyterms_prompt) — an explicit list of terms you want the model to recognize accurately.

Contextual prompting

Use the prompt parameter to provide context about your audio — describe what is being transcribed, not how to transcribe it. Formatting and behavioral instructions are ignored. The model stays grounded in the audio, so irrelevant context won’t cause hallucinated words. For example, this is a 2-second clip from a League of Legends pro interview: Without prompt:

And so look who I've been a dear.

With prompt:

In solo queue, I ban Azir.

Python
Python SDK
JavaScript
JavaScript SDK

import requests
import time

base_url = "https://api.assemblyai.com"
headers = {"authorization": "<YOUR_API_KEY>"}

data = {
    "audio_url": "https://assembly.ai/prompt-8",
    "language_detection": True,
    "prompt": "League of Legends roles"
}

response = requests.post(base_url + "/v2/transcript", headers=headers, json=data)

if response.status_code != 200:
    print(f"Error: {response.status_code}, Response: {response.text}")
    response.raise_for_status()

transcript_response = response.json()
transcript_id = transcript_response["id"]
polling_endpoint = f"{base_url}/v2/transcript/{transcript_id}"

while True:
    transcript = requests.get(polling_endpoint, headers=headers).json()
    if transcript["status"] == "completed":
        print(transcript["text"])
        break
    elif transcript["status"] == "error":
        raise RuntimeError(f"Transcription failed: {transcript['error']}")
    else:
        time.sleep(3)

import assemblyai as aai

aai.settings.api_key = "<YOUR_API_KEY>"

audio_file = "https://assembly.ai/prompt-8"

config = aai.TranscriptionConfig(
  language_detection=True,
  prompt="League of Legends roles",
)

transcript = aai.Transcriber().transcribe(audio_file, config)

print(transcript.text)

const baseUrl = "https://api.assemblyai.com";
const headers = {
  authorization: "<YOUR_API_KEY>",
};

const data = {
  audio_url: "https://assembly.ai/prompt-8",
  language_detection: true,
  prompt: "League of Legends roles",
};

const url = `${baseUrl}/v2/transcript`;
let res = await fetch(url, {
  method: "POST",
  headers: { ...headers, "Content-Type": "application/json" },
  body: JSON.stringify(data),
});
if (!res.ok) throw new Error(`Error: ${res.status}`);
const response = await res.json();

const transcriptId = response.id;
const pollingEndpoint = `${baseUrl}/v2/transcript/${transcriptId}`;

while (true) {
  res = await fetch(pollingEndpoint, { headers });
  if (!res.ok) throw new Error(`Error: ${res.status}`);
  const transcriptionResult = await res.json();

  if (transcriptionResult.status === "completed") {
    console.log(transcriptionResult.text);
    break;
  } else if (transcriptionResult.status === "error") {
    throw new Error(`Transcription failed: ${transcriptionResult.error}`);
  } else {
    await new Promise((resolve) => setTimeout(resolve, 3000));
  }
}

import { AssemblyAI } from "assemblyai";

const client = new AssemblyAI({
  apiKey: "<YOUR_API_KEY>",
});

const audioFile = "https://assembly.ai/prompt-8";

const params = {
  audio: audioFile,
  language_detection: true,
  prompt: "League of Legends roles",
};

const run = async () => {
  const transcript = await client.transcripts.transcribe(params);
  console.log(transcript.text);
};

run();

Prompting guide

Contextual prompts work at three levels of specificity. Use the least specific level that covers your use case, and add detail when your audio contains uncommon names or terms the model can’t otherwise know.

Level	Length	What it contains	Example
Domain	2–5 words	The domain only	`Medical consultation call.`
Scenario	5–15 words	What the conversation is about	`Cardiology consultation about chest pain symptoms.`
Detailed	20–50 words	Full description, including names, products, or identifiers	`Cardiology consultation between Dr. Smith and an elderly patient regarding recurring chest pain, ECG results, and medication adjustment for hypertension.`

Guidelines for writing contextual prompts:

Write plain, complete sentences that describe the audio
Keep it to one short block of text. Don’t pack lists of keywords into the contextual prompt

Keyterms prompting

Keyterms prompting allows you to provide up to 1,000 words or phrases (maximum 6 words per phrase) using the keyterms_prompt parameter to improve transcription accuracy for those terms and related variations or contextually similar phrases. Here is an example showing how you can use keyterms prompting to improve transcription accuracy for a name with distinctive spelling and formatting. Without keyterms prompting:

Hi, this is Kelly Byrne Donahue

With keyterms prompting:

Hi, this is Kelly Byrne-Donoghue

Python
JavaScript
Python SDK
JavaScript SDK

import requests
import time

base_url = "https://api.assemblyai.com"
headers = {"authorization": "<YOUR_API_KEY>"}

data = {
    "audio_url": "https://assemblyaiassets.com/audios/keyterms_prompting.wav",
    "language_detection": True,
    "keyterms_prompt": ["Kelly Byrne-Donoghue"]
}

response = requests.post(base_url + "/v2/transcript", headers=headers, json=data)

if response.status_code != 200:
    print(f"Error: {response.status_code}, Response: {response.text}")
    response.raise_for_status()

transcript_response = response.json()
transcript_id = transcript_response["id"]
polling_endpoint = f"{base_url}/v2/transcript/{transcript_id}"

while True:
    transcript = requests.get(polling_endpoint, headers=headers).json()
    if transcript["status"] == "completed":
        print(transcript["text"])
        break
    elif transcript["status"] == "error":
        raise RuntimeError(f"Transcription failed: {transcript['error']}")
    else:
        time.sleep(3)

const baseUrl = "https://api.assemblyai.com";
const headers = {
  authorization: "<YOUR_API_KEY>",
};

const data = {
  audio_url: "https://assemblyaiassets.com/audios/keyterms_prompting.wav",
  language_detection: true,
  keyterms_prompt: ["Kelly Byrne-Donoghue"],
};

const url = `${baseUrl}/v2/transcript`;
let res = await fetch(url, {
  method: "POST",
  headers: { ...headers, "Content-Type": "application/json" },
  body: JSON.stringify(data),
});
if (!res.ok) throw new Error(`Error: ${res.status}`);
const response = await res.json();

const transcriptId = response.id;
const pollingEndpoint = `${baseUrl}/v2/transcript/${transcriptId}`;

while (true) {
  res = await fetch(pollingEndpoint, { headers });
  if (!res.ok) throw new Error(`Error: ${res.status}`);
  const transcriptionResult = await res.json();

  if (transcriptionResult.status === "completed") {
    console.log(transcriptionResult.text);
    break;
  } else if (transcriptionResult.status === "error") {
    throw new Error(`Transcription failed: ${transcriptionResult.error}`);
  } else {
    await new Promise((resolve) => setTimeout(resolve, 3000));
  }
}

import assemblyai as aai

aai.settings.api_key = "<YOUR_API_KEY>"

audio_file = "https://assemblyaiassets.com/audios/keyterms_prompting.wav"

config = aai.TranscriptionConfig(
    language_detection=True,
    keyterms_prompt=["Kelly Byrne-Donoghue"]
)

transcript = aai.Transcriber(config=config).transcribe(audio_file)

print(transcript.text)

import { AssemblyAI } from "assemblyai";

const client = new AssemblyAI({
  apiKey: "<YOUR_API_KEY>",
});

const audioFile = "https://assemblyaiassets.com/audios/keyterms_prompting.wav";

const params = {
  audio: audioFile,
  language_detection: true,
  keyterms_prompt: ["Kelly Byrne-Donoghue"],
};

const transcript = await client.transcripts.transcribe(params);

console.log(transcript.text);

Keyword count limitsWhile we support up to 1000 key words and phrases, actual capacity may be lower due to internal tokenization and implementation constraints. Key points to remember:

Each word in a multi-word phrase counts towards the 1000 keyword limit
Capitalization affects capacity (uppercase tokens consume more than lowercase)
Longer words consume more capacity than shorter words

For optimal results, use shorter phrases when possible and be mindful of your total token count when approaching the keyword limit.

Need help?

If you’d like help building or optimizing a prompt for your audio, our team can help: open a live chat or email us via the widget in the bottom-right corner (contact info).

​Contextual prompting

​Prompting guide

​Keyterms prompting

​Need help?

Contextual prompting

Prompting guide

Keyterms prompting

Need help?