Skip to main content
Tools let your agent take actions and fetch live data. There are two kinds, and most agents use the first:

HTTP tools (server-side)

Defined on your stored agent. You give a URL and a parameter list; AssemblyAI makes the request for you and feeds the result to the model. Your client does nothing.

Function tools (client-side)

Declared inline in session.tools. The agent emits a tool.call; your code runs the logic and sends back a tool.result. Use when the tool needs local state or custom logic.
This page covers what’s common to both: how parameter hints sharpen accuracy, how to get the agent to call your tools, execution modes, progressive reveal, and per-agent patterns.

Parameter hints improve accuracy

parameters is a JSON Schema object: { "type": "object", "properties": { … }, "required": [ … ] }, the same on both tool types. Beyond type and description, each property accepts standard JSON-Schema keywords that describe the shape of the value. They do two jobs:
  • Tool-calling accuracy. A spoken value that doesn’t fit the shape is rejected before the tool runs, and the agent re-asks for just that value instead of calling the tool with garbage.
  • Turn-detection accuracy. Knowing what a complete value looks like lets the agent tell whether the user has finished speaking it. It waits for all ten digits of a phone number instead of cutting in after “my number is four one five…”.
These keywords describe the value your tool receives, which is the value the agent extracts and normalizes, not the words the caller says out loud. A caller might say “four one five, five five five…”; the agent turns that into the value +14155552671. Your pattern and examples describe that final value, not the spoken form.
KeywordWhat it doesGood values
enumRestrict the value to a fixed set. The model picks one; anything else is rejected.["billing", "sales", "support"]
examplesSample values in the exact form your tool receives them. Give 2–4 realistic ones.["+14155552671", "+442071838750"]
patternA regex the final value must match (Python re, matched against the whole value). Escape backslashes in JSON."\\+[1-9]\\d{1,14}"
formatA named JSON-Schema format the value should conform to."email", "date-time", "date"
If you omit these, the agent infers the expected shape from the property’s description at runtime. Setting them explicitly is an override for when you want tight, predictable validation. A precise description is still what matters most; the hints sharpen it.

How to choose each keyword

enum: use it whenever the value is one of a known, small set. Removes “the model invented a category” bugs entirely.
{ "department": { "type": "string", "description": "Which team to route to.",
  "enum": ["billing", "sales", "support"] } }
examples: use these for any free-form value. Two to four realistic examples beat a long description. Every example should be a value the tool would actually accept (and match pattern if you set one).
{ "airport_code": { "type": "string", "description": "IATA airport code.",
  "examples": ["SFO", "LHR", "JFK"] } }
pattern: use this when the value has a strict format. It’s a Python regular expression matched against the whole value (so you don’t need ^ or $). In JSON you have to escape backslashes, so write \\d, not \d. The agent normalizes what it hears, then the value has to match or the agent re-asks for it.
Valuepatternformatexamples
US ZIP\\d{5}(-\\d{4})?-["94103", "10001-2201"]
E.164 phone\\+[1-9]\\d{1,14}-["+14155552671"]
Order ID[A-Z]{2}-\\d{5}-["AB-12345"]
ISO date\\d{4}-\\d{2}-\\d{2}date["2026-06-09"]
Email-email["alex@acme.com"]
\\d is any digit, [A-Z] is one uppercase letter, {2} is “exactly two”, {1,14} is “one to fourteen”, and ? makes the part before it optional. So [A-Z]{2}-\\d{5} matches an order ID like AB-12345.
Keep patterns tight but not brittle. Too loose (.*) gives no benefit; too strict rejects legitimate values and traps the user in a re-ask loop. Make every value in examples match the pattern; if you can’t, loosen the pattern, not the examples.
format: a standard JSON-Schema named format (email, date-time, uri, …). Use it for well-known value types instead of hand-rolling a pattern.

Worked example

A tool that books a callback, with phone constrained by pattern + examples and time-of-day by enum:
{
  "name": "schedule_callback",
  "description": "Schedule a callback to a phone number at a chosen time.",
  "parameters": {
    "type": "object",
    "properties": {
      "phone": {
        "type": "string",
        "description": "The number to call back, including country code.",
        "examples": ["+14155552671", "+442071838750"],
        "pattern": "\\+[1-9]\\d{1,14}"
      },
      "window": {
        "type": "string",
        "description": "Preferred time of day for the callback.",
        "enum": ["morning", "afternoon", "evening"]
      }
    },
    "required": ["phone", "window"]
  },
  "http": { "url": "https://api.example.com/callbacks", "http_method": "POST" }
}
With these set, if the user says “call me at four one five, five five five…” and trails off, the agent waits (the value doesn’t yet match \+[1-9]\d{1,14}) instead of cutting in. If STT garbles a digit, the agent re-asks for the phone number specifically rather than firing the tool with a bad value.

Getting the agent to call your tools

In rough order of impact:
  1. Strong tool descriptions. Treat description as “when should I reach for this?”, not “what does this do?”. Name the trigger (“Call this when the user asks about X”), name the anti-trigger (“Do not call this for Y”), and mention any precondition. Most “tool never fires” failures trace here.
  2. Strong parameter hints. Vague params produce missing or invented values, which the validator rejects (or worse, your tool runs on garbage). Lead with format, add an example, use enum for fixed sets. See Parameter hints.
  3. Default-to-call wording in system_prompt. “When in doubt, call the tool. A wasted call is fine. Answering wrong from memory is not.” Don’t stack exceptions.
  4. Few-shot examples in system_prompt are the strongest behavioural signal:
    User: “Where’s my order?” You: [call search_orders] “Looks like it’s out for delivery today.”
  5. Keep tool sets small (≤10 per phase). Past that, selection accuracy drops. See Progressive tool reveal.

Execution modes

Set execution_mode per tool to choose how the agent waits. This applies to both HTTP and function tools. The sequence diagrams below show the client-side tool.call/tool.result flow; for HTTP tools that round trip happens server-side, but the conversational behavior (interactive keeps talking, hold goes quiet) is identical.
Use "interactive" for…Use "hold" for…
DB lookups, REST calls, short calculationsPhone transfers, escalations
Returns under ~5 secondsLong-running ops (>10s, async jobs)
Transition phrase (“let me check”) feels naturalSensitive flows (payment auth, identity verification)
Default to interactive. Two common mis-uses:
  • ❌ Wrapping a slow DB query in hold “to be safe”. Agent goes mute, user thinks the call dropped. Use interactive with a longer timeout_seconds.
  • ❌ Using interactive for a 30-second human transfer. Agent fills with small-talk; user gets suspicious.

Interactive

server                                 client
  │  reply.started                       │
  │ ───────────────────────────────────► │
  │  reply.audio  ("let me check that")  │
  │ ───────────────────────────────────► │
  │  tool.call                           │  client accumulates result
  │ ───────────────────────────────────► │  (does NOT send tool.result yet)
  │  reply.done                          │
  │ ───────────────────────────────────► │  client drains pending results:
  │                                      │  tool.result
  │ ◄─────────────────────────────────── │
  │  reply.started                       │  agent delivers answer
  │ ───────────────────────────────────► │
  │  reply.audio  ("it's 22°C and sunny")│
  │ ───────────────────────────────────► │
  │  reply.done                          │
  │ ───────────────────────────────────► │

Hold

While the tool is in flight:
  1. Agent stays silent (no reply.started).
  2. User speech doesn’t trigger replies. Utterances are added to context but the agent doesn’t respond until you send tool.result or reply.create.
  3. tool.result auto-fires the next reply. Don’t also send reply.create after.
{
  "type": "function",
  "name": "transfer_call",
  "description": "Transfer the call to a human agent. Takes 15–30 seconds.",
  "parameters": {"type": "object", "properties": {"department": {"type": "string"}}, "required": ["department"]},
  "execution_mode": "hold",
  "timeout_seconds": 60
}
server                                 client
  │  tool.call (hold)                    │
  │ ───────────────────────────────────► │  kick off long-running op
  │                                      │  (agent silent, no reply.started)
  │                                      │
  │                                      │  reply.create { instructions: ... }
  │ ◄─────────────────────────────────── │  ── optional status update
  │  reply.started → reply.audio → done  │
  │ ───────────────────────────────────► │
  │                                      │  (op completes)
  │                                      │  tool.result
  │ ◄─────────────────────────────────── │
  │  reply.started                       │  auto-fired by tool.result
  │ ───────────────────────────────────► │
  │  reply.audio  ("all set...")         │
  │ ───────────────────────────────────► │
  │  reply.done                          │
  │ ───────────────────────────────────► │
During hold, the server does not emit transcript.user.delta or transcript.user in real time. Transcripts flush once the hold ends (tool.result or reply.create). Live captioning pauses during the hold; nothing is dropped.

Status updates during hold

Send reply.create with optional instructions to make the agent speak mid-hold without ending it:
await ws.send(json.dumps({
    "type": "reply.create",
    "instructions": "Let the customer know you're still working on the transfer."
}))
The hold continues until you send the matching tool.result.

Progressive tool reveal

For multi-step workflows (lookup → estimate → commit), don’t register all tools upfront. After each successful tool.result, send session.update adding the next phase’s tools, and update system_prompt to match. Why: a tool that isn’t in the current list can’t be called, so the model can’t fabricate a commit before the prerequisite step has run. Smaller per-phase tool sets also raise selection accuracy.

Worked example: taxi booking

session state              tools exposed
─────────────────────────  ──────────────────────────────────────────
session start              [lookup_postcode]
                                  │ user gives pickup postcode

                           ⚙ lookup_postcode("SW1A 1AA") → ✓
─────────────────────────  ──────────────────────────────────────────
tier 2 unlocked            [lookup_postcode, estimate_fare]
                                  │ user gives dropoff

                           ⚙ estimate_fare(...) → ✓
─────────────────────────  ──────────────────────────────────────────
tier 3 unlocked            [lookup_postcode, estimate_fare, book_ride,
                            get_booking, track_driver, cancel_ride]
                                  │ user confirms + name

                           ⚙ book_ride(name="Alex", ...) → ✓
Until lookup_postcode returns a real postcode, the model has no book_ride tool. It can verbally promise a booking; it can’t create one.

Client-side wiring

TIER_1_TOOLS = [lookup_postcode]
TIER_2_TOOLS = [lookup_postcode, estimate_fare]
TIER_3_TOOLS = [lookup_postcode, estimate_fare, book_ride,
                get_booking, track_driver, cancel_ride]

tier_2_unlocked = tier_3_unlocked = False

async def maybe_unlock_next_tier(tool_name, result):
    global tier_2_unlocked, tier_3_unlocked
    if result.get("error"):
        return

    if not tier_2_unlocked and tool_name == "lookup_postcode" and result.get("postcode"):
        tier_2_unlocked = True
        await ws.send(json.dumps({"type": "session.update",
                                  "session": {"tools": TIER_2_TOOLS, "system_prompt": TIER_2_PROMPT}}))
    elif not tier_3_unlocked and tool_name == "estimate_fare" and result.get("estimated_fare"):
        tier_3_unlocked = True
        await ws.send(json.dumps({"type": "session.update",
                                  "session": {"tools": TIER_3_TOOLS, "system_prompt": TIER_3_PROMPT}}))
Update tools AND system_prompt together. Tool-only gating where the prompt still references a now-hidden tool can underperform not gating at all. The model hunts for a tool the prompt promised and stalls or improvises when it can’t find it. Strip or rewrite every prompt sentence that names a tool whose visibility changed.

Per-call state machine

The strongest form: every successful tool call is a state transition; each state owns a narrow prompt + small tool list.
StateSystem prompt focusTools
s0_greet”Get pickup postcode. Nothing else.”lookup_postcode
s2_quoting”Call estimate_fare. Filler only; no fare numbers.”estimate_fare
s4_have_name”Call book_ride with captured pickup, dropoff, name.”book_ride
s5_booked”Read back confirmation. Offer track/cancel.”get_booking, track_driver, cancel_ride

Escape hatches

Real users go off-script. Two patterns, used together:
  • Transition tools: revise_pickup, revise_dropoff, restart, end_call exposed in every state. Model picks the right escape; orchestrator rolls state back.
  • respond_freely: a no-op tool in every state for tangential questions (“are you a real person?”). Model calls it instead of leaving the state.

Anti-fabrication clause

Gating makes hallucinations harmless (no real booking happens) but doesn’t suppress the spoken claim. Pair with prompt wording:
NEVER quote a fare, distance, time, confirmation number, name, or ETA unless
those exact values came from a tool result in this conversation. If you
haven't seen a tool result, you do NOT have these values. Don't estimate
them. Don't guess. Don't say "around" a number.

Patterns by agent type

Customer support

s0:  [lookup_ticket]                                           ← always
s1:  [lookup_ticket, escalate_to_human (hold), close_ticket]   ← after lookup
any: [respond_freely, end_call]
Prompt focus: “Use lookup_ticket for any ticket question. Only escalate_to_human after checking the ticket. Don’t promise outcomes you can’t verify.”

Booking / reservations

s0:  [check_availability, cancel_reservation]                          ← always
s1a: [check_availability, create_reservation, cancel_reservation]      ← if available
s1b: [check_availability, add_to_waitlist, cancel_reservation]         ← if full
Prompt focus: “Confirm party size, date, time. Call check_availability. If open, offer it and book. If not, offer the next two times or the waitlist.” The next tool depends on the prior result. Don’t expose both create_reservation and add_to_waitlist simultaneously. The model picks the wrong one ~30% of the time.

Banking / account

s0:  [verify_identity (hold)]                                  ← gatekeeper
s1:  [get_balance, list_recent_transactions]                   ← after auth
s2:  [start_transfer, dispute_charge (hold), close_account]    ← actions
any: [end_call]
Prompt focus: “Before sharing any account info, call verify_identity. Never quote a balance or transaction you haven’t fetched. Never promise a dispute outcome; only the system can.” The anti-fabrication clause matters most here. A bank agent inventing a balance is a P0.

Debugging

Tool never fires

  • Description too vague. Name the user phrases that should trigger it.
  • System prompt missing “default to calling” wording.
  • Too many tools (>10). Drop or split via progressive reveal.
  • Add a few-shot example to the system prompt. This is the strongest signal.

Wrong arguments

  • Parameter description missing format/example. Add (e.g. 2026-04-30), or set examples/pattern. See Parameter hints.
  • Free-text where you want fixed buckets. Use enum.
  • User says the value multiple ways. Normalise in the description.

Agent invents a result

Most common cause: the model is being asked to do something after a tool result without having actually called the tool. Two fixes, used together:
  1. Progressive reveal: gate the commit tool behind the read tool.
  2. Anti-fabrication clause in the prompt (see above).

Tool fires repeatedly

  • tool.result arriving while last_event is reply.started. Make sure your handler flushes on reply.done.
  • Tool slower than timeout_seconds. Agent gets internal timeout, user tries again. Bump the timeout.

Tool fires when it shouldn’t

Description is too broad. Add explicit anti-triggers:
**Use this for**: weather questions.
**Do not call for**: general chit-chat, scheduling, or any non-weather topic.