Function tools your own code runs. Declare them inline, handle the tool.call, and return a tool.result without breaking turn-taking.
Declare tools inline in session.tools when the logic must run in your own code: local state, in-memory lookups, or anything you don’t want to expose as an HTTP endpoint. The agent emits a tool.call; you run the tool and reply with a tool.result when reply.done is the latest event you’ve received.
If your tool just calls an HTTP API, prefer a server-side HTTP tool: AssemblyAI makes the request for you and your client never handles the round trip.
import asyncio, json, websocketsURL = "wss://agents.assemblyai.com/v1/ws"TOOLS = [{ "type": "function", "name": "get_weather", "description": "Get current weather for any city. Use this whenever the user asks about weather, temperature, or conditions. Prefer calling this over guessing.", "parameters": { "type": "object", "properties": {"city": {"type": "string", "description": "City name (e.g. London)"}}, "required": ["city"], },}]async def main(): async with websockets.connect(URL, extra_headers={"Authorization": "Bearer YOUR_KEY"}) as ws: await ws.send(json.dumps({ "type": "session.update", "session": { "system_prompt": "You are a weather assistant. Call get_weather for weather questions. When in doubt, call the tool.", "greeting": "Hi! Ask me about the weather.", "tools": TOOLS, "output": {"type": "audio", "voice": "ivy"}, }, })) last_event, pending = None, [] async def flush_if_idle(): if last_event != "reply.done" or not pending: return for t in pending: await ws.send(json.dumps({"type": "tool.result", "call_id": t["call_id"], "result": json.dumps(t["result"])})) pending.clear() async for raw in ws: event = json.loads(raw); t = event.get("type") if t == "tool.call" and event["name"] == "get_weather": pending.append({"call_id": event["call_id"], "result": {"temp_c": 22, "description": "Sunny"}}) await flush_if_idle() elif t in ("reply.started", "input.speech.started"): last_event = t elif t == "reply.done": last_event = t if event.get("status") == "interrupted": pending.clear() else: await flush_if_idle()asyncio.run(main())
Function tools use a JSON Schema for parameters and carry "type": "function". HTTP tools use the same JSON-Schema parameters; description, execution_mode, and timeout_seconds work the same way for both.
{ "type": "session.update", "session": { "tools": [ { "type": "function", "name": "get_weather", "description": "Get current weather for any city. Use this whenever the user asks about weather.", "parameters": { "type": "object", "properties": {"location": {"type": "string", "description": "City name, e.g. London"}}, "required": ["location"] }, "execution_mode": "interactive", "timeout_seconds": 120 } ] }}
Lead with format (“E.164”, “ISO-8601 date”, “lowercase”).
Always include an example.
Use enum for fixed sets.
required only for fields the tool truly can’t function without; otherwise the model interrogates the user.
The same accuracy hints (enum, examples, pattern, format) can go on each property here.
parameters is not validated at session.update time. Malformed schemas (missing type: "object", broken enum) are accepted silently and break tool calling at runtime. Validate locally.
Send tool.result when reply.done is the latest event you’ve received. Not earlier (agent is still mid-transition-phrase), not later (a new turn has started).
last_event: str | None = Nonepending_tools: list[dict] = []async def flush_if_idle(): if last_event != "reply.done" or not pending_tools: return for tool in pending_tools: await ws.send(json.dumps({ "type": "tool.result", "call_id": tool["call_id"], "result": json.dumps(tool["result"]), # JSON string })) pending_tools.clear()# In your event loop:if t == "tool.call": result = run_tool(event["name"], event["arguments"]) pending_tools.append({"call_id": event["call_id"], "result": result}) await flush_if_idle() # may already be idle if reply.done fired firstelif t in ("reply.started", "input.speech.started"): last_event = t # turn in flight, hold resultselif t == "reply.done": last_event = t if event.get("status") == "interrupted": pending_tools.clear() # agent moved on, drop stale results else: await flush_if_idle()
Two non-obvious bits:
Call flush_if_idle() from the tool.call handler. Your tool may return afterreply.done already fired.
Update last_event on reply.started / input.speech.started so results that become available mid-turn are held until that turn ends.
The error field is read verbatim by the model. Weak errors cause guessing loops; specific errors get clean recoveries.Weak (agent re-asks for everything):
{ "error": "Lookup failed." }
Strong (agent re-asks only for the field that failed):
{ "error": "Could not resolve DROPOFF 'Central train station'. Pickup resolved ('SW1A 1AA'). Ask the user for a UK postcode for the dropoff." }
Patterns: name the failing field, say what did work so the agent doesn’t re-ask for it, tell the agent what to ask for next.