| layout | default |
|---|---|
| title | Chapter 5: Function Calling |
| nav_order | 5 |
| parent | OpenAI Realtime Agents Tutorial |
Welcome to Chapter 5: Function Calling. In this part of OpenAI Realtime Agents Tutorial: Voice-First AI Systems, you will build an intuitive mental model first, then move into concrete implementation details and practical production tradeoffs.
Function calling is where realtime agents move from conversation to action. It must be fast, safe, and auditable.
By the end of this chapter, you should be able to:
- implement a reliable tool-call lifecycle
- enforce schema and authorization checks before execution
- design robust error and timeout handling for realtime UX
- return structured outputs that improve downstream response quality
- model emits tool request with arguments
- gateway validates schema and authorization
- tool executes with timeout and retry policy
- structured result (or structured error) returns to session
- assistant synthesizes user-facing response
| Requirement | Purpose |
|---|---|
| strict argument validation | prevents malformed or unsafe calls |
| auth and policy checks | enforces user/tenant permissions |
| timeout budgeting | protects responsiveness |
| idempotency keys | reduces duplicate side effects on retries |
| structured logging | supports forensic debugging |
- acknowledge long-running tools immediately
- stream progress where possible
- provide deterministic fallback when tool backend is unavailable
- never leave the user without a completion/error state
{
"status": "ok",
"data": {"order_id": "123", "state": "shipped"},
"confidence": 0.98,
"trace_id": "tool-req-abc"
}For errors, keep an explicit shape (status, error_code, message, retryable).
- unrestricted tool access from model-generated arguments
- free-form text outputs instead of typed result envelopes
- silent tool failures without user-visible recovery
- long retries that block turn transitions
You now have a production-safe tool-calling blueprint for realtime agents with clear reliability and security controls.
Next: Chapter 6: Voice Output
The handleHistoryAdded function in src/app/hooks/useHandleSessionHistory.ts handles a key part of this chapter's functionality:
}
function handleHistoryAdded(item: any) {
console.log("[handleHistoryAdded] ", item);
if (!item || item.type !== 'message') return;
const { itemId, role, content = [] } = item;
if (itemId && role) {
const isUser = role === "user";
let text = extractMessageText(content);
if (isUser && !text) {
text = "[Transcribing...]";
}
// If the guardrail has been tripped, this message is a message that gets sent to the
// assistant to correct it, so we add it as a breadcrumb instead of a message.
const guardrailMessage = sketchilyDetectGuardrailMessage(text);
if (guardrailMessage) {
const failureDetails = JSON.parse(guardrailMessage);
addTranscriptBreadcrumb('Output Guardrail Active', { details: failureDetails });
} else {
addTranscriptMessage(itemId, role, text);
}
}
}
function handleHistoryUpdated(items: any[]) {
console.log("[handleHistoryUpdated] ", items);
items.forEach((item: any) => {
if (!item || item.type !== 'message') return;This function is important because it defines how OpenAI Realtime Agents Tutorial: Voice-First AI Systems implements the patterns covered in this chapter.
The handleHistoryUpdated function in src/app/hooks/useHandleSessionHistory.ts handles a key part of this chapter's functionality:
}
function handleHistoryUpdated(items: any[]) {
console.log("[handleHistoryUpdated] ", items);
items.forEach((item: any) => {
if (!item || item.type !== 'message') return;
const { itemId, content = [] } = item;
const text = extractMessageText(content);
if (text) {
updateTranscriptMessage(itemId, text, false);
}
});
}
function handleTranscriptionDelta(item: any) {
const itemId = item.item_id;
const deltaText = item.delta || "";
if (itemId) {
updateTranscriptMessage(itemId, deltaText, true);
}
}
function handleTranscriptionCompleted(item: any) {
// History updates don't reliably end in a completed item,
// so we need to handle finishing up when the transcription is completed.
const itemId = item.item_id;
const finalTranscript =
!item.transcript || item.transcript === "\n"
? "[inaudible]"This function is important because it defines how OpenAI Realtime Agents Tutorial: Voice-First AI Systems implements the patterns covered in this chapter.
The handleTranscriptionDelta function in src/app/hooks/useHandleSessionHistory.ts handles a key part of this chapter's functionality:
}
function handleTranscriptionDelta(item: any) {
const itemId = item.item_id;
const deltaText = item.delta || "";
if (itemId) {
updateTranscriptMessage(itemId, deltaText, true);
}
}
function handleTranscriptionCompleted(item: any) {
// History updates don't reliably end in a completed item,
// so we need to handle finishing up when the transcription is completed.
const itemId = item.item_id;
const finalTranscript =
!item.transcript || item.transcript === "\n"
? "[inaudible]"
: item.transcript;
if (itemId) {
updateTranscriptMessage(itemId, finalTranscript, false);
// Use the ref to get the latest transcriptItems
const transcriptItem = transcriptItems.find((i) => i.itemId === itemId);
updateTranscriptItem(itemId, { status: 'DONE' });
// If guardrailResult still pending, mark PASS.
if (transcriptItem?.guardrailResult?.status === 'IN_PROGRESS') {
updateTranscriptItem(itemId, {
guardrailResult: {
status: 'DONE',
category: 'NONE',
rationale: '',
},This function is important because it defines how OpenAI Realtime Agents Tutorial: Voice-First AI Systems implements the patterns covered in this chapter.
The handleTranscriptionCompleted function in src/app/hooks/useHandleSessionHistory.ts handles a key part of this chapter's functionality:
}
function handleTranscriptionCompleted(item: any) {
// History updates don't reliably end in a completed item,
// so we need to handle finishing up when the transcription is completed.
const itemId = item.item_id;
const finalTranscript =
!item.transcript || item.transcript === "\n"
? "[inaudible]"
: item.transcript;
if (itemId) {
updateTranscriptMessage(itemId, finalTranscript, false);
// Use the ref to get the latest transcriptItems
const transcriptItem = transcriptItems.find((i) => i.itemId === itemId);
updateTranscriptItem(itemId, { status: 'DONE' });
// If guardrailResult still pending, mark PASS.
if (transcriptItem?.guardrailResult?.status === 'IN_PROGRESS') {
updateTranscriptItem(itemId, {
guardrailResult: {
status: 'DONE',
category: 'NONE',
rationale: '',
},
});
}
}
}
function handleGuardrailTripped(details: any, _agent: any, guardrail: any) {
console.log("[guardrail tripped]", details, _agent, guardrail);
const moderation = extractModeration(guardrail.result.output.outputInfo);This function is important because it defines how OpenAI Realtime Agents Tutorial: Voice-First AI Systems implements the patterns covered in this chapter.
flowchart TD
A[handleHistoryAdded]
B[handleHistoryUpdated]
C[handleTranscriptionDelta]
D[handleTranscriptionCompleted]
E[handleGuardrailTripped]
A --> B
B --> C
C --> D
D --> E