layout	default
title	Chapter 4: Conversational AI
nav_order	4
parent	OpenAI Realtime Agents Tutorial

Chapter 4: Conversational AI

Welcome to Chapter 4: Conversational AI. In this part of OpenAI Realtime Agents Tutorial: Voice-First AI Systems, you will build an intuitive mental model first, then move into concrete implementation details and practical production tradeoffs.

Great realtime conversation design is about policy, pacing, and recoverability, not just response quality.

Learning Goals

By the end of this chapter, you should be able to:

define turn-management rules for voice interactions
structure prompt/policy layers to reduce regressions
maintain bounded context in long sessions
design graceful recovery paths for misunderstanding

Turn Management Principles

acknowledge quickly when operations may take time
keep spoken responses concise and easy to parse
ask one clarifying question at a time
confirm risky actions before tool execution

Policy Layering Model

Separate policy concerns so updates stay targeted:

interaction policy: tone, brevity, pacing
domain policy: workflow and business constraints
safety policy: prohibited actions/escalation triggers
tool policy: when and how external actions are allowed

Context Management in Long Sessions

summarize older turns into compact state
maintain explicit slot state (intent, entities, pending actions)
avoid replaying full transcript when compressed memory is sufficient
track unresolved tasks independently of raw transcript

Recovery Patterns

When confidence drops or user correction appears:

restate interpreted intent
ask for explicit confirmation
avoid irreversible side effects
fallback to human handoff where required

Conversational Quality Checks

Check	Why It Matters
interruption continuity	prevents broken conversation after barge-in
clarification rate	reveals understanding quality
task completion rate	measures practical utility
escalation correctness	protects user trust and safety

Evaluation Loop

Run weekly conversation evals using real transcripts:

sample high-friction sessions
classify failure category (policy, context, tool, latency)
apply smallest targeted change
rerun benchmark set before release

Source References

Summary

You now have a conversation-design framework that holds up under interruption, ambiguity, and production constraints.

Next: Chapter 5: Function Calling

Source Code Walkthrough

`src/app/hooks/useHandleSessionHistory.ts`

The handleAgentToolEnd function in src/app/hooks/useHandleSessionHistory.ts handles a key part of this chapter's functionality:

    );    
  }
  function handleAgentToolEnd(details: any, _agent: any, _functionCall: any, result: any) {
    const lastFunctionCall = extractFunctionCallByName(_functionCall.name, details?.context?.history);
    addTranscriptBreadcrumb(
      `function call result: ${lastFunctionCall?.name}`,
      maybeParseJson(result)
    );
  }

  function handleHistoryAdded(item: any) {
    console.log("[handleHistoryAdded] ", item);
    if (!item || item.type !== 'message') return;

    const { itemId, role, content = [] } = item;
    if (itemId && role) {
      const isUser = role === "user";
      let text = extractMessageText(content);

      if (isUser && !text) {
        text = "[Transcribing...]";
      }

      // If the guardrail has been tripped, this message is a message that gets sent to the 
      // assistant to correct it, so we add it as a breadcrumb instead of a message.
      const guardrailMessage = sketchilyDetectGuardrailMessage(text);
      if (guardrailMessage) {
        const failureDetails = JSON.parse(guardrailMessage);
        addTranscriptBreadcrumb('Output Guardrail Active', { details: failureDetails });
      } else {
        addTranscriptMessage(itemId, role, text);
      }

This function is important because it defines how OpenAI Realtime Agents Tutorial: Voice-First AI Systems implements the patterns covered in this chapter.

`src/app/hooks/useHandleSessionHistory.ts`

The handleHistoryAdded function in src/app/hooks/useHandleSessionHistory.ts handles a key part of this chapter's functionality:

  }

  function handleHistoryAdded(item: any) {
    console.log("[handleHistoryAdded] ", item);
    if (!item || item.type !== 'message') return;

    const { itemId, role, content = [] } = item;
    if (itemId && role) {
      const isUser = role === "user";
      let text = extractMessageText(content);

      if (isUser && !text) {
        text = "[Transcribing...]";
      }

      // If the guardrail has been tripped, this message is a message that gets sent to the 
      // assistant to correct it, so we add it as a breadcrumb instead of a message.
      const guardrailMessage = sketchilyDetectGuardrailMessage(text);
      if (guardrailMessage) {
        const failureDetails = JSON.parse(guardrailMessage);
        addTranscriptBreadcrumb('Output Guardrail Active', { details: failureDetails });
      } else {
        addTranscriptMessage(itemId, role, text);
      }
    }
  }

  function handleHistoryUpdated(items: any[]) {
    console.log("[handleHistoryUpdated] ", items);
    items.forEach((item: any) => {
      if (!item || item.type !== 'message') return;

This function is important because it defines how OpenAI Realtime Agents Tutorial: Voice-First AI Systems implements the patterns covered in this chapter.

`src/app/hooks/useHandleSessionHistory.ts`

The handleHistoryUpdated function in src/app/hooks/useHandleSessionHistory.ts handles a key part of this chapter's functionality:

  }

  function handleHistoryUpdated(items: any[]) {
    console.log("[handleHistoryUpdated] ", items);
    items.forEach((item: any) => {
      if (!item || item.type !== 'message') return;

      const { itemId, content = [] } = item;

      const text = extractMessageText(content);

      if (text) {
        updateTranscriptMessage(itemId, text, false);
      }
    });
  }

  function handleTranscriptionDelta(item: any) {
    const itemId = item.item_id;
    const deltaText = item.delta || "";
    if (itemId) {
      updateTranscriptMessage(itemId, deltaText, true);
    }
  }

  function handleTranscriptionCompleted(item: any) {
    // History updates don't reliably end in a completed item, 
    // so we need to handle finishing up when the transcription is completed.
    const itemId = item.item_id;
    const finalTranscript =
        !item.transcript || item.transcript === "\n"
        ? "[inaudible]"

This function is important because it defines how OpenAI Realtime Agents Tutorial: Voice-First AI Systems implements the patterns covered in this chapter.

`src/app/hooks/useHandleSessionHistory.ts`

The handleTranscriptionDelta function in src/app/hooks/useHandleSessionHistory.ts handles a key part of this chapter's functionality:

  }

  function handleTranscriptionDelta(item: any) {
    const itemId = item.item_id;
    const deltaText = item.delta || "";
    if (itemId) {
      updateTranscriptMessage(itemId, deltaText, true);
    }
  }

  function handleTranscriptionCompleted(item: any) {
    // History updates don't reliably end in a completed item, 
    // so we need to handle finishing up when the transcription is completed.
    const itemId = item.item_id;
    const finalTranscript =
        !item.transcript || item.transcript === "\n"
        ? "[inaudible]"
        : item.transcript;
    if (itemId) {
      updateTranscriptMessage(itemId, finalTranscript, false);
      // Use the ref to get the latest transcriptItems
      const transcriptItem = transcriptItems.find((i) => i.itemId === itemId);
      updateTranscriptItem(itemId, { status: 'DONE' });

      // If guardrailResult still pending, mark PASS.
      if (transcriptItem?.guardrailResult?.status === 'IN_PROGRESS') {
        updateTranscriptItem(itemId, {
          guardrailResult: {
            status: 'DONE',
            category: 'NONE',
            rationale: '',
          },

This function is important because it defines how OpenAI Realtime Agents Tutorial: Voice-First AI Systems implements the patterns covered in this chapter.

How These Components Connect

flowchart TD
    A[handleAgentToolEnd]
    B[handleHistoryAdded]
    C[handleHistoryUpdated]
    D[handleTranscriptionDelta]
    E[handleTranscriptionCompleted]
    A --> B
    B --> C
    C --> D
    D --> E

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chapter 4: Conversational AI

Learning Goals

Turn Management Principles

Policy Layering Model

Context Management in Long Sessions

Recovery Patterns

Conversational Quality Checks

Evaluation Loop

Source References

Summary

Source Code Walkthrough

`src/app/hooks/useHandleSessionHistory.ts`

`src/app/hooks/useHandleSessionHistory.ts`

`src/app/hooks/useHandleSessionHistory.ts`

`src/app/hooks/useHandleSessionHistory.ts`

How These Components Connect

FilesExpand file tree

04-conversational-ai.md

Latest commit

History

04-conversational-ai.md

File metadata and controls

Chapter 4: Conversational AI

Learning Goals

Turn Management Principles

Policy Layering Model

Context Management in Long Sessions

Recovery Patterns

Conversational Quality Checks

Evaluation Loop

Source References

Summary

Source Code Walkthrough

src/app/hooks/useHandleSessionHistory.ts

src/app/hooks/useHandleSessionHistory.ts

src/app/hooks/useHandleSessionHistory.ts

src/app/hooks/useHandleSessionHistory.ts

How These Components Connect

`src/app/hooks/useHandleSessionHistory.ts`

`src/app/hooks/useHandleSessionHistory.ts`

`src/app/hooks/useHandleSessionHistory.ts`

`src/app/hooks/useHandleSessionHistory.ts`