Skip to content

chat(): duplicate TOOL_CALL_END (no preceding START) for server-executed tools breaks AG-UI verify #519

@AlemTuzlak

Description

@AlemTuzlak

Summary

chat() emits a duplicate TOOL_CALL_END event (with no preceding TOOL_CALL_START) for every server-executed tool. The first TOOL_CALL_END comes from the adapter during streaming; the second comes from buildToolResultChunks() after server execution.

This violates the AG-UI streaming contract — @ag-ui/client's verifyEvents middleware (and any spec-strict consumer) treats a TOOL_CALL_END not preceded by a matching TOOL_CALL_START as a protocol error and rejects the stream.

Reproducible on @tanstack/ai@0.14.0 (latest). I verified the same code path is on main.

Reproducer

import { chat, toolDefinition } from "@tanstack/ai";
import { openaiText } from "@tanstack/ai-openai";
import { z } from "zod";

const weatherTool = toolDefinition({
  name: "getWeather",
  description: "Get the weather for a city",
  inputSchema: z.object({ city: z.string() }),
}).server(async ({ city }) => ({ city, tempC: 21 }));

const stream = chat({
  adapter: openaiText("gpt-4o"),
  messages: [{ role: "user", content: "What's the weather in Paris?" }],
  tools: [weatherTool],
});

// Count events per toolCallId to see the duplicate END.
const counts = new Map<string, Record<string, number>>();
for await (const chunk of stream) {
  const id = (chunk as any).toolCallId;
  if (!id) continue;
  const c = counts.get(id) ?? {};
  c[chunk.type] = (c[chunk.type] ?? 0) + 1;
  counts.set(id, c);
}
console.log(counts);
// Observed:
// Map {
//   "<id>" => { TOOL_CALL_START: 1, TOOL_CALL_ARGS: N, TOOL_CALL_END: 2, TOOL_CALL_RESULT: 1 }
// }
// Expected: TOOL_CALL_END: 1

The same shape happens for undiscoveredLazyResults (see processToolCalls in chat/index.ts, around buildToolResultChunks(undiscoveredLazyResults, finishEvt)).

Root cause

In packages/typescript/ai/src/activities/chat/index.ts, buildToolResultChunks(results, finishEvent, argsMap?) always pushes a TOOL_CALL_END chunk, but only pushes TOOL_CALL_START + TOOL_CALL_ARGS when argsMap is provided:

https://github.com/TanStack/ai/blob/main/packages/typescript/ai/src/activities/chat/index.ts#L1198-L1240

private buildToolResultChunks(
  results: Array<ToolResult>,
  finishEvent: RunFinishedEvent,
  argsMap?: Map<string, string>,
): Array<StreamChunk> {
  const chunks: Array<StreamChunk> = []
  for (const result of results) {
    const content = JSON.stringify(result.result)

    if (argsMap) {
      chunks.push({ type: 'TOOL_CALL_START', /* ... */ })
      chunks.push({ type: 'TOOL_CALL_ARGS',  /* ... */ })
    }

    chunks.push({ type: 'TOOL_CALL_END',    /* ... */ })  // <-- always
    chunks.push({ type: 'TOOL_CALL_RESULT', /* ... */ })
    // ...
  }
}

There are five call sites. Two paths pass argsMap (and are spec-clean), three don't:

Site Path argsMap Result
index.ts ~L767 checkForPendingToolCalls → undiscovered lazy no duplicate END
index.ts ~L847 / ~L874 checkForPendingToolCalls → continuation re-execution yes (added in #372) OK
index.ts ~L924 processToolCalls → undiscovered lazy no duplicate END
index.ts ~L1003 processToolCalls → mixed approval / client + executed results no duplicate END
index.ts ~L1029 processToolCalls → normal post-execution no duplicate END

#372 (0.10.2, Emit TOOL_CALL_START and TOOL_CALL_ARGS for pending tool calls during continuation re-executions) fixed two of these by threading argsMap through. The other three were left as-is and still produce the orphan TOOL_CALL_END.

The duplicate happens inside iteration 1 of the agent loop (in the executeToolCalls cyclePhase), so agentLoopStrategy: maxIterations(1) does not work around it — shouldContinue() hardcodes if (this.cyclePhase === 'executeToolCalls') return true and the duplicate is emitted before any loop strategy is consulted.

Why it matters

Any AG-UI-spec-strict consumer rejects the stream. Concretely, CopilotKit's runtime pipes the chat() output through @ag-ui/client's verifyEvents middleware, which throws on TOOL_CALL_END without a matching TOOL_CALL_START. We're currently working around this in @copilotkit/runtime by stopping conversion at the first RUN_FINISHED (CopilotKit#4476), but that's a consumer-side guard and it discards real events from the second iteration of multi-turn agentic runs.

The same problem will surface for any other AG-UI-strict consumer (anything wired to @ag-ui/client's verifier, including AG-UI's own dev tooling).

Proposed fix

The post-execution TOOL_CALL_END is redundant — the adapter already emitted it during streaming for the same toolCallId. The post-execution phase only needs to emit the new information: TOOL_CALL_RESULT (and the assistant-message bookkeeping that already happens after).

Options, in order of preference:

  1. Drop the TOOL_CALL_END push from buildToolResultChunks. The adapter is already responsible for START / ARGS / END; buildToolResultChunks should only contribute TOOL_CALL_RESULT. This fixes all five call sites with one change and is semantically the most correct.
  2. Keep the current shape but emit a synthetic TOOL_CALL_START whenever TOOL_CALL_END is emitted — i.e. always pass an argsMap (build one from the executed ToolCall[] if the caller didn't). This keeps streams "self-contained" but adds duplicate START/ARGS events in the normal post-execution path, which is its own protocol-shape problem.
  3. Document that chat() output is not AG-UI-spec-conformant and require consumers to filter post-RUN_FINISHED events themselves.

Option 1 is what I'd ship. Happy to open a PR if you agree on the shape.

Versions

  • @tanstack/ai: 0.14.0
  • @tanstack/ai-openai: 0.8.2
  • Node: 20.x
  • Adapter: openaiText("gpt-4o")

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions