fix(sdk): stop chat.createSession wedging on stop and erroring on continuation boots (#3920)

ericallam · web-flow · commit 47834198fc81 · 2026-06-12T14:07:45.000+01:00
## Summary

Two `chat.createSession()` bugs that break chats at its abstraction
level:

1. **Stopping a generation wedged the run forever.** `turn.complete()`
bare-awaited the AI SDK's `totalUsage` promise, which never settles
after a stop-abort. The run stayed stuck inside the stopped turn (trace
shows a permanently partial `ai.streamText` span and no further `waiting
for next message`), so the chat could never take another message. Fixed
with the same 2s `Promise.race` guard `chat.agent`'s turn loop already
uses.

2. **Continuation runs invoked the model with an empty prompt.** The
first turn only waited for a message on `preload` boots. A continuation
run (spawned after a cancel, crash, or version upgrade) arrives with the
boot payload stripped, so the loop ran a turn with zero messages and
errored with `AI_InvalidPromptError: messages must not be empty`.
Message-less continuation boots now wait for the next session input
("waiting for first message (continuation)"), and `turn.continuation` is
preserved across the wait so user code can seed stored history off it.

Both reproduced and verified end-to-end against a live environment (stop
followed by a next turn; cancel followed by a continuation turn with
seeded history), plus the existing unit suite.
diff --git a/.changeset/create-session-stop-continuation.md b/.changeset/create-session-stop-continuation.md
@@ -0,0 +1,5 @@
+---
+"@trigger.dev/sdk": patch
+---
+
+Fix two `chat.createSession()` bugs: stopping a generation no longer wedges the run (the turn loop raced a `totalUsage` promise that never settles after a stop-abort), and continuation runs now wait for the next message instead of invoking the model with an empty prompt.
diff --git a/packages/trigger-sdk/src/v3/ai.ts b/packages/trigger-sdk/src/v3/ai.ts
@@ -8988,19 +8988,35 @@ function createChatSession(
         async next(): Promise<IteratorResult<ChatTurn>> {
           turn++;
 
-          // First turn: handle preload — wait for the first real message
-          if (turn === 0 && currentPayload.trigger === "preload") {
+          // First turn: wait when the boot payload carries no message.
+          // Preload boots wait for the first real message; continuation
+          // boots (fresh run via `ensureRunForSession` / end-and-continue)
+          // arrive with the sticky boot-payload fields stripped, so running
+          // a turn immediately would invoke the model with no user input.
+          const isMessagelessContinuationBoot =
+            currentPayload.continuation === true && !currentPayload.message;
+          if (turn === 0 && (currentPayload.trigger === "preload" || isMessagelessContinuationBoot)) {
             const result = await messagesInput.waitWithIdleTimeout({
               idleTimeoutInSeconds:
                 sessionIdleTimeoutOpt ?? currentPayload.idleTimeoutInSeconds ?? 30,
               timeout,
-              spanName: "waiting for first message",
+              spanName:
+                currentPayload.trigger === "preload"
+                  ? "waiting for first message"
+                  : "waiting for first message (continuation)",
             });
             if (!result.ok || runSignal.aborted) {
               stop.cleanup();
               return { done: true, value: undefined };
             }
+            const continuationBoot = isMessagelessContinuationBoot;
             currentPayload = result.output;
+            // Preserve the continuation flag — the wire payload of the next
+            // message doesn't carry it, and `turn.continuation` is how the
+            // user knows to seed history (e.g. `turn.setMessages(stored)`).
+            if (continuationBoot && currentPayload.continuation === undefined) {
+              currentPayload = { ...currentPayload, continuation: true };
+            }
           }
 
           // Subsequent turns: wait for the next message
@@ -9170,14 +9186,22 @@ function createChatSession(
                 }
               }
 
-              // Capture token usage from the streamText result
+              // Capture token usage from the streamText result. Race with a 2s
+              // timeout — on stop-abort the AI SDK's totalUsage promise can hang
+              // indefinitely, which would wedge the turn loop (same guard as
+              // chat.agent's turn loop).
               let turnUsage: LanguageModelUsage | undefined;
               if (typeof (source as any).totalUsage?.then === "function") {
                 try {
-                  const usage: LanguageModelUsage = await (source as any).totalUsage;
-                  turnUsage = usage;
-                  previousTurnUsage = usage;
-                  cumulativeUsage = addUsage(cumulativeUsage, usage);
+                  const usage = (await Promise.race([
+                    (source as any).totalUsage,
+                    new Promise<undefined>((r) => setTimeout(() => r(undefined), 2_000)),
+                  ])) as LanguageModelUsage | undefined;
+                  if (usage) {
+                    turnUsage = usage;
+                    previousTurnUsage = usage;
+                    cumulativeUsage = addUsage(cumulativeUsage, usage);
+                  }
                 } catch {
                   /* non-fatal */
                 }

-Original file line number
+Diff line change
@@ @@ -0,0 +1,5 @@ @@
 +---
 +"@trigger.dev/sdk": patch
 +---
++
 +Fix two `chat.createSession()` bugs: stopping a generation no longer wedges the run (the turn loop raced a `totalUsage` promise that never settles after a stop-abort), and continuation runs now wait for the next message instead of invoking the model with an empty prompt.