You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix(sdk,core): head-start handover correctness and continuation boot latency (#3907)
## Summary
Three related fixes for `chat.headStart` and continuation boots, found
while investigating customer reports.
**1. `chat.headStart` now works with `hydrateMessages`.** The turn-0
handover splice only ran on the default accumulation path, so agents
registering `hydrateMessages` silently lost the warm route's step-1
response: pure-text turns fired `onTurnComplete` with no assistant
message (and an empty durable write), tool-call turns re-ran step 1 from
scratch under a fresh `messageId`, and the head-start user message never
reached the hydrate hook at all. The first-turn history now reaches
`hydrateMessages` as `incomingMessages`, and the splice runs after both
accumulation branches, deduplicated by the handover `messageId`.
**2. Reasoning parts survive the handover.** The synthesized partial
only mapped text and tool-call parts, so an extended-thinking model's
step-1 reasoning streamed to the browser but never reached durable
history. Reasoning parts now map through with provider metadata, so
Anthropic thinking signatures survive a UIMessage round trip on hydrate
replays.
**3. Continuation boots no longer stall for ~10 seconds.** The `.in`
resume cursor was found by draining an SSE subscription that only closes
after its 5 second inactivity window, and the scan ran twice per boot.
It is now a non-blocking records read of the latest turn-complete
header, runs at most once per boot, the boot reads run concurrently, and
chat snapshots carry the cursor so subsequent boots skip the scan
entirely. Measured locally on a cancel-then-continue repro: pre-turn
continuation latency dropped from ~11s to ~0.5s.
Every fix was verified red-green: new unit tests reproduced each failure
before the fix, and end-to-end smoke tests against a live local stack
covered both handover legs, reasoning persistence with extended thinking
(including a follow-up turn that round-trips the persisted signed
reasoning back to the provider), and the boot timing comparison.
## Rollout
SDK-only; no server change required. A new SDK against a server that
does not serialize record headers degrades to the existing no-cursor
fallback. Old SDKs ignore the new snapshot field, and new SDKs fall back
to the records scan on snapshots written before it existed.
Continuation chat boots no longer stall for around 10 seconds before the first turn. The `session.in` resume cursor is now found with a non-blocking records read instead of draining an SSE long-poll (which always waited out its full 5 second inactivity window, twice per boot), the boot reads run concurrently, and chat snapshots carry the cursor so subsequent boots skip the scan entirely.
Fix `chat.headStart` when `hydrateMessages` is registered. The warm route's step-1 partial now reaches the agent's accumulator on the hydrate path, so `onTurnComplete` carries the full first turn (the head-start user message included), tool-call handovers resume from step 2 instead of re-running step 1, and the assistant `messageId` stays stable across the handover.
Preserve reasoning parts across the `chat.headStart` handover. Extended-thinking models' step-1 reasoning now lands in the durable session history (and `onTurnComplete`) under the same assistant `messageId`, with provider metadata intact so Anthropic thinking signatures survive replays.
0 commit comments