Skip to content

feat(phase8 #76 D8.4a): FE AI SDK-compatible stream transport + part reducer#1700

Merged
earayu merged 3 commits into
mainfrom
fe/d8.4a-stream-client
Apr 25, 2026
Merged

feat(phase8 #76 D8.4a): FE AI SDK-compatible stream transport + part reducer#1700
earayu merged 3 commits into
mainfrom
fe/d8.4a-stream-client

Conversation

@earayu
Copy link
Copy Markdown
Collaborator

@earayu earayu commented Apr 25, 2026

Summary

D8.4a first-cut. Replaces the legacy AgentRuntimeRedisStore SSE consumer with a fetch + ReadableStream transport that speaks the AI SDK v5 UI Message Stream Protocol, hooks the new client into chat-messages.tsx through a narrow legacy-snapshot-shim, and lays the parts + transientActivity + status/error/abort/resume seam that #77 (parts renderer) and #78 (interactive consent + elicitation UI) will plug into. Architect msg=bad0cd0f locked the 6 client-side contracts; this PR delivers each in a verifiable form (see below).

Module layout (web/src/features/agent-runtime/)

File Purpose
types.ts Wire StreamPart typed union (mirrors aperag/domains/agent_runtime/wire/parts.py:1) + at-rest AgentMessagePart (text / tool / source-url / source-document / citation / tool-consent / elicitation) shaped to align with @ai-sdk/react's UIMessagePart.
stream-parser.ts SSE frame parser — id: + data: only, comments/heartbeats skipped, trailing partial frames carried.
stream-client.ts Single-connection consumer: validates x-vercel-ai-ui-message-stream: v1 response header, forwards Last-Event-ID on resume, terminates on finish / error / abort and on local AbortSignal.
reducer.ts Collapses wire lifecycle parts (tool-input-* / tool-output-available) into consolidated tool parts; dedups by stable id (text-block id / toolCallId / sourceId / elicitationId / citation fingerprint); transient `data-activity` is replace-last only and never reaches the persistent parts list.
use-agent-turn-stream.ts React hook with reconnect loop; surfaces `{ parts, transientActivity, status, errorText, lastSequence, abort }` (this is the contract #77 / #78 build on).
api.ts Typed JSON wrappers for create / cancel / snapshot / artifact + consent / elicitation submit endpoints.
legacy-snapshot-shim.ts TODO(#77 dongdong) projection back to `AgentTurnSnapshot { turn, timeline, artifacts }` so the existing card renders during the transition. Narrow by design — see boundary below.

Consumer rationale: AI SDK-compatible transport (not `useChat`)

Per architect lock msg=ed98280c + Weston msg=563157a8 + symphony msg=592de041 + dongdong msg=8883b85f, this PR ships an AI SDK-compatible stream transport rather than a `useChat` adoption. Why:

  • ApeRAG's turn lifecycle is two-phase: client must `POST /agent/chats/{cid}/turns` first to obtain a `stream_url`, then `GET` that URL to begin the SSE body. `useChat`'s default lifecycle is single-step POST+stream; bending it would mean writing a custom `ChatTransport` of comparable complexity to this consumer.
  • All 6 client-side contracts are at the wire-protocol level, orthogonal to the hook implementation.
  • `@ai-sdk/react@^2.0.0` + `ai@^5.0.0` are still added so feat: support default collections #77 can lean on the SDK's `isTextUIPart` / `isToolUIPart` / `isDataUIPart` type guards directly when rendering.

(Captured for D10 reference: ApeRAG agent runtime SSE is two-phase — POST `/turns` → GET `stream_url`. Worth normalising in the D8 doc later if symphony agrees, follow-up of D8.0c.)

6 client-side contracts (architect msg=bad0cd0f) — all delivered

  1. AI SDK v5 typed parts surface — `StreamPart` (`web/src/features/agent-runtime/types.ts`) mirrors BE wire parts 1:1; index re-exports `AgentMessagePart` shaped to align with `@ai-sdk/react`'s `UIMessagePart` so the renderer can use SDK type guards.
  2. Header marker validation — `stream-client.ts` rejects any response that does not advertise `x-vercel-ai-ui-message-stream: v1` before dispatching any `onPart`. Header constants exported as `AI_SDK_V5_HEADER` / `AI_SDK_V5_HEADER_VALUE`.
  3. Resume / error / abort — Reconnect loop in `use-agent-turn-stream.ts` sends `Last-Event-ID` header (from the highest seen SSE `id:` field) + `after_sequence` query on every connection attempt. `error` part flips `status='failed'` and the connection terminates; `abort` part flips `status='aborted'`. Local `abort()` triggers an `AbortController.abort()` so in-flight `fetch` is cancelled. Reconnect bounded at 5 attempts before surfacing `failed`.
  4. Part-level dedup by stable identifier (architect msg=f35c5a3d Lock C) — `reducer.ts` keys by `text-block id` / `toolCallId` / `sourceId` / `elicitationId` per part type. Citations have no native id; we fingerprint `cited_text + JSON.stringify(location)` since the BE replays the identical payload byte-for-byte (envelope-atomic replay).
  5. Wire shape adoption — wrapped `{type, data: {...}}` for `data-citation` / `data-tool-consent` / `data-elicitation` / `data-activity` passes through unchanged. Outer keys camelCase (`toolCallId`, `errorText`, `sourceId`, `mediaType`, `inputTextDelta`, `argsPreview`, etc.); inner `data` payloads keep the BE Pydantic alias casing exactly.
  6. Transient `data-activity` is render-only — never appears in `parts`; lives on the separate `transientActivity` slot, replace-last on each new frame. Doc'd at `reducer.ts` (`case 'data-activity':`).

Tool-output failure shape (minor architect-canonical drift, please confirm)

BE `parts.py:184` emits `tool-output-available` with an optional `errorText` field set on failure. The strict AI SDK v5 spec splits this into `tool-output-available` (success) / `tool-output-error` (failure). The reducer normalises both shapes onto a single `AgentToolPart.state ∈ {output-available, output-error}` with `errorText`, so consumers don't need to care. Calling out for symphony's review — happy to amend either side.

`legacy-snapshot-shim.ts` boundary (per PM lock msg=ed98280c)

  • Covered: `streamingAnswer` (grouped per text-block id, joined with `\n\n` so multi-block streams read naturally), patched turn status from the live stream, timeline + artifacts pass through from `baselineSnapshot` only (i.e. the snapshot endpoint's read-only history).
  • NOT covered: rich activity inference, debug previews, reference bundle items rendered from new `AgentMessagePart`, error_summary translation, timeline merging from the new `parts` stream. Those belong to the D8.4b renderer (feat: support default collections #77).
  • Zero callers outside `chat-messages.tsx`. The shim is scheduled for deletion as part of feat: support default collections #77.

Verification

  • `yarn lint` — clean (`✔ No ESLint warnings or errors`).
  • `tsc --noEmit` — clean for the touched files. Pre-existing main-branch errors in `chat-input.tsx` / `page.tsx` / `collection-form.tsx` / `collection-provider.tsx` are unrelated and untouched here.
  • Dev boot smoke (per `feedback_fe_runtime_gate.md`) — `yarn dev` boots in 2.3s on port 3010; `GET /`, `GET /auth/signin`, `GET /workspace/collections`, `GET /workspace` all return 200; no new runtime errors in the dev log (the `Invalid keys: status.queued (...)` i18n warnings are pre-existing on main).

Test plan

  • Symphony architect canonical drift quick check (esp. tool-output failure shape; citation fingerprint as dedup key)
  • Weston blocker-level minimal CR — focus on the 6 wire/protocol contracts being verifiable
  • dongdong scoped review — part union shape, dedup keys, reload/resume contract, feat: support default collections #77 renderer seam stability
  • Manual smoke on a live agent chat once a fresh tenant is available (POST a turn → live text stream → tool call lifecycle → completion)
  • Confirm `Last-Event-ID` resume on a forced disconnect (DevTools → throttling → offline → online)

Hand-off seams for #77 / #78

  • feat: support default collections #77 parts renderer → consumes `useAgentTurnStream(...) → { parts, transientActivity, status, errorText, ... }`. Will replace the `AgentTurnCard` consumer of `legacy-snapshot-shim`. Can delete `legacy-snapshot-shim.ts` once the new renderer ships.
  • chore: fix embedding timeout and scale the embedding workers #78 interactive consent / elicitation UI → filters `parts` for `kind === 'tool-consent'` / `kind === 'elicitation'` and calls `decideToolConsent(...)` / `submitElicitation(...)` from `api.ts`.

🤖 Generated with Claude Code

earayu and others added 3 commits April 25, 2026 22:26
…reducer

D8.4a first-cut. Replaces the legacy AgentRuntimeRedisStore SSE consumer
with a fetch+ReadableStream transport that speaks the AI SDK v5 UI
Message Stream Protocol. Hooks the new client into `chat-messages.tsx`
through a narrow `legacy-snapshot-shim` so `AgentTurnCard` keeps
rendering until the parts renderer (#77) ships.

Module layout (`web/src/features/agent-runtime/`):
* `types.ts` — wire `StreamPart` typed union (mirrors
  `aperag/domains/agent_runtime/wire/parts.py`) + at-rest
  `AgentMessagePart` (text / tool / source / citation / consent /
  elicitation) shaped to align with `@ai-sdk/react`'s `UIMessagePart`.
* `stream-parser.ts` — SSE frame parser (handles `id:` + `data:` only,
  ignores comments/heartbeats; carries trailing partial frames).
* `stream-client.ts` — single-connection consumer; validates
  `x-vercel-ai-ui-message-stream: v1` response header, forwards
  `Last-Event-ID` on resume, terminates on `finish` / `error` /
  `abort` and on local `AbortSignal`.
* `reducer.ts` — collapses lifecycle wire parts (`tool-input-*` /
  `tool-output-available`) into consolidated tool parts; dedups by
  stable id (text-block id / toolCallId / sourceId / elicitationId /
  citation fingerprint); transient `data-activity` is replace-last
  only and never reaches the persistent parts list.
* `use-agent-turn-stream.ts` — React hook with reconnect loop; surfaces
  `{ parts, transientActivity, status, errorText, lastSequence,
  abort }` to consumers (#77 / #78).
* `api.ts` — typed JSON wrappers for create/cancel/snapshot/artifact +
  consent/elicitation submit endpoints (#78 plug-in surface).
* `legacy-snapshot-shim.ts` — TODO(#77 dongdong) projection back to
  `AgentTurnSnapshot { turn, timeline, artifacts }` so the existing
  card renders during the transition. Boundary: streamingAnswer
  (grouped per text-block id), patched turn status; timeline +
  artifacts pass through from the baseline snapshot only.

Wire-protocol contracts (architect msg=bad0cd0f) — all verifiable in
the consumer:
1. AI SDK v5 typed parts surface (`StreamPart` mirrors BE; index
   re-exports SDK-aligned `AgentMessagePart` shapes).
2. Header marker — `x-vercel-ai-ui-message-stream: v1` checked before
   any `onPart` dispatch.
3. Resume / error / abort — `Last-Event-ID` header + `after_sequence`
   query on every reconnect; `error` part dispatched then connection
   terminates (no auto-retry on protocol failure beyond reconnect
   loop bounded at 5 attempts); `abort` part flips status and the
   `AbortController` cleans up.
4. Part-level dedup — by stable identifier per part type (architect
   msg=f35c5a3d Lock C); envelope-atomic replay tolerated.
5. Wire shape adoption — wrapped `{type, data:{...}}` for
   `data-citation/data-tool-consent/data-elicitation/data-activity`
   passes through unchanged; outer keys camelCase.
6. Transient `data-activity` — never persisted; surfaced on the
   separate `transientActivity` slot.

Two-phase lifecycle (ApeRAG-specific, captured for D10 reference):
client must POST `/agent/chats/{cid}/turns` first to obtain the
stream URL, then GET that URL to begin the SSE body. `useChat` is not
adopted because its single-step POST+stream lifecycle does not match.

`web/package.json`: adds `@ai-sdk/react@^2.0.0` + `ai@^5.0.0` for
typed parts surface (used today via re-exports; #77 will lean on
`isTextUIPart` / `isToolUIPart` / `isDataUIPart` directly).

Verified: `yarn lint` clean; `tsc --noEmit` clean for the touched
files (pre-existing main-branch errors in `chat-input.tsx` /
`page.tsx` / `collection-form.tsx` unrelated); `yarn dev` boots in
2.3s, GET / / `/auth/signin` / `/workspace/collections` /
`/workspace` all return 200.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Per architect canonical decision (msg=2f9225f5) — strict AI SDK v5
spec splits tool failure into a separate `tool-output-error` part type
(`{toolCallId, errorText}`). BE migration tracked as task #89 (D8.0c+
hygiene fix-forward, owner @cuiwenbo). The reducer now accepts both
the current `tool-output-available + errorText` shape and the post-#89
`tool-output-error` shape so the FE rolls forward without coupling to
BE timing.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… SDK-compatible parts

Weston msg=63a796f3 review identified two blockers within the locked
review boundary; both fixed in-PR.

## B1 — Terminal-driven completion (stream-client.ts)

Before: `consumeAgentStream()` returned `{reason:'completed'}` on
`reader.read()` `done`, regardless of whether a `finish` / `error` /
`abort` part had been dispatched. A clean mid-turn TCP close at the
HTTP layer would mark the turn completed instead of triggering the
reconnect loop, leaving #77 to render half-streamed parts as the
final message.

After: EOF without a terminal part returns `{reason:'error', error:
'stream closed before terminal frame'}` so the hook reconnects with
`Last-Event-ID` from the highest-seen `id:` field. Existing reconnect
budget (5 attempts) bounds persistent failures.

## B2 — SDK-compatible part union (types.ts + reducer.ts +
##      legacy-snapshot-shim.ts)

Before: `AgentMessagePart` used an ApeRAG-local `{kind: ...}`
discriminator. The PR claimed #77 could lean on `@ai-sdk/react`'s
`isTextUIPart` / `isToolUIPart` / `isDataUIPart` guards, but the SDK
guards branch on `type`, not `kind` — so the seam was nominally
SDK-aligned, factually divergent.

After: every part uses a `type:` discriminator that matches the SDK
exactly:
* `text` / `source-url` / `source-document` mirror the corresponding
  SDK `*UIPart` shapes structurally.
* Tool parts use `type: \`tool-${SafeToolName}\`` so the SDK's
  `isToolUIPart` `startsWith('tool-')` guard accepts them. `toolName`
  is also kept as a sibling field for direct render access.
* `data-citation` / `data-tool-consent` / `data-elicitation` use the
  SDK `DataUIPart` shape (`{type: 'data-${name}', id, data}`); `id`
  is the dedup key (citation fingerprint, toolCallId, elicitationId
  respectively).

A compile-time `_AgentMessagePartIsSDKCompatible` assertion in
`types.ts` enforces structural assignment to the SDK's
`TextUIPart` / `SourceUrlUIPart` / `SourceDocumentUIPart` /
`DataUIPart<ApeRAGUIDataTypes>` types — drift fails type-check.

Reducer is rewritten to produce the new shapes; consent and
elicitation now correctly replace existing parts when their state
transitions (the previous `kind:` shape relied on `update?` callback
that was a no-op for the consent/elicitation flow). `null` fields
from the wire are coerced to `undefined` to satisfy SDK shape
expectations.

`legacy-snapshot-shim.ts`: top comment claim "minimal timeline (one
entry per running tool call)" was a drift — the actual code only
passes through `baselineSnapshot.timeline` / `.artifacts`. Comment
realigned to actual coverage (per dongdong msg=f33e9039 minor).

Verified: `yarn lint` clean; `tsc --noEmit` clean for the touched
files (the SDK compatibility assertion compiles, proving structural
assignment); `yarn dev` boots in 3.5s on port 3011 with `GET /`,
`/auth/signin`, `/workspace/collections`, `/workspace` all 200.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@earayu earayu merged commit 63a9d52 into main Apr 25, 2026
4 checks passed
@earayu earayu deleted the fe/d8.4a-stream-client branch April 25, 2026 14:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant