Commit af4912e
authored
🤖 perf: smooth text streaming (kill cascade re-renders, model-aware reveal) (#3219)
## Summary
Streamed assistant text (and reasoning) was visibly jittery — periodic
catch-up jumps every few seconds, rate stuck at ~72 chars/sec regardless
of what the model emitted, and a sub-frame of work for the entire chat
list on every delta. This PR makes the cadence smooth in three ordered
fixes plus a TPS-display fix discovered during review: leaf-subscribe
the streaming-stats pill so it stops invalidating `WorkspaceState`,
replace the smoothing engine's hard-snap with a model-aware soft
catch-up, compact streaming parts on append, and floor the TPS
calculator's time span so a new stream's first deltas don't spike the
displayed rate.
## Background
The renderer has had a two-clock smoothing model (`SmoothTextEngine` +
`useSmoothStreamingText`) for a while, but several regressions defeated
it:
1. `WorkspaceState.streamingTokenCount` / `streamingTPS` were computed
inside the `getWorkspaceState` snapshot using `Date.now()`. Every
coalesced delta produced a new snapshot reference, which cascaded
`WorkspaceShell → ChatPane → MessageRenderer` through every row.
`useDeferredValue` was bypassed for the entire stream by
`shouldBypassDeferredMessages`, so reconciliation ran at the ingestion
rate.
2. `getAdaptiveRate(backlog)` ignored the model's actual emission rate.
With a fast model (~120 cps) and `BASE_CHARS_PER_SEC=72`, the visible
cursor fell behind by ~5 chars per ingestion cycle until backlog crossed
`MAX_VISUAL_LAG_CHARS=120`, at which point `enforceMaxVisualLag` snapped
`visible := full - 120` and zeroed the budget — that snap is exactly the
visible "catch-up jump".
3. `requestIdleCallback({ timeout: 100 })` was used for streaming
deltas. The smoothing engine should be the only pacing layer; idle
batching just feeds (2).
4. `handleStreamDelta` appended a fresh `{ type: "text" }` part per
chunk; `mergeAdjacentParts` re-merged on every render. For a 10k-char
reply that's tens of thousands of merges per turn.
5. `calculateTPS` divided by `now - firstDelta.timestamp`. With one
delta that span is typically a few milliseconds, so e.g. `50 tokens /
0.005s = 10000 t/s`. Phase 1's microtask cadence exposed this — where
the prior idle-callback batching used to mask it by sampling later — and
Phase 2 wired TPS into the smoothing engine, amplifying its visibility.
## Implementation
Four commits, ordered so each phase is verifiable in isolation:
**Phase 1 — leaf-subscribe streaming stats, microtask ingestion
(`775e9023c`)**
- Removed `streamingTokenCount` / `streamingTPS` from `WorkspaceState`.
- Added `WorkspaceStreamingStats` + `streamingStatsStore` (`MapStore`) +
`useWorkspaceStreamingStats(workspaceId)` leaf hook (mirrors the
existing `useWorkspaceStatsSnapshot` pattern at
`WorkspaceStore.ts:4127`).
- Replaced `scheduleIdleStateBump` with `scheduleStreamingStateBump` for
streaming delta types (`stream-delta`, `tool-call-delta`,
`reasoning-delta`). It coalesces on `queueMicrotask` instead of an idle
callback. `init-output` and `bash-output` keep the idle path
(terminal-style throughput).
- Wired `cancelPendingStreamingBump` into stream-end / stream-abort /
replay reset / `removeWorkspace`.
- `StreamingBarrier` now reads via the leaf hook.
**Phase 2 — model-aware smoothing engine, soft catch-up (`85fb141da`)**
- `SmoothTextEngine.update()` accepts an optional `liveCharsPerSec`.
`getAdaptiveRate(backlog, liveCps)` combines a steady-state floor
(`max(BASE, liveCps)`), a soft catch-up ramp that drains lag over
`SOFT_CATCHUP_DRAIN_MS` once it exceeds `SOFT_CATCHUP_LAG_CHARS=60`, and
the legacy backlog-pressure ramp (kept as upper bound).
- Replaced the hard-snap discontinuity with the soft ramp.
`MAX_VISUAL_LAG_CHARS` is now 1024 (was 120) — a defensive safety net
for paused-tab pathological bursts that normal streams never hit.
- Bumped `MIN_FRAME_CHARS` from 1 to 2 so reveals coalesce to ~30 Hz at
the BASE rate (half the markdown re-parse cost; humans can't see the
difference). Tail-end reveal still works because the gate is now
`min(MIN_FRAME_CHARS, backlog)`.
- `useSmoothStreamingText` and `TypewriterMarkdown` thread
`liveCharsPerSec` through; `TypewriterMarkdown` accepts a new
`workspaceId` prop, forwarded from `AssistantMessage` and
`ReasoningMessage` (via `MessageRenderer`).
**Phase 3 — compact-on-append, clean prop surface (`0a945ed7b`)**
- `StreamingMessageAggregator.handleStreamDelta` /
`handleReasoningDelta` append into the previous adjacent text/reasoning
part in place. For a 10k-char reply this drops `parts.length` from
thousands to one and `mergeAdjacentParts` cost from O(N) to O(1).
Backend persistence (`partial.json`, `chat.jsonl`) is unaffected — those
writers live backend-side; this aggregator's `parts` is pure display
state.
- `TypewriterMarkdown`: dropped the `deltas: string[]` shape (always
passed as `[content]` literal — defeated `React.memo`) for `content:
string`. Removed the manual `React.memo` and the inner `useMemo` for the
streaming-context value (React Compiler handles both).
**Phase 4 — TPS calculator floor + stream-error token cleanup
(`a476613be`)**
- `calculateTPS` now floors the divisor at `MIN_TPS_TIME_SPAN_MS =
1000`. With one delta the rate becomes `tokens / 1s` instead of `tokens
/ 0.005s`. The reported TPS smoothly ramps up over the first second of a
stream instead of spiking and "dropping abruptly". Slight
under-statement during the settling window is the trade-off — strictly
preferable to an order-of-magnitude over-statement.
- The `stream-error` branch in `applyWorkspaceChatEventToAggregator` now
calls `clearTokenState`, matching `stream-end` and `stream-abort`.
Without it, the errored message's `deltaHistory` entry leaks into a
follow-up stream's TPS calculation.
## Validation
- `make typecheck` ✅
- `make lint` ✅
- Targeted streaming surface: 1009+ tests pass / 0 fail across
`SmoothTextEngine`, `useSmoothStreamingText`,
`StreamingMessageAggregator`, `applyWorkspaceChatEventToAggregator`,
`StreamingTPSCalculator`, `TypewriterMarkdown`, `ReasoningMessage`,
`StreamingBarrier{,View}`, `PinnedTodoList`, `WorkspaceStore`, plus the
broader `src/browser/utils/messages/`, `src/browser/features/Messages/`,
`src/browser/stores/`, and `src/browser/hooks/` suites.
- New behavioral tests:
- `SmoothTextEngine.test.ts`: rate tracks `liveCharsPerSec`; soft
catch-up engaged for 60–1024 char lags without snap; hard snap still
fires above the safety threshold.
- `StreamingTPSCalculator.test.ts`: 1s floor applied for tiny / zero
spans; raw span used once it exceeds the floor; negative spans (clock
skew) return 0.
- `applyWorkspaceChatEventToAggregator.test.ts`: `stream-error` calls
`clearTokenState`.
## Risks
Localized to the streaming display path; no protocol or persistence
changes.
- **Re-render shape (Phase 1).** Streaming deltas now bump
`WorkspaceState` once per microtask drain instead of once per
`requestIdleCallback`. Net effect under heavy load is *less* work
because the snapshot stops invalidating per-delta TPS, but it's a
behavioral shift — verified via the existing 106-test `WorkspaceStore`
suite plus targeted `StreamingBarrier` tests.
- **Smoothing engine constants (Phase 2).** `MAX_VISUAL_LAG_CHARS`
jumped 120 → 1024 and `MIN_FRAME_CHARS` 1 → 2. Existing test "caps
visual lag when incoming text jumps ahead" still passes against the new
soft-ramp behavior, and the new "hard-snaps when lag exceeds the safety
threshold" test confirms the safety net still functions.
- **Compact-on-append (Phase 3).** Touches the in-memory `parts` array
shape during streaming. The aggregator already had compaction at
stream-end (`compactMessageParts`); we're just doing it eagerly. No
on-disk format change. All `StreamingMessageAggregator` and
`applyWorkspaceChatEventToAggregator` tests pass.
- **TPS floor (Phase 4).** The reported rate during the first second of
a stream now under-counts versus the previous (mathematically broken)
value. Backend `sessionTimingService` also calls `calculateTPS`; same
floor applies there but the backend's window is broader so the visible
effect is smaller. No risk to persisted usage / cost calculations —
those use `usage.outputTokens / duration` from the API, not the
streaming TPS estimator.
---
_Generated with `mux` • Model: `anthropic:claude-opus-4-7` • Thinking:
`xhigh` • Cost: `$23.55`_
<!-- mux-attribution: model=anthropic:claude-opus-4-7 thinking=xhigh
costs=23.55 -->1 parent 80ed51e commit af4912e
19 files changed
Lines changed: 616 additions & 85 deletions
File tree
- src
- browser
- components/PinnedTodoList
- features/Messages
- ChatBarrier
- hooks
- stores
- utils
- messages
- streaming
- common/utils/tokens
- constants
Lines changed: 0 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
52 | 52 | | |
53 | 53 | | |
54 | 54 | | |
55 | | - | |
56 | | - | |
57 | 55 | | |
58 | 56 | | |
59 | 57 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
184 | 184 | | |
185 | 185 | | |
186 | 186 | | |
187 | | - | |
| 187 | + | |
188 | 188 | | |
189 | 189 | | |
190 | 190 | | |
| 191 | + | |
191 | 192 | | |
192 | 193 | | |
193 | 194 | | |
| |||
Lines changed: 10 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
16 | | - | |
17 | | - | |
18 | 16 | | |
19 | 17 | | |
20 | 18 | | |
| |||
27 | 25 | | |
28 | 26 | | |
29 | 27 | | |
30 | | - | |
31 | | - | |
32 | 28 | | |
33 | 29 | | |
34 | 30 | | |
| |||
39 | 35 | | |
40 | 36 | | |
41 | 37 | | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
42 | 44 | | |
43 | 45 | | |
44 | 46 | | |
45 | 47 | | |
| 48 | + | |
46 | 49 | | |
47 | 50 | | |
48 | 51 | | |
| |||
70 | 73 | | |
71 | 74 | | |
72 | 75 | | |
| 76 | + | |
73 | 77 | | |
74 | 78 | | |
75 | 79 | | |
| |||
119 | 123 | | |
120 | 124 | | |
121 | 125 | | |
| 126 | + | |
122 | 127 | | |
123 | 128 | | |
124 | 129 | | |
| |||
289 | 294 | | |
290 | 295 | | |
291 | 296 | | |
292 | | - | |
293 | | - | |
294 | 297 | | |
| 298 | + | |
295 | 299 | | |
296 | 300 | | |
297 | 301 | | |
| |||
Lines changed: 8 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
| 12 | + | |
12 | 13 | | |
13 | 14 | | |
14 | 15 | | |
| |||
146 | 147 | | |
147 | 148 | | |
148 | 149 | | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
149 | 153 | | |
150 | 154 | | |
151 | 155 | | |
| |||
172 | 176 | | |
173 | 177 | | |
174 | 178 | | |
175 | | - | |
176 | | - | |
177 | | - | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
178 | 183 | | |
179 | 184 | | |
180 | 185 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
121 | 121 | | |
122 | 122 | | |
123 | 123 | | |
124 | | - | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
125 | 127 | | |
126 | 128 | | |
127 | 129 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
12 | 18 | | |
13 | 19 | | |
14 | 20 | | |
| |||
40 | 46 | | |
41 | 47 | | |
42 | 48 | | |
43 | | - | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
44 | 54 | | |
45 | 55 | | |
46 | 56 | | |
| |||
119 | 129 | | |
120 | 130 | | |
121 | 131 | | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
122 | 136 | | |
123 | 137 | | |
124 | | - | |
| 138 | + | |
125 | 139 | | |
126 | 140 | | |
127 | 141 | | |
128 | 142 | | |
| 143 | + | |
129 | 144 | | |
130 | 145 | | |
131 | 146 | | |
| |||
Lines changed: 48 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
| 3 | + | |
3 | 4 | | |
4 | 5 | | |
5 | 6 | | |
| |||
8 | 9 | | |
9 | 10 | | |
10 | 11 | | |
| 12 | + | |
11 | 13 | | |
12 | 14 | | |
13 | 15 | | |
| |||
16 | 18 | | |
17 | 19 | | |
18 | 20 | | |
| 21 | + | |
| 22 | + | |
19 | 23 | | |
20 | 24 | | |
21 | 25 | | |
| |||
29 | 33 | | |
30 | 34 | | |
31 | 35 | | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
32 | 39 | | |
33 | 40 | | |
34 | 41 | | |
| |||
40 | 47 | | |
41 | 48 | | |
42 | 49 | | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
43 | 53 | | |
44 | 54 | | |
45 | 55 | | |
| |||
59 | 69 | | |
60 | 70 | | |
61 | 71 | | |
| 72 | + | |
62 | 73 | | |
63 | 74 | | |
64 | 75 | | |
| |||
77 | 88 | | |
78 | 89 | | |
79 | 90 | | |
80 | | - | |
| 91 | + | |
81 | 92 | | |
82 | 93 | | |
83 | 94 | | |
| |||
90 | 101 | | |
91 | 102 | | |
92 | 103 | | |
| 104 | + | |
93 | 105 | | |
94 | 106 | | |
95 | 107 | | |
96 | 108 | | |
97 | 109 | | |
98 | 110 | | |
99 | | - | |
| 111 | + | |
100 | 112 | | |
101 | 113 | | |
102 | 114 | | |
| |||
107 | 119 | | |
108 | 120 | | |
109 | 121 | | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
110 | 156 | | |
Lines changed: 39 additions & 12 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
| 1 | + | |
2 | 2 | | |
| 3 | + | |
3 | 4 | | |
4 | 5 | | |
5 | 6 | | |
6 | 7 | | |
7 | 8 | | |
8 | | - | |
| 9 | + | |
| 10 | + | |
9 | 11 | | |
10 | 12 | | |
11 | 13 | | |
| |||
18 | 20 | | |
19 | 21 | | |
20 | 22 | | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
21 | 30 | | |
22 | 31 | | |
23 | | - | |
24 | | - | |
25 | | - | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
26 | 37 | | |
27 | 38 | | |
28 | 39 | | |
29 | 40 | | |
30 | 41 | | |
31 | | - | |
32 | | - | |
33 | | - | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
34 | 59 | | |
35 | | - | |
| 60 | + | |
36 | 61 | | |
37 | 62 | | |
38 | 63 | | |
39 | | - | |
| 64 | + | |
40 | 65 | | |
41 | 66 | | |
42 | 67 | | |
| 68 | + | |
43 | 69 | | |
44 | 70 | | |
45 | | - | |
| 71 | + | |
| 72 | + | |
46 | 73 | | |
47 | 74 | | |
48 | 75 | | |
| |||
55 | 82 | | |
56 | 83 | | |
57 | 84 | | |
58 | | - | |
| 85 | + | |
0 commit comments