You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs(api): document typed StallEventPayload on status.stall.typed SSE event (#4802)
Documents the new `status.stall.typed` SSE event and its typed `StallEventPayload` schema introduced by PR #4806 (Issue #4802 F-9).
Adds a Typed Stall Payload section to docs/api-reference.md covering wire format, field reference, bounded errorClass enum (6 values), backward compatibility with the legacy free-form `stall` event, channel-fanout fingerprint-defense rule, and migration recipe for typed-only consumers.
> **`approval_resolved`** — emitted after an approval or rejection via `POST /v1/sessions/:id/approval/approve` or `/reject`. Payload: `{ action: "approved" | "rejected", approvalId: string }`.
1446
1446
1447
+
> **`status.stall.typed`** — Issue #4802 (F-9). Typed superset of the legacy free-form `stall` event. Carries a bounded `StallEventPayload` so renderers can subscribe via Zod schema instead of parsing free-form strings. Both events ship in parallel; consumers should migrate to `status.stall.typed`. See [Typed Stall Payload](#typed-stall-payload-statusstalltyped) below for the full schema.
1448
+
1447
1449
Supports `Last-Event-ID` header for replay of missed events.
1448
1450
1449
1451
---
1450
1452
1453
+
### Typed Stall Payload (`status.stall.typed`)
1454
+
1455
+
Issue #4802 (F-9) adds a typed alternative to the legacy free-form `stall` SSE event. The typed event ships a bounded `StallEventPayload` so dashboards, scripts, and channel integrations never have to parse free-form `detail` strings.
|`errorClass`| enum (see below) | yes | Bounded operational category — drives the dashboard pill label. Adding a new value is a schema PR. |
1481
+
|`statusCode`| integer | only when `errorClass === 'transient_5xx'`| HTTP status code extracted from CC `stopReason` (e.g. `'529_overloaded'` → `529`). Scoped to upstream 5xx only — other categories reject it at the validator. |
1482
+
|`lastErrorAt`| string (ISO 8601) | yes | Timestamp of the last detected error or activity signal. |
1483
+
|`stallDurationMs`| integer | yes | Milliseconds since the stall was first detected. |
1484
+
|`recoveryAttemptCount`| integer | yes | Current recovery attempt number (`0` if recovery not yet attempted). Reset on successful recovery or idle transition. |
1485
+
|`recoveryMaxAttempts`| integer | yes | Server-side cap on recovery attempts for this stall event. |
1486
+
|`recoveryDisabled`| boolean | yes | Per-session kill-switch state. `true` means an operator paused auto-recovery via the dashboard stall pill. See `recoveryDisabled` on `SessionInfo`. |
1487
+
1488
+
**`errorClass` values** (bounded enum — `ERROR_CLASS_VALUES` in `src/stall-events.ts`):
1489
+
1490
+
| Value | Meaning |
1491
+
|-------|---------|
1492
+
|`transient_5xx`| Upstream 5xx (rate-limit, overloaded, service unavailable). Retry-eligible. Only category that carries `statusCode`. |
1493
+
|`permission_timeout`|`permission_prompt` or `bash_approval` stalled past the timeout. |
1494
+
|`jsonl_stall`| Session reported as "working" but no new JSONL bytes observed. |
1495
+
|`thinking_stall`| Claude Code extended thinking past the stall threshold. |
1496
+
|`unknown_stall`| Unknown stall state past the threshold. Also used as the mapping target for the legacy `extended` stall type until a dedicated enum lands. |
1497
+
|`extended_working`| Session has been "working" for 3× the stall threshold (Claude Code internal loop). |
1498
+
1499
+
**Backward compatibility:**
1500
+
1501
+
The legacy free-form `stall` event (`{ stallType, detail }`) continues to ship alongside `status.stall.typed`. Both events are emitted at every stall-detector site; consumers that need rich metadata should migrate to `status.stall.typed`. The free-form event is the `Path 2 fallback` for renderers that haven't wired the typed schema yet.
1502
+
1503
+
**Channel fanout (Telegram, Slack, Email):**
1504
+
1505
+
When `status.stall.typed` is forwarded to channel transports, `statusCode` is stripped via `toChannelFanoutPayload()` because HTTP status codes are fingerprint-y (530 vs 529 reveals upstream API variant and adds noise to operator notifications). Operator surfaces (dashboard, in-app tooltip, API consumers) receive the full payload including `statusCode`.
1506
+
1507
+
**Migration recipe** for consumers on the legacy `stall` event:
1508
+
1509
+
```bash
1510
+
# Subscribe to typed events only (curl, filter on event name)
TypeScript consumers should validate `data` against the `StallEventPayload` schema from `src/stall-events.ts` (re-exported from `@aegis/sdk` in a future minor release).
0 commit comments