Skip to content

Commit 79fba30

Browse files
docs(api): document typed StallEventPayload on status.stall.typed SSE event (#4802)
Documents the new `status.stall.typed` SSE event and its typed `StallEventPayload` schema introduced by PR #4806 (Issue #4802 F-9). Adds a Typed Stall Payload section to docs/api-reference.md covering wire format, field reference, bounded errorClass enum (6 values), backward compatibility with the legacy free-form `stall` event, channel-fanout fingerprint-defense rule, and migration recipe for typed-only consumers.
1 parent b8eef9b commit 79fba30

1 file changed

Lines changed: 68 additions & 0 deletions

File tree

docs/api-reference.md

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1444,10 +1444,78 @@ curl -N "http://localhost:9100/v1/sessions/abc123/events?token=$SSE_TOKEN"
14441444

14451445
> **`approval_resolved`** — emitted after an approval or rejection via `POST /v1/sessions/:id/approval/approve` or `/reject`. Payload: `{ action: "approved" | "rejected", approvalId: string }`.
14461446
1447+
> **`status.stall.typed`** — Issue #4802 (F-9). Typed superset of the legacy free-form `stall` event. Carries a bounded `StallEventPayload` so renderers can subscribe via Zod schema instead of parsing free-form strings. Both events ship in parallel; consumers should migrate to `status.stall.typed`. See [Typed Stall Payload](#typed-stall-payload-statusstalltyped) below for the full schema.
1448+
14471449
Supports `Last-Event-ID` header for replay of missed events.
14481450

14491451
---
14501452

1453+
### Typed Stall Payload (`status.stall.typed`)
1454+
1455+
Issue #4802 (F-9) adds a typed alternative to the legacy free-form `stall` SSE event. The typed event ships a bounded `StallEventPayload` so dashboards, scripts, and channel integrations never have to parse free-form `detail` strings.
1456+
1457+
**Wire format** (per-session SSE, `/v1/sessions/:id/events`):
1458+
1459+
```json
1460+
{
1461+
"event": "status.stall.typed",
1462+
"sessionId": "abc123",
1463+
"timestamp": "2026-06-22T17:42:18.000Z",
1464+
"data": {
1465+
"errorClass": "transient_5xx",
1466+
"statusCode": 529,
1467+
"lastErrorAt": "2026-06-22T17:42:15.000Z",
1468+
"stallDurationMs": 12500,
1469+
"recoveryAttemptCount": 1,
1470+
"recoveryMaxAttempts": 3,
1471+
"recoveryDisabled": false
1472+
}
1473+
}
1474+
```
1475+
1476+
**Field reference:**
1477+
1478+
| Field | Type | Always present? | Description |
1479+
|-------|------|-----------------|-------------|
1480+
| `errorClass` | enum (see below) | yes | Bounded operational category — drives the dashboard pill label. Adding a new value is a schema PR. |
1481+
| `statusCode` | integer | only when `errorClass === 'transient_5xx'` | HTTP status code extracted from CC `stopReason` (e.g. `'529_overloaded'``529`). Scoped to upstream 5xx only — other categories reject it at the validator. |
1482+
| `lastErrorAt` | string (ISO 8601) | yes | Timestamp of the last detected error or activity signal. |
1483+
| `stallDurationMs` | integer | yes | Milliseconds since the stall was first detected. |
1484+
| `recoveryAttemptCount` | integer | yes | Current recovery attempt number (`0` if recovery not yet attempted). Reset on successful recovery or idle transition. |
1485+
| `recoveryMaxAttempts` | integer | yes | Server-side cap on recovery attempts for this stall event. |
1486+
| `recoveryDisabled` | boolean | yes | Per-session kill-switch state. `true` means an operator paused auto-recovery via the dashboard stall pill. See `recoveryDisabled` on `SessionInfo`. |
1487+
1488+
**`errorClass` values** (bounded enum — `ERROR_CLASS_VALUES` in `src/stall-events.ts`):
1489+
1490+
| Value | Meaning |
1491+
|-------|---------|
1492+
| `transient_5xx` | Upstream 5xx (rate-limit, overloaded, service unavailable). Retry-eligible. Only category that carries `statusCode`. |
1493+
| `permission_timeout` | `permission_prompt` or `bash_approval` stalled past the timeout. |
1494+
| `jsonl_stall` | Session reported as "working" but no new JSONL bytes observed. |
1495+
| `thinking_stall` | Claude Code extended thinking past the stall threshold. |
1496+
| `unknown_stall` | Unknown stall state past the threshold. Also used as the mapping target for the legacy `extended` stall type until a dedicated enum lands. |
1497+
| `extended_working` | Session has been "working" for 3× the stall threshold (Claude Code internal loop). |
1498+
1499+
**Backward compatibility:**
1500+
1501+
The legacy free-form `stall` event (`{ stallType, detail }`) continues to ship alongside `status.stall.typed`. Both events are emitted at every stall-detector site; consumers that need rich metadata should migrate to `status.stall.typed`. The free-form event is the `Path 2 fallback` for renderers that haven't wired the typed schema yet.
1502+
1503+
**Channel fanout (Telegram, Slack, Email):**
1504+
1505+
When `status.stall.typed` is forwarded to channel transports, `statusCode` is stripped via `toChannelFanoutPayload()` because HTTP status codes are fingerprint-y (530 vs 529 reveals upstream API variant and adds noise to operator notifications). Operator surfaces (dashboard, in-app tooltip, API consumers) receive the full payload including `statusCode`.
1506+
1507+
**Migration recipe** for consumers on the legacy `stall` event:
1508+
1509+
```bash
1510+
# Subscribe to typed events only (curl, filter on event name)
1511+
curl -N "http://localhost:9100/v1/sessions/abc123/events?token=$SSE_TOKEN" \
1512+
| jq 'select(.event == "status.stall.typed")'
1513+
```
1514+
1515+
TypeScript consumers should validate `data` against the `StallEventPayload` schema from `src/stall-events.ts` (re-exported from `@aegis/sdk` in a future minor release).
1516+
1517+
---
1518+
14511519
### Get Child Sessions
14521520

14531521
```

0 commit comments

Comments
 (0)