|
| 1 | +# ADR-0012: Wire-input hardening — frame-shape guards, inbound limits, sanitized execute errors |
| 2 | + |
| 3 | +**Status**: Accepted |
| 4 | +**Date**: 2026-06-11 |
| 5 | +**Plan**: [005-wire-input-hardening.md](../../plans/005-wire-input-hardening.md) |
| 6 | + |
| 7 | +## Context |
| 8 | + |
| 9 | +The DO trusted every *decoded* frame's shape. MessagePack decoding was already |
| 10 | +guarded (`webSocketMessage` ignored undecodable bytes), but a frame that decoded |
| 11 | +to the wrong *shape* was dispatched as-is: a `mut` whose `txId` was an object |
| 12 | +could reach `lookupTx` and `sql.exec` bindings with an arbitrary value. |
| 13 | + |
| 14 | +Additionally, nothing capped `ops` length, per-socket subscriptions, or inbound |
| 15 | +frame size, so one authenticated client could force unbounded allocation. And a |
| 16 | +thrown `execute` error's raw message (SQLite constraint text, column names, etc.) |
| 17 | +was forwarded verbatim to the client — leaking internal schema detail. |
| 18 | + |
| 19 | +None of these are unauthenticated holes — the attacker is already an authed socket |
| 20 | +on the same DO — but the library's ethos is reject-don't-degrade, and these are |
| 21 | +the cheap mechanical layers of that. |
| 22 | + |
| 23 | +## Decisions |
| 24 | + |
| 25 | +### D1: Shape guard — drop and log, no reply |
| 26 | + |
| 27 | +A `wellFormed(v: unknown): v is ClientFrame` check runs after decode, before any |
| 28 | +SQL binding. A frame that fails the check is dropped with a server-side |
| 29 | +`console.error` (fail loud in logs) and **no client reply** — mirroring the |
| 30 | +existing "ignore undecodable frames" stance and extending its comment. |
| 31 | + |
| 32 | +The guard validates per variant of `ClientFrame` (enumerated from |
| 33 | +`src/wire/frames.ts`): required fields are checked for correct type and |
| 34 | +non-emptiness; optional fields treat `null` as absent (the client transport |
| 35 | +serialises absent fields as `null` in MessagePack rather than omitting them). |
| 36 | +`where`/`orderBy` fields are left `unknown` — the sql-compiler is their |
| 37 | +validator (it already `fail-loud`s via `UnsupportedPredicateError`). |
| 38 | + |
| 39 | +### D2: Three overrideable limits as `protected readonly` tunables |
| 40 | + |
| 41 | +Following the `tickMs`/`compactionEvery` field pattern (doc comment, `protected |
| 42 | +readonly`, override-able by subclasses at construction time): |
| 43 | + |
| 44 | +- **`maxOpsPerMutation = 128`**: checked at the top of `handleMut`. **Reject, |
| 45 | + don't truncate** — a partial apply would silently drop client writes. Sends |
| 46 | + `rejected` with `code: "LIMIT_EXCEEDED"`. |
| 47 | +- **`maxSubsPerSocket = 256`**: checked in `handleSub` before `subs.add`. A |
| 48 | + re-sub on an existing `subId` replaces the old entry |
| 49 | + (`SubscriptionRegistry.add` semantics) and does NOT count against the cap — |
| 50 | + only genuinely new sub IDs are counted. Over-limit → `reset` for the refused |
| 51 | + `subId` + `console.error`. (Reset is the existing "sub refused" signal.) |
| 52 | +- **`maxFrameBytes = 1_048_576`**: checked in `webSocketMessage` before decode |
| 53 | + (`typeof message === "string" ? message.length : message.byteLength`). |
| 54 | + Oversize → drop + `console.error`. Cloudflare caps WS messages at ~1 MiB |
| 55 | + anyway; this makes the bound explicit, testable, and overrideable. |
| 56 | + |
| 57 | +`SubscriptionRegistry.countFor(ws)` was added to expose the per-socket count |
| 58 | +without exposing the internal Map. |
| 59 | + |
| 60 | +### D3: Execute-error sanitization; authorize errors stay user-facing |
| 61 | + |
| 62 | +In `handleMut`, there are two catch sites: |
| 63 | + |
| 64 | +- **Authorize catch**: unchanged. Authorize errors are user-facing API |
| 65 | + (`README: "throw to deny"`). The error message passes through verbatim. |
| 66 | +- **Execute catch (transaction)**: the full error is logged server-side |
| 67 | + (`console.error`) and a **generic** `"mutation failed"` message with code |
| 68 | + `"EXECUTE_FAILED"` is sent to the client. SQLite constraint strings, column |
| 69 | + names, and programming-error text are internal detail — not client API surface. |
| 70 | + |
| 71 | +In `handleCall`, authorize and execute share one try/catch. Command authorize |
| 72 | +errors are not currently user-facing API in the same way mutation authorize is, so |
| 73 | +the entire catch is sanitized: log full detail server-side, send `"command failed"` |
| 74 | +with `"EXECUTE_FAILED"` to the client. |
| 75 | + |
| 76 | +**Compatibility note**: the client-visible error text for execute failures changed. |
| 77 | +Callers who matched on specific SQLite error strings or programming-error messages |
| 78 | +must update to the generic messages/codes. Authorize-path messages are unchanged. |
| 79 | + |
| 80 | +### D4: Dedup identity binding deferred |
| 81 | + |
| 82 | +`_sync_seen_tx` is keyed by `txId` alone; any authed socket presenting a |
| 83 | +guessed/leaked txId receives the stored receipt. Risk is low (txIds are |
| 84 | +client-random UUIDs) but the fix needs an identity-keying decision (`TUser` is |
| 85 | +author-defined and unserializable in general — likely a `protected |
| 86 | +dedupScope(user: TUser): string` hook). This is **explicitly deferred** pending |
| 87 | +a maintainer design decision. |
| 88 | + |
| 89 | +## Consequences |
| 90 | + |
| 91 | +- **Security**: arbitrary decoded values no longer reach SQL bindings; inbound |
| 92 | + resource exhaustion is bounded; internal schema detail does not leak to clients. |
| 93 | +- **Behavior change** (observable by consumers): execute-path `rejected` frames |
| 94 | + now carry `"mutation failed"`/`"command failed"` + `"EXECUTE_FAILED"` instead of |
| 95 | + the raw error message. Authorize-path messages are unchanged. |
| 96 | +- **Hibernation**: no idle timers were introduced. All new checks are synchronous |
| 97 | + and run on the existing `webSocketMessage` path. |
| 98 | +- **Extensibility**: all three limits are `protected readonly` — subclasses can |
| 99 | + override at construction time. `LimitsTestDO` in the test worker exercises this. |
| 100 | +- **Test coverage**: `tests/wire-hardening.test.ts` (5 tests) pins all four |
| 101 | + invariants; `tests/error-paths.test.ts` was updated to assert the new generic |
| 102 | + error text for execute failures. |
0 commit comments