|
| 1 | +# `alarm-restart-e2e` |
| 2 | + |
| 3 | +Reproducer for the runtime contract that motivates partyserver's |
| 4 | +`__ps_name` fallback record. Pins down behavior reported in |
| 5 | +[cloudflare/partykit#390](https://github.com/cloudflare/partykit/issues/390) |
| 6 | +across three Durable Objects in the same Worker: |
| 7 | + |
| 8 | +| DO | Class | Extends | |
| 9 | +| ------------ | --------------------------------- | -------------------------------------------------------------------------- | |
| 10 | +| `RawAlarm` | `RawAlarm` | `DurableObject` (no PartyServer) | |
| 11 | +| `StockAlarm` | `StockAlarm` (built from a mixin) | `Server` from `partyserver@0.5.3` (aliased as `partyserver-stock`) | |
| 12 | +| `FixedAlarm` | `FixedAlarm` (built from a mixin) | `Server` from this workspace's local `partyserver` (with the fallback fix) | |
| 13 | + |
| 14 | +Each DO records an observation (`{source, ctxIdName, storedPsName, |
| 15 | +partyName, partyNameError, at}`) to its own SQLite-backed storage on |
| 16 | +every entry through `fetch()` or `alarm()`. Observations accumulate |
| 17 | +across dev-server restarts. |
| 18 | + |
| 19 | +## Run the experiment |
| 20 | + |
| 21 | +```bash |
| 22 | +npm install |
| 23 | +npm run start |
| 24 | +``` |
| 25 | + |
| 26 | +In a second shell, schedule an alarm into a fresh room and observe: |
| 27 | + |
| 28 | +```bash |
| 29 | +ROOM="cold-strict-$(date +%s)" |
| 30 | + |
| 31 | +# Session A: schedule into a fresh room. This is the only entry into |
| 32 | +# the DO instances during session A. After this, the alarm record on |
| 33 | +# disk is what carries the DO across the restart. |
| 34 | +curl -s "http://localhost:5173/raw/$ROOM?schedule=45" |
| 35 | +curl -s "http://localhost:5173/parties/stock-alarm/$ROOM?schedule=45" |
| 36 | +curl -s "http://localhost:5173/parties/fixed-alarm/$ROOM?schedule=45" |
| 37 | +``` |
| 38 | + |
| 39 | +Then kill `vite dev` (Ctrl-C), restart it (`npm run start`), and |
| 40 | +**don't touch the room** until well past the 45-second mark. Then: |
| 41 | + |
| 42 | +```bash |
| 43 | +curl -s "http://localhost:5173/raw/$ROOM?snapshot=1" | jq |
| 44 | +curl -s -i "http://localhost:5173/parties/stock-alarm/$ROOM?snapshot=1" | head -n 12 |
| 45 | +curl -s "http://localhost:5173/parties/fixed-alarm/$ROOM?snapshot=1" | jq |
| 46 | +``` |
| 47 | + |
| 48 | +Observed behavior on `workerd@1.20260424.1`, |
| 49 | +`compatibility_date: "2026-01-28"`: |
| 50 | + |
| 51 | +- `RawAlarm`: alarm observation has no `ctxIdName` (i.e. `ctx.id.name` |
| 52 | + is `undefined`). Subsequent fetches via `idFromName(...)` ALSO see |
| 53 | + `ctx.id.name === undefined` for the lifetime of that DO instance — |
| 54 | + the instance is "born nameless" and stays that way. |
| 55 | + |
| 56 | +- `StockAlarm`: `Server.fetch` returns 500 with the "Cannot determine |
| 57 | + the name" error. Reproduces the failure reported in cloudflare/partykit#390. |
| 58 | + |
| 59 | +- `FixedAlarm`: `alarm()` runs successfully. `ctx.id.name` is |
| 60 | + `undefined` in the observation, but `this.name` resolves from the |
| 61 | + on-disk `__ps_name` record that PartyServer wrote during session |
| 62 | + A's fetch. `partyserver` recovers the name; the DO continues |
| 63 | + working normally. |
| 64 | + |
| 65 | +## Why three DOs |
| 66 | + |
| 67 | +`RawAlarm` pins down what workerd actually does, free of any |
| 68 | +framework. `StockAlarm` reproduces the user-reported bug under |
| 69 | +`partyserver@0.5.3`. `FixedAlarm` validates that the workspace fix |
| 70 | +restores normal operation under the same conditions. |
| 71 | + |
| 72 | +## Critical: don't warm the DOs before the alarm fires |
| 73 | + |
| 74 | +Any HTTP fetch or websocket message sent to a DO between session B |
| 75 | +startup and the alarm firing time will wake the DO via that entry |
| 76 | +point first. workerd captures `ctx.id.name` from the first entry |
| 77 | +point and that value persists for the instance's lifetime. So a |
| 78 | +pre-alarm fetch silently warms `ctx.id.name` and masks the bug. The |
| 79 | +critical window is from `vite dev` starting back up until the |
| 80 | +expected alarm fire time. Don't open the page in a browser, don't |
| 81 | +curl `?snapshot`, don't let any client reconnect to the room. Just |
| 82 | +wait. |
| 83 | + |
| 84 | +The frontend `index.html` exists for manual exploration but is |
| 85 | +deliberately separate from the cold-DO experiment so a developer |
| 86 | +running the page won't accidentally warm a different room. To run |
| 87 | +the cold experiment, drive everything from `curl` against rooms the |
| 88 | +frontend isn't subscribed to. |
0 commit comments