Commit 80b3b74
docs(scaling-dive): rewrite with cliff-finding + Qodo fixes
Captures everything learned since the first draft:
- The "250-author cliff" was a measurement artefact from per-IP
commitRateLimiting + colocated harness. Fixed via the
etherpad-load-test#105 workflow patch. Real ceiling is ~350-400
authors on a 4-vCPU GitHub runner.
- apply_mean ballooning at the cliff isn't slow code — it's OS
preemption (7+ cores of work on 4 vCPU). Application-level JS
rearrangement can't reach it.
- Two changes hold up under the dive: fan-out serialization
+ NEW_CHANGES_BATCH (#7768, 70% p95 drop at 200 authors) and
historicalAuthorData cache (#7769, neutral on dive but real
production thundering-herd fix at join time).
- Four directions didn't pan out: WebSocket-only transport, heap
bump, message-level batching alone (#7766 closed), and
rebase-loop prefetch (#7770 closed). Each has a one-line cause
documented for the record.
- Engine.io transport-level packing (#7767) is the meatiest
untouched lever — sending multiple packets per WebSocket frame
the way polling already does via encodePayload.
Qodo-flagged corrections incorporated:
1. The new instruments are Histogram + Counter + Gauge, not
"three counters" — labelled correctly.
2. The lever-3 line reference now points at updatePadClients
(lines 985-999) where NEW_CHANGES actually emits, not the
wrong line 627 (handleSaveRevisionMessage).
3. Lever 3's results are written up against measured data, not
"deferred".
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>1 parent 2645da2 commit 80b3b74
1 file changed
Lines changed: 140 additions & 72 deletions
0 commit comments