Skip to content

spec-5.19 MG-B/MG-D: heap-ITL WAL delta v3 (close single-node write-tax gate)#11

Merged
sqlrush merged 3 commits into
mainfrom
spec-5.19-stage5-acceptance
Jun 29, 2026
Merged

spec-5.19 MG-B/MG-D: heap-ITL WAL delta v3 (close single-node write-tax gate)#11
sqlrush merged 3 commits into
mainfrom
spec-5.19-stage5-acceptance

Conversation

@sqlrush

@sqlrush sqlrush commented Jun 29, 2026

Copy link
Copy Markdown
Owner

Closes the spec-5.19 MG-B single-node write-tax hard gate (rule 8.B). v2 (48B/record) measured 10.62% > 10% on CI nightly; an isolated A/B (register delta 8B shorter) passed the gate, confirming the always-Invalid 8B commit_scn drop is sufficient.

v3 = xl_heap_itl_delta_v3 (32B): keeps UBA, drops the always-Invalid write-time commit_scn. Redo dispatches v1/v2/v3 by block format_version (v1/v2 retained for backward replay; v3 reconstructs commit_scn=InvalidScn). All 8 write-path emit sites emit ACTIVE/LOCK only — verified no COMMITTED delta — so the drop is crash-recovery-lossless; the COMMITTED-requires-valid-SCN redo guard still fails closed. catversion bumped (fences old binary from v3 WAL).

Per mutating heap record: 8+40==48B -> 8+32==40B.

Verification: local build StaticAsserts pass (v3==32B), cluster_unit test_cluster_itl_wal 33/33 + stage5 acceptance 6/6. Full nightly dispatched on this branch for the real t/328 perf number + crash-recovery TAP (8.A).

🤖 Generated with Claude Code

SqlRush added 3 commits June 29, 2026 20:51
…-logged median

- M3: two-node peer-online single-writer write-tax measurement (real
  ClusterPair, strict quorum + shared_data; node0 writes while node1 is
  in quorum).  REPORT ONLY: never asserts a threshold and never fails the
  single-node hard gate -- if the ClusterPair cannot boot/quorum/measure
  this run it passes with an explicit "unavailable" note.  Addresses the
  2-node write-path question without weakening the M1 single-node gate.
- M1: emit the measured median write tax via diag() (reaches the CI log
  even on PASS; note() is swallowed by non-verbose prove) so the gate's
  headroom is visible without re-running the shard verbose.

The HARD gate stays the single-node M1 tax <= 10% (rule 8.B).
…valid commit_scn, 48->40B/record)

Closes the MG-B single-node write-tax blocker (rule 8.B).  CI nightly
measured the v2 (48B/record) tax at 10.62% > the 10% hard gate; an A/B
that registered the delta 8B shorter passed the gate, confirming the
8B drop is sufficient.

The write-time commit_scn (8B) is ALWAYS InvalidScn at every write-path
ITL emit site: heap_insert / multi_insert / delete / lock / lock-chain /
update old+new only ever stamp ITL_FLAG_ACTIVE / ITL_FLAG_LOCK_ONLY_ACTIVE
transitions (the slot is not yet committed; COMMITTED stamping happens via
the later commit-time / delayed-cleanout page mutation, FPI-logged, not via
a write-path delta).  Dropping an always-Invalid field is lossless.

- heapam_xlog.h: new xl_heap_itl_delta_v3 (32B) + CLUSTER_ITL_DELTA_FORMAT_V3;
  keeps UBA (undo_segment_head moves 24->16), only commit_scn elided.
  StaticAsserts pin sizeof==32 and all field offsets.
- cluster_itl.c: redo dispatches v1/v2/v3 by block format_version; v3
  reconstructs commit_scn=InvalidScn.  v1/v2 branches retained for backward
  WAL replay.  consumed-bytes helper extended for v3.
- heapam.c: all 8 write-path emit sites switch v2->v3 (drop the commit_scn
  assignment; register sizeof(v3)=32).
- The COMMITTED-requires-valid-SCN redo guard still fires for any v3 delta
  that carries ITL_FLAG_COMMITTED -> fails closed (PANIC), so v3 can never
  silently install a committed slot with InvalidScn (8.A).
- catversion 202606330 -> 202606340: fences an old binary from replaying
  v3-format WAL (unknown format_version -> redo PANIC).
- Per mutating heap record: 8 + 40 == 48B -> 8 + 32 == 40B.
- Tests: test_cluster_itl_wal v3 layout (T30-T33); D8 L6 invariant -> 40B
  (v2 40B retained for backward replay); t/329 MG-D model -> 40B + decision
  framing (v3 GO part shipped; same-block coalesce remains a follow-up).
…ative/two-node TPS + tax + unavailable reason)

Output-visibility hardening for the MG-B report-only leg (the M1 single-node
gate already diag()s its median):

- M3 now diag()s the native single-node median tps, the two-node peer-online
  node0 median tps, and the two-node write tax % — captured in the CI log even
  on PASS / non-verbose prove, so the report-only numbers are never a silent
  black box.
- When the 2-node measurement is unavailable, M3 prints the SPECIFIC reason
  (ClusterPair boot failed + error / peers not connected+in_quorum within
  timeout / pgbench init failed / no valid rounds / native baseline missing)
  instead of a generic "unavailable".
- Still strictly REPORT ONLY: M3 never asserts a threshold and never fails the
  single-node hard gate.

No behavior change to the M1 single-node ≤10% hard gate.
@sqlrush sqlrush merged commit e4dfcb4 into main Jun 29, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant