Commit b352a46
SqlRush
fix(cluster): spec-5.15 online-join Hardening v1.1 — half-publish proof, bootstrap epoch-proof, marker identity-grouping
Spec: spec-5.15-online-declared-node-join-membership.md (Hardening v1.1)
Three post-ship correctness findings on the online declared-node join path,
all fail-closed (behind cluster.online_join, default off):
- HF-1 / INV-J9 (half-publish window): the v1.0 joiner opened its write gate
on the durable COMMITTED join marker alone (note_self_admitted), so a
coordinator that crashed/stalled after the marker was majority-durable but
before the publish (epoch advance + survivor state=MEMBER) left the joiner
writable while every other node still saw it JOINING. New pure fn
cluster_reconfig_join_publish_proven(admitted_epoch): qvotec opens the gate
only after a majority of MEMBER survivors have advanced their durable
observed epoch to >= admitted_epoch (the publish actually propagated). A
marker-durable-but-unpublished state keeps the gate CLOSED; the joiner then
times out to 53R61 and restarts with a fresh incarnation.
- HF-2 / INV-J14 (bootstrap fail-open): the v1.0 joiner_self_tick used a
timing-grace heuristic — if no running peer was observed within a window it
declared cold-bootstrap and left the gate open forever, so a slow qvotec
could mis-see a rejoiner as a bootstrap and permanently fail-open. Replaced
with a positive epoch proof cluster_reconfig_bootstrap_quorum_at_initial()
(quorum of declared CSSD-alive AND no declared peer past EPOCH_INITIAL);
undecided keeps the gate CLOSED (fail-closed). shmem init defaults the gate
CLOSED when online_join is on (open when off — no behavior change off).
- HF-3 / INV-J13 (false majority): self-admit and the startup seed counted
"any COMMITTED marker" toward majority, so two minority writes from
different commit attempts (different coordinator/epoch) could aggregate.
ClusterJoinCommitMarker gains a per-attempt commit_nonce (CLUSTER_JCMK_VERSION
1->2; v1 markers fail-closed rejected); new cluster_join_marker_same_commit()
groups by full identity, so only a single commit present on a disk majority
is honored.
Test-infra fix the L8 e2e surfaced: the three reconfig-join injection fire
sites were never in the cluster_inject registry, so cluster.injection_points
rejected them as unknown and the half-publish injection never armed. Register
cluster-reconfig-join-commit-marker-durable (the one L8 needs) — total registry
139 -> 140; ripple the count baselines in t/015/017/018/020/021/022/023/024/030
and the cluster-reconfig-% count in cluster_regress reconfig_smoke (5 -> 6).
Tests: cluster_unit 131/131 (U16-U19 new: same_commit identity group, version
fail-closed, publish-proven member quorum, bootstrap epoch proof); t/315 18/18
incl. new L8 real crash-in-window e2e (coordinator paused post-marker-durable,
joiner stays 53R60 then 53R61, never half-published); cluster_regress 11/11;
PG 219/219; clang-format v18 clean. No catversion bump (marker is voting-disk
on-disk, self-versioned, not catalog).1 parent 2056233 commit b352a46
21 files changed
Lines changed: 588 additions & 85 deletions
File tree
- src
- backend/cluster
- include/cluster
- test
- cluster_regress
- expected
- sql
- cluster_tap/t
- cluster_unit
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
229 | 229 | | |
230 | 230 | | |
231 | 231 | | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
232 | 239 | | |
233 | 240 | | |
234 | 241 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
180 | 180 | | |
181 | 181 | | |
182 | 182 | | |
183 | | - | |
| 183 | + | |
184 | 184 | | |
185 | 185 | | |
186 | 186 | | |
| |||
227 | 227 | | |
228 | 228 | | |
229 | 229 | | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
230 | 247 | | |
231 | 248 | | |
232 | 249 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
821 | 821 | | |
822 | 822 | | |
823 | 823 | | |
824 | | - | |
825 | | - | |
826 | | - | |
| 824 | + | |
| 825 | + | |
| 826 | + | |
| 827 | + | |
| 828 | + | |
| 829 | + | |
| 830 | + | |
| 831 | + | |
| 832 | + | |
| 833 | + | |
| 834 | + | |
| 835 | + | |
827 | 836 | | |
828 | 837 | | |
829 | | - | |
830 | | - | |
| 838 | + | |
| 839 | + | |
831 | 840 | | |
832 | | - | |
| 841 | + | |
| 842 | + | |
| 843 | + | |
833 | 844 | | |
834 | 845 | | |
835 | 846 | | |
| |||
847 | 858 | | |
848 | 859 | | |
849 | 860 | | |
850 | | - | |
851 | | - | |
852 | | - | |
| 861 | + | |
| 862 | + | |
| 863 | + | |
| 864 | + | |
| 865 | + | |
| 866 | + | |
| 867 | + | |
| 868 | + | |
| 869 | + | |
| 870 | + | |
| 871 | + | |
| 872 | + | |
| 873 | + | |
| 874 | + | |
| 875 | + | |
853 | 876 | | |
854 | | - | |
855 | | - | |
| 877 | + | |
| 878 | + | |
| 879 | + | |
| 880 | + | |
| 881 | + | |
856 | 882 | | |
857 | 883 | | |
858 | 884 | | |
| |||
0 commit comments