Skip to content

Commit f2f4001

Browse files
author
SqlRush
committed
fix(cluster): spec-5.16 Hardening v1.4 — 2 P1 join-remaster correctness (RF1 P0/8.A + RF2)
Post-ship adversarial review (r5) of the online-join GRD/PCM remaster found two real defects, both confirmed against the code and fixed in-spec (rule 8.A: visibility/lock correctness is fail-closed, no forward-link). RF1 (P0/8.A) -- the JOIN fence bitmap accumulated across episodes and was never cleared, so a prior rejoiner (now a steady survivor) was wrongly excluded from the NEXT episode's re-declare barrier -> premature fence lift -> the joiner cold-serves a block the survivor still holds X on -> double-grant/false-visible. This overturns the v1.3 CF5 P2 "stale bit is benign, no clear needed" conclusion, which only reasoned about the over-fence direction and missed the opposite polarity of the same bitmap in the two barrier-exclusion sites (cluster_grd_block_view_rebuilt and the WAIT_CLUSTER re-declare ACK barrier). Fix: replace join_pcm_fenced_member[words] with a per-node armed-epoch array join_pcm_fence_member_epoch[CLUSTER_MAX_NODES]. A node is a recipient of the current episode iff member_epoch[node] == fence_epoch; a stale stamp from a completed prior episode is < the current epoch and is no longer excluded. Per-node monotonic-max keeps the v1.3 concurrent-arm union (no under-fence) with no reset race. RF2 -- qvotec peer-observe counted ANY committed-basis join marker toward the admission majority and merged max incarnation/epoch across distinct attempts, unlike the self-admit and startup-seed paths which require a same-commit majority (INV-J13). Under a half-publish/retry window distinct-attempt minority markers could aggregate into a false majority -> a survivor readmits a peer whose join never durably committed -> membership identity split (8.A). Fix: extract the shared selector cluster_join_marker_select_majority() and use it in all three sites (peer-observe + self-admit + startup-seed), eliminating the third-site drift that caused this. Tests: cluster_unit test_jr_u17_stale_recipient_not_excluded_next_episode (RF1) and test_marker_select_majority_groups_by_commit (RF2), both RED->GREEN. Local gate green: cluster_unit (129 binaries) + cluster_regress 12/12 + cluster_tap t/325+t/326 (3-node no-double-grant) 35/35 + PG219 219/219 + clang-format v18 0-diff. No wire / no catversion change; shmem GRD region only. Spec: spec-5.16-online-join-grd-pcm-remaster.md (Hardening v1.4)
1 parent 31ceb49 commit f2f4001

10 files changed

Lines changed: 283 additions & 111 deletions

File tree

src/backend/cluster/cluster_grd.c

Lines changed: 67 additions & 53 deletions
Original file line numberDiff line numberDiff line change
@@ -632,8 +632,8 @@ cluster_grd_shmem_init(void)
632632

633633
/* spec-5.16 D2/D3b/D5 — online-join remaster fence + counters. */
634634
pg_atomic_init_u64(&cluster_grd_state->join_pcm_fence_epoch, 0);
635-
for (i = 0; i < (CLUSTER_MAX_NODES + 63) / 64; i++)
636-
pg_atomic_init_u64(&cluster_grd_state->join_pcm_fenced_member[i], 0);
635+
for (i = 0; i < CLUSTER_MAX_NODES; i++)
636+
pg_atomic_init_u64(&cluster_grd_state->join_pcm_fence_member_epoch[i], 0);
637637
pg_atomic_init_u32(&cluster_grd_state->recovery_direction, (uint32)GRD_REMASTER_DIR_NONE);
638638
pg_atomic_init_u64(&cluster_grd_state->join_remaster_started_count, 0);
639639
pg_atomic_init_u64(&cluster_grd_state->join_remaster_done_count, 0);
@@ -1221,46 +1221,55 @@ cluster_grd_master_map_recompute_for_membership(const uint8 *active_member, uint
12211221
* Arm the joiner-home PCM block fence SYNCHRONOUSLY (NEVER from the LMON
12221222
* tick — the async tick is structurally later than the 5.15 write-gate open,
12231223
* so a fence armed there would leave a "MEMBER-writable but unfenced" window).
1224-
* Sets join_pcm_fenced_member (the rejoining set) THEN raises join_pcm_fence_
1225-
* epoch (a write barrier between, so a reader that sees the raised epoch sees
1226-
* the member set). Monotonic-max: a later join re-arms higher; re-arm of
1227-
* the same epoch is an idempotent no-op (INV-R12).
1224+
* Stamps each rejoining recipient's join_pcm_fence_member_epoch with THIS
1225+
* episode's epoch THEN raises join_pcm_fence_epoch (a write barrier between, so
1226+
* a reader that sees the raised epoch sees the recipient stamps). Monotonic-
1227+
* max: a later join re-arms higher; re-arm of the same epoch is an idempotent
1228+
* no-op (INV-R12).
12281229
*
1229-
* spec-5.16 Hardening v1.3 (Rule 8.A) — the member set is OR-accumulated, never
1230-
* overwritten. Two arms race on the rejoining node: qvotec (note_self_admitted,
1231-
* {self}) and the LMON tick ({evt.join_bitmap} for a multi-joiner episode). A
1232-
* plain write would let the {self} arm zero a co-joiner's bit, UNDER-fencing its
1233-
* home block -> cold-serve -> 8.A double-grant. OR makes concurrent arms
1234-
* accumulate the union regardless of interleaving (no under-fence). A stale bit
1235-
* carried from a completed prior episode only OVER-fences (the block waits for
1236-
* this epoch's view_rebuilt barrier, then lifts) — benign liveness, never a
1237-
* correctness fault, so no per-bit clear is required here.
1230+
* spec-5.16 Hardening v1.4 (Rule 8.A) — the recipient set is keyed PER NODE by
1231+
* the arming epoch, not an OR-accumulated bitmap. Two arms race on the
1232+
* rejoining node: qvotec (note_self_admitted, {self}) and the LMON tick
1233+
* ({evt.join_bitmap} for a multi-joiner episode). Both stamp the SAME epoch on
1234+
* their respective nodes, so they union with no lost update and no under-fence.
1235+
* Crucially, a node armed in a COMPLETED prior episode keeps its lower stamp,
1236+
* which is < the current join_pcm_fence_epoch, so the recipient test (used by
1237+
* active_for_shard AND the re-declare barriers) excludes it from THIS episode
1238+
* automatically. The previous v1.3 bitmap was never cleared, so a prior
1239+
* rejoiner — now a steady survivor that may hold X on the new joiner's home
1240+
* block — was wrongly skipped by the barrier -> premature fence lift -> cold-
1241+
* serve -> 8.A double-grant / false-visible (reviewer P1 #1). Per-node epoch
1242+
* keying fixes both the union (no under-fence) and the staleness (no cross-
1243+
* episode under-wait) with no reset race.
12381244
*/
12391245
void
12401246
cluster_grd_arm_join_pcm_fence(const uint8 *rejoining_set)
12411247
{
12421248
uint64 epoch;
12431249
uint64 prev;
1244-
int w;
1250+
int node;
12451251

12461252
if (cluster_grd_state == NULL || rejoining_set == NULL)
12471253
return;
12481254

12491255
epoch = cluster_epoch_get_current();
12501256

1251-
for (w = 0; w < (CLUSTER_MAX_NODES + 63) / 64; w++) {
1252-
uint64 word = 0;
1253-
int j;
1254-
1255-
for (j = 0; j < 8; j++) {
1256-
int byte_idx = w * 8 + j;
1257+
for (node = 0; node < CLUSTER_MAX_NODES; node++) {
1258+
uint64 cur;
12571259

1258-
if (byte_idx < CLUSTER_RECONFIG_DEAD_BITMAP_BYTES)
1259-
word |= ((uint64)rejoining_set[byte_idx]) << (8 * j);
1260+
if (node >= CLUSTER_RECONFIG_DEAD_BITMAP_BYTES * 8)
1261+
break; /* rejoining_set bitmap is exhausted */
1262+
if (((rejoining_set[node >> 3] >> (node & 7)) & 1) == 0)
1263+
continue;
1264+
/* monotonic-max per node — concurrent same-epoch arms union safely */
1265+
cur = pg_atomic_read_u64(&cluster_grd_state->join_pcm_fence_member_epoch[node]);
1266+
while (epoch > cur) {
1267+
if (pg_atomic_compare_exchange_u64(
1268+
&cluster_grd_state->join_pcm_fence_member_epoch[node], &cur, epoch))
1269+
break;
12601270
}
1261-
pg_atomic_fetch_or_u64(&cluster_grd_state->join_pcm_fenced_member[w], word);
12621271
}
1263-
pg_write_barrier(); /* member set visible before the epoch is raised */
1272+
pg_write_barrier(); /* recipient stamps visible before the epoch is raised */
12641273

12651274
prev = pg_atomic_read_u64(&cluster_grd_state->join_pcm_fence_epoch);
12661275
while (epoch > prev) {
@@ -1286,18 +1295,19 @@ cluster_grd_join_remaster_in_progress(void)
12861295
/*
12871296
* cluster_grd_join_remaster_active_for_shard -- spec-5.16 D3 (INV-R8).
12881297
*
1289-
* True iff the block's STATIC PCM home (cluster_gcs_lookup_master_static) is
1290-
* in the armed join_pcm_fenced_member set. Bound to online_join (the fence
1291-
* epoch is armed by note_self_admitted / LMON P0-accept), INDEPENDENT of any
1292-
* GRD master[] movement — so join_remaster_enabled=off still fences (r2 P1-①,
1293-
* P1-A closure). false when the fence is not armed.
1298+
* True iff the block's STATIC PCM home (cluster_gcs_lookup_master_static) is a
1299+
* rejoining RECIPIENT of the CURRENT fence episode (member_epoch[home] ==
1300+
* join_pcm_fence_epoch). Bound to online_join (the fence epoch is armed by
1301+
* note_self_admitted / LMON P0-accept), INDEPENDENT of any GRD master[]
1302+
* movement — so join_remaster_enabled=off still fences (r2 P1-①, P1-A
1303+
* closure). false when the fence is not armed or the home is a steady member
1304+
* (incl. a prior rejoiner whose stamp is from a completed earlier episode).
12941305
*/
12951306
bool
12961307
cluster_grd_join_remaster_active_for_shard(BufferTag tag)
12971308
{
12981309
uint64 fence_epoch;
12991310
int home;
1300-
int w, b;
13011311

13021312
if (cluster_grd_state == NULL)
13031313
return false;
@@ -1309,11 +1319,7 @@ cluster_grd_join_remaster_active_for_shard(BufferTag tag)
13091319
home = cluster_gcs_lookup_master_static(tag);
13101320
if (home < 0 || home >= CLUSTER_MAX_NODES)
13111321
return false;
1312-
w = home >> 6;
1313-
b = home & 63;
1314-
return (pg_atomic_read_u64(&cluster_grd_state->join_pcm_fenced_member[w])
1315-
& (UINT64CONST(1) << b))
1316-
!= 0;
1322+
return pg_atomic_read_u64(&cluster_grd_state->join_pcm_fence_member_epoch[home]) == fence_epoch;
13171323
}
13181324

13191325
/*
@@ -1340,19 +1346,21 @@ cluster_grd_join_remaster_active_for_shard(BufferTag tag)
13401346
* gate on the joiner is the authoritative backstop (INV-R8/R14).
13411347
*/
13421348

1343-
/* Test whether node_id is in the armed join_pcm_fenced_member (rejoining) set. */
1349+
/*
1350+
* Test whether node_id is a rejoining RECIPIENT of the fence episode identified
1351+
* by ref_epoch (member_epoch[node] == ref_epoch). A stale stamp from a prior
1352+
* episode (< ref_epoch) returns false, so a now-steady survivor is correctly
1353+
* waited for by the re-declare barriers (Hardening v1.4, reviewer P1 #1).
1354+
*/
13441355
static inline bool
1345-
join_fenced_member_test(int32 node_id)
1356+
join_fence_is_recipient_for(int32 node_id, uint64 ref_epoch)
13461357
{
1347-
int w, b;
1348-
13491358
if (cluster_grd_state == NULL || node_id < 0 || node_id >= CLUSTER_MAX_NODES)
13501359
return false;
1351-
w = node_id >> 6;
1352-
b = node_id & 63;
1353-
return (pg_atomic_read_u64(&cluster_grd_state->join_pcm_fenced_member[w])
1354-
& (UINT64CONST(1) << b))
1355-
!= 0;
1360+
if (ref_epoch == 0)
1361+
return false;
1362+
return pg_atomic_read_u64(&cluster_grd_state->join_pcm_fence_member_epoch[node_id])
1363+
== ref_epoch;
13561364
}
13571365

13581366
bool
@@ -1379,10 +1387,12 @@ cluster_grd_block_view_rebuilt(BufferTag tag)
13791387
* JOIN_COMMITTED event as a reconfig episode (it is published coordinator-
13801388
* side only), so it never announces REDECLARE_DONE. The binding safety
13811389
* condition is "every SURVIVOR finished re-declaring its held joiner-home
1382-
* blocks" — exclude the fenced set so view_rebuilt converges on the
1383-
* survivors' done (Hardening v1.1 + D8 fix).
1390+
* blocks" — exclude only THIS episode's recipients so view_rebuilt
1391+
* converges on the survivors' done (Hardening v1.1 + D8 fix). Keyed on
1392+
* fence_epoch so a prior rejoiner (now a steady survivor) is NOT skipped
1393+
* (Hardening v1.4, reviewer P1 #1: cross-episode under-wait -> 8.A).
13841394
*/
1385-
if (join_fenced_member_test(i))
1395+
if (join_fence_is_recipient_for(i, fence_epoch))
13861396
continue;
13871397
if (pg_atomic_read_u64(&cluster_grd_state->recovery_done_epoch[i]) < fence_epoch)
13881398
return false;
@@ -2074,7 +2084,7 @@ cluster_grd_recovery_lmon_tick(void)
20742084
* (P5-P7 below); for JOIN that rebuilds the joiner's PCM block view even
20752085
* when join_remaster_enabled is off. The two scopes are independent:
20762086
* grd_moved_shards = `affected` here (GRD, GUC-gated); pcm_fenced_home_set
2077-
* lives in join_pcm_fenced_member (PCM, online_join-gated, armed at P0).
2087+
* lives in join_pcm_fence_member_epoch (PCM, online_join-gated, armed at P0).
20782088
*/
20792089
memset(affected, 0, sizeof(affected));
20802090
if (pg_atomic_read_u32(&cluster_grd_state->recovery_direction)
@@ -2307,10 +2317,14 @@ cluster_grd_recovery_lmon_tick(void)
23072317
* and never observes the JOIN_COMMITTED event as its own reconfig
23082318
* episode (it is published coordinator-side only), so it never
23092319
* announces REDECLARE_DONE. Waiting for it would wedge the survivor's
2310-
* barrier forever. The survivors (everyone outside the fenced set)
2311-
* ARE the re-declarers and must converge.
2320+
* barrier forever. The survivors (everyone outside THIS episode's
2321+
* recipient set) ARE the re-declarers and must converge. Keyed on
2322+
* episode_epoch so a prior rejoiner (now a steady survivor) is still
2323+
* waited for (Hardening v1.4, reviewer P1 #1: a stale cross-episode
2324+
* exclusion would skip a survivor holding X on the joiner's home
2325+
* block -> premature unfreeze -> 8.A double-grant).
23122326
*/
2313-
if (is_join && join_fenced_member_test(i))
2327+
if (is_join && join_fence_is_recipient_for(i, episode_epoch))
23142328
continue;
23152329
if (pg_atomic_read_u64(&cluster_grd_state->recovery_done_epoch[i])
23162330
< episode_epoch) {

src/backend/cluster/cluster_membership.c

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -291,6 +291,41 @@ cluster_join_marker_same_commit(const ClusterJoinCommitMarker *a, const ClusterJ
291291
return memcmp(a, b, offsetof(ClusterJoinCommitMarker, crc32c)) == 0;
292292
}
293293

294+
/*
295+
* INV-J13 majority selector (shared by self-admit, startup-seed and qvotec
296+
* peer-observe). `markers` must already be committed-basis; O(n^2) over n disks
297+
* (n <= CLUSTER_MAX_VOTING_DISKS, small). Returns the index of the first marker
298+
* that is same_commit with >= `majority` of the array (i.e. a single commit
299+
* attempt that actually reached a disk majority), or -1. A set of distinct-
300+
* attempt minority markers therefore selects NOTHING — they never aggregate
301+
* into a false majority (reviewer P1 #2 / P1-3).
302+
*/
303+
int
304+
cluster_join_marker_select_majority(const ClusterJoinCommitMarker *markers, int n, uint32 majority,
305+
uint32 *out_agree)
306+
{
307+
int a, b;
308+
309+
if (out_agree != NULL)
310+
*out_agree = 0;
311+
if (markers == NULL || n <= 0)
312+
return -1;
313+
314+
for (a = 0; a < n; a++) {
315+
uint32 same = 0;
316+
317+
for (b = 0; b < n; b++)
318+
if (cluster_join_marker_same_commit(&markers[a], &markers[b]))
319+
same++;
320+
if (same >= majority) {
321+
if (out_agree != NULL)
322+
*out_agree = same;
323+
return a;
324+
}
325+
}
326+
return -1;
327+
}
328+
294329
/*
295330
* Apply one durable marker to the admitted floor (INV-J7). Only a committed
296331
* basis raises the floor; record_admitted is monotonic so re-applying / lower

src/backend/cluster/cluster_qvotec.c

Lines changed: 27 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -879,17 +879,27 @@ qvotec_poll_once(void)
879879
* joins the re-declare barrier (without it the coordinator's barrier waits
880880
* forever for a non-participating survivor in >=3-node). Mirrors the self-
881881
* admission read below; one extra region-3 slot read per (disk × peer) per poll
882-
* (qvotec already has the fds). Majority + COMMITTED-basis validated; a never-
883-
* joined / undeclared slot fails the basis check and publishes nothing.
882+
* (qvotec already has the fds). Same-commit majority + COMMITTED-basis
883+
* validated (INV-J13); a never-joined / undeclared slot fails the basis check
884+
* and publishes nothing.
885+
*
886+
* Hardening v1.4 (reviewer P1 #2) — this used to count ANY committed-basis
887+
* marker toward the majority and take the max incarnation/epoch, UNLIKE the
888+
* self-admit and startup-seed paths. Two distinct join attempts (different
889+
* coordinator / epoch / nonce), each on a minority of disks, would then
890+
* aggregate into a false majority -> the survivor readmits a peer whose join
891+
* never durably committed (DEAD->MEMBER + observer JOIN event) -> 8.A. Now it
892+
* mirrors them exactly: collect committed-basis markers, then require a
893+
* same-commit majority via the shared cluster_join_marker_select_majority.
884894
*/
885895
{
886896
uint32 node;
887897
uint32 majority = ((uint32)qvotec_n_disks / 2u) + 1u;
888898

889899
for (node = 0; node < CLUSTER_MAX_NODES; node++) {
890-
uint32 agree = 0;
891-
uint64 best_incarnation = 0;
892-
uint64 best_admitted_epoch = 0;
900+
ClusterJoinCommitMarker committed[CLUSTER_MAX_VOTING_DISKS];
901+
int n_committed = 0;
902+
int win;
893903
int d;
894904

895905
if ((int32)node == cluster_node_id)
@@ -908,15 +918,13 @@ qvotec_poll_once(void)
908918
memcpy(&m, jslot.bytes, sizeof(m));
909919
if (!cluster_join_marker_is_committed_basis(&m, (int32)node))
910920
continue;
911-
agree++;
912-
if (m.admitted_incarnation > best_incarnation)
913-
best_incarnation = m.admitted_incarnation;
914-
if (m.admitted_epoch > best_admitted_epoch)
915-
best_admitted_epoch = m.admitted_epoch;
921+
committed[n_committed++] = m;
916922
}
917-
if (agree >= majority && best_incarnation > 0)
918-
cluster_reconfig_record_observed_committed_join((int32)node, best_incarnation,
919-
best_admitted_epoch);
923+
win = cluster_join_marker_select_majority(committed, n_committed, majority, NULL);
924+
if (win >= 0 && committed[win].admitted_incarnation > 0)
925+
cluster_reconfig_record_observed_committed_join((int32)node,
926+
committed[win].admitted_incarnation,
927+
committed[win].admitted_epoch);
920928
}
921929
}
922930

@@ -940,9 +948,8 @@ qvotec_poll_once(void)
940948
ClusterJoinCommitMarker self_markers[CLUSTER_MAX_VOTING_DISKS];
941949
int n_self = 0;
942950
uint32 majority = ((uint32)qvotec_n_disks / 2u) + 1u;
943-
int d, a, b;
944-
int win = -1;
945-
uint32 win_agree = 0;
951+
int d;
952+
int win;
946953

947954
for (d = 0; d < qvotec_n_disks; d++) {
948955
union {
@@ -962,23 +969,12 @@ qvotec_poll_once(void)
962969
continue; /* a stale prior-incarnation admission — not us */
963970
self_markers[n_self++] = m;
964971
}
965-
/* HF-3: find a single commit (nonce) present on >= majority disks. */
966-
for (a = 0; a < n_self; a++) {
967-
uint32 same = 0;
968-
969-
for (b = 0; b < n_self; b++)
970-
if (cluster_join_marker_same_commit(&self_markers[a], &self_markers[b]))
971-
same++;
972-
if (same >= majority) {
973-
win = a;
974-
win_agree = same;
975-
break;
976-
}
977-
}
972+
/* HF-3 / INV-J13: find a single commit (nonce) present on >= majority disks
973+
* (shared selector — same logic as startup-seed and peer-observe). */
974+
win = cluster_join_marker_select_majority(self_markers, n_self, majority, NULL);
978975

979976
/* HF-1: gate-open requires the publish-proof too, not the marker alone. */
980-
if (win >= 0 && win_agree >= majority
981-
&& cluster_reconfig_join_publish_proven(self_markers[win].admitted_epoch)) {
977+
if (win >= 0 && cluster_reconfig_join_publish_proven(self_markers[win].admitted_epoch)) {
982978
cluster_reconfig_note_self_admitted(self_markers[win].admitted_epoch);
983979
/*
984980
* spec-5.16 (3-node rejoin) — the same durable COMMITTED join marker

src/backend/cluster/cluster_reconfig.c

Lines changed: 5 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -2383,9 +2383,9 @@ cluster_membership_seed_last_admitted_from_voting_disk(const int *fds, int n_dis
23832383
} slot;
23842384
ClusterJoinCommitMarker committed[CLUSTER_MAX_VOTING_DISKS];
23852385
int n_committed = 0;
2386-
int win = -1;
2386+
int win;
23872387
uint32 win_agree = 0;
2388-
int d, a, b;
2388+
int d;
23892389

23902390
if (cluster_conf_lookup_node(s) == NULL)
23912391
continue;
@@ -2408,21 +2408,10 @@ cluster_membership_seed_last_admitted_from_voting_disk(const int *fds, int n_dis
24082408
* nonce), not "any COMMITTED marker". Two minority writes from
24092409
* different attempts (different coordinator / epoch) must not aggregate.
24102410
* Only a marker that actually reached a disk majority represents a real
2411-
* admission, so only it raises the monotonic floor (P1-3). O(disks^2),
2412-
* disks <= CLUSTER_MAX_VOTING_DISKS (small).
2411+
* admission, so only it raises the monotonic floor (P1-3). Shared
2412+
* selector — same logic as self-admit and qvotec peer-observe.
24132413
*/
2414-
for (a = 0; a < n_committed; a++) {
2415-
uint32 same = 0;
2416-
2417-
for (b = 0; b < n_committed; b++)
2418-
if (cluster_join_marker_same_commit(&committed[a], &committed[b]))
2419-
same++;
2420-
if (same >= majority) {
2421-
win = a;
2422-
win_agree = same;
2423-
break;
2424-
}
2425-
}
2414+
win = cluster_join_marker_select_majority(committed, n_committed, majority, &win_agree);
24262415

24272416
if (win >= 0 && committed[win].admitted_incarnation > 0) {
24282417
uint64 win_incarnation = committed[win].admitted_incarnation;

src/include/cluster/cluster_gcs_block.h

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1010,10 +1010,11 @@ extern ClusterGcsBlockPhase cluster_gcs_block_phase_for_tag(BufferTag tag);
10101010
* requester-side phase gate and the master-side envelope handler consume them).
10111011
*
10121012
* cluster_grd_join_remaster_active_for_shard: the block's STATIC PCM home
1013-
* (cluster_gcs_lookup_master_static) is in the armed join_pcm_fenced_member
1014-
* set (bound to online_join, INDEPENDENT of any GRD master[] movement —
1015-
* so join_remaster_enabled=off still fences, r2 P1-①). false when the
1016-
* fence is not armed (join_pcm_fence_epoch == 0).
1013+
* (cluster_gcs_lookup_master_static) is a rejoining RECIPIENT of the current
1014+
* fence episode (join_pcm_fence_member_epoch[home] == join_pcm_fence_epoch;
1015+
* bound to online_join, INDEPENDENT of any GRD master[] movement — so
1016+
* join_remaster_enabled=off still fences, r2 P1-①). false when the fence is
1017+
* not armed (join_pcm_fence_epoch == 0) or the home is a steady member.
10171018
* cluster_grd_block_view_rebuilt: the joiner-home view is rebuilt — i.e.
10181019
* EVERY declared member's recovery_done_epoch >= join_pcm_fence_epoch
10191020
* (Hardening v1.1: the all-members all_done barrier, NOT the joiner's own

0 commit comments

Comments
 (0)