Commit df11d2b
committed
feat(sqs): atomic dormancy gate-lift + cluster-wide htfifo capability gate (Phase 3.D PR 5b-3)
The §11 PR 2 dormancy gate (PartitionCount > 1 hard-rejected at
CreateQueue) was a placeholder while the data plane caught up. The
data plane is now in place (PR #731 + #732), so this PR atomically
lifts the dormancy gate and replaces it with the §8.5 capability
gate that polls every cluster peer's /sqs_health for the htfifo
capability.
What changes:
- Remove validatePartitionDormancyGate and the
htfifoTemporaryGateMessage constant from sqs_partitioning.go.
Both were marked "Removed in PR 5 in the same commit that wires
the data plane so the gate-and-lift land atomically" — that PR
is this one.
- Add (*SQSServer).validateHTFIFOCapability in
adapter/sqs_capability_gate.go, called from createQueueCore.
Two-stage fail-closed check on PartitionCount > 1:
1. Local: this binary must advertise htfifo
(htfifoCapabilityAdvertised). Refuses the create with
InvalidAttributeValue if not.
2. Peers: every entry in s.leaderSQS must report htfifo via
/sqs_health within the poller's per-peer timeout. Any
timeout, HTTP error, malformed body, or missing capability
blocks the create.
Vacuous on PartitionCount <= 1 and on empty leaderSQS (single-
node cluster — the local check is the whole cluster).
- collectSQSPeers helper returns leaderSQS values in deterministic
sorted order with empty/duplicate addresses filtered, so the
poller and operator-facing error messages never depend on Go map
iteration order.
- buildHTFIFOCapabilityRejection composes the rejection message
with each failing peer's address + reason (per-peer Error or
"missing capability") so an operator triaging a partial-rolling-
upgrade cluster does not need to re-run the poll out-of-band.
CreateQueue control flow on PartitionCount > 1:
schema validators (validatePartitionConfig, etc.)
→ validateHTFIFOCapability
→ htfifoCapabilityAdvertised check (local)
→ PollSQSHTFIFOCapability(ctx, collectSQSPeers(), …)
→ reject with InvalidAttributeValue on any failure
→ createQueueWithRetry
Caller audit: validateHTFIFOCapability has exactly one production
caller (createQueueCore in sqs_catalog.go); both the JSON handler
and the future query-protocol handler reach it through that one
path. SetQueueAttributes is unaffected because PartitionCount is
immutable post-create (validatePartitionImmutability).
Test changes:
- Delete TestValidatePartitionDormancyGate_RejectsAboveOne (the
function it tested is gone).
- Convert TestSQSServer_HTFIFO_DormancyGate_RejectsPartitionedCreate
into TestSQSServer_HTFIFO_CapabilityGate_AcceptsOnSingleNode —
the same wire payloads now SUCCEED because the local node
advertises htfifo and there are no peers to poll. Renamed
TestSQSServer_HTFIFO_DormancyGate_AllowsPartitionCountOne →
TestSQSServer_HTFIFO_CapabilityGate_AllowsPartitionCountOne for
consistency.
- Update comments on
TestSQSServer_HTFIFO_RejectsQueueScopedDedupOnPartitioned,
TestSQSServer_HTFIFO_RejectsNonPowerOfTwoPartitionCount,
TestSQSServer_HTFIFO_ImmutabilitySetQueueAttributesRejects,
mustCreateFIFOWithThroughputLimit, and the
installPartitionedMetaForTest helper to describe the new
capability-gate world.
New unit tests in sqs_capability_gate_test.go:
- TestValidateHTFIFOCapability_ShortCircuitsOnLegacyMeta:
PartitionCount in {0, 1} skips the poll entirely (proven by
wiring a peer that would FAIL the gate and verifying the
short-circuit path bypasses it).
- TestValidateHTFIFOCapability_AcceptsWhenAllPeersAdvertise:
happy path with two fake peers.
- TestValidateHTFIFOCapability_AcceptsOnEmptyPeerList: vacuous
case (single-node cluster).
- TestValidateHTFIFOCapability_RejectsWhenOnePeerLacksCapability:
rolling-upgrade fail-closed; offending peer's address surfaces
in the InvalidAttributeValue message.
- TestValidateHTFIFOCapability_RejectsWhenPeerUnreachable:
transient-network fail-closed.
- TestCollectSQSPeers_Deterministic: sort + dedup + empty-skip.
- TestBuildHTFIFOCapabilityRejection_ShapesOperatorMessage:
rejection-message shape pinned (advertising peers absent,
failing peers contribute "(reason)" suffix, defensive paths).
Self-review (CLAUDE.md):
1. Data loss — None. The gate strictly tightens CreateQueue
acceptance vs. the previous dormancy reject; no path now
accepts a write that would have been rejected before. The
dormancy gate's invariant ("partitioned-shape meta cannot
land on a binary that does not handle the partitioned
keyspace") is preserved by the local htfifoCapabilityAdvertised
check and strengthened by the cluster-wide poll.
2. Concurrency / distributed failures — Poll runs concurrently
across peers via the existing PollSQSHTFIFOCapability helper
(covered by its own race tests). collectSQSPeers + sort are
pure / deterministic. The leaderSQS map is only mutated at
SQSServer construction (WithSQSLeaderMap), not at request
time, so no read/write races. Leader transitions during the
poll are handled by the existing proxyToLeader path that
gates createQueue before validateHTFIFOCapability runs.
3. Performance — Poll cost is O(peers) and only paid on
PartitionCount > 1 creates (rare control-plane operation).
Legacy / single-partition CreateQueue calls pay one
short-circuit branch. collectSQSPeers' sort is O(N log N)
on a small N (cluster size). No hot-path impact.
4. Data consistency — Schema validators (PartitionCount shape,
dedup-scope rule, perMessageGroupId rule) still run BEFORE
the capability gate inside parseAttributesIntoMeta, so an
invalid shape rejects with the schema's reason rather than
the gate's. SetQueueAttributes immutability remains the
guard for post-create partition-shape changes.
5. Test coverage — Gate function: 5 unit tests covering the
short-circuit, happy path, vacuous empty, rolling-upgrade,
and unreachable-peer classes. Helpers: 2 unit tests pinning
deterministic order and message shape. Wire-level: existing
HT-FIFO integration tests carry forward, with the dormancy-
reject test converted to the new accepts-on-single-node
happy path.1 parent 637e543 commit df11d2b
7 files changed
Lines changed: 387 additions & 101 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
903 | 903 | | |
904 | 904 | | |
905 | 905 | | |
906 | | - | |
907 | | - | |
908 | | - | |
909 | | - | |
910 | | - | |
911 | | - | |
912 | | - | |
| 906 | + | |
| 907 | + | |
| 908 | + | |
| 909 | + | |
| 910 | + | |
| 911 | + | |
| 912 | + | |
| 913 | + | |
| 914 | + | |
913 | 915 | | |
914 | 916 | | |
915 | 917 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
17 | | - | |
18 | | - | |
19 | | - | |
20 | | - | |
21 | | - | |
22 | | - | |
23 | | - | |
24 | | - | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
25 | 23 | | |
26 | 24 | | |
27 | 25 | | |
| |||
30 | 28 | | |
31 | 29 | | |
32 | 30 | | |
33 | | - | |
34 | | - | |
35 | | - | |
36 | | - | |
37 | | - | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
| |||
0 commit comments