Skip to content

Commit b4bf81c

Browse files
committed
test(sqs): tighten typed-nil regression + add genuine groupMutations test
Round 3 + Round 4 review on PR #715 caught two test gaps. Both were real — the regressions they claim to catch were not actually caught by the tests as written. Round 3: requireNilInterface used testify's require.Nil testify's require.Nil reflects through to the underlying pointer and considers a nil pointer wrapped in a non-nil interface as "nil". So if buildSQSPartitionResolver's return type were reverted to *adapter.SQSPartitionResolver (the typed-nil bug), the regression test would still pass — require.Nil on the typed nil returns true. Switch to require.True(t, r == nil, msg). Go's `==` operator on the interface checks BOTH the type tag AND the value tag — it only returns true for a true nil interface, which is the exact invariant the typed-nil fix produces. Round 4: TestShardedCoordinator_DispatchHonoursPartitionResolver did not actually regress the groupMutations bypass For a single-mutation batch, the test passes even if groupMutations bypasses the resolver, because rawLogs produces one pb.Request and router.Commit's groupRequests re-routes by the raw key — the router rescues the mis-routing the coordinator would have introduced. To genuinely regress the bypass, the test must dispatch TWO mutations belonging to TWO different partition groups. With the buggy groupMutations both end up under the engine-default group, rawLogs produces one request, and router.Commit puts both mutations on whichever group claims Mutations[0].Key — the second group receives nothing. Added TestShardedCoordinator_DispatchSplitsMutationsByResolverGroup: - Engine routes everything to group 1. - Resolver claims keyP0 → group 42, keyP1 → group 43. - Dispatch with [Put keyP0, Put keyP1]. - Asserts BOTH g42 and g43 each receive exactly one request. - Pre-fix: g43 receives ZERO (bypass put both under group 1's rawLog, then router put them under g42 because of Mutations[0]). - Post-fix: groupMutations splits via c.router.ResolveGroup, two separate requests, each group gets its own. Updated TestShardedCoordinator_DispatchHonoursPartitionResolver comment to reflect what it actually pins (WithPartitionResolver wiring + raw-key dispatch, not groupMutations).
1 parent 9cfc779 commit b4bf81c

2 files changed

Lines changed: 104 additions & 15 deletions

File tree

kv/sharded_coordinator_partition_test.go

Lines changed: 91 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -51,18 +51,18 @@ func (s *stubResolver) callKeys() [][]byte {
5151
return out
5252
}
5353

54-
// TestShardedCoordinator_DispatchHonoursPartitionResolver pins the
55-
// Gemini-HIGH fix: ShardedCoordinator's groupMutations path now
56-
// calls c.router.ResolveGroup, so a Dispatch whose key is claimed
57-
// by the partition resolver MUST land on the resolver's group, not
58-
// the engine's default group. Before the fix the coordinator
59-
// bypassed the resolver entirely and partitioned-FIFO traffic
60-
// silently mis-routed through 2PC.
54+
// TestShardedCoordinator_DispatchHonoursPartitionResolver pins
55+
// resolver wiring at the coordinator level: a Dispatch whose key
56+
// is claimed by the partition resolver must land on the
57+
// resolver's group, not the engine's default group.
6158
//
62-
// Two-group setup: engine routes everything to group 1; resolver
63-
// claims one specific key for group 42. Dispatch on that key must
64-
// hit group 42's recordingTransactional, leaving group 1's
65-
// recorder empty.
59+
// NOTE (round 4 review): for a single-mutation batch, this test
60+
// passes even if groupMutations bypasses the resolver — rawLogs
61+
// produces one request, and router.Commit's groupRequests
62+
// re-routes by raw key. This test pins the WithPartitionResolver
63+
// fluent wiring + raw-key dispatch, NOT the groupMutations bypass
64+
// regression. The 2-mutation test below is the actual regression
65+
// for groupMutations (codex P1 + gemini HIGH).
6666
func TestShardedCoordinator_DispatchHonoursPartitionResolver(t *testing.T) {
6767
t.Parallel()
6868
engine := distribution.NewEngine()
@@ -115,6 +115,86 @@ func TestShardedCoordinator_DispatchHonoursPartitionResolver(t *testing.T) {
115115
require.Equal(t, []byte("!sqs|msg|data|p|partitioned-key"), calls[0])
116116
}
117117

118+
// TestShardedCoordinator_DispatchSplitsMutationsByResolverGroup is
119+
// the genuine regression for the Gemini-HIGH groupMutations
120+
// bypass: a Dispatch with mutations belonging to TWO different
121+
// partitions must split into two requests, one per group.
122+
//
123+
// Pre-fix path (groupMutations called c.engine.GetRoute directly):
124+
// - groupMutations → engine route → both mutations bundled under
125+
// the engine default group.
126+
// - rawLogs → ONE pb.Request with both mutations.
127+
// - router.Commit → groupRequests routes by Mutations[0].Key only
128+
// → request goes to group A only; group B receives nothing.
129+
//
130+
// Post-fix path (groupMutations consults c.router.ResolveGroup):
131+
// - groupMutations → resolver → mut0→A, mut1→B; grouped split.
132+
// - rawLogs → TWO pb.Requests, one per group.
133+
// - Each group receives its own request.
134+
//
135+
// The assertion is that BOTH groups receive a request — pre-fix,
136+
// only one would. This is the test the round 4 review asked for.
137+
func TestShardedCoordinator_DispatchSplitsMutationsByResolverGroup(t *testing.T) {
138+
t.Parallel()
139+
engine := distribution.NewEngine()
140+
engine.UpdateRoute([]byte(""), nil, 1)
141+
142+
g1 := &recordingTransactional{
143+
responses: []*TransactionResponse{{CommitIndex: 1}},
144+
}
145+
g42 := &recordingTransactional{
146+
responses: []*TransactionResponse{{CommitIndex: 1}},
147+
}
148+
g43 := &recordingTransactional{
149+
responses: []*TransactionResponse{{CommitIndex: 1}},
150+
}
151+
152+
coord := NewShardedCoordinator(engine, map[uint64]*ShardGroup{
153+
1: {Txn: g1, Store: store.NewMVCCStore()},
154+
42: {Txn: g42, Store: store.NewMVCCStore()},
155+
43: {Txn: g43, Store: store.NewMVCCStore()},
156+
}, 1, NewHLC(), nil)
157+
158+
keyP0 := []byte("!sqs|msg|data|p|partition-0-key")
159+
keyP1 := []byte("!sqs|msg|data|p|partition-1-key")
160+
coord.WithPartitionResolver(&stubResolver{claim: map[string]uint64{
161+
string(keyP0): 42,
162+
string(keyP1): 43,
163+
}})
164+
165+
resp, err := coord.Dispatch(context.Background(), &OperationGroup[OP]{
166+
Elems: []*Elem[OP]{
167+
{Op: Put, Key: keyP0, Value: []byte("v0")},
168+
{Op: Put, Key: keyP1, Value: []byte("v1")},
169+
},
170+
})
171+
require.NoError(t, err)
172+
require.NotNil(t, resp)
173+
174+
g1.mu.Lock()
175+
g42.mu.Lock()
176+
g43.mu.Lock()
177+
defer g1.mu.Unlock()
178+
defer g42.mu.Unlock()
179+
defer g43.mu.Unlock()
180+
181+
require.Zero(t, len(g1.requests),
182+
"engine default group must not receive a request — both "+
183+
"keys are partitioned-resolver claims")
184+
require.Equal(t, 1, len(g42.requests),
185+
"partition-0 group must receive exactly one request "+
186+
"(pre-groupMutations-fix: would receive both mutations "+
187+
"in one request because router.Commit routes by "+
188+
"Mutations[0].Key only)")
189+
require.Equal(t, 1, len(g43.requests),
190+
"partition-1 group must receive exactly one request "+
191+
"(pre-groupMutations-fix: would receive ZERO requests "+
192+
"because both mutations were bundled under the "+
193+
"engine-route group). This is the genuine regression "+
194+
"for the Gemini HIGH bypass — the fix splits "+
195+
"mutations across groups via c.router.ResolveGroup.")
196+
}
197+
118198
// TestShardedCoordinator_DispatchFallsThroughForUnclaimedKeys pins
119199
// the inverse: a key the resolver does NOT claim must continue to
120200
// route via the byte-range engine. Without this guard the resolver-

main_sqs_resolver_test.go

Lines changed: 13 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -61,11 +61,20 @@ func TestBuildSQSPartitionResolver_NonEmptyReturnsResolver(t *testing.T) {
6161
}
6262

6363
// requireNilInterface accepts a kv.PartitionResolver and asserts
64-
// the interface value (NOT just the underlying pointer) is nil.
65-
// The function-parameter conversion forces a typed-nil pointer to
66-
// be wrapped into a non-nil interface, which is exactly the
64+
// the INTERFACE value is nil — both type AND value tags must be
65+
// nil. The function-parameter conversion forces a typed-nil
66+
// pointer to be wrapped into a non-nil interface, which is the
6767
// failure mode the regression test guards against.
68+
//
69+
// Uses Go's `==` operator on the interface, NOT testify's
70+
// require.Nil. require.Nil reflects through to the underlying
71+
// pointer and considers a nil pointer wrapped in a non-nil
72+
// interface as "nil", so it would pass even with the typed-nil
73+
// regression present. require.True(t, r == nil, ...) uses Go's
74+
// native interface comparison, which only returns true when the
75+
// interface's type tag is also nil — that is the exact invariant
76+
// this test was designed to catch (round 3 review on PR #715).
6877
func requireNilInterface(t *testing.T, r kv.PartitionResolver, msg string) {
6978
t.Helper()
70-
require.Nil(t, r, msg)
79+
require.True(t, r == nil, msg)
7180
}

0 commit comments

Comments
 (0)