Skip to content

Commit a5909eb

Browse files
committed
fix(redis): carve standalone SET behind separate sub-gate (codex P1)
PR #943 round-1 codex P1: flipping onePhaseTxnDedup to default-on exposed a pre-existing semantic divergence on the standalone SET path. txnContext.applySet (adapter/redis.go:2356) returns WRONGTYPE when the target key already holds a list/hash/set/zset/stream, while the legacy setLegacy → executeSet → replaceWithStringTxn path correctly deletes the collection's logical elements and writes the string. Real Redis allows SET to overwrite any prior type unconditionally; the default flip would have silently broken normal overwrite semantics for any default-configured server. This commit narrows the default flip — the standalone SET path now requires *two* gates to route through runTransactionWithDedup: onePhaseTxnDedup (the umbrella, default on) AND a new standaloneSetDedup sub-gate (default off, opt-in via WithStandaloneSetDedup or ELASTICKV_REDIS_ONEPHASE_DEDUP_SET=1). With the default config, the standalone SET branch falls through to setLegacy, preserving legacy overwrite semantics. MULTI/EXEC and list-push paths — the parent design's actual M4-validated workloads — keep the dedup default-on win unchanged. Bringing applySet to parity (collection-deletion + string write inside the dedup txn buffer) is a larger surgery tracked as a follow-up; until then the sub-gate stays default-off. Per the CLAUDE.md "add failing test then fix" convention: - TestRedis_SET_OverwritesList_UnderDefaultGate (new file adapter/redis_set_overwrite_default_test.go) seeds RPUSH k x, runs SET k v under default config, asserts OK + GET returns v. Fails on the pre-fix build (WRONGTYPE), passes on the fixed build. - Existing TestStandaloneSetDedup_* tests now set standaloneSetDedup: true alongside onePhaseTxnDedup: true to keep pinning the dedup-on routing they were designed to exercise. Also addresses gemini medium: "[#937]" broken reference-style link in the design doc replaced with an inline link. Verified: go vet ./... clean, go build ./... clean, golangci-lint run ./adapter/... 0 issues, TestRedis_SET_OverwritesList_UnderDefaultGate + TestStandaloneSetDedup_* + Dedup/OnePhase test family all pass.
1 parent a6af333 commit a5909eb

4 files changed

Lines changed: 148 additions & 10 deletions

File tree

adapter/redis.go

Lines changed: 43 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -251,23 +251,51 @@ type RedisServer struct {
251251
// to opt out — kept as a one-env-var operator rollback.
252252
onePhaseTxnDedup bool
253253

254+
// standaloneSetDedup gates whether the *standalone* SET command (not SET
255+
// inside MULTI/EXEC) routes through runTransactionWithDedup. Default off
256+
// because the dedup path's applySet does not match the legacy
257+
// executeSet/replaceWithStringTxn semantics for SET-over-collection: a
258+
// `SET k v` after `RPUSH k x` returns WRONGTYPE on the dedup path but
259+
// overwrites correctly on the legacy path (PR #943 round-1 codex P1).
260+
// Bringing applySet to parity (collection-deletion + string write inside
261+
// the dedup txn) is tracked as a follow-up; until that lands, the
262+
// standalone SET path stays on the legacy default-on-flip code path
263+
// regardless of onePhaseTxnDedup's value. Enable explicitly via
264+
// WithStandaloneSetDedup(true) or ELASTICKV_REDIS_ONEPHASE_DEDUP_SET=1
265+
// only when applySet's parity is verified for the workload at hand.
266+
standaloneSetDedup bool
267+
254268
route map[string]func(conn redcon.Conn, cmd redcon.Command)
255269
}
256270

257271
type RedisServerOption func(*RedisServer)
258272

259273
// WithOnePhaseTxnDedup enables (or disables) the option-2 one-phase
260-
// idempotency dedup on list-push, MULTI/EXEC, and standalone-write retries
274+
// idempotency dedup on list-push and MULTI/EXEC retries
261275
// (see RedisServer.onePhaseTxnDedup). On by default since the rollout
262276
// recorded in docs/design/2026_06_10_proposed_redis_onephase_dedup_default_on.md;
263277
// pass false to opt out from code, or set ELASTICKV_REDIS_ONEPHASE_DEDUP=0
264278
// to opt out from the environment. The constructor option trumps the env var.
279+
// Standalone SET requires the separate WithStandaloneSetDedup gate; see
280+
// RedisServer.standaloneSetDedup.
265281
func WithOnePhaseTxnDedup(enabled bool) RedisServerOption {
266282
return func(r *RedisServer) {
267283
r.onePhaseTxnDedup = enabled
268284
}
269285
}
270286

287+
// WithStandaloneSetDedup enables the option-2 dedup path on the *standalone*
288+
// SET command (not SET inside MULTI/EXEC). Off by default because the dedup
289+
// path's applySet does not yet match the legacy executeSet semantics for
290+
// SET-over-collection — see RedisServer.standaloneSetDedup. Enable only
291+
// after verifying applySet parity for the workload (no SET-over-list /
292+
// SET-over-hash / SET-over-set / SET-over-zset / SET-over-stream issued).
293+
func WithStandaloneSetDedup(enabled bool) RedisServerOption {
294+
return func(r *RedisServer) {
295+
r.standaloneSetDedup = enabled
296+
}
297+
}
298+
271299
func WithRedisActiveTimestampTracker(tracker *kv.ActiveTimestampTracker) RedisServerOption {
272300
return func(r *RedisServer) {
273301
r.readTracker = tracker
@@ -506,10 +534,13 @@ func NewRedisServer(listen net.Listener, redisAddr string, store store.MVCCStore
506534
// ELASTICKV_REDIS_ONEPHASE_DEDUP=0 opts out; the WithOnePhaseTxnDedup
507535
// constructor option still trumps the env var.
508536
onePhaseTxnDedup: os.Getenv("ELASTICKV_REDIS_ONEPHASE_DEDUP") != "0",
509-
baseCtx: baseCtx,
510-
baseCancel: baseCancel,
511-
streamWaiters: newKeyWaiterRegistry(),
512-
zsetWaiters: newKeyWaiterRegistry(),
537+
// standaloneSetDedup defaults off; see field comment for the
538+
// applySet-vs-executeSet parity gap that gates this separately.
539+
standaloneSetDedup: os.Getenv("ELASTICKV_REDIS_ONEPHASE_DEDUP_SET") == "1",
540+
baseCtx: baseCtx,
541+
baseCancel: baseCancel,
542+
streamWaiters: newKeyWaiterRegistry(),
543+
zsetWaiters: newKeyWaiterRegistry(),
513544
}
514545
r.relay.Bind(r.publishLocal)
515546

@@ -1137,7 +1168,13 @@ func (r *RedisServer) set(conn redcon.Conn, cmd redcon.Command) {
11371168
// SET there is exactly one element with the same redisResult shape as
11381169
// the standalone reply (resultString OK / resultNil for NX/XX miss /
11391170
// resultBulk for GET).
1140-
if r.onePhaseTxnDedup {
1171+
// Both gates must be on to route standalone SET through the dedup path.
1172+
// onePhaseTxnDedup covers the MULTI/EXEC and list-push retries that the
1173+
// parent design's M4 validated; standaloneSetDedup is a separate sub-gate
1174+
// (default off) because applySet diverges from executeSet on SET-over-
1175+
// collection — flipping onePhaseTxnDedup default-on without this guard
1176+
// would change normal Redis overwrite behaviour (PR #943 round-1 codex P1).
1177+
if r.onePhaseTxnDedup && r.standaloneSetDedup {
11411178
// Call runTransactionWithDedup directly instead of going through
11421179
// runTransaction. runTransaction re-checks the same
11431180
// r.onePhaseTxnDedup gate and routes here anyway; the indirection

adapter/redis_set_dedup_test.go

Lines changed: 18 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,12 @@ func TestStandaloneSetDedup_LandedPriorAttempt_ReturnsOK(t *testing.T) {
2828
ctx := context.Background()
2929
st := store.NewMVCCStore()
3030
coord := newDedupTestCoordinator(st, 1, true) // attempt 1 lands then errors
31-
srv := &RedisServer{store: st, coordinator: coord, scriptCache: map[string]string{}, onePhaseTxnDedup: true}
31+
// Both gates: onePhaseTxnDedup is the umbrella; standaloneSetDedup is
32+
// the per-path sub-gate that opts the *standalone* SET branch into
33+
// the dedup routing. The sub-gate defaults off (PR #943 round-1 codex
34+
// P1 — applySet diverges from executeSet on SET-over-collection).
35+
// Tests that pin the dedup-on routing must explicitly enable both.
36+
srv := &RedisServer{store: st, coordinator: coord, scriptCache: map[string]string{}, onePhaseTxnDedup: true, standaloneSetDedup: true}
3237

3338
conn := &recordingConn{}
3439
cmd := redcon.Command{Args: [][]byte{[]byte(cmdSet), []byte("k"), []byte("v1")}}
@@ -61,7 +66,12 @@ func TestStandaloneSetDedup_NXMissReturnsNil(t *testing.T) {
6166
require.NoError(t, st.PutAt(ctx, redisStrKey([]byte("k")), encodeRedisStr([]byte("seed"), nil), 5, 0))
6267

6368
coord := newDedupTestCoordinator(st, 1, true) // attempt 1 lands then errors
64-
srv := &RedisServer{store: st, coordinator: coord, scriptCache: map[string]string{}, onePhaseTxnDedup: true}
69+
// Both gates: onePhaseTxnDedup is the umbrella; standaloneSetDedup is
70+
// the per-path sub-gate that opts the *standalone* SET branch into
71+
// the dedup routing. The sub-gate defaults off (PR #943 round-1 codex
72+
// P1 — applySet diverges from executeSet on SET-over-collection).
73+
// Tests that pin the dedup-on routing must explicitly enable both.
74+
srv := &RedisServer{store: st, coordinator: coord, scriptCache: map[string]string{}, onePhaseTxnDedup: true, standaloneSetDedup: true}
6575

6676
conn := &recordingConn{}
6777
// SET k v1 NX -- attempt 1 records resultNil because NX miss.
@@ -99,7 +109,12 @@ func TestStandaloneSetDedup_GETOptionReturnsOldBulk(t *testing.T) {
99109
require.NoError(t, st.PutAt(ctx, redisStrKey([]byte("k")), encodeRedisStr([]byte("prior"), nil), 5, 0))
100110

101111
coord := newDedupTestCoordinator(st, 1, true) // attempt 1 lands then errors
102-
srv := &RedisServer{store: st, coordinator: coord, scriptCache: map[string]string{}, onePhaseTxnDedup: true}
112+
// Both gates: onePhaseTxnDedup is the umbrella; standaloneSetDedup is
113+
// the per-path sub-gate that opts the *standalone* SET branch into
114+
// the dedup routing. The sub-gate defaults off (PR #943 round-1 codex
115+
// P1 — applySet diverges from executeSet on SET-over-collection).
116+
// Tests that pin the dedup-on routing must explicitly enable both.
117+
srv := &RedisServer{store: st, coordinator: coord, scriptCache: map[string]string{}, onePhaseTxnDedup: true, standaloneSetDedup: true}
103118

104119
conn := &recordingConn{}
105120
// SET k v1 GET -- attempt 1 records resultBulk("prior").
Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
package adapter
2+
3+
import (
4+
"context"
5+
"testing"
6+
7+
"github.com/redis/go-redis/v9"
8+
"github.com/stretchr/testify/require"
9+
)
10+
11+
// TestRedis_SET_OverwritesList_UnderDefaultGate locks in the legacy
12+
// SET-over-collection overwrite semantics under the new default-on dedup
13+
// gate landed in PR #943. The dedup path's applySet returns WRONGTYPE on
14+
// a SET that hits a key already holding a list/hash/set/zset/stream,
15+
// while the legacy setLegacy / executeSet / replaceWithStringTxn path
16+
// deletes the collection's logical elements and writes the string.
17+
// Codex flagged this as a P1 regression on PR #943 round-1: flipping
18+
// onePhaseTxnDedup default-on without a separate gate on the standalone
19+
// SET path would have changed normal Redis overwrite behaviour. The fix
20+
// is the standaloneSetDedup sub-gate, which defaults off — so
21+
// `SET k v` after `RPUSH k x` must still return OK and let the next
22+
// GET observe the string value, not WRONGTYPE.
23+
//
24+
// This is a regression test (CLAUDE.md self-review §5 + the
25+
// "when code review surfaces a defect, first add a failing test"
26+
// convention). If a future change re-enables standaloneSetDedup as
27+
// default-on without bringing applySet to parity with executeSet, this
28+
// test must fail.
29+
func TestRedis_SET_OverwritesList_UnderDefaultGate(t *testing.T) {
30+
t.Parallel()
31+
nodes, _, _ := createNode(t, 3)
32+
defer shutdown(nodes)
33+
34+
ctx := context.Background()
35+
rdb := redis.NewClient(&redis.Options{Addr: nodes[0].redisAddress})
36+
defer func() { _ = rdb.Close() }()
37+
38+
// Seed the key as a list — the encoding that the dedup path's
39+
// applySet hard-fails against.
40+
require.NoError(t, rdb.Do(ctx, "RPUSH", "setover:list", "elem-a").Err())
41+
42+
// Real Redis behaviour: SET unconditionally replaces the value,
43+
// dropping the previous type. The legacy path implements this; the
44+
// dedup path returns WRONGTYPE. With the default config
45+
// (onePhaseTxnDedup on, standaloneSetDedup off) we must take the
46+
// legacy path and observe OK + string value.
47+
res, err := rdb.Do(ctx, "SET", "setover:list", "replaced").Result()
48+
require.NoError(t, err, "SET must overwrite an existing list under default config")
49+
require.Equal(t, "OK", res)
50+
51+
got, err := rdb.Get(ctx, "setover:list").Result()
52+
require.NoError(t, err, "GET after SET-over-list must succeed")
53+
require.Equal(t, "replaced", got)
54+
}

docs/design/2026_06_10_proposed_redis_onephase_dedup_default_on.md

Lines changed: 33 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,8 @@ profile that produced the parent design's trigger anomaly. M4 is met.
8080

8181
The control workflow (`jepsen-test-scheduled.yml`, gate off) continues
8282
to surface `:duplicate-elements` and the related real-time cycle
83-
anomalies as expected; the 2026-06-05 18:37 failure ([#937]) is the
83+
anomalies as expected; the 2026-06-05 18:37 failure
84+
([#937](https://github.com/bootjp/elastickv/issues/937)) is the
8485
most recent observation. The control's purpose ends when default-on
8586
lands — see *§Control workflow disposition* below.
8687

@@ -110,6 +111,37 @@ Comment and `WithOnePhaseTxnDedup`'s godoc are updated to describe the
110111
new default; the in-file pointer to R5 stays (now reframed as "the
111112
ordering constraint that authorized this default flip").
112113

114+
### M1a — Carve out standalone `SET` behind a separate sub-gate
115+
116+
PR #943 round-1 codex P1 flagged that `txnContext.applySet`
117+
(`adapter/redis.go:2356-2360`) returns `WRONGTYPE` when the existing
118+
key is a list/hash/set/zset/stream, while the legacy `setLegacy`
119+
`executeSet``replaceWithStringTxn` path correctly deletes the
120+
collection's logical elements and writes the string. Real Redis lets
121+
`SET k v` overwrite any existing type unconditionally; flipping
122+
`onePhaseTxnDedup` default-on therefore would have silently changed
123+
normal Redis overwrite behaviour for every standalone-`SET`-against-
124+
collection configuration.
125+
126+
This proposal does not bring `applySet` to parity (collection-deletion
127+
inside the dedup txn is a larger surgery, tracked as a separate
128+
follow-up). Instead it introduces a dedicated sub-gate
129+
`RedisServer.standaloneSetDedup` (default off, opt-in via
130+
`WithStandaloneSetDedup(true)` or
131+
`ELASTICKV_REDIS_ONEPHASE_DEDUP_SET=1`). The standalone-`SET` branch
132+
in `set()` now requires **both** `onePhaseTxnDedup` **and**
133+
`standaloneSetDedup` to be true before routing through
134+
`runTransactionWithDedup`; the previous behaviour is preserved as the
135+
default. MULTI/EXEC and list-push paths (the parent design's actual
136+
M4-validated workloads) are unaffected.
137+
138+
The new field, option, env var, and the regression test
139+
`TestRedis_SET_OverwritesList_UnderDefaultGate`
140+
(`adapter/redis_set_overwrite_default_test.go`) all land in this PR.
141+
The test seeds `RPUSH k x`, calls `SET k v`, asserts `OK` + a
142+
subsequent `GET` returns `v` — it fails on the pre-fix build
143+
(WRONGTYPE) and passes on the fixed build.
144+
113145
### M2 — Control workflow disposition
114146

115147
After default-on, `jepsen-test-scheduled.yml` would silently exercise

0 commit comments

Comments
 (0)