Skip to content

Commit a6af333

Browse files
committed
feat(redis): flip one-phase txn dedup default-off to default-on
Implements the proposal in docs/design/2026_06_10_proposed_redis_onephase_dedup_default_on.md. The Redis adapter's one-phase idempotency dedup (adapter.RedisServer.onePhaseTxnDedup) now defaults on. The ELASTICKV_REDIS_ONEPHASE_DEDUP env var inverts from opt-in (=1 enabled, anything else disabled) to opt-out (=0 disabled, anything else enabled). The WithOnePhaseTxnDedup constructor option still trumps the env var. Authorization for the flip is the parent design's M4 7-day green criterion in the dedup-mode Jepsen workflow (.github/workflows/jepsen-test-scheduled-dedup.yml), met by 12 consecutive green runs over 10 days (2026-05-31 → 2026-06-10) on the stress profile that produced the parent design's trigger anomaly. The parent's R5 (FSM determinism across a rolling upgrade) is discharged — the probe code has shipped on every production node for months, so the writer (emission) flipping on is uniformly understood. To preserve the legacy-path coverage during the post-flip observation window, the control workflow (.github/workflows/jepsen-test-scheduled.yml) now sets ELASTICKV_REDIS_ONEPHASE_DEDUP=0 explicitly at the job env level so its semantics survive the default change. Retirement of that workflow is a 30-day follow-up. Closes #937 — the failure that triggered this work is the control baseline's expected unprotected behaviour; with default-on it falls back into the path the dedup-mode workflow has been exercising clean for 12 consecutive days. Verified: - go vet ./... clean - go build ./... clean - go test -run 'Dedup|OnePhase|PrevCommit|Idempot' ./adapter/ — pass - go test -run 'TestRedis|TestList|TestSet|TestZSet|TestStream|TestExec|TestMulti' ./adapter/ — pass (169s) - golangci-lint run ./adapter/... — 0 issues
1 parent 71a8241 commit a6af333

2 files changed

Lines changed: 32 additions & 13 deletions

File tree

.github/workflows/jepsen-test-scheduled.yml

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,19 @@ jobs:
3939
runs-on: ubuntu-latest
4040
env:
4141
GOCACHE: /tmp/go-build
42+
# Explicit dedup-OFF control baseline. The Redis adapter's
43+
# onePhaseTxnDedup flipped to default-on in
44+
# docs/design/2026_06_10_proposed_redis_onephase_dedup_default_on.md;
45+
# this workflow is preserved as the legacy-path coverage so anomalies
46+
# the dedup gate prevents (`:duplicate-elements`, `:future-read`,
47+
# `:G-single-item-realtime`) continue to be measured against an
48+
# unprotected build. Pair with the dedup-ON workflow
49+
# (.github/workflows/jepsen-test-scheduled-dedup.yml) which sets
50+
# ELASTICKV_REDIS_ONEPHASE_DEDUP=1 explicitly. Retirement of this
51+
# workflow is a follow-up after 30 days of post-flip data; until
52+
# then, do NOT remove this env var — without it the two workflows
53+
# would exercise the same path under the new default.
54+
ELASTICKV_REDIS_ONEPHASE_DEDUP: "0"
4255
steps:
4356
- uses: actions/checkout@v6
4457
with:

adapter/redis.go

Lines changed: 19 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -243,21 +243,25 @@ type RedisServer struct {
243243
// retryable write error, list-push retries reuse the failed attempt's
244244
// write set and carry prev_commit_ts so the FSM can dedup a commit that
245245
// landed under leadership churn (see
246-
// docs/design/2026_05_21_proposed_txn_secondary_idempotency.md). It
247-
// MUST stay off until every node runs a probe-aware binary — see R5
248-
// (FSM determinism across a rolling upgrade). Default off; enabled via
249-
// WithOnePhaseTxnDedup / the ELASTICKV_REDIS_ONEPHASE_DEDUP env var
250-
// after a full rollout.
246+
// docs/design/2026_05_21_proposed_txn_secondary_idempotency.md). The
247+
// FSM probe ships on every node in production, satisfying R5 (FSM
248+
// determinism across a rolling upgrade), so the gate now defaults on
249+
// per docs/design/2026_06_10_proposed_redis_onephase_dedup_default_on.md.
250+
// Set ELASTICKV_REDIS_ONEPHASE_DEDUP=0 (or WithOnePhaseTxnDedup(false))
251+
// to opt out — kept as a one-env-var operator rollback.
251252
onePhaseTxnDedup bool
252253

253254
route map[string]func(conn redcon.Conn, cmd redcon.Command)
254255
}
255256

256257
type RedisServerOption func(*RedisServer)
257258

258-
// WithOnePhaseTxnDedup enables the option-2 one-phase idempotency dedup on
259-
// list-push retries (see RedisServer.onePhaseTxnDedup). Off by default;
260-
// enable only after the whole cluster runs a probe-aware binary.
259+
// WithOnePhaseTxnDedup enables (or disables) the option-2 one-phase
260+
// idempotency dedup on list-push, MULTI/EXEC, and standalone-write retries
261+
// (see RedisServer.onePhaseTxnDedup). On by default since the rollout
262+
// recorded in docs/design/2026_06_10_proposed_redis_onephase_dedup_default_on.md;
263+
// pass false to opt out from code, or set ELASTICKV_REDIS_ONEPHASE_DEDUP=0
264+
// to opt out from the environment. The constructor option trumps the env var.
261265
func WithOnePhaseTxnDedup(enabled bool) RedisServerOption {
262266
return func(r *RedisServer) {
263267
r.onePhaseTxnDedup = enabled
@@ -495,11 +499,13 @@ func NewRedisServer(listen net.Listener, redisAddr string, store store.MVCCStore
495499
// getLuaPool, which honors luaPoolMaxIdle the same way.
496500
luaPool: nil,
497501
traceCommands: os.Getenv("ELASTICKV_REDIS_TRACE") == "1",
498-
// onePhaseTxnDedup honors the documented opt-in env var; the
499-
// WithOnePhaseTxnDedup option below can still override either way.
500-
// Default off — see R5 in the design doc (the writer must not be
501-
// enabled until the whole cluster runs a probe-aware binary).
502-
onePhaseTxnDedup: os.Getenv("ELASTICKV_REDIS_ONEPHASE_DEDUP") == "1",
502+
// onePhaseTxnDedup defaults on — the parent design's R5 rolling-upgrade
503+
// constraint is discharged (FSM probe shipped on every node months ago,
504+
// 12 consecutive green dedup-mode Jepsen runs 2026-05-31 → 2026-06-10).
505+
// See docs/design/2026_06_10_proposed_redis_onephase_dedup_default_on.md.
506+
// ELASTICKV_REDIS_ONEPHASE_DEDUP=0 opts out; the WithOnePhaseTxnDedup
507+
// constructor option still trumps the env var.
508+
onePhaseTxnDedup: os.Getenv("ELASTICKV_REDIS_ONEPHASE_DEDUP") != "0",
503509
baseCtx: baseCtx,
504510
baseCancel: baseCancel,
505511
streamWaiters: newKeyWaiterRegistry(),

0 commit comments

Comments
 (0)