Commit f598868
authored
perf(lease-read): store lease expiry as atomic.Int64 nanoseconds (zero-alloc extend) (#948)
Closes #554.
## Behavior change
`leaseState` no longer heap-allocates on the lease write path. The
expiry is now stored as a single `atomic.Int64` of monotonic-raw
nanoseconds and the invalidation generation as a single `atomic.Uint64`,
replacing the `atomic.Pointer[leaseSlot]` whose every successful
`extend` allocated a fresh `*leaseSlot`. That allocation was 1:1 with
Dispatch throughput (issue #554: ~50k allocs/s at 50k Dispatch/s) and a
matching GC source.
All externally observable lease semantics are preserved:
- `valid(now)` boundary unchanged: `now.Before(expiry)` (strict /
exclusive at the expiry instant), zero-expiry treated as "no lease",
zero `now` fails closed.
- `extend` monotonicity unchanged: a shorter target never regresses a
longer live lease.
- Generation guard unchanged: an `extend` carrying a generation captured
before a racing `invalidate` is dropped, so a leader-loss callback
cannot be resurrected.
- `invalidate` still clears unconditionally and bumps the generation.
- Nil-receiver and zero-value `leaseState` behavior unchanged.
The method signatures (`valid`, `generation`, `extend`, `invalidate`)
are byte-for-byte identical, so the callers in `kv/coordinator.go` and
`kv/sharded_coordinator.go` are untouched.
## Clock-source decision (called out explicitly)
The issue offered two directions for the expiry value:
1. `time.Now().UnixNano()` (wall clock) — zero alloc but **loses Go's
monotonic-clock comparison**: a backward NTP step prematurely expires
the lease (safe), a forward step extends it past its true safety window
(**unsafe**).
2. A monotonic source (`runtime.nanotime` / `CLOCK_MONOTONIC_RAW`) —
zero alloc **and** step-immune, avoiding the caveat entirely.
This PR takes **option 2**, and in fact the repo already adopted
`CLOCK_MONOTONIC_RAW` via `internal/monoclock` (PR #551, which landed
after #554 was filed). `expiryNanos` stores `monoclock.Instant.Nanos()`
and `valid()` compares it against a caller-supplied `monoclock.Now()`.
Both endpoints share the same arbitrary monotonic-raw zero point, so the
lease-vs-safety-window comparison is immune to NTP rate adjustment
**and** wall-clock step events — strictly stronger than
`time.Now().UnixNano()` and with no allocation. The wall-clock
dependency the issue asked to document is captured in a code comment on
`leaseState` (constraint stated as an invariant, no PR/issue reference):
both the stored expiry and the `now` passed to `valid()` must originate
from monoclock; mixing in `time.Now()`-derived nanoseconds would
reintroduce the NTP-step hazard.
So the only outstanding part of #554 was the per-`extend` `*leaseSlot`
allocation, which this PR removes.
## Concurrency design
`valid()` (the hot path, one per read) stays lock-free: a single atomic
load of `expiryNanos` and a comparison. `extend` and `invalidate` (one
per Dispatch / leadership change, not per read) serialize on a
writer-only `sync.Mutex`, so their two-field `(gen, expiry)` updates
appear atomic to each other without needing a 128-bit CAS and without
the post-write rollback dance a lock-free two-atomic scheme would
require. Because writers are mutually exclusive, an `extend` and an
`invalidate` can never interleave: the `extend` either runs fully before
the `invalidate` (and is then cleared) or fully after (and is dropped by
the generation guard). Readers never take the mutex.
This replaces the previous pointer-identity rollback machinery (the
clock-granularity-tie disambiguation): with serialized writers there is
no rollback, so the value-clobber hazard that machinery guarded cannot
occur. The test that pinned the pointer-identity invariant is replaced
by one pinning the same observable safety property
(`TestLeaseState_StaleExtendCannotClobberFreshLeaseSameExpiry`).
## Risk
- Adds a `sync.Mutex` on the lease write paths. These run once per
Dispatch (already gated by Raft consensus) / once per leadership change,
not on the per-read hot path. The critical section is two atomic stores.
Lock contention is negligible relative to the consensus round-trip; the
read hot path is unaffected.
- The clock-source choice is the load-bearing safety decision. monoclock
(CLOCK_MONOTONIC_RAW) preserves the existing step-immune behavior; no
wall-clock regression is introduced.
## Test evidence
`go test -race -run 'TestLeaseState|TestIsLeadershipLossError' ./kv/`:
```
ok github.com/bootjp/elastickv/kv 1.094s
```
Full package, `go test -race ./kv/...`:
```
ok github.com/bootjp/elastickv/kv 10.721s
```
`golangci-lint --config=.golangci.yaml run ./kv/...`:
```
0 issues.
```
Benchmark, `go test -run='^$' -bench='BenchmarkLeaseState' -benchmem
./kv/`:
```
goos: darwin
goarch: arm64
pkg: github.com/bootjp/elastickv/kv
cpu: Apple M1 Max
BenchmarkLeaseStateExtend-10 79673116 14.65 ns/op 0 B/op 0 allocs/op
BenchmarkLeaseStateValid-10 1000000000 0.5511 ns/op 0 B/op 0 allocs/op
```
Before this change the same `extend` benchmark measured `23.08 ns/op 16
B/op 1 allocs/op`. Acceptance criterion (0 allocs/op) met; `extend` is
also ~37% faster.
Per-PR Jepsen CI covers the lease-read window (DynamoDB / Redis
workloads), which is the integration-level safety net for this path.
## Self-review
1. **Data loss** — No persistence path touched. The lease is
leader-local, in-memory, advisory fast-path state; it gates whether a
read takes the fast or slow (LinearizableRead) path but never affects
what is written or committed. A lost/cleared lease only downgrades to
the slow path (safe). No FSM, Raft, Pebble, or snapshot semantics
change.
2. **Concurrency / distributed failures** — Writers serialize on
`writeMu`; readers are lock-free single-atomic. Eliminates the
extend/invalidate interleave the old design handled via pointer-identity
rollback (the interleave can no longer occur). Generation guard
preserved, so leader-loss invalidation still beats a racing stale
extend. Verified with `go test -race` on the full `kv` package,
including the concurrent extend/read test.
3. **Performance** — Removes the per-`extend` 16 B heap allocation (0
allocs/op, confirmed by benchmark); read path stays lock-free at 0.55
ns/op. New writer mutex critical section is two atomic stores, off the
read hot path and dwarfed by the consensus round-trip that precedes
every extend.
4. **Data consistency** — Lease validity boundary, monotonic-extend
rule, zero/sentinel handling, and the monotonic-raw clock source are all
preserved, so the lease-read freshness bound is unchanged. Reads still
go through the leader-issued read pipeline; nothing here bypasses
`HLC.Next()` or the read-timestamp path.
5. **Test coverage** — All existing lease tests retained and passing.
Replaced the implementation-internal pointer-identity test with
`TestLeaseState_StaleExtendCannotClobberFreshLeaseSameExpiry` (same
observable invariant against the new internals) and
`TestLeaseState_StaleExtendAfterInvalidateIsNoop`. Added
`BenchmarkLeaseStateExtend` (the acceptance criterion, 0 allocs/op) and
`BenchmarkLeaseStateValid`.2 files changed
Lines changed: 241 additions & 163 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
| 4 | + | |
4 | 5 | | |
5 | 6 | | |
6 | 7 | | |
| |||
30 | 31 | | |
31 | 32 | | |
32 | 33 | | |
33 | | - | |
34 | | - | |
35 | | - | |
36 | | - | |
37 | | - | |
38 | | - | |
39 | | - | |
40 | | - | |
41 | | - | |
42 | | - | |
43 | | - | |
44 | | - | |
45 | | - | |
46 | | - | |
47 | | - | |
48 | | - | |
49 | | - | |
50 | | - | |
51 | | - | |
52 | | - | |
53 | | - | |
54 | | - | |
55 | 34 | | |
56 | | - | |
57 | | - | |
58 | | - | |
| 35 | + | |
59 | 36 | | |
60 | | - | |
61 | | - | |
62 | | - | |
63 | | - | |
64 | | - | |
65 | | - | |
66 | | - | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
67 | 69 | | |
68 | | - | |
69 | | - | |
70 | | - | |
71 | | - | |
72 | | - | |
73 | | - | |
74 | | - | |
75 | | - | |
76 | | - | |
77 | | - | |
78 | | - | |
79 | | - | |
80 | | - | |
81 | | - | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
82 | 84 | | |
83 | 85 | | |
84 | 86 | | |
| |||
93 | 95 | | |
94 | 96 | | |
95 | 97 | | |
96 | | - | |
97 | | - | |
| 98 | + | |
| 99 | + | |
98 | 100 | | |
99 | 101 | | |
100 | | - | |
| 102 | + | |
101 | 103 | | |
102 | 104 | | |
103 | 105 | | |
| |||
110 | 112 | | |
111 | 113 | | |
112 | 114 | | |
113 | | - | |
| 115 | + | |
114 | 116 | | |
115 | 117 | | |
116 | 118 | | |
117 | 119 | | |
118 | 120 | | |
119 | 121 | | |
120 | 122 | | |
121 | | - | |
122 | | - | |
123 | | - | |
124 | | - | |
125 | | - | |
126 | | - | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
127 | 130 | | |
128 | 131 | | |
129 | 132 | | |
130 | 133 | | |
131 | 134 | | |
132 | 135 | | |
133 | 136 | | |
134 | | - | |
| 137 | + | |
| 138 | + | |
135 | 139 | | |
136 | 140 | | |
137 | | - | |
138 | | - | |
139 | | - | |
140 | | - | |
141 | | - | |
142 | | - | |
143 | | - | |
144 | | - | |
145 | | - | |
146 | | - | |
147 | | - | |
148 | | - | |
149 | | - | |
150 | | - | |
151 | | - | |
152 | | - | |
153 | | - | |
154 | | - | |
155 | | - | |
156 | | - | |
157 | | - | |
158 | | - | |
159 | | - | |
160 | | - | |
161 | | - | |
162 | | - | |
163 | | - | |
164 | | - | |
165 | | - | |
166 | | - | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
167 | 147 | | |
168 | 148 | | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
169 | 158 | | |
170 | 159 | | |
171 | | - | |
172 | | - | |
173 | | - | |
174 | | - | |
175 | | - | |
176 | | - | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
177 | 185 | | |
178 | 186 | | |
179 | 187 | | |
180 | 188 | | |
181 | | - | |
182 | | - | |
183 | | - | |
184 | | - | |
185 | | - | |
186 | | - | |
187 | | - | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
188 | 194 | | |
| 195 | + | |
189 | 196 | | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
0 commit comments