Skip to content

Commit 32d8b93

Browse files
authored
feat(keyviz): fan-out §9.1 (group, term) dedupe (Phase 2-C+ PR-3c) (#822)
## Summary Replace the Phase 2-C **§4.2 row-level max-merge** for writes with the parent design's canonical **§9.1 per-cell `(group, term)` dedupe + sum-across-terms**. This is the headline Phase 2-C+ behavior change: under a mid-window leader flip, the cluster heatmap now recovers the TRUE write count instead of being conservatively bounded by the larger leader's slice. ## Why Per `docs/admin_ui_key_visualizer_design.md` §9.1: > "For write samples the authoritative identity is `(raftGroupID, leaderTerm)` — by Raft invariants at most one leader exists per term per group — so the admin binary collapses write samples to one value per `(bucketID, raftGroupID, leaderTerm, windowStart)` key. ... Across distinct `leaderTerm` values for the same group and window, values are summed because each term's leader only observed its own term's writes." PR-3a (#709) gave the sampler the `GroupID + LeaderTerm` fields. PR-3b (#720) wired them through the proto + JSON wire as parallel arrays and started the `main.go` publisher. This PR consumes them. ## Algorithm (per cell, writes series) 1. For each source's contribution at this cell, classify by `(group, term)`. 2. Within the same `(group, term)`: keep the larger value (Raft invariant — agreement is the steady state); raise `conflict=true` on disagreement (multiple "leaders" reporting writes for the same term is a real divergence). 3. **Sum across distinct `(group, term)` entries** — each term's leader only observed its own term's writes, so summing is exact. 4. **Fail-closed fallback**: if ANY contribution has `group=0` or `term=0` (legacy peer or a publisher that hasn't fired yet), fall back to the §4.2 max-merge for the WHOLE cell. Summing across an unknown-term entry would risk double-counting overlapping windows. Reads / read_bytes unchanged — local serves are independent and never need per-leader dedupe (sum-across-nodes is exact). Write_bytes follows writes. ## Structural change `mergeRowInto` no longer writes through into `dst.Values` incrementally. Each `(bucket, cell)` accumulates state in a `cellMergeAcc`; after all source rows have been processed, `resolveRowMergeAcc` materialises one final `KeyVizRow` per bucket. The accumulator carries: - `sum` / `lastGroup` / `lastTerm`: read-path running sum + best-effort identity. - `byTerm[(group, term)] → max value`: canonical write-path table. - `termConflict[(group, term)] → bool`: per-key disagreement flag. - `hasUnknownTerm`: flips fallback path on first unknown contribution. - `fallbackMax`, `fallbackSeenNonZeroValue`, `fallbackNonZeroDistinct`: fallback max-merge state, maintained for every non-zero contribution so a cell that later flips into the fallback path has the right running max + disagreement signal. The dispatch is gated by `mergeUsesGroupTermDedupe(series)` — `true` for writes/write_bytes, `false` for reads/read_bytes. ## Caller audit (semantic change to writes value) - `mergeKeyVizMatrices` → only `KeyVizFanout.Run` (same package). - `KeyVizFanout.Run` → only `KeyVizHandler.ServeHTTP` (same package). - HTTP `/admin/api/v1/keyviz/matrix` → SPA at `web/admin/src/pages/KeyViz.tsx` treats `values[]` as opaque numbers and reads `row.conflict` as a coarse hatch signal. No max/sum assumption, no consumer-side change required. - gRPC `AdminServer.GetKeyVizMatrix` returns single-node and does not invoke `mergeKeyVizMatrices` — unaffected. No structural caller change. The numeric meaning of `values[]` for writes shifts from "max across nodes" to "§9.1 sum-across-terms", which is the documented intentional behavior of PR-3c. ## Out of scope - **Per-cell `Conflicts []bool` wire format**: parent design §9.1 says "the cell-level flag will land with the leaderTerm-based merge". Deferred to a future PR-3d so this PR's diff is contained to the merge algorithm itself. The row-level `Conflict bool` is now driven by any-cell-saw-conflict, matching the existing SPA hatch contract. - **SPA per-cell hatching**: requires the per-cell wire change above. ## Test plan - [x] `go test -race -count=1 ./internal/admin/... ./keyviz/...` — pass. - [x] `golangci-lint run` (full project) — 0 issues. - [x] New tests: - `TestMergeKeyVizMatricesGroupTermSumAcrossTerms`: ex-leader (term=42, value=30) + new-leader (term=43, value=50) on the SAME cell merge to 80 (sum), recovering the pre-transfer slice §4.2 max would have lost. - `TestMergeKeyVizMatricesGroupTermConflictWithinSameTerm`: two sources reporting different non-zero values for the SAME `(group, term)` raise `conflict=true` (Raft invariant violation). - `TestMergeKeyVizMatricesGroupTermFallbackOnUnknownTerm`: modern (group=7, term=42, value=30) + legacy (no arrays, value=50) fall back to max=50, not sum=80 — the fail-closed guard preserves PR-3b's mixed-version safety. - [x] Existing PR-3b tests (`TestMergeKeyVizMatricesPerCellIdentityMatchesValueOwner`, `TestMergeKeyVizMatricesLegacyPeerWinnerResetsIdentity`, the §4.2-shape `TestMergeKeyVizMatricesWritesMax*` tests with empty arrays which now hit the fallback path) all continue to pass. - [ ] Jepsen — N/A (admin-side merge algorithm, no replication/MVCC/OCC impact). ## Five-lens self-review 1. **Data loss** — admin-side merge only; no on-disk format change. The fail-closed fallback preserves PR-3b's mixed-version safety. A cluster running mixed PR-3b/PR-3c binaries gets max-merge (PR-3b semantics) for any cell touched by an old binary, and §9.1 sum elsewhere. 2. **Concurrency** — fan-out merge runs single-goroutine after parallel peer calls return; no new shared state. 3. **Performance** — per-cell map allocation only when ≥1 source contributes a known-term write to that cell (lazy `c.byTerm == nil` check). Cells without known-term contributions (legacy fallback) skip map allocation entirely. The accumulator-based two-pass merge is O(sources × rows × columns), same asymptotic as the previous incremental merge. 4. **Data consistency** — §9.1 algorithm is the canonical reference. Tests pin the three nontrivial cases: sum across distinct terms (recovers the leader-flip slice), conflict within same term (Raft invariant violation), unknown-term fallback (mixed-version safety). 5. **Test coverage** — algorithm pinned with three new table-driven cases plus the existing PR-3b regression guards. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Improvements** * More accurate merge behavior for write paths with finer-grained grouping and robust fallback when contributor identity is unknown, reducing incorrect aggregates and surfacing row-level conflicts reliably. * **Documentation** * Clarified conflict semantics to explain when row-level conflict is set and future per-cell conflict plans. * **Tests** * Added unit tests covering merge summing across terms, conflict detection, and unknown-identity fallback. <!-- review_stack_entry_start --> [![Review Change Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/bootjp/elastickv/pull/822?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack) <!-- review_stack_entry_end --> <!-- end of auto-generated comment: release notes by coderabbit.ai -->
2 parents ab8eac8 + 48376d3 commit 32d8b93

3 files changed

Lines changed: 409 additions & 106 deletions

File tree

0 commit comments

Comments
 (0)