Commit c1aa23b
committed
feat(keyviz): fan-out §9.1 (group, term) dedupe (Phase 2-C+ PR-3c)
Replaces the Phase 2-C §4.2 row-level max-merge for writes with
the parent design's canonical §9.1 merge: collapse to one value
per (BucketID, RaftGroupID, LeaderTerm, column), then SUM across
distinct LeaderTerm values for the same (group, column). The
TRUE write count is recovered under a mid-window leader flip
instead of being conservatively bounded by the larger leader's
slice (§4.2 trade — never over-counts but understates by the
ex-leader's pre-transfer increment).
Algorithm (per cell, writes):
1. Per (group, term), keep the max value seen across all
sources (Raft invariant: at most one leader per term per
group; agreement is the steady state).
2. If two sources disagree within the same (group, term, cell),
keep the larger AND surface conflict=true — that's a real
divergence (multiple "leaders" reporting writes for the
same term).
3. Sum across distinct (group, term) entries for the cell —
each term's leader only observed its own term's writes, so
summing is exact.
4. Fail-closed: if ANY contribution has group=0 or term=0
(unknown identity — legacy peer or a publisher that hasn't
fired yet), fall back to the §4.2 max-merge for the WHOLE
cell. Summing across an unknown-term entry would risk
double-counting overlapping windows.
Reads stay on the legacy sum-across-nodes path — local serves
are independent and never need per-leader dedupe. Bytes
counters follow the same write/read split.
Structural change: mergeRowInto no longer writes through into
dst.Values incrementally. Each (bucket, cell) accumulates state
in a cellMergeAcc; after every source row has been processed,
resolveRowMergeAcc materialises a final KeyVizRow per bucket.
Caller audit (semantic change to writes value):
- mergeKeyVizMatrices → only KeyVizFanout.Run
- KeyVizFanout.Run → only KeyVizHandler.ServeHTTP
- HTTP /admin/api/v1/keyviz/matrix → SPA at web/admin/src/
pages/KeyViz.tsx treats values[] as opaque numbers and reads
row.conflict as a coarse hatch signal. No max/sum assumption,
no consumer-side change required.
- gRPC AdminServer.GetKeyVizMatrix returns single-node and
does not invoke mergeKeyVizMatrices — unaffected.
Tests:
- TestMergeKeyVizMatricesGroupTermSumAcrossTerms: ex-leader
(term=42, value=30) + new-leader (term=43, value=50) on the
SAME cell merge to 80 (sum), recovering the pre-transfer
slice the §4.2 max would have lost.
- TestMergeKeyVizMatricesGroupTermConflictWithinSameTerm: two
sources reporting different non-zero values for the SAME
(group, term) raise conflict=true (Raft invariant
violation).
- TestMergeKeyVizMatricesGroupTermFallbackOnUnknownTerm:
modern (group=7, term=42, value=30) + legacy (no arrays,
value=50) fall back to max=50, not sum=80 — the fail-closed
guard preserves PR-3b's mixed-version safety.
The Phase 2-C+ §4.2 → §9.1 transition is the headline
behavior change of PR-3c. Per-cell conflict promotion (parent
design §9.1: "the cell-level flag will land with the
leaderTerm-based merge") stays deferred to a future PR-3d so
this PR's diff is contained to the merge algorithm itself.
The row-level Conflict bool is now driven by any-cell-saw-
conflict, matching the existing SPA hatch contract.
Dropped dead code: maxMerge, sumMerge, mergeFnFor,
mergeCellFn. The new accumulator-based path replaces the
incremental mergeFn dispatch.1 parent fde4ee3 commit c1aa23b
3 files changed
Lines changed: 359 additions & 106 deletions
0 commit comments