Skip to content

feat(keyviz): leader-term wire format + main.go ticker (Phase 2-C+ PR-3b)#720

Merged
bootjp merged 3 commits intomainfrom
feat/keyviz-leader-term-wire
May 1, 2026
Merged

feat(keyviz): leader-term wire format + main.go ticker (Phase 2-C+ PR-3b)#720
bootjp merged 3 commits intomainfrom
feat/keyviz-leader-term-wire

Conversation

@bootjp
Copy link
Copy Markdown
Owner

@bootjp bootjp commented Apr 30, 2026

Summary

Build the end-to-end pipeline that takes the sampler-side RaftGroupID + LeaderTerm fields shipped in PR #709 (Phase 2-C+ PR-3a) and surfaces them on the wire (proto + JSON + gRPC) so the fan-out aggregator (PR-3c, next) has data to dedupe by (group, term) instead of the conservative §4.2 max-merge.

Why

Per docs/design/2026_04_27_proposed_keyviz_cluster_fanout.md §4.2, the cluster fan-out today max-merges write samples per row — under stable leadership this is exact, but under a leadership flip it understates the true window total by at most one leader's pre-transfer slice. §9.1's canonical fix is to dedupe by (bucket_id, raft_group_id, leader_term, column) and sum across terms; that needs raft_group_id + leader_term on every row of the wire format. PR #709 added the fields to MatrixRow; this PR plumbs them through.

Changes

Wire-format additions (forward-compatible):

  • proto/admin.proto: KeyVizRow.raft_group_id = 13 and leader_term = 14. Old SPAs ignore the new fields; old servers don't emit them — the zero value is the documented "term not tracked" fallback that triggers the legacy max-merge.
  • internal/admin/keyviz_handler.go: JSON KeyVizRow gains RaftGroupID and LeaderTerm (omitempty); newKeyVizRowFrom copies mr.RaftGroupID / mr.LeaderTerm.
  • internal/admin/keyviz_fanout.go: mergeRowInto seeds the destination row with RaftGroupID + LeaderTerm on first-seen (PR-3c will switch to a per-cell merge).
  • adapter/admin_grpc.go: newKeyVizRowFrom copies the same two fields onto the proto KeyVizRow.

main.go ticker:

  • startKeyVizLeaderTermPublisher launches a goroutine that polls each runtime.engine.Status().Term every Step (matching the flush cadence so each column observes a fresh term). Skips entirely when the sampler is nil or no runtimes are wired.
  • publishLeaderTerms / publishLeaderTermsFromSnapshots split into two helpers so unit tests can exercise publication without standing up a full raftengine.Engine fake.
  • The ticker publishes once immediately so the very first flush column carries a non-zero term.

Scope (PR-3b only)

  • Wire format emission: ✓ this PR
  • Ticker that publishes terms via SetLeaderTerm: ✓ this PR
  • Aggregator consumption (per-cell (group, term) merge replacing max-merge): deferred to PR-3c
  • Conflict per-cell promotion: deferred to PR-3c
  • SPA changes: none required — the wire-format extension is forward-compatible

With this PR alone, the new fields are visible on every response but the aggregator still uses §4.2 max-merge → no behavior change for SPA consumers.

Test plan

  • go test -race -count=1 ./keyviz/... ./internal/admin/... — passes
  • New test coverage:
    • TestKeyVizHandlerStampsRaftIdentity (JSON path)
    • TestMergeKeyVizMatricesPreservesRaftIdentity (fan-out merge)
    • TestGetKeyVizMatrixStampsRaftIdentity (gRPC path)
    • TestPublishLeaderTermsFromSnapshotsStampsRows (end-to-end through sampler)
    • TestPublishLeaderTermsFromSnapshotsNilSafe (nil-receiver contract)
    • TestStartKeyVizLeaderTermPublisherSkipsWhenSamplerNil / ...WhenNoRuntimes (errgroup lifecycle)
  • go build ./... — clean
  • golangci-lint run — 0 issues
  • Jepsen — N/A (admin-side wire extension, no replication/MVCC/OCC impact)

Five-lens self-review

  1. Data loss — wire-format extension only; legacy max-merge fallback preserves today's behavior when LeaderTerm=0.
  2. Concurrency — ticker is a single goroutine; SetLeaderTerm takes groupTermsMu (Phase 2-C+ PR-3a contract); no new shared state.
  3. PerformanceObserve hot path is unchanged; ticker adds N × Status() calls per Step where N is the runtime count (typically ≤ a few groups), and Status() is a non-blocking read.
  4. Data consistencyRaftGroupID + LeaderTerm round-trip through every wire path verified by table-driven tests.
  5. Test coverage — each layer (JSON, fan-out, gRPC, publisher) has its own focused regression guard; end-to-end test pins the sampler→column propagation.

Summary by CodeRabbit

  • New Features
    • Keyviz matrix now includes Raft leadership metadata with group ID and leader term fields in responses.
    • Leadership term information is automatically tracked and included in key visualization output.

…-3b)

PR #709 (Phase 2-C+ PR-3a) shipped the sampler-side
SetLeaderTerm API and per-row RaftGroupID + LeaderTerm
fields, but no caller was wiring them. This PR builds the
end-to-end pipeline so the fan-out aggregator (PR-3c, next)
has data to act on.

Wire-format additions (forward-compatible):

- proto/admin.proto: KeyVizRow gains
  `uint64 raft_group_id = 13` and `uint64 leader_term = 14`.
  Old SPAs ignore the new fields; old servers don't emit
  them (the field's zero value is the documented "term not
  tracked" fallback so the legacy max-merge applies).
- internal/admin/keyviz_handler.go (JSON path):
  KeyVizRow gains RaftGroupID and LeaderTerm with
  omitempty tags. newKeyVizRowFrom copies
  MatrixRow.RaftGroupID + LeaderTerm into the JSON row.
- internal/admin/keyviz_fanout.go: mergeRowInto seeds
  the destination row with RaftGroupID + LeaderTerm when
  it is first seen for a bucket. PR-3c will switch to a
  per-cell (group, term) merge; this PR ensures the
  fields survive the conservative max-merge so PR-3c has
  something to dedupe.
- adapter/admin_grpc.go (proto path): newKeyVizRowFrom
  copies the same two fields onto the proto KeyVizRow.

main.go ticker (publishes terms at the flush cadence):

- startKeyVizLeaderTermPublisher launches a goroutine that
  polls each runtime.engine.Status().Term every Step
  (matching the flush cadence so each column observes a
  fresh term). Skips entirely when the sampler is nil
  (keyviz disabled) or when no runtimes are wired.
- publishLeaderTerms / publishLeaderTermsFromSnapshots
  split into two helpers so unit tests can exercise the
  publication step without faking the wide raftengine.Engine
  interface — the snapshot-based helper is nil-receiver
  safe for the keyviz-disabled path.
- The ticker publishes once immediately so the very first
  flush column carries a non-zero term (without this, the
  first column built between startup and the first ticker
  fire would inherit LeaderTerm=0).

Tests:

- TestKeyVizHandlerStampsRaftIdentity: JSON path —
  MatrixRow.RaftGroupID + LeaderTerm propagate to
  raft_group_id + leader_term JSON fields.
- TestMergeKeyVizMatricesPreservesRaftIdentity: fan-out
  path — mergeRowInto preserves RaftGroupID + LeaderTerm.
- TestGetKeyVizMatrixStampsRaftIdentity: gRPC path —
  matrixToProto stamps proto fields 13 and 14.
- TestPublishLeaderTermsFromSnapshotsStampsRows: end-to-end
  through the sampler — published snapshot surfaces on
  every subsequent flushed row.
- TestPublishLeaderTermsFromSnapshotsNilSafe: nil-receiver
  contract.
- TestStartKeyVizLeaderTermPublisherSkipsWhenSamplerNil /
  ...WhenNoRuntimes: errgroup closes cleanly when the
  publisher has nothing to publish.

Five-lens self-review:

1. Data loss — wire format extension only; legacy max-merge
   fallback preserves today's behavior when LeaderTerm=0.
2. Concurrency — ticker is single goroutine; SetLeaderTerm
   takes groupTermsMu (Phase 2-C+ PR-3a contract); no new
   shared state.
3. Performance — Observe hot path is unchanged; ticker
   adds N×Status() calls per Step where N is the runtime
   count (typically ≤ a few groups), and Status() is a
   non-blocking read.
4. Data consistency — RaftGroupID + LeaderTerm round-trip
   through every wire path (JSON handler, fan-out merge,
   gRPC) verified by table-driven tests.
5. Test coverage — each layer (JSON, fan-out, gRPC,
   publisher) has its own focused regression guard;
   end-to-end test pins the sampler→column propagation.

Not in this PR (deferred to PR-3c):

- §9.1 canonical (bucket_id, raft_group_id, leader_term,
  column) merge replacing §4.2 max-merge.
- Conflict promotion from per-row to per-cell.
- SPA changes — wire-format addition is forward-compatible
  so the existing SPA renders correctly without changes.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 30, 2026

Warning

Rate limit exceeded

@bootjp has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 37 minutes and 5 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3b83f209-5599-4a17-bfc3-77a9a5ad4754

📥 Commits

Reviewing files that changed from the base of the PR and between c34c8bd and c3f0ce6.

⛔ Files ignored due to path filters (1)
  • proto/admin.pb.go is excluded by !**/*.pb.go
📒 Files selected for processing (10)
  • adapter/admin_grpc.go
  • adapter/admin_grpc_keyviz_test.go
  • internal/admin/keyviz_fanout.go
  • internal/admin/keyviz_fanout_test.go
  • internal/admin/keyviz_handler.go
  • internal/admin/keyviz_handler_test.go
  • main.go
  • main_admin.go
  • multiraft_runtime.go
  • proto/admin.proto
📝 Walkthrough

Walkthrough

This change propagates Raft leadership metadata (raft_group_id and leader_term) through the KeyViz data pipeline. A background publisher reads engine term status, the handler and adapter copy these fields through the data model, and merge logic preserves them during aggregation.

Changes

Cohort / File(s) Summary
Proto Definition
proto/admin.proto
Added raft_group_id and leader_term uint64 fields to KeyVizRow message for Raft identity tracking.
Handler Layer
internal/admin/keyviz_handler.go, internal/admin/keyviz_handler_test.go
Added RaftGroupID and LeaderTerm fields to KeyVizRow struct with JSON serialization; handler now populates these from source MatrixRow during row construction.
Adapter/gRPC Layer
adapter/admin_grpc.go, adapter/admin_grpc_keyviz_test.go
gRPC adapter copies RaftGroupID and LeaderTerm from source MatrixRow to proto KeyVizRow output; test validates proto field values match input identity values or remain zero.
Fanout/Merge Layer
internal/admin/keyviz_fanout.go, internal/admin/keyviz_fanout_test.go
Merge aggregation now preserves RaftGroupID and LeaderTerm when creating destination rows; test validates merged output retains identity fields.
Publisher/Main
main.go, main_keyviz_test.go
Added background publisher that periodically reads engine Status().Term for each Raft group and writes to sampler via SetLeaderTerm; publisher starts only when sampler is enabled and runtimes are provided. Includes snapshot abstraction for testing.

Sequence Diagram

sequenceDiagram
    participant Engine
    participant Publisher as Leader Term Publisher
    participant Sampler
    participant Handler
    participant Adapter

    loop Every flush cadence
        Publisher->>Engine: engine.Status().Term
        Engine-->>Publisher: term value
        Publisher->>Sampler: SetLeaderTerm(groupID, term)
        Sampler->>Sampler: store in MatrixRow
    end
    
    Sampler->>Sampler: Flush columns
    Sampler->>Handler: MatrixRow with RaftGroupID/LeaderTerm
    Handler->>Handler: newKeyVizRowFrom() copies fields
    Handler->>Handler: Merge/fanout aggregation preserves fields
    Handler->>Adapter: MatrixRow to gRPC KeyVizRow
    Adapter->>Adapter: Copy RaftGroupID/LeaderTerm to proto
    Adapter-->>Handler: pb.KeyVizRow response
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

Poem

🐰 A rabbit hops through fields of raft,
Collecting leader terms, left and right,
Each row now knows whose paw was daft,
Building matrices, pixel-bright! ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main changes: adding leader-term wire format support and a main.go ticker for publishing leader terms, which are the two key objectives of this PR.
Docstring Coverage ✅ Passed Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/keyviz-leader-term-wire

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 0/1 reviews remaining, refill in 37 minutes and 5 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces Raft identity tracking (Raft Group ID and Leader Term) into the KeyViz pipeline to support future per-term deduplication. Changes include updates to internal data structures, proto definitions, and the fan-out merge logic, along with a new background publisher to poll Raft terms. Feedback highlights that row-level identity is insufficient for the stated goal of per-cell deduplication because leadership can change within a window. Additionally, the current merge logic may create inconsistencies between the reported term and the data until full per-cell logic is implemented.

Comment thread proto/admin.proto Outdated
Comment on lines +115 to +116
uint64 raft_group_id = 13;
uint64 leader_term = 14;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The raft_group_id and leader_term fields are defined at the row level, but a KeyVizRow contains a sequence of values representing multiple time columns. Since leadership can change within the requested window, a single row-level identity is insufficient for the per-cell deduplication goal mentioned in the PR description (which specifies (bucket_id, raft_group_id, leader_term, column) as the dedupe key). The current implementation in the pivot logic (adapter/admin_grpc.go and internal/admin/keyviz_handler.go) only captures the identity from the first column where a route appears, which will lead to incorrect deduplication in the aggregator for subsequent columns if a term change occurred during the window.

Comment thread internal/admin/keyviz_fanout.go Outdated
Comment on lines +528 to +529
RaftGroupID: row.RaftGroupID,
LeaderTerm: row.LeaderTerm,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

In mergeRowInto, the destination row's RaftGroupID and LeaderTerm are seeded from the first source row encountered for a given BucketID. When performing a maxMerge (used for writes) across nodes that might be in different terms, the resulting merged row could carry the identity of the first node processed but the value of a different node (the one with the maximum value). This creates an inconsistency between the reported term and the data in the merged output until the per-cell deduplication logic is fully implemented in Phase 2-C+.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c34c8bd1ac

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread main.go Outdated
Comment on lines +1562 to +1565
if rt == nil || rt.engine == nil {
continue
}
snaps = append(snaps, groupTermSnapshot{groupID: rt.spec.id, term: rt.engine.Status().Term})
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Capture engine once before nil-checking in term publisher

publishLeaderTerms reads rt.engine twice (rt.engine == nil and then rt.engine.Status()), which is unsafe because rt.Close() sets engine to nil during cleanup. On startup-error paths that return before joining all goroutines (for example when setupAdminService fails), this goroutine can race with teardown and the second read can become nil after the check, causing a panic or race-detector failure. Store rt.engine in a local variable once, check that, and use that same reference for Status().

Useful? React with 👍 / 👎.

Three review items, two of them resolved by the same
semantic change (per-cell wire format):

- Gemini HIGH (proto/admin.proto:116) — row-level scalar
  raft_group_id / leader_term insufficient for the
  per-cell dedupe goal. The pivot in admin_grpc.go and
  keyviz_handler.go only captures the FIRST column's
  identity, so subsequent columns under a mid-window
  leader flip get the wrong term and the aggregator's
  (bucket, group, term, column) dedupe key breaks.
- Gemini MEDIUM (keyviz_fanout.go:529) — first-seen
  identity vs max-merge value can mismatch when the
  source whose value won did not match the source whose
  identity was seeded first.

Both resolved by switching wire format to per-cell
parallel arrays:

  // proto/admin.proto KeyVizRow:
  repeated uint64 raft_group_ids = 13;
  repeated uint64 leader_terms = 14;

  // internal/admin JSON KeyVizRow:
  RaftGroupIDs []uint64 `json:"raft_group_ids,omitempty"`
  LeaderTerms  []uint64 `json:"leader_terms,omitempty"`

The pivot loops in admin_grpc.go and keyviz_handler.go
now stamp row.RaftGroupIDs[j] = mr.RaftGroupID per
column (parallel to row.Values[j] = pick(mr)). The
mergeRowInto in keyviz_fanout.go stamps the destination
cell's identity from whichever source contributed the
value kept at that cell — for maxMerge that's the
larger source; for sumMerge that's best-effort
last-touched (sum has no single owner, but in the
steady state both sources agree on the term, so the
distinction doesn't matter).

- Codex P2 (main.go:1565) — publishLeaderTerms reads
  rt.engine twice (nil-check + Status() call) which
  races rt.Close() setting rt.engine = nil during
  startup-error teardown. Snapshotted into a local once.

Caller audit (per loop instructions): the only
production consumers of KeyVizRow.RaftGroupID/LeaderTerm
were the pivot/merge paths in adapter/admin_grpc.go and
internal/admin/keyviz_{handler,fanout}.go, all updated
in this commit. No external consumer reads these fields.
The PR-3c aggregator that will USE the per-cell dedupe
hasn't been written yet, so there are no callers
expecting the old scalar shape.

Test updates:

- TestKeyVizHandlerStampsRaftIdentity: extended to a
  two-column scenario with a mid-window flip; asserts
  parallel arrays match (route 1: groups=[7,7],
  terms=[42,43]; route 2 absent in col1: zero pair).
- TestGetKeyVizMatrixStampsRaftIdentity: same shape on
  the gRPC path against pb.KeyVizRow.RaftGroupIds /
  LeaderTerms.
- TestMergeKeyVizMatricesPreservesRaftIdentity: source
  rows now use parallel-array shape; pinned identity
  survives mergeRowInto.
- TestMergeKeyVizMatricesPerCellIdentityMatchesValueOwner
  (NEW): leadership flip across two cells with each
  leader winning one cell; asserts merged identity at
  each cell matches the value's owner (Gemini MEDIUM
  regression guard).

Sampler tests at the package level are unchanged because
keyviz.MatrixRow still carries scalar RaftGroupID /
LeaderTerm — the per-cell representation only emerges
at the pivot where multiple per-column MatrixRows are
folded into one row-major KeyVizRow.
@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented May 1, 2026

@claude review

Round-1 follow-ups (d7e5c43a) addressing 3 review items:

  • Gemini HIGH (proto/admin.proto:116) + Gemini MEDIUM (keyviz_fanout.go:529) — both resolved by switching the wire format from row-level scalars to per-cell parallel arrays (repeated uint64 raft_group_ids = 13; repeated uint64 leader_terms = 14; parallel to values[]). The pivot loops in adapter/admin_grpc.go and internal/admin/keyviz_handler.go now stamp row.RaftGroupIDs[j] = mr.RaftGroupID per column. mergeRowInto stamps the destination cell's identity from whichever source contributed the value kept at that cell — for maxMerge that's the larger source; for sumMerge it's best-effort last-touched (sum has no single owner; steady state has agreeing terms).

  • Codex P2 (main.go:1565)publishLeaderTerms snapshotted rt.engine into a local once before nil-check + Status() call, eliminating the race against rt.Close() setting rt.engine = nil during startup-error teardown.

Caller audit (per loop instructions, semantic change requires audit): the only production consumers of KeyVizRow.RaftGroupID/LeaderTerm were the pivot/merge paths I'm updating. grep -rn 'KeyVizRow.*\\.RaftGroupId\\b' returns nothing outside proto/admin.pb.go (generated). No external consumer reads these fields. The PR-3c aggregator that will USE the per-cell dedupe hasn't been written yet, so there are no callers expecting the old scalar shape.

New test: TestMergeKeyVizMatricesPerCellIdentityMatchesValueOwner drives a leadership flip across two cells with each leader winning one cell, asserts merged identity at each cell matches the value's owner (Gemini MEDIUM regression guard). Existing TestKeyVizHandlerStampsRaftIdentity and TestGetKeyVizMatrixStampsRaftIdentity extended to two-column scenarios with mid-window flips.

go test -race -count=1 ./keyviz/... ./internal/admin/... — pass; golangci-lint run (full project) — 0 issues.

@claude
Copy link
Copy Markdown

claude Bot commented May 1, 2026

Claude finished @bootjp's task in 4m 57s —— View job


Round-1 review: d7e5c43a

  • Gather context and understand request
  • Read changed files
  • Review proto changes (per-cell parallel arrays)
  • Review handler/adapter pivot loops
  • Review merge logic (per-cell identity)
  • Review main.go race fix
  • Review new tests

Verdict: Approved with one note

The three reviewer items are correctly addressed. The per-cell parallel-array design is the right call and the implementation is consistent across all three layers (proto → JSON → gRPC → fanout merge). Below is the detailed lens-by-lens assessment.


Proto / wire format (proto/admin.proto)

repeated uint64 raft_group_ids = 13 / repeated uint64 leader_terms = 14 is the correct shape for the stated goal. Field numbers 13/14 are reused from the first commit of this PR (which never hit main), so there is no deployed code with the old scalar encoding — the wire-type change from varint (scalar) to length-delimited packed (repeated) is safe in this branch-only context.

The invariant documented in the comment — "Both slices are either empty or len == len(values); the server never emits a partial-length slice" — is correctly maintained: every pivot allocates with make([]uint64, numCols), same as Values. ✓


Pivot loops (keyviz_handler.go, adapter/admin_grpc.go)

Both pivotKeyVizColumns and matrixToProto stamp RaftGroupIDs[j] / LeaderTerms[j] inside the column loop, where j is bounded to [0, numCols-1] by the range over cols. Index access is always in bounds. ✓

The key correctness property (Gemini HIGH): a route that appears across N columns now captures the per-column term, so a mid-window leader flip produces [42, 43] rather than [42, 42] (or the old scalar, which only captured the first column). TestKeyVizHandlerStampsRaftIdentity and TestGetKeyVizMatrixStampsRaftIdentity both drive the two-column flip scenario. ✓


Fanout merge (internal/admin/keyviz_fanout.go)

The identity-attribution logic:

if next == row.Values[j] && j < len(row.RaftGroupIDs) {
    dst.RaftGroupIDs[idx] = row.RaftGroupIDs[j]
}

Tracing through all maxMerge cases:

prev incoming next next == incoming action
0 X X stamp incoming identity
X 0 X ✗ (X≠0) keep existing identity (prev won) ✓
A>B B A ✗ (A≠B) keep existing identity (prev won) ✓
B A>B A stamp incoming identity ✓
X X≠0 X stamp incoming identity (tie; either is acceptable, documented) ✓

One minor behavioural note (not a bug): when prev == 0 and incoming == 0, maxMerge returns (0, false) and next == row.Values[j] holds, so identity is stamped for a zero-value cell from whichever source happens to be processed last. Zero-value cells carry no contribution for the Phase 2-C+ aggregator, so the spurious stamp is harmless — but it means RaftGroupIDs[j] / LeaderTerms[j] may be non-zero even when Values[j] == 0. The current test suite covers non-zero winner scenarios; a test with an all-zero cell (two sources both report 0 for a column) would confirm this is intentional rather than an oversight.

For sumMerge (reads): identity is stamped only when prev == 0 (first non-zero contributor), which matches the documented "best-effort last-touched" semantics. Subsequent contributors to the sum leave the identity from the first. ✓

The j < len(row.RaftGroupIDs) guard correctly handles legacy peers that send nil arrays: zero-length check means no identity is stamped, leaving dst.RaftGroupIDs[idx] at 0 (the "term not tracked" fallback). ✓

TestMergeKeyVizMatricesPerCellIdentityMatchesValueOwner is an effective regression guard for the Gemini MEDIUM scenario. ✓


Codex P2 – publishLeaderTerms race fix (main.go)

The fix:

engine := rt.engine      // snapshot once
if engine == nil {
    continue
}
snaps = append(snaps, ..., term: engine.Status().Term)

What it fixes: eliminates the TOCTOU window between the nil-check read and the Status() read; Close() can no longer set rt.engine = nil between the two and cause a nil-pointer dereference or panic. ✓

What remains: the initial read engine := rt.engine is still unsynchronized with rt.engine = nil in Close(). If they execute simultaneously the race detector can still flag it. That said:

  1. This is the same unguarded read pattern already used in raftMonitorRuntimes, pebbleMonitorSources, dispatchMonitorSources, and elsewhere — not newly introduced here.
  2. In practice the goroutine lifecycle prevents the race: the publisher runs inside the errgroup, and rt.Close() is in a cleanup.Add callback that fires after eg.Wait().
  3. The scenario Codex described ("setupAdminService fails before joining all goroutines") would need to be audited separately if a full fix were wanted — likely requiring sync/atomic.Pointer or a mutex around rt.engine.

For now this is an improvement, not a regression, and consistent with the existing codebase idiom.


Test coverage

Test Layer Scenario covered
TestKeyVizHandlerStampsRaftIdentity JSON Two-column flip (terms 42→43), absent route in col1 → zero identity
TestGetKeyVizMatrixStampsRaftIdentity gRPC Same scenario on the proto path
TestMergeKeyVizMatricesPreservesRaftIdentity fanout Steady-state: identity survives merge
TestMergeKeyVizMatricesPerCellIdentityMatchesValueOwner fanout Gemini MEDIUM: each leader wins one cell; identity tracks value owner
TestPublishLeaderTermsFromSnapshotsStampsRows publisher End-to-end sampler stamping
TestPublishLeaderTermsFromSnapshotsNilSafe publisher Nil sampler contract
TestStartKeyVizLeaderTermPublisherSkipsWhenSamplerNil/...NoRuntimes publisher Goroutine lifecycle

All layers are covered. The only gap worth noting (low priority) is a case where two sources both report zero for a cell, to confirm the zero-cell identity stamping behaviour is intentional.


Summary

The Gemini HIGH and MEDIUM issues are correctly resolved by the per-cell parallel-array design. The Codex P2 fix correctly eliminates the double-read TOCTOU issue (consistent with existing codebase patterns for the deeper race). The wire format contract, pivot logic, merge attribution, bounds guards, and tests are all sound. No blocking issues.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d7e5c43aff

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/admin/keyviz_fanout.go Outdated
Comment on lines +559 to +563
if next == row.Values[j] && j < len(row.RaftGroupIDs) {
dst.RaftGroupIDs[idx] = row.RaftGroupIDs[j]
}
if next == row.Values[j] && j < len(row.LeaderTerms) {
dst.LeaderTerms[idx] = row.LeaderTerms[j]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Clear raft identity when a winning source lacks metadata

When mergeRowInto picks the incoming value (next == row.Values[j]) from a legacy peer that does not send raft_group_ids/leader_terms (empty slices), the len guards skip assignment and leave whatever non-zero identity was previously in dst. In a mixed-version cluster, that mislabels an unknown-term winning cell as a known term from another source, which breaks the documented zero-sentinel fallback and can corrupt later per-term dedupe/summing. The merge should explicitly reset dst.RaftGroupIDs[idx] and dst.LeaderTerms[idx] to 0 whenever the incoming value wins but metadata is absent.

Useful? React with 👍 / 👎.

Comment thread main.go Outdated
Comment on lines +1573 to +1574
engine := rt.engine
if engine == nil {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Synchronize access to runtime engine in term publisher

publishLeaderTerms still performs an unsynchronized read of rt.engine (engine := rt.engine) while raftGroupRuntime.Close mutates the same field to nil during shutdown. The new publisher goroutine runs on a ticker, and run() cleanup can call rt.Close() after cancellation on error paths before goroutines are fully joined, so this introduces a real read/write race on the interface field even though double-read nil checks were removed. Use locking/atomic access (or stop the publisher before closing runtimes) to avoid race-detector failures and undefined behavior.

Useful? React with 👍 / 👎.

Two P2 items from Codex's second pass on `d7e5c43a`:

- Codex P2 #1 (keyviz_fanout.go) — fail-closed semantic
  for legacy peer winners. The previous code used a
  `j < len(row.RaftGroupIDs)` guard that SKIPPED the
  assignment when the incoming source had empty arrays
  (legacy peer in mixed-version cluster). That left the
  stale identity from a previous higher-versioned source
  in `dst.RaftGroupIDs[idx]`, mislabeling an unknown-term
  winning cell as a known term and breaking the per-cell
  dedupe/summing in PR-3c. Fixed by EXPLICITLY RESETTING
  to 0 (the documented "term not tracked" sentinel) when
  the incoming wins but its metadata array is shorter
  than j. Caller audit: only mergeKeyVizMatrices calls
  mergeRowInto; no external consumer relies on the
  stale-identity behaviour.

- Codex P2 #2 (main.go publishLeaderTerms) — race
  between the publisher's read of rt.engine and
  rt.Close()'s write. Even after round-1's local-snapshot
  fix, the initial read remained unsynchronized with
  Close()'s write. Fixed structurally by adding
  engineMu sync.RWMutex to raftGroupRuntime, exposing
  snapshotEngine() that takes the read lock, and updating
  Close() to take the write lock around clearing the
  field.

  Caller audit (per loop instruction — semantic change to
  field access pattern): grepped all `rt.engine` /
  `r.engine` reads:
  - main.go:858, 1104, 1108, 1115 — startup wiring,
    single-threaded BEFORE any goroutine launches; safe
    to read directly.
  - main_admin.go:828, 831, 837 (newClusterInfoSource) —
    HTTP/gRPC handler closure, runs concurrent with
    rt.Close(). SAME race class as the publisher.
    Migrated to snapshotEngine() for consistency.
  - multiraft_runtime.go:Close() — now takes engineMu
    write lock.

  No other readers exist outside startup.

Test: TestMergeKeyVizMatricesLegacyPeerWinnerResetsIdentity
drives a mixed-version cluster scenario (modern source
reports 30 with (group=7, term=42); legacy source reports
50 with empty arrays). Legacy wins maxMerge with value=50;
asserts dst.RaftGroupIDs / LeaderTerms reset to 0 instead
of inheriting modern's identity.

go test -race ./keyviz/... ./internal/admin/... — pass.
golangci-lint run (full project) — 0 issues.
@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented May 1, 2026

@claude review

Round-2 follow-ups (c3f0ce66) addressing 2 Codex P2 items on d7e5c43a:

  • Codex P2 Configure Renovate #1 (keyviz_fanout.go) — fail-closed for legacy peer winners: previous j < len(row.RaftGroupIDs) guard SKIPPED assignment when the incoming had empty arrays, leaving stale identity from a previous higher-versioned source. Fixed: explicitly reset to 0 (the documented "term not tracked" sentinel) when the incoming wins but its metadata array is shorter than j.

  • Codex P2 Update dependency go to v1.21.4 #2 (main.go publishLeaderTerms) — race with rt.Close(): even after round-1's local snapshot, the initial read of rt.engine was unsynchronized with Close()'s write. Fixed structurally by adding engineMu sync.RWMutex to raftGroupRuntime, exposing snapshotEngine() accessor (read lock) and updating Close() to clear the field under the write lock.

Caller audit (per loop instruction — both fixes are semantic changes):

  • mergeRowInto reset semantics: only caller is mergeKeyVizMatrices (same file). No external consumer of the stale-identity behavior.
  • raftGroupRuntime.engine field access: grepped all rt.engine / r.engine reads:
    • main.go:858, 1104, 1108, 1115 — startup wiring, single-threaded BEFORE goroutines launch. Safe to keep direct field reads.
    • main_admin.go:828, 831, 837 (newClusterInfoSource) — HTTP/gRPC handler closure, runs concurrent with rt.Close(). Same race class as the publisher. Migrated to snapshotEngine() for consistency.
    • multiraft_runtime.go:Close() — now takes engineMu.Lock() around clearing the field.

New test: TestMergeKeyVizMatricesLegacyPeerWinnerResetsIdentity drives mixed-version cluster (modern reports 30 with identity, legacy reports 50 with empty arrays). Legacy wins maxMerge → asserts dst.RaftGroupIDs/LeaderTerms reset to 0 instead of inheriting modern's identity.

go test -race -count=1 ./keyviz/... ./internal/admin/... — pass; golangci-lint run (full project) — 0 issues.

@claude
Copy link
Copy Markdown

claude Bot commented May 1, 2026

Claude finished @bootjp's task in 2m 10s —— View job


Round-2 review: c3f0ce66

  • Gather context and understand request
  • Read internal/admin/keyviz_fanout.go — legacy-peer fail-closed fix
  • Read internal/admin/keyviz_fanout_test.go — new TestMergeKeyVizMatricesLegacyPeerWinnerResetsIdentity
  • Read multiraft_runtime.goengineMu + snapshotEngine() + Close() locking
  • Read main.go — publisher race fix
  • Read main_admin.gonewClusterInfoSource migration to snapshotEngine()

Verdict: Approved

Both Codex P2 items are correctly resolved. The fixes are structural and complete, and the caller audit in the commit message is accurate. Detailed assessment follows.


Codex P2 #1keyviz_fanout.go fail-closed fix

The restructured logic:

if next == row.Values[j] {
    if j < len(row.RaftGroupIDs) {
        dst.RaftGroupIDs[idx] = row.RaftGroupIDs[j]
    } else {
        dst.RaftGroupIDs[idx] = 0      // ← reset, not skip
    }
    if j < len(row.LeaderTerms) {
        dst.LeaderTerms[idx] = row.LeaderTerms[j]
    } else {
        dst.LeaderTerms[idx] = 0
    }
}

Compared to round-1's guard-skips (j < len(...) { stamp } with no else), the explicit else { = 0 } is semantically significant: when a legacy peer wins, it cannot silently inherit a modern peer's term. The sentinel correctly propagates "this cell's term is unknown". ✓

All four ordering cases for maxMerge:

First source Second source next Identity at dst
modern (30, group=7) legacy (50, nil arrays) 50 0 — reset ✓
legacy (50, nil arrays) modern (30, group=7) 50 0 — skip update (50 ≠ 30) ✓
modern (50, group=7) legacy (30, nil arrays) 50 7 — modern keeps (50 ≠ 30) ✓
modern (30, group=7) modern2 (50, group=9) 50 9 — winner stamped ✓

For sumMerge (reads): the next == row.Values[j] condition holds only when prev == 0 (first non-zero contributor wins the identity assignment). If that first contributor is a legacy peer with empty arrays, identity is correctly reset to 0. Subsequent contributors don't overwrite identity, consistent with the documented "best-effort last-touched" semantics. ✓

TestMergeKeyVizMatricesLegacyPeerWinnerResetsIdentity correctly pins the primary mixed-version scenario (modern=30/group=7/term=42 processed first, legacy=50/nil processed second → value=50, identity=0). ✓

Minor note, informational only: the existing round-1 observation about all-zero cells (both sources report 0, identity written from the second source via 0 == 0) still applies. With the reset fix, a legacy peer is the second source over a zero cell → identity correctly resets to 0. A modern peer as second → non-zero identity stamped for a zero cell. This is harmless since zero-value cells carry no contribution for PR-3c's aggregator, and the behavior is consistent with the "either identity is acceptable in a tie" note in the comments.


Codex P2 #2multiraft_runtime.go / main.go / main_admin.go race fix

Close() pattern (releasing lock before calling engine.Close()):

r.engineMu.Lock()
engine := r.engine
r.engine = nil
r.engineMu.Unlock()
if engine != nil {
    engine.Close()   // outside the lock
}

Snapshot-under-lock + call-outside-lock is the correct idiom: it avoids holding engineMu while engine.Close() runs (potential lock inversion or deadlock if Close() acquires other locks), while ensuring the nil-write is atomic from the perspective of concurrent readers. ✓

snapshotEngine() correctness:

func (r *raftGroupRuntime) snapshotEngine() raftengine.Engine {
    if r == nil { return nil }
    r.engineMu.RLock()
    defer r.engineMu.RUnlock()
    return r.engine
}

Nil-receiver safe; returns a stable snapshot under RLock. After return, a concurrent Close() can set r.engine = nil but the caller holds the snapshot reference. The engine's own concurrency (e.g., Status() racing engine.Close()) is a pre-existing concern at the raftengine.Engine level and outside the scope of this fix. ✓

Caller audit verification — grep confirms the five distinct access sites:

  • main.go:858, 1104, 1108, 1115 — all within the startup loop that runs single-threaded before the errgroup is launched; these pass rt.engine directly into constructors that store the reference (not re-read). Safe to leave as direct reads. ✓
  • main_admin.go (newClusterInfoSource) — HTTP/gRPC handler closure captures the runtime slice and re-reads the engine on each request; migrated to snapshotEngine(). ✓
  • multiraft_runtime.go:Close() — now takes write lock. ✓

No missed call sites. The audit is complete and the migration boundary (startup direct reads vs. long-lived goroutine guarded reads) is sound.

Test coverage for the race fix: No new unit test for the locking behavior specifically, which is appropriate — a timing-sensitive test would be flaky. Running with -race flag across the existing test suite is the correct guard, and the commit message confirms it passes.


No blocking issues. Ready to merge.

@bootjp bootjp merged commit ebb7cc6 into main May 1, 2026
9 checks passed
@bootjp bootjp deleted the feat/keyviz-leader-term-wire branch May 1, 2026 18:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant