Skip to content

Commit c788a96

Browse files
committed
docs: address PR #694 round-28 (Codex P1 #2 stale divisor)
Codex round-27 P1 #2 on `e7b0c132` (still applicable to `44d7cd45`): PR-B row claimed RegisterRoute is unconditionally legacy-slot-only (round-26 fix), but later in the same row said "MaxTrackedRoutes coarsening divides by slotsPerRoute (= len(AllLabels)+1 when the flag is on, 1 when off)". Implementer setting --keyvizLabelsEnabled=true on PR-B binary would compute divisor=6 with only 1 slot per route, expanding tracked-route capacity 6× beyond MaxTrackedRoutes. Fixed by aligning the §7 PR-B row's divisor description with round-26's unconditional model: - "MaxTrackedRoutes coarsening check in PR-B uses slotsPerRoute as the literal 1 — same unconditionality reason as the RegisterRoute body above; the flag-gated conditional lands in PR-C+D+E, not PR-B" - "PR-B with slotsPerRoute=1 keeps the cap counting routes (1 slot per route × N routes ≤ MaxTrackedRoutes), behavior identical to today regardless of any flag value" Also tightened the secondary "flag-default-false" reference in the L0 chain failure-mode discussion: the legacy-only behavior comes from the unconditional code structure, not from the flag default. Updated to "With PR-B unconditionally pre-creating only the legacy slot (regardless of flag, per round-26 P2)". Codex round-27 P1 #1 (PR-B labeled-siblings claim in §9 Q2) was already addressed in round-27 commit 44d7cd4.
1 parent 44d7cd4 commit c788a96

1 file changed

Lines changed: 1 addition & 1 deletion

File tree

docs/design/2026_04_28_proposed_keyviz_adapter_labels.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -629,7 +629,7 @@ Virtual-bucket rows in the heatmap render as
629629
| PR | Scope |
630630
|---|---|
631631
| **PR-A** | Land this design doc. |
632-
| **PR-B** | Sampler API extension: extend the `Sampler` interface declaration (`keyviz/sampler.go:74`) AND the `MemSampler.Observe` implementation to `Observe(routeID uint64, op Op, keyLen, valueLen int, label Label)` (Claude bot round-8 minor: surfaces immediately as a compile error if any mock implementer is missed, but naming it keeps the scope complete). `routeSlot.Label`, slot-key type widens to `slotKey{uint64, Label}`. **PR-B introduces the `--keyvizLabelsEnabled` startup flag** (default `false`) on the server binary — CLI parsing and the `KeyVizLabelsEnabled bool` option on `ShardedCoordinatorOpts` / `MemSamplerOpts` ship in PR-B, so the flag exists from PR-B onward and PR-B's behavior-neutrality claim does not depend on PR-C+D+E (Codex round-25 P2). **`RegisterRoute` in PR-B always pre-creates only the legacy empty-label slot, regardless of the flag** (Codex round-26 P2 — earlier round-25 wording said pre-creation is "gated on the flag" in PR-B, which would have allowed an operator to enable the flag in PR-B and start allocating `len(AllLabels)+1` slots before any of the PR-C+D+E adapter-wiring or wire-format machinery existed). The `RegisterRoute` flag-honoring code lands in **PR-C+D+E**, not PR-B; in PR-B the flag is **dormant** for slot pre-creation (it exists for option plumbing only). Per-route slot count in PR-B is therefore unconditionally 1, regardless of the flag, and PR-B is fully behavior-neutral. **PR-B creates `keyviz/labels.go`** with the `slotKey` struct, the `type Label string` definition, the `LabelLegacy`/`LabelDynamo`/`LabelRedis`/etc. **typed constants**, and `AllLabels` as a **fixed-size array** `[5]Label{LabelDynamo, ..., LabelRawKV}` (round-23 — array form prevents importing-package mutation; see §9 Q2). The constants must exist at compile time so `RemoveRoute`'s `allLabelsWithLegacy()` teardown loop compiles in PR-B; `RegisterRoute`'s flag-honoring pre-creation loop is **not yet present in PR-B** (it lands in PR-C+D+E per the previous paragraph), so PR-B's `RegisterRoute` body is a single legacy-slot create call regardless of the flag. With PR-B unconditionally pre-creating only the legacy slot, per-route slot count stays at 1 (Claude bot round-6 + round-23 array correction + Codex round-26 P2 unconditionality). `MatrixRow.Label` field present. **L0 label copy** (Claude bot round-8 moderate): update `appendDrainedRow` (`keyviz/sampler.go:828`) to include `Label: slot.Label` in the `MatrixRow` struct literal — `slot.Label` is set once at `RegisterRoute` time and immutable, so read it directly rather than threading through `snapshotMeta`. This is the first link in the propagation chain: `slot.Label` → `MatrixRow.Label` (L0, here) → `KeyVizRow.Label` via `newKeyVizRowFrom` (L1, PR-D+E) → merged `KeyVizRow.Label` via `mergeRowInto` (L2, PR-D+E). Missing L0 silently truncates the chain and leaves `MatrixRow.Label = ""` even after PR-C+D+E wires the adapters to pass labels; L1 and L2 then faithfully copy the empty string and the feature collapses. With PR-B's flag-default-false (only legacy slots pre-created), `slot.Label` is `""` so the emitted row is `""`-labeled (identical to today's behavior), but the wiring is in place for PR-C+D+E to flip the flag and start emitting labels. `reclaimRetiredSlot` dedupes on `(RouteID, Label)` (forward-prep; only `Label=""` exists). The `MaxTrackedRoutes` coarsening check (`keyviz/sampler.go:416`) divides by `slotsPerRoute` (= `len(AllLabels)+1` when the flag is on, `1` when off; round-17 + round-23) so the cap continues to count **routes**, not slots; with the flag default to false in PR-B the divisor is 1 and behavior is identical to today (Claude round-24 fix to the prior "AllLabels empty in PR-B" framing — round-23 made AllLabels populated from PR-B, so the divisor=1 reason is the flag, not the array length). Update existing tests **and the coordinator call sites in `kv/sharded_coordinator.go`** (the only non-test caller of `Sampler.Observe`) to pass `label = ""`. |
632+
| **PR-B** | Sampler API extension: extend the `Sampler` interface declaration (`keyviz/sampler.go:74`) AND the `MemSampler.Observe` implementation to `Observe(routeID uint64, op Op, keyLen, valueLen int, label Label)` (Claude bot round-8 minor: surfaces immediately as a compile error if any mock implementer is missed, but naming it keeps the scope complete). `routeSlot.Label`, slot-key type widens to `slotKey{uint64, Label}`. **PR-B introduces the `--keyvizLabelsEnabled` startup flag** (default `false`) on the server binary — CLI parsing and the `KeyVizLabelsEnabled bool` option on `ShardedCoordinatorOpts` / `MemSamplerOpts` ship in PR-B, so the flag exists from PR-B onward and PR-B's behavior-neutrality claim does not depend on PR-C+D+E (Codex round-25 P2). **`RegisterRoute` in PR-B always pre-creates only the legacy empty-label slot, regardless of the flag** (Codex round-26 P2 — earlier round-25 wording said pre-creation is "gated on the flag" in PR-B, which would have allowed an operator to enable the flag in PR-B and start allocating `len(AllLabels)+1` slots before any of the PR-C+D+E adapter-wiring or wire-format machinery existed). The `RegisterRoute` flag-honoring code lands in **PR-C+D+E**, not PR-B; in PR-B the flag is **dormant** for slot pre-creation (it exists for option plumbing only). Per-route slot count in PR-B is therefore unconditionally 1, regardless of the flag, and PR-B is fully behavior-neutral. **PR-B creates `keyviz/labels.go`** with the `slotKey` struct, the `type Label string` definition, the `LabelLegacy`/`LabelDynamo`/`LabelRedis`/etc. **typed constants**, and `AllLabels` as a **fixed-size array** `[5]Label{LabelDynamo, ..., LabelRawKV}` (round-23 — array form prevents importing-package mutation; see §9 Q2). The constants must exist at compile time so `RemoveRoute`'s `allLabelsWithLegacy()` teardown loop compiles in PR-B; `RegisterRoute`'s flag-honoring pre-creation loop is **not yet present in PR-B** (it lands in PR-C+D+E per the previous paragraph), so PR-B's `RegisterRoute` body is a single legacy-slot create call regardless of the flag. With PR-B unconditionally pre-creating only the legacy slot, per-route slot count stays at 1 (Claude bot round-6 + round-23 array correction + Codex round-26 P2 unconditionality). `MatrixRow.Label` field present. **L0 label copy** (Claude bot round-8 moderate): update `appendDrainedRow` (`keyviz/sampler.go:828`) to include `Label: slot.Label` in the `MatrixRow` struct literal — `slot.Label` is set once at `RegisterRoute` time and immutable, so read it directly rather than threading through `snapshotMeta`. This is the first link in the propagation chain: `slot.Label` → `MatrixRow.Label` (L0, here) → `KeyVizRow.Label` via `newKeyVizRowFrom` (L1, PR-D+E) → merged `KeyVizRow.Label` via `mergeRowInto` (L2, PR-D+E). Missing L0 silently truncates the chain and leaves `MatrixRow.Label = ""` even after PR-C+D+E wires the adapters to pass labels; L1 and L2 then faithfully copy the empty string and the feature collapses. With PR-B unconditionally pre-creating only the legacy slot (regardless of flag, per round-26 P2), `slot.Label` is `""` so the emitted row is `""`-labeled (identical to today's behavior), but the type-widening + L0 wiring is in place for PR-C+D+E to land the flag-honoring `RegisterRoute` loop and start emitting labels. `reclaimRetiredSlot` dedupes on `(RouteID, Label)` (forward-prep; only `Label=""` exists). The `MaxTrackedRoutes` coarsening check (`keyviz/sampler.go:416`) in PR-B uses `slotsPerRoute` as the **literal `1`** — same unconditionality reason as the `RegisterRoute` body above (Codex round-26 + round-27 P1: the flag-gated `slotsPerRoute := 1; if KeyVizLabelsEnabled { slotsPerRoute = len(AllLabels)+1 }` lands in PR-C+D+E, not PR-B). PR-B with `slotsPerRoute = 1` keeps the cap counting routes (1 slot per route × N routes ≤ MaxTrackedRoutes), behavior identical to today regardless of any flag value. Update existing tests **and the coordinator call sites in `kv/sharded_coordinator.go`** (the only non-test caller of `Sampler.Observe`) to pass `label = ""`. |
633633
| **PR-C+D+E** | **Bundled — must ship together.** PR-C (adapter wiring only — `AllLabels` constants are already declared in PR-B as part of the `[5]Label` array, see §9 Q2) and PR-D+E (wire format + pivot-key widening) are not separately shippable: once an adapter calls `Observe(..., LabelDynamo)`, the sampler emits multiple `MatrixRow` per route in a single column (one per label); without the pivot-key widening from PR-D+E, `pivotKeyVizColumns` and `matrixToProto` collapse those rows back into a single `RouteID`-keyed entry where each labeled row overwrites the previous one — non-deterministic data loss in the intermediate state. Earlier drafts framed PR-C and PR-D+E as separately shippable; that was wrong (Codex round-10 P1). The bundle covers: <br>**Adapter wiring (was PR-C)**: each adapter sets its label at the `ShardedCoordinator.Observe…` dispatch entry. The five canonical constants (`LabelDynamo`, `LabelRedis`, `LabelS3`, `LabelSQS`, `LabelRawKV`) are already declared in `keyviz/labels.go` from PR-B as part of the `[5]Label` array (round-23 — see §9 Q2 and the PR-B row above); PR-C+D+E only wires the adapters to pass them, no `AllLabels` modification needed (Claude round-24 fix to the prior "populate empty slice" framing). Extend `RegisterRoute` to pre-create the labeled siblings (one slot per `AllLabels` member) **only when `--keyvizLabelsEnabled=true`**; with the flag off, `RegisterRoute` continues to pre-create just the legacy empty-label slot, identical to today's behavior (Codex round-15-4th-pass P1). **MaxTrackedRoutes is unchanged in semantics** — the §4.1.1 coarsening check divides slot count by `slotsPerRoute` (`len(AllLabels)+1` when the flag is on, `1` when off), so the existing operator-set cap continues to mean "individually-tracked routes" exactly as before; **operators do not need to raise `--keyvizMaxTrackedRoutes`** (an earlier draft told operators to scale the cap by `× (len(labels)+1)` — that contradicted Option A; Codex round-6). Memory growth: `MaxTrackedRoutes × (len(AllLabels) + 1)` slots when labels are enabled, `MaxTrackedRoutes × 1` when off (today's baseline); documented but not capped. Emit `slog.Warn` from inside `RegisterRoute` when a route coarsens. <br>**Wire-format extension (was PR-D+E)**: proto + JSON `bucket_id` composite + optional `label` field, plus the SPA `route:N / label` rendering AND the **five** code changes across the three response paths: <br>**Single-node JSON path (`internal/admin/keyviz_handler.go`)**: <br>(a) `pivotKeyVizColumns` `rowsByID` map AND `order` slice both widen from `uint64` to the composite `BucketID string` — widening only the map without the `order` slice is a compile error; <br>(b) `newKeyVizRowFrom` (`keyviz_handler.go:368`) copies `mr.Label → row.Label` via an explicit `string(mr.Label)` cast (`MatrixRow.Label` is the typed `keyviz.Label`; `KeyVizRow.Label` is plain `string` for wire-format flexibility — Claude bot round-10 minor) — **first-level** Label copy, affects single-node and cluster deployments alike; <br>(c) `bucketIDFor` (`keyviz_handler.go:383`) returns the composite `"route:<id>:<label>"` when `mr.Label != ""`, falling back to the legacy `"route:<id>"` for empty labels — without this `BucketID` is non-unique and `applyKeyVizRowBudget` / `sortKeyVizRowsByStart` lose their deterministic tiebreak; <br>**Fan-out JSON path (`internal/admin/keyviz_fanout.go`)**: <br>(d) `mergeRowInto` (`keyviz_fanout.go:509`) adds `dst.Label = row.Label` — **second-level** Label copy, only the cluster fan-out path touches this; <br>**gRPC path (`adapter/admin_grpc.go`)**: <br>(e) `matrixToProto` (`admin_grpc.go:599`) and the per-row conversion it drives: (e1) **widen `rowsByID` (line 603) and `order` (line 604) from `uint64` to the composite `BucketID string` key** — same widening as item (a); without it `(routeID=1, label="dynamo")` and `(routeID=1, label="redis")` collapse to the same map entry; (e2) copy `MatrixRow.Label → KeyVizRow.label` (proto field 4) via `string(mr.Label)` cast (same typed→untyped reasoning as item (b)); (e3) emit composite `bucket_id` (`"route:<id>:<label>"`). Without (e1)–(e3), `GetKeyVizMatrix` gRPC clients receive collapsed unlabeled rows even though HTTP/SPA responses now show per-label rows. <br>All five copies are required; missing any one leaves a flavour of deployment with empty labels. Splitting the bundle into separate PR-C and PR-D+E was the original framing but is now rejected (see opening paragraph of this row). <br>**Operator-controlled rollout gate (rolling-upgrade safety)**: a normal rolling upgrade temporarily mixes nodes that emit legacy `route:<id>` rows with nodes that emit labeled `route:<id>:<label>` rows. The fan-out aggregator keys strictly by `BucketID` in `mergeRowInto`, so those rows do **not** merge — operators would see fragmented unlabeled-plus-labeled data per route until every node converges. **`--keyvizLabelsEnabled` is introduced in PR-B** (default `false`) so PR-B's behavior-neutrality claim (`RegisterRoute` skips labeled-sibling pre-creation, slot count stays at 1) holds without any PR-C+D+E dependency. PR-C+D+E adds the **coordinator override path**: when the flag is false, the **`ShardedCoordinator` overrides the adapter-supplied label to `keyviz.LabelLegacy` at the single `sampler.Observe(...)` call site** in `kv/sharded_coordinator.go` (one `if !s.keyvizLabelsEnabled { label = keyviz.LabelLegacy }` guard, not a 5-file flag-read duplication across adapters; Claude bot round-12 moderate). The override path only matters in PR-C+D+E because that is when adapters first start passing non-empty labels; in PR-B all adapters still pass `label = ""` so the override is a no-op. (Codex round-25 P2 caught the prior wording that said "PR-C+D+E adds the flag" — it implied PR-B couldn't enforce its own gating, when in fact PR-B introduces the flag for pre-creation gating and PR-C+D+E only adds the override path on top.) The bundled binary is therefore safe to roll out one node at a time — every node, mixed or fully upgraded, emits the legacy format. <br>**Flag-flip activation**: once the fleet is fully on the new binary, the operator flips `--keyvizLabelsEnabled=true`. The flag is a startup-only `flag.Bool` (no live-toggle / config-reload path; Claude bot round-12 minor); changing it requires a process restart. There are two restart strategies: (a) **simultaneous restart of all nodes** — KeyViz heatmap is briefly unavailable during the restart window but no mixed-format fragmentation occurs; (b) **rolling restart** — for the duration of the restart (typically minutes), the heatmap shows a transient mixed view because the legacy `route:N` rows from not-yet-restarted nodes don't merge with `route:N:label` rows from restarted nodes. Since KeyViz is a monitoring view (not a consistency-sensitive system), the rolling-restart fragmentation is acceptable and clears as the final node restarts; operators who want zero fragmentation should use the simultaneous restart. (Claude bot round-12 moderate.) <br>**Flag also gates pre-allocation**: `--keyvizLabelsEnabled` is **both** the traffic-routing toggle (override label to `LabelLegacy` at the coordinator) **and** the slot-pre-creation toggle. When `false`, `RegisterRoute` pre-creates only the legacy empty-label slot (`len(next.slots) += 1` per route, identical to today's behavior); labeled siblings are not allocated. When `true`, `RegisterRoute` pre-creates the legacy slot **and** one labeled sibling per `AllLabels` member (`len(next.slots) += len(AllLabels)+1` per route). Memory therefore stays at the today's level (`MaxTrackedRoutes × 1`) for clusters that deploy the bundled binary but leave the flag off, and grows to `MaxTrackedRoutes × (len(AllLabels)+1)` only when the operator opts in. The flag-flip activation requires a process restart (see preceding paragraph), and `RegisterRoute` re-runs at startup, so the new pre-creation regime is in effect immediately after the restart — there is no live re-allocation path needed. **An earlier draft separated memory and traffic-routing into two regimes ("memory-vs-flag separation", round-12) where labeled slots were pre-created regardless of the flag; that was wrong (Codex round-15-4th-pass P1) — it imposed a multi-x memory penalty on clusters that never enable labels, contradicting the §2.1 minimal-penalty goal.** (Codex round-11 P2 originated this rollout gate.) |
634634

635635
PR-B is independent of the bundled PR-C+D+E: PR-B widens types

0 commit comments

Comments
 (0)