You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Round-26 Codex P2 on `788b6858`:
The round-25 PR-B row said "RegisterRoute only pre-creates
the legacy empty-label slot when --keyvizLabelsEnabled=false
(the default) — labeled-sibling pre-creation is gated on the
flag (round-17)". This implied that an operator setting
--keyvizLabelsEnabled=true in PR-B would cause RegisterRoute
to pre-create len(AllLabels)+1 slots, reintroducing the
early memory/route-budget regression PR-C+D+E tries to
defer.
Per Codex round-26 P2, PR-B should be unambiguously
legacy-slot-only regardless of the flag. The flag-honoring
RegisterRoute code lands in PR-C+D+E, not PR-B.
Updates:
- §7 PR-B row: clarified that PR-B introduces the flag for
CLI parsing + Opts plumbing only; RegisterRoute in PR-B
unconditionally pre-creates only the legacy slot
regardless of the flag value. The flag-honoring loop
lands in PR-C+D+E. Per-route slot count is unconditionally
1 in PR-B.
- §7 closing paragraph: same clarification — PR-B is
legacy-slots-only because the flag-honoring
pre-creation loop is in PR-C+D+E, not because the flag
is default-false.
- §4.1.1 PR-B/PR-C+D+E body pseudocode (sampler.go:416
slotsPerRoute calculation): split into two phases —
PR-B body stops at `slotsPerRoute := 1` (literal, no
flag check), PR-C+D+E extends to the flag-gated
conditional branch. Comment updated accordingly.
The flag still exists from PR-B (per round-25 P2 — PR-B
needs it on Opts for downstream wiring), but PR-B treats
it as dormant for slot pre-creation. PR-C+D+E activates
the flag's pre-creation effect in tandem with the
adapter-wiring + wire-format changes.
@@ -611,16 +619,22 @@ Virtual-bucket rows in the heatmap render as
611
619
| PR | Scope |
612
620
|---|---|
613
621
|**PR-A**| Land this design doc. |
614
-
| **PR-B** | Sampler API extension: extend the `Sampler` interface declaration (`keyviz/sampler.go:74`) AND the `MemSampler.Observe` implementation to `Observe(routeID uint64, op Op, keyLen, valueLen int, label Label)` (Claude bot round-8 minor: surfaces immediately as a compile error if any mock implementer is missed, but naming it keeps the scope complete). `routeSlot.Label`, slot-key type widens to `slotKey{uint64, Label}`. **PR-B introduces the `--keyvizLabelsEnabled` startup flag** (default `false`) on the server binary, so the flag exists from PR-B onward and PR-B's behavior-neutrality claim does not depend on PR-C+D+E (Codex round-25 P2). **`RegisterRoute` only pre-creates the legacy empty-label slot when `--keyvizLabelsEnabled=false` (the default)** — labeled-sibling pre-creation is gated on the flag (round-17), so PR-B with the default flag is a behavior-neutral refactor (per-route slot count is unchanged at 1). **PR-B creates `keyviz/labels.go`** with the `slotKey` struct, the `type Label string` definition, the `LabelLegacy`/`LabelDynamo`/`LabelRedis`/etc. **typed constants**, and `AllLabels` as a **fixed-size array** `[5]Label{LabelDynamo, ..., LabelRawKV}` (round-23 — array form prevents importing-package mutation; see §9 Q2). The constants must exist at compile time so `RemoveRoute`'s `allLabelsWithLegacy()` teardown loop and `RegisterRoute`'s pre-creation loop both compile; with the flag default to false, `RegisterRoute` skips the labeled-sibling loop body so per-route slot count stays at 1 (Claude bot round-6 + round-23 array correction). `MatrixRow.Label` field present. **L0 label copy** (Claude bot round-8 moderate): update `appendDrainedRow` (`keyviz/sampler.go:828`) to include `Label: slot.Label` in the `MatrixRow` struct literal — `slot.Label` is set once at `RegisterRoute` time and immutable, so read it directly rather than threading through `snapshotMeta`. This is the first link in the propagation chain: `slot.Label` → `MatrixRow.Label` (L0, here) → `KeyVizRow.Label` via `newKeyVizRowFrom` (L1, PR-D+E) → merged `KeyVizRow.Label` via `mergeRowInto` (L2, PR-D+E). Missing L0 silently truncates the chain and leaves `MatrixRow.Label = ""` even after PR-C+D+E wires the adapters to pass labels; L1 and L2 then faithfully copy the empty string and the feature collapses. With PR-B's flag-default-false (only legacy slots pre-created), `slot.Label` is `""` so the emitted row is `""`-labeled (identical to today's behavior), but the wiring is in place for PR-C+D+E to flip the flag and start emitting labels. `reclaimRetiredSlot` dedupes on `(RouteID, Label)` (forward-prep; only `Label=""` exists). The `MaxTrackedRoutes` coarsening check (`keyviz/sampler.go:416`) divides by `slotsPerRoute` (= `len(AllLabels)+1` when the flag is on, `1` when off; round-17 + round-23) so the cap continues to count **routes**, not slots; with the flag default to false in PR-B the divisor is 1 and behavior is identical to today (Claude round-24 fix to the prior "AllLabels empty in PR-B" framing — round-23 made AllLabels populated from PR-B, so the divisor=1 reason is the flag, not the array length). Update existing tests **and the coordinator call sites in `kv/sharded_coordinator.go`** (the only non-test caller of `Sampler.Observe`) to pass `label = ""`. |
622
+
| **PR-B** | Sampler API extension: extend the `Sampler` interface declaration (`keyviz/sampler.go:74`) AND the `MemSampler.Observe` implementation to `Observe(routeID uint64, op Op, keyLen, valueLen int, label Label)` (Claude bot round-8 minor: surfaces immediately as a compile error if any mock implementer is missed, but naming it keeps the scope complete). `routeSlot.Label`, slot-key type widens to `slotKey{uint64, Label}`. **PR-B introduces the `--keyvizLabelsEnabled` startup flag** (default `false`) on the server binary — CLI parsing and the `KeyVizLabelsEnabled bool` option on `ShardedCoordinatorOpts` / `MemSamplerOpts` ship in PR-B, so the flag exists from PR-B onward and PR-B's behavior-neutrality claim does not depend on PR-C+D+E (Codex round-25 P2). **`RegisterRoute` in PR-B always pre-creates only the legacy empty-label slot, regardless of the flag** (Codex round-26 P2 — earlier round-25 wording said pre-creation is "gated on the flag" in PR-B, which would have allowed an operator to enable the flag in PR-B and start allocating `len(AllLabels)+1` slots before any of the PR-C+D+E adapter-wiring or wire-format machinery existed). The `RegisterRoute` flag-honoring code lands in **PR-C+D+E**, not PR-B; in PR-B the flag is **dormant** for slot pre-creation (it exists for option plumbing only). Per-route slot count in PR-B is therefore unconditionally 1, regardless of the flag, and PR-B is fully behavior-neutral. **PR-B creates `keyviz/labels.go`** with the `slotKey` struct, the `type Label string` definition, the `LabelLegacy`/`LabelDynamo`/`LabelRedis`/etc. **typed constants**, and `AllLabels` as a **fixed-size array** `[5]Label{LabelDynamo, ..., LabelRawKV}` (round-23 — array form prevents importing-package mutation; see §9 Q2). The constants must exist at compile time so `RemoveRoute`'s `allLabelsWithLegacy()` teardown loop compiles in PR-B; `RegisterRoute`'s flag-honoring pre-creation loop is **not yet present in PR-B** (it lands in PR-C+D+E per the previous paragraph), so PR-B's `RegisterRoute` body is a single legacy-slot create call regardless of the flag. With PR-B unconditionally pre-creating only the legacy slot, per-route slot count stays at 1 (Claude bot round-6 + round-23 array correction + Codex round-26 P2 unconditionality). `MatrixRow.Label` field present. **L0 label copy** (Claude bot round-8 moderate): update `appendDrainedRow` (`keyviz/sampler.go:828`) to include `Label: slot.Label` in the `MatrixRow` struct literal — `slot.Label` is set once at `RegisterRoute` time and immutable, so read it directly rather than threading through `snapshotMeta`. This is the first link in the propagation chain: `slot.Label` → `MatrixRow.Label` (L0, here) → `KeyVizRow.Label` via `newKeyVizRowFrom` (L1, PR-D+E) → merged `KeyVizRow.Label` via `mergeRowInto` (L2, PR-D+E). Missing L0 silently truncates the chain and leaves `MatrixRow.Label = ""` even after PR-C+D+E wires the adapters to pass labels; L1 and L2 then faithfully copy the empty string and the feature collapses. With PR-B's flag-default-false (only legacy slots pre-created), `slot.Label` is `""` so the emitted row is `""`-labeled (identical to today's behavior), but the wiring is in place for PR-C+D+E to flip the flag and start emitting labels. `reclaimRetiredSlot` dedupes on `(RouteID, Label)` (forward-prep; only `Label=""` exists). The `MaxTrackedRoutes` coarsening check (`keyviz/sampler.go:416`) divides by `slotsPerRoute` (= `len(AllLabels)+1` when the flag is on, `1` when off; round-17 + round-23) so the cap continues to count **routes**, not slots; with the flag default to false in PR-B the divisor is 1 and behavior is identical to today (Claude round-24 fix to the prior "AllLabels empty in PR-B" framing — round-23 made AllLabels populated from PR-B, so the divisor=1 reason is the flag, not the array length). Update existing tests **and the coordinator call sites in `kv/sharded_coordinator.go`** (the only non-test caller of `Sampler.Observe`) to pass `label = ""`. |
615
623
| **PR-C+D+E** | **Bundled — must ship together.** PR-C (adapter wiring only — `AllLabels` constants are already declared in PR-B as part of the `[5]Label` array, see §9 Q2) and PR-D+E (wire format + pivot-key widening) are not separately shippable: once an adapter calls `Observe(..., LabelDynamo)`, the sampler emits multiple `MatrixRow` per route in a single column (one per label); without the pivot-key widening from PR-D+E, `pivotKeyVizColumns` and `matrixToProto` collapse those rows back into a single `RouteID`-keyed entry where each labeled row overwrites the previous one — non-deterministic data loss in the intermediate state. Earlier drafts framed PR-C and PR-D+E as separately shippable; that was wrong (Codex round-10 P1). The bundle covers: <br>**Adapter wiring (was PR-C)**: each adapter sets its label at the `ShardedCoordinator.Observe…` dispatch entry. The five canonical constants (`LabelDynamo`, `LabelRedis`, `LabelS3`, `LabelSQS`, `LabelRawKV`) are already declared in `keyviz/labels.go` from PR-B as part of the `[5]Label` array (round-23 — see §9 Q2 and the PR-B row above); PR-C+D+E only wires the adapters to pass them, no `AllLabels` modification needed (Claude round-24 fix to the prior "populate empty slice" framing). Extend `RegisterRoute` to pre-create the labeled siblings (one slot per `AllLabels` member) **only when `--keyvizLabelsEnabled=true`**; with the flag off, `RegisterRoute` continues to pre-create just the legacy empty-label slot, identical to today's behavior (Codex round-15-4th-pass P1). **MaxTrackedRoutes is unchanged in semantics** — the §4.1.1 coarsening check divides slot count by `slotsPerRoute` (`len(AllLabels)+1` when the flag is on, `1` when off), so the existing operator-set cap continues to mean "individually-tracked routes" exactly as before; **operators do not need to raise `--keyvizMaxTrackedRoutes`** (an earlier draft told operators to scale the cap by `× (len(labels)+1)` — that contradicted Option A; Codex round-6). Memory growth: `MaxTrackedRoutes × (len(AllLabels) + 1)` slots when labels are enabled, `MaxTrackedRoutes × 1` when off (today's baseline); documented but not capped. Emit `slog.Warn` from inside `RegisterRoute` when a route coarsens. <br>**Wire-format extension (was PR-D+E)**: proto + JSON `bucket_id` composite + optional `label` field, plus the SPA `route:N / label` rendering AND the **five** code changes across the three response paths: <br>**Single-node JSON path (`internal/admin/keyviz_handler.go`)**: <br>(a) `pivotKeyVizColumns` `rowsByID` map AND `order` slice both widen from `uint64` to the composite `BucketID string` — widening only the map without the `order` slice is a compile error; <br>(b) `newKeyVizRowFrom` (`keyviz_handler.go:368`) copies `mr.Label → row.Label` via an explicit `string(mr.Label)` cast (`MatrixRow.Label` is the typed `keyviz.Label`; `KeyVizRow.Label` is plain `string` for wire-format flexibility — Claude bot round-10 minor) — **first-level** Label copy, affects single-node and cluster deployments alike; <br>(c) `bucketIDFor` (`keyviz_handler.go:383`) returns the composite `"route:<id>:<label>"` when `mr.Label != ""`, falling back to the legacy `"route:<id>"` for empty labels — without this `BucketID` is non-unique and `applyKeyVizRowBudget` / `sortKeyVizRowsByStart` lose their deterministic tiebreak; <br>**Fan-out JSON path (`internal/admin/keyviz_fanout.go`)**: <br>(d) `mergeRowInto` (`keyviz_fanout.go:509`) adds `dst.Label = row.Label` — **second-level** Label copy, only the cluster fan-out path touches this; <br>**gRPC path (`adapter/admin_grpc.go`)**: <br>(e) `matrixToProto` (`admin_grpc.go:599`) and the per-row conversion it drives: (e1) **widen `rowsByID` (line 603) and `order` (line 604) from `uint64` to the composite `BucketID string` key** — same widening as item (a); without it `(routeID=1, label="dynamo")` and `(routeID=1, label="redis")` collapse to the same map entry; (e2) copy `MatrixRow.Label → KeyVizRow.label` (proto field 4) via `string(mr.Label)` cast (same typed→untyped reasoning as item (b)); (e3) emit composite `bucket_id` (`"route:<id>:<label>"`). Without (e1)–(e3), `GetKeyVizMatrix` gRPC clients receive collapsed unlabeled rows even though HTTP/SPA responses now show per-label rows. <br>All five copies are required; missing any one leaves a flavour of deployment with empty labels. Splitting the bundle into separate PR-C and PR-D+E was the original framing but is now rejected (see opening paragraph of this row). <br>**Operator-controlled rollout gate (rolling-upgrade safety)**: a normal rolling upgrade temporarily mixes nodes that emit legacy `route:<id>` rows with nodes that emit labeled `route:<id>:<label>` rows. The fan-out aggregator keys strictly by `BucketID` in `mergeRowInto`, so those rows do **not** merge — operators would see fragmented unlabeled-plus-labeled data per route until every node converges. **`--keyvizLabelsEnabled` is introduced in PR-B** (default `false`) so PR-B's behavior-neutrality claim (`RegisterRoute` skips labeled-sibling pre-creation, slot count stays at 1) holds without any PR-C+D+E dependency. PR-C+D+E adds the **coordinator override path**: when the flag is false, the **`ShardedCoordinator` overrides the adapter-supplied label to `keyviz.LabelLegacy` at the single `sampler.Observe(...)` call site** in `kv/sharded_coordinator.go` (one `if !s.keyvizLabelsEnabled { label = keyviz.LabelLegacy }` guard, not a 5-file flag-read duplication across adapters; Claude bot round-12 moderate). The override path only matters in PR-C+D+E because that is when adapters first start passing non-empty labels; in PR-B all adapters still pass `label = ""` so the override is a no-op. (Codex round-25 P2 caught the prior wording that said "PR-C+D+E adds the flag" — it implied PR-B couldn't enforce its own gating, when in fact PR-B introduces the flag for pre-creation gating and PR-C+D+E only adds the override path on top.) The bundled binary is therefore safe to roll out one node at a time — every node, mixed or fully upgraded, emits the legacy format. <br>**Flag-flip activation**: once the fleet is fully on the new binary, the operator flips `--keyvizLabelsEnabled=true`. The flag is a startup-only `flag.Bool` (no live-toggle / config-reload path; Claude bot round-12 minor); changing it requires a process restart. There are two restart strategies: (a) **simultaneous restart of all nodes** — KeyViz heatmap is briefly unavailable during the restart window but no mixed-format fragmentation occurs; (b) **rolling restart** — for the duration of the restart (typically minutes), the heatmap shows a transient mixed view because the legacy `route:N` rows from not-yet-restarted nodes don't merge with `route:N:label` rows from restarted nodes. Since KeyViz is a monitoring view (not a consistency-sensitive system), the rolling-restart fragmentation is acceptable and clears as the final node restarts; operators who want zero fragmentation should use the simultaneous restart. (Claude bot round-12 moderate.) <br>**Flag also gates pre-allocation**: `--keyvizLabelsEnabled` is **both** the traffic-routing toggle (override label to `LabelLegacy` at the coordinator) **and** the slot-pre-creation toggle. When `false`, `RegisterRoute` pre-creates only the legacy empty-label slot (`len(next.slots) += 1` per route, identical to today's behavior); labeled siblings are not allocated. When `true`, `RegisterRoute` pre-creates the legacy slot **and** one labeled sibling per `AllLabels` member (`len(next.slots) += len(AllLabels)+1` per route). Memory therefore stays at the today's level (`MaxTrackedRoutes × 1`) for clusters that deploy the bundled binary but leave the flag off, and grows to `MaxTrackedRoutes × (len(AllLabels)+1)` only when the operator opts in. The flag-flip activation requires a process restart (see preceding paragraph), and `RegisterRoute` re-runs at startup, so the new pre-creation regime is in effect immediately after the restart — there is no live re-allocation path needed. **An earlier draft separated memory and traffic-routing into two regimes ("memory-vs-flag separation", round-12) where labeled slots were pre-created regardless of the flag; that was wrong (Codex round-15-4th-pass P1) — it imposed a multi-x memory penalty on clusters that never enable labels, contradicting the §2.1 minimal-penalty goal.** (Codex round-11 P2 originated this rollout gate.) |
616
624
617
625
PR-B is independent of the bundled PR-C+D+E: PR-B widens types
618
626
and prepares the slot machinery without changing observable
619
-
behavior (legacy slots only, `AllLabels` is the populated
620
-
`[5]Label` array but `--keyvizLabelsEnabled` defaults to false
621
-
so `RegisterRoute` skips the labeled-sibling loop and the
622
-
`MaxTrackedRoutes` divisor stays at 1; round-23 array form, see
623
-
§9 Q2). The bundled PR-C+D+E then ships everything
627
+
behavior (legacy slots only — `RegisterRoute` in PR-B
628
+
unconditionally pre-creates only the legacy slot regardless of
629
+
`--keyvizLabelsEnabled`, since the flag-honoring pre-creation
630
+
loop lands in PR-C+D+E, not PR-B; per Codex round-26 P2 — see
631
+
the PR-B row above). `AllLabels` is the populated `[5]Label`
632
+
array from PR-B (round-23, see §9 Q2), but its only consumers in
633
+
PR-B are `RemoveRoute`'s `allLabelsWithLegacy()` teardown loop
634
+
(over an empty live-slot set under the unconditional PR-B
635
+
behavior) and the `TestAllLabels…` regression guards in §8.
636
+
`MaxTrackedRoutes` divisor stays at 1 in PR-B because per-route
637
+
slot count is 1. The bundled PR-C+D+E then ships everything
624
638
operator-visible — adapter wiring (constants are already
625
639
declared in `keyviz/labels.go` from PR-B; PR-C+D+E only wires
626
640
adapters to pass them at the dispatch entry, no change to
0 commit comments