Skip to content

Commit edaaa64

Browse files
authored
feat(admin): SPA degraded banner + conflict hatching for fan-out (Phase 2-C PR-2) (#687)
## Summary Phase 2-C **PR-2**, per the design doc landed in #685 §6: wire the fan-out `Fanout` / `conflict` shapes shipped by PR #686 into the admin SPA so operators can see when a peer failed and which rows are soft due to a leadership flip. Backend is unchanged. PR #686's `KeyVizMatrix.Fanout` block is currently emitted on every fan-out request but the SPA was discarding it. ## What this PR adds | Piece | Behavior | |---|---| | **Header counter** | `cluster view (N of M nodes)` in the per-matrix metadata strip. Hidden when `fanout` is absent (single-node mode, no behavior change). | | **FanoutBanner** | Degraded-mode card above the heatmap when `responded < expected`. Lists every failed node + error string. Hidden when the cluster is healthy. | | **ConflictOverlay** | SVG hatch over rows whose `conflict === true`. Layered inside the scroll container so the hatch tracks horizontal scroll (same idiom as `TimeAxis`). 4 px diagonal stripes, alpha 0.45, inheriting `currentColor` so it works against both light and dark themes. | | **RowDetail pill** | Hovering a conflicting row reveals a `conflict` pill with a hover-tooltip explaining the leadership-flip semantics. | ## Five-lens self-review 1. **Data loss** — n/a; SPA-only change reading existing fields. 2. **Concurrency / distributed** — n/a; single browser tab consuming an existing API. 3. **Performance** — Banner: zero-cost when healthy (`null` returned). Hatch: SVG with at most 1024 `<rect>` elements (matches `KeyVizRow` budget); empirically negligible. Overlay only renders when at least one row has `conflict === true`. 4. **Data consistency** — UI surfaces existing server signals; semantics come from PR #685 design §4.2 and #686 implementation. Not relitigated here. 5. **Test coverage** — `tsc -b --noEmit` and `vite build` clean. UI behavior (banner visibility, hatch presence, header text) is hard to assert in `tsc` alone; manual verification documented below. ## Test plan - [x] `npm run lint` (`tsc -b --noEmit`) — clean - [x] `npm run build` (Vite) — clean, output writes to `internal/admin/dist` - [ ] Manual: `make run` with `--keyvizEnabled --keyvizFanoutNodes=127.0.0.1:8080`. Open `/keyviz`; header reads `cluster view (1 of 1 nodes)`, no banner, no hatch. - [ ] Manual: configure `--keyvizFanoutNodes=127.0.0.1:8080,127.0.0.1:9999` (port 9999 unused). Header reads `1 of 2 nodes`, the failed-node banner appears with `connection refused` for `127.0.0.1:9999`. - [ ] Manual: smoke a leadership flip mid-window (or fake `conflict=true` server-side via `/admin/api/v1/keyviz/matrix?...`); the affected row renders with diagonal hatch overlay; hovering reveals the `conflict` pill. ## Out of scope - **Per-cell hatching** — Phase 2-C+ once the proto extension lands and `conflict` becomes per-cell instead of per-row. - **Per-node `generated_at` skew indicator** — design §9 Q3, deferred. - **Banner auto-dismiss / sound notification** — out of scope; an operator who has the page open will see the banner. CI alert pages are a separate system. Closes the SPA half of Phase 2-C; the proto extension follows in PR-3. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * KeyViz now displays cluster fan-out status with node availability and responsiveness counts. * Added visual indicators for rows experiencing conflicts during leadership transitions. * Row details now include conflict status labels with informative tooltips. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
2 parents 4f3d061 + a3b3be7 commit edaaa64

2 files changed

Lines changed: 151 additions & 3 deletions

File tree

web/admin/src/api/client.ts

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -231,13 +231,41 @@ export interface KeyVizRow {
231231
route_ids_truncated?: boolean;
232232
route_count: number;
233233
values: number[];
234+
// Phase 2-C row-level conflict flag (always present on the wire,
235+
// defaults to false). True when ≥2 nodes reported a non-zero
236+
// value for the same cell — typically a leadership flip mid-
237+
// window. Per design 4.2 / PR-1, this is a row-level signal; it
238+
// moves to per-cell when the proto extension lands in 2-C+.
239+
conflict?: boolean;
240+
}
241+
242+
// Per-node entry in the KeyVizMatrix.fanout block. OK=true means
243+
// the node returned a parseable matrix; OK=false carries the
244+
// reason (timeout, refused, 5xx body, JSON decode failure). Self
245+
// always reports ok=true since the local matrix is computed in-
246+
// process.
247+
export interface KeyVizFanoutNode {
248+
node: string;
249+
ok: boolean;
250+
error?: string;
251+
}
252+
253+
// Per-response fan-out summary attached to KeyVizMatrix.fanout
254+
// when the operator configured --keyvizFanoutNodes. Absent when
255+
// fan-out is disabled. nodes is ordered self-first, then in
256+
// --keyvizFanoutNodes order. Responded counts ok=true entries.
257+
export interface KeyVizFanoutResult {
258+
nodes: KeyVizFanoutNode[];
259+
responded: number;
260+
expected: number;
234261
}
235262

236263
export interface KeyVizMatrix {
237264
column_unix_ms: number[];
238265
rows: KeyVizRow[];
239266
series: KeyVizSeries;
240267
generated_at: string;
268+
fanout?: KeyVizFanoutResult;
241269
}
242270

243271
export interface KeyVizParams {

web/admin/src/pages/KeyViz.tsx

Lines changed: 123 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
import { useEffect, useMemo, useRef, useState } from "react";
2-
import type { KeyVizMatrix, KeyVizRow, KeyVizSeries } from "../api/client";
2+
import type { KeyVizFanoutResult, KeyVizMatrix, KeyVizRow, KeyVizSeries } from "../api/client";
33
import { api } from "../api/client";
44
import { ramp } from "../lib/colorRamp";
55
import { formatApiError, useApiQuery } from "../lib/useApi";
@@ -56,6 +56,8 @@ export function KeyVizPage() {
5656
</div>
5757
</header>
5858

59+
{matrix.data?.fanout && <FanoutBanner fanout={matrix.data.fanout} />}
60+
5961
<section className="card">
6062
{matrix.loading && !matrix.data && (
6163
<div className="text-sm text-muted">Loading…</div>
@@ -196,6 +198,14 @@ function Heatmap({ matrix }: HeatmapProps) {
196198
{matrix.rows.length} rows × {matrix.column_unix_ms.length} columns ·
197199
series <code className="font-mono">{matrix.series}</code> · max ={" "}
198200
{maxValue.toLocaleString()}
201+
{matrix.fanout && (
202+
<>
203+
{" · "}
204+
<span>
205+
cluster view ({matrix.fanout.responded} of {matrix.fanout.expected} nodes)
206+
</span>
207+
</>
208+
)}
199209
</span>
200210
<span>{new Date(matrix.generated_at).toLocaleString()}</span>
201211
</div>
@@ -209,13 +219,14 @@ function Heatmap({ matrix }: HeatmapProps) {
209219
// canvas as the user scrolls horizontally. Putting it outside
210220
// would freeze the labels under the left edge whenever the
211221
// canvas overflows.
212-
<div className="overflow-auto border border-border rounded">
222+
<div className="overflow-auto border border-border rounded relative">
213223
<canvas
214224
ref={canvasRef}
215225
onMouseMove={onMove}
216226
onMouseLeave={onLeave}
217227
style={{ display: "block", width, height }}
218228
/>
229+
<ConflictOverlay rows={matrix.rows} cellH={cellH} width={width} />
219230
<TimeAxis columnUnixMs={matrix.column_unix_ms} cellW={cellW} />
220231
</div>
221232
)}
@@ -226,6 +237,107 @@ function Heatmap({ matrix }: HeatmapProps) {
226237
);
227238
}
228239

240+
// FanoutBanner renders the degraded-mode strip above the heatmap when
241+
// at least one peer failed to respond. Hidden when responded ===
242+
// expected so a healthy cluster keeps the page clean. Lists every
243+
// failed node with its error so operators can debug without checking
244+
// per-node logs.
245+
interface FanoutBannerProps {
246+
fanout: KeyVizFanoutResult;
247+
}
248+
249+
function FanoutBanner({ fanout }: FanoutBannerProps) {
250+
if (fanout.responded >= fanout.expected) return null;
251+
const failed = fanout.nodes.filter((n) => !n.ok);
252+
return (
253+
<div className="card border-danger/40">
254+
<div className="text-sm font-semibold text-danger mb-2">
255+
Cluster view degraded — {fanout.responded} of {fanout.expected} nodes responded
256+
</div>
257+
<ul className="text-xs text-muted space-y-0.5">
258+
{failed.map((n) => (
259+
<li key={n.node} className="font-mono">
260+
<span className="text-danger">×</span> {n.node}
261+
{n.error && <span className="text-muted">{n.error}</span>}
262+
</li>
263+
))}
264+
</ul>
265+
<p className="text-xs text-muted mt-2">
266+
The heatmap reflects the {fanout.responded} responding node{fanout.responded === 1 ? "" : "s"} only;
267+
traffic served by the failed peers is not shown until they recover.
268+
</p>
269+
</div>
270+
);
271+
}
272+
273+
// ConflictOverlay layers a thin striped pattern over rows whose
274+
// merge produced disagreeing per-node values (Phase 2-C row-level
275+
// conflict flag). Per design 4.2, this is the SPA's signal that the
276+
// row's totals are best-effort dedup during a leadership flip and
277+
// may understate the true window. The overlay is an SVG layered
278+
// inside the scroll container so it tracks the canvas's scroll
279+
// position (same idiom as TimeAxis).
280+
//
281+
// Patterns rather than colour because the underlying canvas already
282+
// uses colour for intensity; a hatch communicates "soft data here"
283+
// without competing with the heatmap signal. SVG sized to (width,
284+
// rows.length × cellH) mirrors the canvas exactly.
285+
interface ConflictOverlayProps {
286+
rows: KeyVizRow[];
287+
cellH: number;
288+
width: number;
289+
}
290+
291+
function ConflictOverlay({ rows, cellH, width }: ConflictOverlayProps) {
292+
const conflictRows = useMemo(() => {
293+
const out: number[] = [];
294+
for (let i = 0; i < rows.length; i++) {
295+
if (rows[i].conflict) out.push(i);
296+
}
297+
return out;
298+
}, [rows]);
299+
if (conflictRows.length === 0) return null;
300+
const totalH = rows.length * cellH;
301+
return (
302+
<svg
303+
width={width}
304+
height={totalH}
305+
style={{
306+
position: "absolute",
307+
top: 0,
308+
left: 0,
309+
pointerEvents: "none",
310+
}}
311+
aria-hidden="true"
312+
>
313+
<defs>
314+
{/* Diagonal hatch — 4px stride, 1px lines. The stroke uses
315+
currentColor so the overlay inherits the text colour and
316+
stays visible against both light and dark themes. */}
317+
<pattern
318+
id="keyviz-conflict-hatch"
319+
width={4}
320+
height={4}
321+
patternUnits="userSpaceOnUse"
322+
patternTransform="rotate(45)"
323+
>
324+
<line x1={0} y1={0} x2={0} y2={4} stroke="currentColor" strokeWidth={1} opacity={0.45} />
325+
</pattern>
326+
</defs>
327+
{conflictRows.map((i) => (
328+
<rect
329+
key={i}
330+
x={0}
331+
y={i * cellH}
332+
width={width}
333+
height={cellH}
334+
fill="url(#keyviz-conflict-hatch)"
335+
/>
336+
))}
337+
</svg>
338+
);
339+
}
340+
229341
interface TimeAxisProps {
230342
columnUnixMs: number[];
231343
cellW: number;
@@ -284,10 +396,18 @@ function RowDetail({ row, index }: RowDetailProps) {
284396
const total = row.values.reduce((a, b) => a + b, 0);
285397
return (
286398
<div className="card text-sm">
287-
<div className="flex items-center gap-2 mb-2">
399+
<div className="flex items-center gap-2 mb-2 flex-wrap">
288400
<span className="text-xs text-muted">Row {index}</span>
289401
<span className="font-mono">{row.bucket_id}</span>
290402
{row.aggregate && <span className="pill-muted text-xs">aggregate</span>}
403+
{row.conflict && (
404+
<span
405+
className="pill-muted text-xs"
406+
title="Two or more nodes reported a non-zero value for the same cell — typical during a leadership flip mid-window. Displayed totals may understate the true count by up to one leader's pre-transfer slice."
407+
>
408+
conflict
409+
</span>
410+
)}
291411
</div>
292412
<dl className="grid grid-cols-2 gap-x-4 gap-y-1 text-xs">
293413
<dt className="text-muted">Start</dt>

0 commit comments

Comments
 (0)