console: push cluster filter down in replica utilization history query by jubrad · Pull Request #37323 · MaterializeInc/materialize

jubrad · 2026-06-27T02:57:59Z

The cluster-detail page recomputes the whole-fleet replica-utilization rollup on every load and only filters to the viewed cluster at the very end. Because the cluster_id predicate enters at the final join, the optimizer can't push it into the shared replica_utilization_history_binned CTE that the five Top-1 argmax CTEs all read — so the metrics aggregate and all five Top-1 passes run over every replica in the deployment before the last step discards other clusters.

This pushes the filter to the front:

filter replica_history to the cluster in both UNION branches, so the predicate reaches the source reads instead of the final join;
restrict the offline-event scan to that replica set (a full status-history scan becomes a lookup);
drop the redundant replica_history re-join in the binned CTE (the rollup is already derived from it, and it could fan out on a replica that ever had two sizes).

Output is unchanged — verified row-for-row against the original query on synthetic fleets. Only execution differs: EXPLAIN shows the cluster_id filter pushed down to the replica_history source reads, with the five Top-1 passes now scoped to one cluster.

Console-only; no catalog or index changes. On a synthetic single-worker fleet (matching mz_catalog_server) the per-cluster page-load p50 drops ~8–15× and, unlike before, no longer grows with total fleet size:

clusters	old p50 (1 viewer)	new p50 (1 viewer)	speedup
25	3.7 s	0.48 s	7.7×
100	14.8 s	1.2 s	12.4×
200	29.3 s	2.2 s	13.0×

The gain widens with concurrent viewers and with fleet size. In production the win is larger still: these synthetic tables are unindexed, whereas mz_cluster_replica_metrics_history / mz_cluster_replica_status_history are indexed on replica_id, so the pushed-down replica set turns the residual full scans into lookups.

The cluster-detail page recomputes the whole-fleet utilization rollup on every load and only filters to the viewed cluster at the very end. Because the cluster predicate enters at the final join, the optimizer can't push it into the shared replica_utilization_history_binned CTE that the five Top-1 argmax CTEs all read, so the metrics aggregate and all five Top-1 passes run over every replica in the deployment before the last step discards other clusters. Three changes scope the heavy work to the requested cluster(s): - filter replica_history to the cluster up front (both UNION branches), so the cluster predicate reaches the source reads instead of the final join; - restrict the offline-event scan to that replica set, turning a full status-history scan into a lookup; - drop the redundant replica_history re-join in the binned CTE (the rollup is already derived from it, and it could fan out on a replica that ever had two sizes). Output is unchanged (verified row-for-row on synthetic fleets); only execution differs. On a synthetic fleet the per-cluster page-load latency drops ~8-15x and, unlike before, no longer grows with total fleet size. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

jubrad force-pushed the console-replica-utilization-pushdown branch from bfef2b5 to f55904e Compare June 27, 2026 03:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

console: push cluster filter down in replica utilization history query#37323

console: push cluster filter down in replica utilization history query#37323
jubrad wants to merge 1 commit into
MaterializeInc:mainfrom
jubrad:console-replica-utilization-pushdown

jubrad commented Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

jubrad commented Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant