Downsample and throttle timeseries plots by SimonHeybrock · Pull Request #942 · scipp/esslivedata

SimonHeybrock · 2026-05-26T08:20:28Z

Summary

Closes #940.

Long-running timeseries plots are laggy because every Kafka delta rebuilds the full hv.Curve from the entire buffered history and ships it through pipe.send. The timeseries plotter now exposes three time-based knobs — Period, Recent Window, Floor Period — that together bound the displayed point count and throttle plot updates to the chosen period.

Defaults (1 s / 1 h / 5 min) keep a multi-day 1 Hz run near ~4 000 points indefinitely, regardless of buffer size.

By setting "Floor Period" to zero the user can achieve display of a moving window such as the last hour.

Example

Plot showing transition from downsampled floor rate to downsampled rate in recent window

Design notes

Downsampling lives at the plotter, not the extractor. The subscription stays a plain full-history pull; per-plot config can change without re-subscribing.
The throttle short-circuits compute() when no new data has crossed Period, which means no autoscaler run, no hv.Curve build, no _set_cached_state, no presenter dirty bit, and crucially no pipe.send / Bokeh patch / WebSocket flush / browser repaint.
The extractor's _to_local_datetime O(N) work is not skipped — that would need a gate in DataService. The skipped portion is the dominant cost (~100 ms pipe.send at N=100 k per the issue's measurements).
Knobs are time-based, not point-based: users reason in time, not in budgets. max_points was considered and dropped.
Bucket boundaries are anchored to the epoch (fixed time grid), so kept samples don't slide as new data arrives. The recent-window cutoff is quantized to the floor period; actual recent length runs from recent_seconds to recent_seconds + floor_period_seconds (soft lower bound).

🤖 Generated with Claude Code

Long-running timeseries (~100k points after a day at 1 Hz) make the dashboard sluggish because the full hv.Curve is rebuilt and shipped through pipe.send on every tick. The timeseries plotter now exposes three time-based knobs - Period, Recent Window, Floor Period - that together bound point count and throttle plot updates to the chosen period. Defaults keep a multi-day 1 Hz run near ~4000 points indefinitely. Downsampling happens at the plotter (not the extractor) so the subscription remains a simple full-history pull and the per-plot config can change without re-subscribing. The throttle short-circuits compute() when no new data has crossed Period, which skips the autoscaler, hv.Curve build, pipe.send, and downstream browser repaint. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Plan doc moved into code: the "where logic lives" rationale becomes a docstring on LinePlotter.from_timeseries_params, and the throttle semantics (what compute() skips on a short-circuit) become a docstring on the compute() override. The downsample_timeseries module docstring no longer references the deleted plan. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

FullHistoryExtractor guarantees a datetime64 time coord by the time data reaches LinePlotter.compute(), so the int64-with-time-unit fallback in _to_int64_ns and _latest_time_ns was dead code that silently masked any upstream regression. The helper _to_int64_ns is removed and its cast inlined; _latest_time_ns now assumes datetime64 directly. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

FullHistoryExtractor guarantees a non-empty datetime64 time coord on every DataArray it produces, so the per-key dim/coord/size guards in _latest_time_ns were redundant. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The previous bucket-first scheme anchored buckets at the band's first sample and force-included the very last index as a special case to keep the curve's tail aligned with the lag indicator. Anchoring buckets at the band's last sample instead makes bucket 0 contain the latest by construction, so the special case disappears. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

scipp.DataArray supports `data[dim, np_indices]` directly, which slices the data values, dim-aligned coords, and variances in one step and leaves scalar coords untouched. The hand-rolled `_select_indices` helper that rebuilt the DataArray field-by-field was reinvention. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Replaces the numpy datetime64->int64 cast and per-band index arithmetic with scipp operations: `.to(unit='ns')` for the coord, scipp arithmetic on datetime/timedelta variables for the bucket math, and a single boolean keep-mask via `sc.where` selecting per-sample band reference and period. The final selection uses scipp boolean indexing rather than fancy-int indexing of concatenated index arrays. Benchmarked on 1k-1M point inputs with typical dashboard parameters (recent=1h@1s, floor=5min): - n>=10k: 1.2-1.4x faster (1M points: 13ms -> 9ms) - n=1k: slower in absolute terms (~200us overhead) but still sub-ms Outputs are identical to the previous implementation at all sizes.

The previous design end-anchored buckets at each band's latest sample. That made bucket IDs of all existing samples shift by +1 every tick, so although the keep *pattern* was preserved, the actual kept samples slid one position forward with every update. Anchoring buckets to the epoch instead gives a fixed time grid: kept samples sit on absolute time-quanta and don't move as new samples arrive. The recent-band cutoff is quantized to the floor period so band membership is stable between quantum crossings; the actual recent length is now `recent_seconds` to `recent_seconds + floor_period_seconds` (soft lower bound). At each crossing one floor period of samples retires from the recent band as a batch. Side benefits: the `latest_older` lookup and per-band reference time disappear. At 1M points the function drops from ~10 ms to ~4.6 ms (2.3x); cumulative ~2.8x over the original numpy implementation.

SimonHeybrock and others added 6 commits May 26, 2026 08:19

Trust the timeseries data contract in _latest_time_ns

3a4c0a8

FullHistoryExtractor guarantees a non-empty datetime64 time coord on every DataArray it produces, so the per-key dim/coord/size guards in _latest_time_ns were redundant. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

This was referenced May 26, 2026

Gate Plotter.compute on active-consumer interest #944

Closed

Gate plotter.compute on viewer interest #946

Open

SimonHeybrock added 2 commits May 27, 2026 08:18

SimonHeybrock marked this pull request as ready for review May 27, 2026 08:55

SimonHeybrock requested a review from nvaytet May 27, 2026 08:55

Merge branch 'main' into 940-timeseries-downsample

f3b2430

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Downsample and throttle timeseries plots#942

Downsample and throttle timeseries plots#942
SimonHeybrock wants to merge 9 commits into
mainfrom
940-timeseries-downsample

SimonHeybrock commented May 26, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

SimonHeybrock commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Example

Design notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

SimonHeybrock commented May 26, 2026 •

edited

Loading