Gate plotter.compute on viewer interest#946
Open
SimonHeybrock wants to merge 4 commits into
Open
Conversation
Alternative shape to #944: same lazy-compute semantics, but the gate lives on LayerStateMachine instead of inside Plotter. Plotter stays a pure compute artifact -- no set_active, no compute/_build split, no internal locking. Every Plotter subclass, static/correlation special-cases, and ~30 plotter test fixtures are left untouched. LayerStateMachine gains a token set, pending input stash, and gate lock. stash_pending() and set_active() return a ComputeTask when a build should run now; callers (PlotOrchestrator._run_compute and the new activate_layer entry point) invoke the task outside any lock. The 0->1 token transition flushes synchronously on the polling thread so the same poll pass sees fresh has_cached_state. Plotter replacement via job_started clears the stash so old input cannot be flushed through a new plotter. The polling loop in plot_grid_tabs now drives orchestrator.activate_layer(layer_id, session_layer, is_active) once per (grid, layer); orphan and shutdown paths release tokens. Test migration is one helper update (add_cell_with_layer activates the new layer) plus two inline activate_layer calls in tests that bypass the helper. Plotter unit tests are unchanged. 8 new gate tests cover stash-without-token, 0->1 flush, intermediate-update collapse, multi-token ref counting, plotter replacement, and pre-job_started stash behaviour.
The gate spans two threads by design: stash_pending runs on the bg ingestion thread, set_active on the per-session polling thread. The 0→1 flush deliberately runs on the polling thread so the same poll pass's component rebuild observes fresh has_cached_state.
Attach a weakref.finalize when a token is first acquired, so the gate releases it if the caller (typically a SessionLayer) is garbage-collected without an explicit set_active(False). Explicit release remains the fast path; the finalizer is belt-and-braces against missed cleanup (e.g., a session disposed without PlotGridTabs.shutdown running) and removes the id()-reuse footgun where a stale int could be discarded by an unrelated caller.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
plotter.compute()runs on every Kafka delta for every layer, regardless of whether anyone is watching. Hidden tabs (PanelTabs(dynamic=True)), backgrounded sessions, and a running backend with no browser attached all still pay the full build cost on every update. For expensive plotters this can consume significant CPU, especially when many plot tabs are used. The motivating trigger for this change was #942, which adds downsampling with significant cost to timeseries plotting.Approach
Add an interest gate on
LayerStateMachine. A layer's plotter only builds when at least one viewer holds an interest token; otherwise the latest Kafka input is stashed and collapsed (last-writer-wins) until a viewer arrives. On a 0→1 token transition with pending input, the build runs synchronously so the same polling pass's component rebuild sees freshhas_cached_state.Tokens are ref-counted via
LayerStateMachine.set_active(token, active). The polling loop drivesPlotOrchestrator.activate_layer(layer_id, session_layer, is_active)per (grid, layer): the currently-visible grid's layers hold tokens; everything else is released. Cleanup of dropped sessions is automatic viaweakref.finalize, so a session disposed withoutPlotGridTabs.shutdownrunning cannot leak interest.Visibility is orthogonal to the layer FSM. It is per-(session, layer) — interest is held by SessionLayer instances, not by the layer itself — so it cannot fit into the layer-singular
WAITING_FOR_JOB → … → READYchain. The gate lives onLayerStateMachinebecause that object already owns the plotter reference and is shared across sessions, but it is decoupled from the lifecycle states.Threading
stash_pendingruns on the bg ingestion thread (data-subscriber callback).set_activeruns on the per-session polling thread (Bokeh main).ComputeTaskwhen a build is due; the caller invokestask.run()on its own thread, outside the gate lock.has_cached_stateand renders without waiting another tick.Notes on the design
ComputeTaskis a frozen dataclass returned from gate operations so callers run the build outside the lock._run_compute(Kafka delta) andactivate_layer(tab activation) both funnel through_dispatch_compute_taskfor uniform error reporting and state transitions.job_startedclears the stash so old input cannot be flushed through a new plotter.weakref.finalizeauto-releases tokens if the owning SessionLayer is garbage-collected. Explicitset_active(False)remains the fast path; the finalizer is belt-and-braces against missed cleanup, and it removes theid()-reuse footgun where a stale int could be discarded by an unrelated caller.Test plan
LayerStateMachine: stash-without-token, 0→1 flush, only-once flush, multi-token ref counting, intermediate-update collapse, unknown-token release, pending-cleared-on-plotter-replacement, no-flush-before-job_started, gc auto-release.dummyinstrument: hidden-tab plots stop computing; tab switches still show fresh state.Relationship to #944
#944 implements the same runtime semantics — gating compute on viewer interest with a synchronous 0→1 flush — but places the gate inside
Plotteritself (set_active,_pending,_dirty,_active_tokens,_build_lock, acompute→_buildrename). That couples a layer-orchestration concern to every plotter subclass:CorrelationHistogramPlotterhas to forwardset_active,StaticPlotter.set_activeis a no-op stub, six subclasses inherit a_build/computesplit they don't need, and ~30 Plotter test sites need fixtures for token acquisition.This PR puts the gate on
LayerStateMachine— which is already per-layer, shared across sessions, and owns both the plotter reference and the lifecycle. Plotter and its subclasses are untouched; no new abstractions in the plotting layer.set_active,_build)Benchmarks from #944 (tab-switch latency, hidden-tab CPU) apply unchanged — the build path itself is identical.