Skip to content

Discussion: v3.1 'Live Tag Pipeline' architectural exploration — file-major pipeline + appendData buffer + reactive Companion #151

@HanSur94

Description

@HanSur94

Context

While exploring the same performance problem PR #130 ultimately solved (live-tick CPU cost at scale), a longer-running branch took a different architectural path: rebuilding the Tag → Pipeline → Companion data flow as event-driven from the ground up, instead of optimizing the existing scan-based path.

Main shipped PR #130 ("cut live-tick ~75%") before that branch could land, and the two architectures diverged too far to merge cleanly (the branch is ~44 commits, main moved ~250 commits in the same window). This issue captures what was explored so the ideas aren't lost — pieces of it may still be worth porting onto the current architecture, separately.

PR #150 lands the one cleanly-portable piece (stress fleets in the demo). This issue is the discussion artifact for the rest.

What the v3.1 branch built (3 phases)

Phase 1024 — Tag-side event foundation

Tag.appendData(newX, newY) with amortized O(1) buffer growth, paired with a DataChanged event carrying a delta payload (StartIdx, EndIdx, NewX, NewY) rather than just signaling "data mutated":

  • XBuffer_/YBuffer_ private fields, doubling capacity strategy
  • Length_ logical-length pointer
  • Deadband property (numeric tags only) — skip incoming Y deltas within ± Deadband of last stored Y
  • Reentrancy guard (Tag:reentrancyForbidden thrown if a listener tries to recurse into appendData)
  • One-shot buffer-overflow warning at 1M samples
  • subscribeDataChanged(callback) helper that captures the listener handle into Tag's own list with ObjectBeingDestroyed cleanup
  • Capacity and ListenerCount Dependent props for observability
  • updateData(X, Y) rewritten as Length_=0; appendData(X, Y) so existing callers see identical end-state behavior

Phase 1025 — Pipeline rewrite + demo migration

LiveTagPipeline inverted from tag-major to file-major:

  • Files grouped by RawSource.file (rebuilt only when TagRegistry.Version increments)
  • 1 stat + 1 parse per file → N dispatches via tag.appendData
  • TickComplete event (one per tick) with tagsUpdated cellstr payload
  • Per-tick observability: LastFileStatCount, LastDispatchCount, LastTickDurationMs
  • Listener-budget guard: one-shot warning when callback time exceeds 50% of tick period
  • Persistence decoupled: hot path enqueues into dirtyQueue_; separate flushTimer_ (default 60s) drains via flushNow(); stop() flushes before tearing down timers
  • Demo writer migrated to file-only — single source of truth via .dat files; no in-process tag mutation

Phase 1026 — Reactive subscription rewiring

  • DashboardWidget base class: TagListeners_ private cell + idempotent subscribeTagDataChanged_ / cleanupTagListeners_ helpers; cleanup in delete() before parent destruction
  • DashboardEngine.wireListeners leak fix (the existing addlistener handle was being discarded — Phase 1026 captured it into the widget)
  • DashboardEngine + FastSenseCompanion accept 'Pipeline' NV pair, subscribe to TickComplete
  • DashboardEngine.onLiveTick reduced to global state only — per-widget refresh from DataChanged listeners; EventTimelineWidget kept on heartbeat (its source is EventStoreObj, not a Tag)
  • InspectorPane per-state listeners (single tag = 1 listener; multitag = N idx-capturing closures); cleanup on every state transition
  • FastSenseCompanion.LiveTimer_ and scanLiveTagUpdates_ deleted; live-updates log fed from pipeline.TickComplete (one batch insert per tick)

Benchmark numbers (MATLAB R2025b, Apple M3)

Wall-clock of pipeline.tickOnce() across stress modes, 15 ticks each, with 1.05s pacing between ticks to clear dir().datenum second-resolution mtime gating:

Mode K (files) N (tags) mean tick (ms) p95 (ms) samples/sec N/K
'off' 10 10 40 81 249 1.0
'small' 15 60 39 63 1,540 4.0
'medium' 16 160 70 142 2,301 10.0
'large' 36 360 127 188 2,840 10.0

Reading: 36× more tags (10 → 360) at only ~3× tick cost (40 → 127 ms), because tick cost scales with K (files) not N (tags). A naive 1-file-per-tag scaling would extrapolate to ~1,440 ms/tick at 360 tags — past the 1s tick budget.

Why it isn't merging as a milestone

The merge cost-benefit didn't favor pushing through ~50+ resolution decisions across architectural choices the project had already committed to a different way.

Pieces that could still be useful (per-PR candidates)

If any of these sound interesting, happy to extract them as standalone PRs against current main:

  1. Tag.Deadband property — skip dispatch when incoming Y is within ± deadband of last stored. Useful for noisy sensors and could reduce widget refresh churn on main's architecture too. Self-contained ~30 lines.
  2. TagRegistry.Version counter — read-only uint64 bumped on register/unregister/clear. Lets downstream caches invalidate cheaply. Self-contained ~20 lines.
  3. bench_appendData.m — existing micro-bench for tag-side buffer growth (the v3.1 version is tied to appendData, but the pattern would adapt to updateData).
  4. bench_pipeline.m + stress fleets in demo (the latter is now PR Demo: opt-in stress fleets for scale-testing the industrial plant #150). The bench would need rewriting against main's pipeline architecture.
  5. One-shot listener-budget guard in LiveTagPipeline — warn once per pipeline lifetime when listener phase exceeds 50% of tick period. Useful regardless of architecture; helps users diagnose slow callbacks.
  6. Reentrancy guard on Tag.updateData — throws Tag:reentrancyForbidden if a DataChanged listener calls updateData on the same tag. Prevents a class of buffer-corruption bugs.

Test counts (on the v3.1 branch as a snapshot)

  • Phase 1024 alone: 189/189 pass
  • Phase 1025 (cumulative): 217/217 pass
  • Phase 1026 (cumulative): 278/278 pass
    • stress fleets: 287/287 pass

(All on MATLAB R2025b. Some tests gate on assumeTrue(~isOctave) for MATLAB-only event observation paths.)

What would help me

  • Are any of items 1–6 worth extracting as small PRs?
  • Are there places in the current architecture where the file-major scan pattern would still be a win (e.g., a future high-N-tag deployment)?
  • Should the v3.1 branch be pruned, or kept as a reference for future architectural work?

cc maintainer when you have time — happy to follow whichever direction makes sense.

🤖 Generated with Claude Code

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions