Skip to content

Optimize waveform render loop and per-line string utility hot paths#12069

Merged
niksedk merged 1 commit into
mainfrom
perf/hot-path-fixes
Jul 2, 2026
Merged

Optimize waveform render loop and per-line string utility hot paths#12069
niksedk merged 1 commit into
mainfrom
perf/hot-path-fixes

Conversation

@niksedk

@niksedk niksedk commented Jul 2, 2026

Copy link
Copy Markdown
Member

Performance pass over the AudioVisualizer render path, the MainViewModel position timer, and the libse string utilities called per subtitle line. No caching added per request — these are algorithmic fixes (binary search, early exit, single-pass rewrites, hoisted invariants, static compiled regexes).

All changes verified with BenchmarkDotNet (Apple M4, .NET 10). Each old/new pair was asserted to produce identical output before timing.

Waveform / render loop (runs at ~60 fps during playback)

  • GetShotChangeIndex: linear scan over all shot changes per frame → binary search on the sorted list: 1013 ns → 7.6 ns (134×) with 2000 shot changes.
  • DrawShotChanges: returns immediately when there are no shot changes (the common case — it previously still rebuilt the paragraph position sets every frame); binary-searches the first visible entry and breaks past the right edge instead of computing X positions for every shot change in the movie.
  • DrawParagraph: selection check was List.Contains per visible paragraph (O(selection) each — with select-all on a large subtitle that's millions of compares per frame); now a HashSet rebuilt once per render.
  • BuildWaveFormFancy: WaveformColor/WaveformSelectedColor/WaveformFancyHighColor StyledProperty reads hoisted out of the per-pixel loop (they went through Avalonia's value store once per pixel column while scrolling/zooming).
  • HitTestParagraph: resolves WavePeaks.SampleRate * ZoomFactor once per pointer move instead of twice per paragraph.
  • Timeline FormattedText cache gets the same 8000-entry cap as the paragraph text caches (it grew unbounded when scrubbing in frame mode).
  • MainViewModel 50 ms position timer: the current-line scan now stops at the first line starting after the playhead (the buffer is sorted by start time) instead of always scanning the entire subtitle.

String utilities (called per line from casing fixes, batch convert, multiple replace, auto-break, grid repaints)

Method Old New Speedup Alloc
HtmlUtil.RemoveColorTags 1882 ns 282 ns 6.7× −78%
HtmlUtil.RemoveFontName 2170 ns 358 ns 6.1× −79%
HtmlUtil.FixUpperTags 548 ns 64 ns 8.6× −86%
RegexUtils.ReplaceNewLineSafe (\n-only text) 182 ns 73 ns 2.5× −85%
StringExtensions.CountWords 184 ns 134 ns 1.4× −82%
StringExtensions.RemoveAndSaveTags (ASSA) 359 ns 198 ns 1.8× −39%
  • RemoveColorTags/RemoveFontName: the regex was re-parsed on every call (uncompiled new Regex(...)) — now static compiled; also dropped the rest-of-string Substring per match.
  • FixUpperTags: rescan-from-zero loop (8 IndexOf passes + Remove+Insert full copies per tag) → single forward pass over a char buffer.
  • ReplaceNewLineSafe/CountNewLineSafe: skip the two SplitToLines+Join round-trips when the text contains no \r/U+2028 (always true after the first rule in a multiple-replace run).
  • RemoveAndSaveTags: input.Substring(index) (rest of string) allocated per </{/\ just for a StartsWith check — now AsSpan. The plain-HTML micro-case is ~13% slower on M4 but allocates 30% less and no longer degrades quadratically with line length; the ASSA case is 1.8× faster.
  • CalcFactory.MakeCalculator (hit per grid cell repaint via CPS): dropped the LINQ closure and per-call fallback new CalcAll(); returns the existing shared instances.
  • Utilities.CanBreak: dropped the per-call ToArray() copy of the cached no-break list.

Not addressed (fix would require caching, excluded by request): the spectrogram path rebuilding + converting a full bitmap every frame, and the position timer's per-tick copy+sort of all subtitles.

All 1003 tests pass (libse 485, UI 512, libuilogic 6).

🤖 Generated with Claude Code

Verified with BenchmarkDotNet (Apple M4, .NET 10, old vs new, identical
outputs asserted before timing):

- AudioVisualizer.GetShotChangeIndex: linear scan over all shot changes
  per frame -> binary search: 1013 ns -> 7.6 ns (134x)
- DrawShotChanges: skip entirely with no shot changes (the common case,
  previously still rebuilt paragraph position sets), binary search the
  first visible entry and stop past the right edge
- DrawParagraph selection check: List.Contains per visible paragraph
  (O(selection) each) -> HashSet rebuilt once per render
- BuildWaveFormFancy: hoist WaveformColor/WaveformSelectedColor/
  WaveformFancyHighColor StyledProperty reads out of the per-pixel loop
- HitTestParagraph: resolve WavePeaks/ZoomFactor once per pointer move
  instead of per paragraph edge
- Timeline FormattedText cache: same 8000-entry cap as the paragraph
  caches (grew unbounded when scrubbing in frame mode)
- MainViewModel position timer: stop the per-tick current-line scan at
  the first line starting after the playhead (list is sorted)

- HtmlUtil.RemoveColorTags/RemoveFontName: per-call uncompiled Regex ->
  static compiled: 1882 -> 282 ns (6.7x) / 2170 -> 358 ns (6.1x)
- HtmlUtil.FixUpperTags: rescan-from-zero with 8 IndexOf passes and two
  full copies per tag -> single forward pass: 548 -> 64 ns (8.6x)
- RegexUtils.ReplaceNewLineSafe/CountNewLineSafe: skip the SplitToLines
  + Join round-trips when the text is already \n-only: 182 -> 73 ns
- StringExtensions.CountWords: char walk instead of Split allocations:
  184 -> 134 ns, 82% less allocated
- StringExtensions.RemoveAndSaveTags: O(n^2) Substring(index) per tag
  char -> AsSpan: ASSA lines 359 -> 198 ns, ~30% less allocated
- CalcFactory.MakeCalculator: drop LINQ closure + fallback allocation
- Utilities.CanBreak: drop the per-call ToArray copy of the cached
  no-break list

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@niksedk niksedk merged commit 8239a4f into main Jul 2, 2026
1 of 3 checks passed
@niksedk niksedk deleted the perf/hot-path-fixes branch July 2, 2026 12:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant