Skip to content

Benchmarks: add end-to-end scrolling performance suite and CI performance gate #5294

@tig

Description

@tig

Summary

The existing benchmark suite covers layout (DimAuto), text processing, console drivers, and memory profiling — but has no end-to-end scrolling/rendering benchmarks, and benchmarks are not run in CI. Scrolling is the most user-visible performance surface for data-heavy views.

This issue proposes two additions:

  1. End-to-end scrolling benchmarks using BenchmarkDotNet
  2. A CI performance gate that catches egregious regressions and celebrates improvements

1. End-to-end scrolling benchmarks

Views to benchmark

View Content setup Why
Baseline View subclass Empty view with large ContentSize, no rendering logic Isolates framework overhead (viewport math, layout, draw dispatch) from view-specific work
TextView 1K / 5K / 10K lines of ~80-char text The most complex rendering path: line tracking, word wrap, tab expansion, selection
TableView 100 / 1K / 10K rows × 10 columns Cell rendering, column alignment, header drawing
ListView 1K / 10K / 100K items Item rendering with mark glyphs, selection roles
CharMap Default Unicode ranges Dense grid rendering, glyph width calculation

Benchmark scenarios

For each view, measure the full input → layout → draw pipeline:

Vertical scrolling:

  • Arrow ↓ to bottom — inject Key.CursorDown + LayoutAndDraw after every keypress until the caret/selection reaches the last row. Measures per-keystroke cost.
  • Arrow ↑ to top — same in reverse from the bottom.
  • PageDown to bottom / PageUp to top — viewport-sized jumps. Fewer iterations but each rebuilds a full page.

Horizontal scrolling (TextView, TableView):

  • Arrow → to end — char-by-char or column-by-column traversal through a line/row wider than the viewport.
  • Arrow ← to start — reverse.
  • Home/End oscillation — alternating Home/End on a long line/row. Worst-case horizontal viewport churn.

Caret/selection movement with scroll counting:

  • Same as above but count how many keystrokes trigger a viewport shift (i.e. the caret hits a viewport boundary). Reports both total time and scroll-event count.

Implementation approach

  • Create a lightweight BenchmarkHarness that wraps Application.Create() + ANSI driver + the view under test. Synchronous setup/teardown for BenchmarkDotNet compatibility.
  • Use InputInjectionMode.Direct for key injection (bypasses ANSI encoding, same as integration tests).
  • Call App.LayoutAndDraw(true) after each keypress to force a full render cycle.
  • Parameterize by document/data size using [Params].
  • Place in Tests/Benchmarks/Scrolling/ alongside existing benchmark categories.

Baseline view

// Minimal View subclass that sets ContentSize but does no drawing.
// Isolates framework scrolling overhead from view-specific rendering.
private sealed class EmptyScrollView : View
{
    public EmptyScrollView (int contentWidth, int contentHeight)
    {
        ContentSize = new (contentWidth, contentHeight);
    }
}

2. CI performance gate

Layer 1: Performance smoke tests (xUnit)

Add Stopwatch-based tests to the parallelizable unit tests with generous thresholds (~50–100x typical). These run on every CI build and only catch catastrophic regressions.

Example tests:

  • Build a 10K-row TableView and render one viewport — assert < 200 ms
  • Scroll a 1K-line TextView top-to-bottom via arrow keys — assert < 5s
  • Render a 100K-item ListView viewport — assert < 100 ms

Thresholds should be fat enough to never flake on slow CI runners, but tight enough to catch an accidental O(n²) regression.

Layer 2: Baseline comparison (CI step)

  • Store baseline benchmark results in a JSON file (e.g. Tests/Benchmarks/baseline.json)
  • Add a CI step (Linux only) that runs a focused subset of the scrolling benchmarks (ShortRun, ~30-60s)
  • Compare results to the baseline
  • Fail if any benchmark exceeds 3x the baseline (egregious regression)
  • Celebrate 🎉 in the GitHub step summary if any benchmark drops below 0.8x the baseline (nice improvement)
  • Post a markdown comparison table to the step summary

Updating the baseline

After a deliberate performance change, re-run the focused benchmarks and update baseline.json with the new numbers. The thresholds are relative, so the baseline is the source of truth.


Out of scope

  • Pixel-level rendering benchmarks (not applicable to TUI)
  • Network/IO benchmarks
  • Memory profiling (already covered by memory / scenarios commands)

References

  • Tests/Benchmarks/README.md — existing benchmark docs
  • Tests/AppTestHelpers/AppTestHelper.Input.cs — input injection infrastructure
  • Tests/Benchmarks/ViewBase/ViewMemoryBenchmark.cs — existing view profiling pattern

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No fields configured for Task.

Projects

Status

✅ Done

Relationships

None yet

Development

No branches or pull requests

Issue actions