Build: Improve Bench Robustness & Reporting by jlukic · Pull Request #181 · Semantic-Org/Semantic-Next

jlukic · 2026-05-05T14:50:27Z

This PR is designed to improve tachometer bench coverage and usability with tachometer on PRs. It adds descriptions for all tests, and additional information on wins/losses inside a PR between commits. It also adds missing benchmarks that help round out the suite.

Changes

Adds new benchmark suites
Improves robustness and resilience of CI scripts, particularly around editing comments of existing bench runs
Adds reactivity, renderer, and compiler bench suites
Adds glossary at end of bench comment explaining benchs with descriptions
Adds win/loss/drift columns to intra-PR perf reporting

Risk

0/10 - CI only changes, blast radius is CI runs only

How to Test

Confirm performance bot reports results correctly on next PR with changes to packages/

vercel · 2026-05-05T14:50:31Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
semantic-next	Ready	Preview, Comment	May 5, 2026 8:09pm

1 Skipped Deployment

Project	Deployment	Actions	Updated (UTC)
mcp	Ignored	Preview	May 5, 2026 8:09pm

semantic-performance-bot · 2026-05-05T15:10:12Z

⚪ No Meaningful Change for `cce2347` on Benchmark Suite 📊

Base: main · Action: #25383773434 · Raw: bench-report.json

^{Harness/Build: Bench coverage plan and metric purposes}

Note

This PR did not move any measured metrics.

✅ 0 faster · ❌ 0 slower · 🔍 10 unsure · ⚪ 30 no change

⚪ No Change (30)

Metrics where this PR measured within ±2% of main — no meaningful performance change detected.

metric	Change
`add-20`	-0.1% – +0.0%
`bulk-add-500`	-0.7% – +0.4%
`clear-completed-250`	-1.9% – +0.7%
`create-10k`	-0.3% – +0.4%
`create-1k`	-0.4% – +1.6%
`edit-cycle-5`	-0.4% – +0.1%
`edit-start-10`	-1.9% – +0.6%
`filter-cycle-20`	-0.4% – +0.7%
`hydrate-each-100`	-1.1% – +0.6%
`hydrate-each-100-mount`	-0.7% – +0.6%
`hydrate-helper-100-mount`	-0.8% – +0.5%
`remove-10-middle`	-1.3% – +0.4%
`remove-5-front`	-1.7% – +0.9%
`remove-first-10`	-1.0% – +0.9%
`remove-last-10`	-0.1% – +0.4%
`remove-middle-10`	+0.1% – +0.7%
`remove-row-back-10`	-1.2% – +0.3%
`remove-row-front-20`	-1.0% – +0.3%
`remove-row-middle-20`	-2.0% – +0.5%
`select-40`	-1.3% – +0.4%
`signal-computed-chain-10x60k`	-1.1% – +0.1%
`signal-reactive-fanout-500x1200`	-0.6% – +1.5%
`signal-reactive-list-replace-1000x1000`	-1.6% – +1.1%
`signal-reactive-set-property-by-id-200`	-1.7% – +0.2%
`swap-rows-20`	-1.1% – +0.3%
`toggle-10`	-0.7% – +0.1%
`toggle-all-20`	-0.0% – +0.2%
`toggle-first-10`	-0.0% – +0.3%
`toggle-last-10`	+0.0% – +0.6%
`toggle-middle-10`	-0.1% – +0.4%

🔍 Unsure (10)

Inconclusive (8)

The measured difference is small, and our sampling couldn't confidently place it above or below zero. Running more samples in a future run might settle these metrics.

metric	Change	Expected Noise
`append-1k`	-1.7% – +3.1%	±1%
`clear-10k`	-3.7% – +2.5%	±1%
`replace-1k`	-1.3% – +2.2%	±2%
`signal-reactive-list-filter-1000x300`	-2.7% – -0.0%	±1%
`signal-reactive-multi-read-5x160k`	-3.0% – +0.7%	±1%
`signal-reactive-push-2000x20`	-2.6% – +0.3%	±1%
`signal-reactive-set-index-300`	-3.7% – +0.5%	±1%
`update-10th-10`	-0.9% – +2.5%	±1%

Too Fast to Measure Precisely (2)

On benches this short, system jitter (scheduling, GC, JIT) masks sub-4% changes; larger deltas still resolve cleanly.

metric	Change	Test Time	Expected Noise
`hydrate-helper-100-state-change`	-4.9% – +12.7%	~6ms	±25%
`remove-5-back`	-2.4% – +0.4%	~76ms	±2%

_{Sample size: 50 · Resolution floor: ±2% · Timeout: 3min · Wall-clock: 17m57s}

… tachometer config

… dom walker

semantic-performance-bot · 2026-05-05T16:35:02Z

⚪ No Meaningful Change for `28afa94` on Benchmark Suite 📊

Base: main · Action: #25399508379 · Raw: bench-report.json

^{Build: Improve Bench Robustness & Reporting}

Note

This PR did not move any measured metrics.

✅ 0 faster · ❌ 0 slower · 🔍 5 unsure · ⚪ 58 no change

⚪ No Change (58)

Metrics where this PR measured within ±2% of main — no meaningful performance change detected.

metric	Change
`active-indicator-200`	0.0% – 0.0%
`active-indicator-nested-200`	-0.0% – +0.0%
`add-20`	-0.1% – +0.1%
`bulk-add-500`	-0.6% – +0.5%
`clear-completed-250`	-1.5% – +0.2%
`create-10k`	-0.5% – +0.5%
`create-1k`	-0.6% – +1.6%
`edit-cycle-5`	-1.0% – +0.6%
`edit-start-10`	-1.3% – +1.5%
`filter-cycle-20`	-1.2% – +1.1%
`hydrate-each-100`	-0.6% – +1.8%
`hydrate-each-100-mount`	-0.8% – +1.0%
`hydrate-helper-100-mount`	-1.2% – +1.3%
`hydrate-helper-100-state-change`	-1.1% – +0.6%
`micro-build-html-string-10k`	-0.4% – +1.8%
`micro-compiler-ast-walk-5k`	-1.6% – +1.8%
`micro-compiler-parse-cold-complex-200`	-0.7% – +1.1%
`micro-compiler-parse-cold-normal-500`	-1.9% – +0.7%
`micro-compiler-snippet-args-5k`	-0.8% – +0.7%
`micro-dom-walker-1000x15`	-1.3% – +1.4%
`micro-expr-js-10k`	-1.6% – +1.0%
`micro-expr-lisp-50k`	-0.7% – +1.3%
`micro-expr-simple-100k`	-1.9% – -0.2%
`reaction-coalesce-200x100`	-1.1% – +0.3%
`reaction-dep-diff-30k`	-0.9% – +1.3%
`reaction-flush-noop-5m`	-1.9% – +0.3%
`remove-10-middle`	-1.5% – +0.4%
`remove-5-back`	-1.2% – +1.2%
`remove-first-10`	-1.4% – +1.5%
`remove-last-10`	-0.3% – +0.2%
`remove-middle-10`	-0.7% – +0.4%
`remove-row-back-10`	-0.9% – +0.2%
`remove-row-front-20`	-0.4% – +0.8%
`remove-row-middle-20`	-1.1% – +0.9%
`rename-50`	-0.1% – +0.1%
`select-40`	-0.1% – +1.1%
`signal-computed-chain-10x60k`	-0.5% – +0.6%
`signal-reactive-fanout-500x1200`	-0.8% – +1.1%
`signal-reactive-list-filter-1000x300`	-1.6% – +0.9%
`signal-reactive-list-replace-1000x1000`	-1.5% – +0.8%
`signal-reactive-multi-read-5x160k`	-0.6% – +0.9%
`signal-reactive-set-index-300`	-0.6% – +1.3%
`signal-reactive-set-property-by-id-200`	-1.0% – +0.3%
`signal-set-same-10m`	-1.2% – +0.7%
`signal-sub-unsub-100k`	-0.0% – +1.2%
`snippet-args-per-key-100`	-0.0% – 0.0%
`snippet-in-subtemplate-100`	0.0% – 0.0%
`stable-ref-mutate-500`	0.0% – 0.0%
`subtemplate-data-blob-100`	-0.0% – +0.0%
`subtemplate-reactive-data-100`	-0.4% – +0.1%
`subtemplate-shorthand-props-100`	-0.0% – +0.1%
`swap-rows-20`	-0.9% – +0.5%
`toggle-10`	-0.7% – +0.2%
`toggle-all-20`	-0.1% – +0.2%
`toggle-first-10`	-0.2% – +0.1%
`toggle-last-10`	-0.3% – +0.4%
`toggle-middle-10`	-0.1% – +0.7%
`update-10th-10`	-1.3% – +1.5%

🔍 Unsure (5)

Inconclusive (3)

The measured difference is small, and our sampling couldn't confidently place it above or below zero. Running more samples in a future run might settle these metrics.

metric	Change	Expected Noise
`append-1k`	-0.4% – +4.3%	±1%
`clear-10k`	-2.1% – +5.6%	±1%
`signal-reactive-push-2000x20`	-0.3% – +2.3%	±1%

Too Fast to Measure Precisely (2)

On benches this short, system jitter (scheduling, GC, JIT) masks sub-4% changes; larger deltas still resolve cleanly.

metric	Change	Test Time	Expected Noise
`remove-5-front`	-2.5% – -0.2%	~87ms	±2%
`replace-1k`	+0.3% – +2.4%	~100ms	±2%

_{Sample size: 50 · Resolution floor: ±2% · Timeout: 3min · Wall-clock: 16m48s}

… semantic class names

… index, declared compiler devDep, dead-link fixes

…dit-last fix

Workspace is gitignored, so any tracked file linking into it is dead by definition. Inline the voice rules that were parked behind the link, and drop the calibration-log-artifact pointer from the sessions table — the substantive findings already follow.

Workload measured regex throughput more than reactive dispatch, and the FGR plan it was speculatively designed for doesn't ship expression-eval memoization. Drop pre-merge rather than ship a metric that won't move under the work it's measuring.

snippet-in-subtemplate-100: 25 cards each invoking 4 inner snippets, mutate parent prop's source. Tests dataDep pollution from receivesData: true subtemplate into inner snippet bodies. Distinct from top-level snippet-args-per-key-100. active-indicator-nested-200: 5×10×4 nav-menu shape with external currentUrl helper. First bench exercising cross-layer isolation across three nested each blocks. rename-50: pure setProperty(id, 'title') flood on the existing todoItem subtemplate, no editingId co-fires. Tightens the partial coverage in edit-cycle-5.

jlukic added 8 commits May 5, 2026 09:45

Harness: Bench coverage expansion plan

33a7c96

Harness: Bench coverage plan — apply reviewer feedback

9c14295

Harness: Bench reporter — purpose extraction and glossary rendering

af6217a

Build: Bench todo — purpose comments

ce14047

Build: Bench krausest — purpose comments

03f8a6a

Build: Bench hydrate — purpose comments

0bcd069

Build: Bench signal — purpose comments

a2ece5f

Build: Bench purpose comments — voice cleanup

1ce1470

github-actions Bot added the Docs Modifies documentation label May 5, 2026

Harness: Roadmap — add PR link for bench coverage

cce2347

vercel Bot deployed to Preview – semantic-next May 5, 2026 14:55 View deployment

jlukic added 3 commits May 5, 2026 11:25

Test: Add snippet/subtemplate arg-source propagation tests

f83afc2

Test: Unbury snippet/reactiveData granularity invariants

7eb7bc9

Build: Bench template-reactivity — fine-grained reactivity workload +…

066bd10

… tachometer config

github-actions Bot added the Tests Modifies tests label May 5, 2026

vercel Bot deployed to Preview – semantic-next May 5, 2026 15:40 View deployment

jlukic added 3 commits May 5, 2026 11:46

Build: Bench hydrate-helper state-change — amplify to 10 cycles

727f9e6

Build: Bench signal — micro-signal and scheduler hot-path coverage

0979ec6

Build: Bench renderer micros — expression evaluator, buildHTMLString,…

2a2196b

… dom walker

vercel Bot deployed to Preview – semantic-next May 5, 2026 16:09 View deployment

Build: Bench compiler micros — parse, AST walk, snippet args

9784b8b

vercel Bot deployed to Preview – semantic-next May 5, 2026 16:18 View deployment

Build: Bench compiler + renderer micros — normal-component headline +…

762838e

… semantic class names

vercel Bot deployed to Preview – semantic-next May 5, 2026 16:39 View deployment

Build: Bench review pass — purpose voice, kebab metric name, reporter…

0c844ec

… index, declared compiler devDep, dead-link fixes

vercel Bot deployed to Preview – semantic-next May 5, 2026 18:21 View deployment

Build: Bench review pass — stale-refactor cleanup, comment hygiene, e…

4f4a1f2

…dit-last fix

github-actions Bot added the CI modifies continuous integration label May 5, 2026

vercel Bot deployed to Preview – semantic-next May 5, 2026 18:49 View deployment

jlukic added 3 commits May 5, 2026 14:53

Harness: Bench coverage expansion — archive plan, sweep ROADMAP

63fe3d8

Harness: Bench peak attribution — archive plan, sweep ROADMAP

d84e85e

Harness: Plan cross-refs — sibling path now both in archive

0ef31a1

jlukic changed the title ~~Harness/Build: Bench coverage plan and metric purposes~~ Build: Bench Coverage Expansion May 5, 2026

vercel Bot deployed to Preview – semantic-next May 5, 2026 18:59 View deployment

jlukic changed the title ~~Build: Bench Coverage Expansion~~ Build: Improve Bench Robustness & Reporting May 5, 2026

vercel Bot deployed to Preview – semantic-next May 5, 2026 19:22 View deployment

jlukic added 2 commits May 5, 2026 15:36

Build: Bench renderer — clarify DCE-safety and bench-internals coupling

97fed94

vercel Bot deployed to Preview – semantic-next May 5, 2026 19:39 View deployment

vercel Bot deployed to Preview – semantic-next May 5, 2026 20:09 View deployment

jlukic merged commit 1ba5e85 into main May 5, 2026
21 checks passed

jlukic deleted the feat/bench-coverage-expansion branch May 5, 2026 20:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build: Improve Bench Robustness & Reporting#181

Build: Improve Bench Robustness & Reporting#181
jlukic merged 26 commits intomainfrom
feat/bench-coverage-expansion

jlukic commented May 5, 2026 •

edited

Loading

Uh oh!

vercel Bot commented May 5, 2026 •

edited

Loading

Uh oh!

semantic-performance-bot Bot commented May 5, 2026

Inconclusive (8)

Too Fast to Measure Precisely (2)

Uh oh!

semantic-performance-bot Bot commented May 5, 2026 •

edited

Loading

Inconclusive (3)

Too Fast to Measure Precisely (2)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jlukic commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Risk

How to Test

Uh oh!

vercel Bot commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

semantic-performance-bot Bot commented May 5, 2026

⚪ No Meaningful Change for cce2347 on Benchmark Suite 📊

Inconclusive (8)

Too Fast to Measure Precisely (2)

Uh oh!

semantic-performance-bot Bot commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚪ No Meaningful Change for 28afa94 on Benchmark Suite 📊

Inconclusive (3)

Too Fast to Measure Precisely (2)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jlukic commented May 5, 2026 •

edited

Loading

vercel Bot commented May 5, 2026 •

edited

Loading

⚪ No Meaningful Change for `cce2347` on Benchmark Suite 📊

semantic-performance-bot Bot commented May 5, 2026 •

edited

Loading

⚪ No Meaningful Change for `28afa94` on Benchmark Suite 📊