Build: Improve Bench Robustness & Reporting#181
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
⚪ No Meaningful Change for
|
| metric | Change |
|---|---|
add-20 |
-0.1% – +0.0% |
bulk-add-500 |
-0.7% – +0.4% |
clear-completed-250 |
-1.9% – +0.7% |
create-10k |
-0.3% – +0.4% |
create-1k |
-0.4% – +1.6% |
edit-cycle-5 |
-0.4% – +0.1% |
edit-start-10 |
-1.9% – +0.6% |
filter-cycle-20 |
-0.4% – +0.7% |
hydrate-each-100 |
-1.1% – +0.6% |
hydrate-each-100-mount |
-0.7% – +0.6% |
hydrate-helper-100-mount |
-0.8% – +0.5% |
remove-10-middle |
-1.3% – +0.4% |
remove-5-front |
-1.7% – +0.9% |
remove-first-10 |
-1.0% – +0.9% |
remove-last-10 |
-0.1% – +0.4% |
remove-middle-10 |
+0.1% – +0.7% |
remove-row-back-10 |
-1.2% – +0.3% |
remove-row-front-20 |
-1.0% – +0.3% |
remove-row-middle-20 |
-2.0% – +0.5% |
select-40 |
-1.3% – +0.4% |
signal-computed-chain-10x60k |
-1.1% – +0.1% |
signal-reactive-fanout-500x1200 |
-0.6% – +1.5% |
signal-reactive-list-replace-1000x1000 |
-1.6% – +1.1% |
signal-reactive-set-property-by-id-200 |
-1.7% – +0.2% |
swap-rows-20 |
-1.1% – +0.3% |
toggle-10 |
-0.7% – +0.1% |
toggle-all-20 |
-0.0% – +0.2% |
toggle-first-10 |
-0.0% – +0.3% |
toggle-last-10 |
+0.0% – +0.6% |
toggle-middle-10 |
-0.1% – +0.4% |
🔍 Unsure (10)
Inconclusive (8)
The measured difference is small, and our sampling couldn't confidently place it above or below zero. Running more samples in a future run might settle these metrics.
| metric | Change | Expected Noise |
|---|---|---|
append-1k |
-1.7% – +3.1% | ±1% |
clear-10k |
-3.7% – +2.5% | ±1% |
replace-1k |
-1.3% – +2.2% | ±2% |
signal-reactive-list-filter-1000x300 |
-2.7% – -0.0% | ±1% |
signal-reactive-multi-read-5x160k |
-3.0% – +0.7% | ±1% |
signal-reactive-push-2000x20 |
-2.6% – +0.3% | ±1% |
signal-reactive-set-index-300 |
-3.7% – +0.5% | ±1% |
update-10th-10 |
-0.9% – +2.5% | ±1% |
Too Fast to Measure Precisely (2)
On benches this short, system jitter (scheduling, GC, JIT) masks sub-4% changes; larger deltas still resolve cleanly.
| metric | Change | Test Time | Expected Noise |
|---|---|---|---|
hydrate-helper-100-state-change |
-4.9% – +12.7% | ~6ms | ±25% |
remove-5-back |
-2.4% – +0.4% | ~76ms | ±2% |
Sample size: 50 · Resolution floor: ±2% · Timeout: 3min · Wall-clock: 17m57s
⚪ No Meaningful Change for
|
| metric | Change | Expected Noise |
|---|---|---|
append-1k |
-0.4% – +4.3% | ±1% |
clear-10k |
-2.1% – +5.6% | ±1% |
signal-reactive-push-2000x20 |
-0.3% – +2.3% | ±1% |
Too Fast to Measure Precisely (2)
On benches this short, system jitter (scheduling, GC, JIT) masks sub-4% changes; larger deltas still resolve cleanly.
| metric | Change | Test Time | Expected Noise |
|---|---|---|---|
remove-5-front |
-2.5% – -0.2% | ~87ms | ±2% |
replace-1k |
+0.3% – +2.4% | ~100ms | ±2% |
Sample size: 50 · Resolution floor: ±2% · Timeout: 3min · Wall-clock: 16m48s
… semantic class names
… index, declared compiler devDep, dead-link fixes
Workspace is gitignored, so any tracked file linking into it is dead by definition. Inline the voice rules that were parked behind the link, and drop the calibration-log-artifact pointer from the sessions table — the substantive findings already follow.
Workload measured regex throughput more than reactive dispatch, and the FGR plan it was speculatively designed for doesn't ship expression-eval memoization. Drop pre-merge rather than ship a metric that won't move under the work it's measuring.
snippet-in-subtemplate-100: 25 cards each invoking 4 inner snippets, mutate parent prop's source. Tests dataDep pollution from receivesData: true subtemplate into inner snippet bodies. Distinct from top-level snippet-args-per-key-100. active-indicator-nested-200: 5×10×4 nav-menu shape with external currentUrl helper. First bench exercising cross-layer isolation across three nested each blocks. rename-50: pure setProperty(id, 'title') flood on the existing todoItem subtemplate, no editingId co-fires. Tightens the partial coverage in edit-cycle-5.
This PR is designed to improve tachometer bench coverage and usability with tachometer on PRs. It adds descriptions for all tests, and additional information on wins/losses inside a PR between commits. It also adds missing benchmarks that help round out the suite.
Changes
Risk
0/10 - CI only changes, blast radius is CI runs only
How to Test
packages/