Skip to content

Commit a2e599d

Browse files
committed
fix(bench): widen WASM_TIMING_THRESHOLD to 0.75 and add pts-param to TECHNIQUE_MAP
Observed 71% WASM Build ms/file runner variance (18.7 → 32ms) on byte-identical code, exceeding the prior 70% ceiling. The WASM_TIMING_THRESHOLD was designed to absorb WASM runner jitter structurally so per-version KNOWN_REGRESSIONS entries are not needed. Widen to 0.75 to match the empirical maximum observed (71%) with adequate headroom; native engine stays at strict 25%/50% thresholds. Also adds pts-param to TECHNIQUE_MAP so inline-array spread edges are correctly attributed to the points-to technique bucket rather than falling through to other.
1 parent 6942655 commit a2e599d

2 files changed

Lines changed: 7 additions & 5 deletions

File tree

tests/benchmarks/regression-guard.test.ts

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -72,21 +72,22 @@ const NOISY_METRICS = new Set<string>(['No-op rebuild', '1-file rebuild', 'fnDep
7272
* than native and dominated by interpreter + GC overhead. The same ±10–20ms
7373
* of shared-runner jitter therefore lands as a much larger *percentage* swing
7474
* than on native. Empirically, WASM timing metrics on the publish runner swing
75-
* run-to-run by +27–67% on byte-identical code (No-op rebuild 15→25 = +67%,
75+
* run-to-run by +27–71% on byte-identical code (No-op rebuild 15→25 = +67%,
7676
* Query time 32.5→44.2 = +36%, fnDeps depth 3/5 ~+31%, Full build 7664→9833
77-
* = +28%), which previously required a per-version KNOWN_REGRESSIONS entry for
78-
* each metric on every release — an endless whack-a-mole.
77+
* = +28%, Build ms/file 18.7→32 = +71%), which previously required a
78+
* per-version KNOWN_REGRESSIONS entry for each metric on every release — an
79+
* endless whack-a-mole.
7980
*
8081
* Why this is safe: the native engine shares all extraction, resolution, and
8182
* query logic with WASM (the WASM path only swaps the parser/runtime), so any
8283
* *real* algorithmic regression shows up on the native numbers too — and native
8384
* keeps the strict 25% / 50% thresholds. Native is the canary. WASM timing only
8485
* needs to catch gross WASM-specific catastrophes (the 100–220% blowups seen in
85-
* v3.0.1–3.4.0), which 70% still flags, while absorbing the ≤67% shared-runner
86+
* v3.0.1–3.4.0), which 75% still flags, while absorbing the ≤71% shared-runner
8687
* jitter. Size metrics (DB bytes/file) are engine-independent and excluded from
8788
* this widening via SIZE_METRICS below — they keep the strict threshold.
8889
*/
89-
const WASM_TIMING_THRESHOLD = 0.7;
90+
const WASM_TIMING_THRESHOLD = 0.75;
9091

9192
/**
9293
* Metric labels that measure size/count rather than wall-clock time. These are

tests/benchmarks/resolution/resolution-benchmark.test.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -97,6 +97,7 @@ const TECHNIQUE_MAP: Record<string, string> = {
9797
'pts-set': 'points-to',
9898
'pts-array-from': 'points-to',
9999
'pts-spread': 'points-to',
100+
'pts-param': 'points-to',
100101
'define-property': 'ts-native',
101102
};
102103

0 commit comments

Comments
 (0)