Skip to content

Commit f56cec6

Browse files
avrabeclaude
andauthored
test(vcr-ra): non-vacuous --relocatable gate for const-CSE — proves it fires + stays correct on gale's path (#242) (#522)
The const-CSE no-regression fix (#519) ships with an honest gap: its `--relocatable` gate was INERT. gale's gust_mix 90→92 B regression was on `--relocatable`, which routes through `select_direct()` where only the post-hoc `apply_const_cse` runs — but the `const_cse.wat` arithmetic corpus never makes the direct selector emit the redundant same-value-in-two-registers shape that pass dedups, so the gate gave zero positive evidence on the exact path the bug lived on. This adds `const_cse_direct.wat`: single-param, pure-register, reloc-free shapes (a >8-bit const reused across several independent sub-expressions summed at the end) that DO make the direct selector emit the redundant `movw`, so post-hoc `apply_const_cse` fires on `--relocatable` (r1 44→40, r2 38→34). The differential now runs a NON-VACUOUS direct-path gate that asserts: (a) CSE actually fires on >=1 function — fails if the gate goes blind; (b) no function grows (the no-regression property on gale's path); (c) every result is bit-identical to wasmtime under unicorn (correctness of post-hoc CSE on the direct selector's output). This is the positive evidence on gale's exact path that #519 could not provide. Behavior-frozen: new fixture + harness only, no codegen change — frozen anchors (control_step 0x00210A55, flight_algo 0x07FDF307) and the const_cse flag-off golden are untouched (frozen gate 3/3, golden 2/2). The full differential passes (exit 0); flag-off byte-identical. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
1 parent 0a23a09 commit f56cec6

2 files changed

Lines changed: 105 additions & 29 deletions

File tree

scripts/repro/const_cse_differential.py

Lines changed: 65 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,11 @@
2828
shapes (see the .wat header): large/small/negative/mixed consts, reuse ACROSS an
2929
if/else (cache must reset), and a 12-live-local function that forces real spills.
3030
31+
It also runs a DIRECT-PATH (`--relocatable`) gate on `const_cse_direct.wat` —
32+
the path gale's gust_mix regression lived on, where only the post-hoc pass
33+
runs — asserting CSE actually fires there, stays bit-identical to wasmtime,
34+
and never grows a function (non-vacuous: fails if the fixture stops firing).
35+
3136
Run (needs wasmtime + unicorn + pyelftools):
3237
SYNTH=./target/debug/synth python scripts/repro/const_cse_differential.py
3338
Exits nonzero on any value mismatch OR any function that grew under const-CSE.
@@ -49,25 +54,31 @@
4954
)
5055

5156
WAT = Path(__file__).with_name("const_cse.wat")
57+
# Direct-path (`--relocatable`) trigger fixtures — shapes that make the DIRECT
58+
# selector emit the redundant const materialization `apply_const_cse` dedups, so
59+
# the direct-path gate actually exercises gale's path (single-param, reloc-free →
60+
# runnable under unicorn). See its header.
61+
WAT_DIRECT = Path(__file__).with_name("const_cse_direct.wat")
62+
DIRECT_EXPORTS = ["r1", "r2", "rneg"]
5263
SYNTH = os.environ.get("SYNTH", "./target/release/synth")
5364
CODE, STK, RET = 0x100000, 0x900000, 0x300000
5465
EXPORTS = ["large3", "small3", "neg", "mixed", "ctrl", "spill12"]
5566
INPUTS = [5, 0, -3, 1000, -100000, 7]
5667

5768

58-
def wasmtime_run(fn, arg):
69+
def wasmtime_run(fn, arg, wat=WAT):
5970
engine = wasmtime.Engine()
60-
module = wasmtime.Module.from_file(engine, str(WAT))
71+
module = wasmtime.Module.from_file(engine, str(wat))
6172
store = wasmtime.Store(engine)
6273
inst = wasmtime.Instance(store, module, [])
6374
return inst.exports(store)[fn](store, arg)
6475

6576

66-
def compile_optimized(out, cse_on, relocatable=False):
77+
def compile_optimized(out, cse_on, relocatable=False, wat=WAT):
6778
env = {"PATH": "/usr/bin:/bin"}
6879
if cse_on:
6980
env["SYNTH_CONST_CSE"] = "1"
70-
cmd = [SYNTH, "compile", str(WAT), "-o", out, "-b", "arm",
81+
cmd = [SYNTH, "compile", str(wat), "-o", out, "-b", "arm",
7182
"--target", "cortex-m4", "--all-exports"]
7283
if relocatable:
7384
cmd.append("--relocatable")
@@ -165,34 +176,59 @@ def main():
165176
print(f"OK {fn}: {o} -> {n} B ({tag})")
166177
fails += grew
167178

168-
# --relocatable NO-REGRESSION GATE (#242). gale's gust_mix 90→92 B regression
169-
# was on `--relocatable`, which routes through `select_direct()` — there the
170-
# INLINE const cache never runs, so the post-hoc `apply_const_cse` acts ALONE.
171-
# This is the precise path of the bug; the optimized-path gate above does not
172-
# exercise it (inline aliasing dominates there). We compile the corpus on the
173-
# direct path flag-on vs flag-off and assert no function grows. NOTE: post-hoc
174-
# CSE is currently INERT on this corpus (the direct selector does not emit the
175-
# redundant same-value-in-two-registers shape these arithmetic fixtures would
176-
# need) — so this gate is, for now, a tripwire that will light up the moment a
177-
# fixture that DOES trigger it (e.g. gale's gust_mix Q8 clamp mixer) is added.
178-
print("\n--- per-function no-regression gate, --relocatable / direct path (#242) ---")
179-
roff, ron = "/tmp/const_cse_reloc_off.o", "/tmp/const_cse_reloc_on.o"
180-
compile_optimized(roff, cse_on=False, relocatable=True)
181-
compile_optimized(ron, cse_on=True, relocatable=True)
182-
_, _, _, roff_sizes = load(roff)
183-
_, _, _, ron_sizes = load(ron)
184-
rgrew, rfired = 0, 0
185-
for fn in sorted(set(roff_sizes) & set(ron_sizes)):
186-
o, n = roff_sizes[fn], ron_sizes[fn]
179+
# --relocatable DIRECT-PATH GATE (#242) — the precise path of gale's gust_mix
180+
# 90→92 B regression. `--relocatable` routes through `select_direct()`, where
181+
# the INLINE const cache never runs, so the post-hoc `apply_const_cse` acts
182+
# ALONE; the optimized-path gate above does not exercise this (inline aliasing
183+
# dominates there). The main `const_cse.wat` corpus is INERT on this path (its
184+
# shapes don't make the direct selector emit the redundant same-value-in-two-
185+
# registers materialization apply_const_cse dedups), so it gave no positive
186+
# evidence. `const_cse_direct.wat` DOES trigger it. This gate is NON-VACUOUS:
187+
# it asserts (a) CSE actually fires on ≥1 function (else the gate has gone
188+
# blind — fail), (b) no function grows (the no-regression property on gale's
189+
# path), and (c) every result is bit-identical to wasmtime under unicorn
190+
# (correctness of post-hoc CSE on the direct selector's output).
191+
print("\n--- direct-path (--relocatable) gate, const_cse_direct.wat (#242) ---")
192+
doff, don = "/tmp/const_cse_direct_off.o", "/tmp/const_cse_direct_on.o"
193+
compile_optimized(doff, cse_on=False, relocatable=True, wat=WAT_DIRECT)
194+
compile_optimized(don, cse_on=True, relocatable=True, wat=WAT_DIRECT)
195+
d_off_code, d_off_base, _, d_off_sizes = load(doff)
196+
d_on_code, d_on_base, d_on_syms, d_on_sizes = load(don)
197+
dgrew, dfired = 0, 0
198+
for fn in DIRECT_EXPORTS:
199+
o, n = d_off_sizes.get(fn), d_on_sizes.get(fn)
200+
if o is None or n is None:
201+
continue
187202
if n > o:
188-
rgrew += 1
203+
dgrew += 1
189204
print(f"REGRESS {fn}: {o} -> {n} B (+{n - o}) — const-CSE grew a function")
190205
elif n < o:
191-
rfired += 1
192-
print(f"OK {fn}: {o} -> {n} B (-{o - n}) — CSE fired")
193-
if rfired == 0:
194-
print(f"(post-hoc CSE inert on this corpus — {len(roff_sizes)} fns, none changed)")
195-
fails += rgrew
206+
dfired += 1
207+
print(f"OK {fn}: {o} -> {n} B (-{o - n}) — CSE fired (direct path)")
208+
else:
209+
print(f"-- {fn}: {o} B (inert)")
210+
# correctness vs wasmtime on the CSE-on direct-path object (reloc-free → runs)
211+
for fn in DIRECT_EXPORTS:
212+
faddr = d_on_syms.get(fn)
213+
if faddr is None:
214+
continue
215+
for arg in INPUTS:
216+
exp = wasmtime_run(fn, arg, wat=WAT_DIRECT)
217+
got = unicorn_run(d_on_code, d_on_base, faddr, arg)
218+
if not (isinstance(got, int) and got == exp):
219+
fails += 1
220+
print(f"FAIL {fn} cse-on(direct)({arg}) = {got} (wasmtime {exp})")
221+
fails += dgrew
222+
if dfired == 0:
223+
# Non-vacuity guard: if no function shrank, the fixture no longer triggers
224+
# post-hoc CSE on the direct path — the gate has gone blind. Fail loudly so
225+
# it is fixed rather than silently passing.
226+
fails += 1
227+
print("FAIL: direct-path gate is VACUOUS — no function exercised post-hoc "
228+
"CSE (fixture no longer triggers it). Fix const_cse_direct.wat.")
229+
else:
230+
print(f"direct-path gate non-vacuous: CSE fired on {dfired} function(s), "
231+
f"all results match wasmtime, none grew")
196232

197233
print("\n--- instruction-count delta, optimized path (information) ---")
198234
off_n, on_n = insn_count(off_code), insn_count(on_code)

scripts/repro/const_cse_direct.wat

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
;; VCR-RA const-CSE (#242) — DIRECT-PATH (`--relocatable`) trigger fixtures.
2+
;;
3+
;; gale's gust_mix 90->92 B regression was on `--relocatable`, which routes
4+
;; through the direct selector (`select_direct`) — there the optimized path's
5+
;; INLINE const cache never runs, so the post-hoc `liveness::apply_const_cse`
6+
;; acts ALONE. The main `const_cse.wat` corpus is INERT on that path (its
7+
;; arithmetic shapes don't make the direct selector materialize the same value
8+
;; into two distinct live registers in one straight-line segment), so it gave no
9+
;; positive evidence the no-regression fix actually engages gale's path.
10+
;;
11+
;; These shapes DO make the direct selector emit the redundant materialization
12+
;; that `apply_const_cse` then dedups: a >8-bit const (needs `movw`) reused across
13+
;; several independent sub-expressions whose results stay live to a final sum.
14+
;; All are single-param, pure-register (no memory) so they link with zero
15+
;; relocations and run directly under unicorn (R0=arg) for a correctness check vs
16+
;; wasmtime. CSE fires (each shrinks) on `--relocatable`, so the differential's
17+
;; direct-path gate is non-vacuous: it proves post-hoc CSE engages gale's path,
18+
;; stays correct, and never grows a function.
19+
(module
20+
(memory 1)
21+
22+
;; one const (30011) reused by four independent ops, all summed.
23+
(func (export "r1") (param i32) (result i32)
24+
(i32.add
25+
(i32.add (i32.mul (local.get 0) (i32.const 30011)) (i32.xor (local.get 0) (i32.const 30011)))
26+
(i32.add (i32.and (local.get 0) (i32.const 30011)) (i32.or (local.get 0) (i32.const 30011)))))
27+
28+
;; same const (50021) reused by three ops, summed — different op mix.
29+
(func (export "r2") (param i32) (result i32)
30+
(i32.add
31+
(i32.add (i32.mul (local.get 0) (i32.const 50021)) (i32.sub (local.get 0) (i32.const 50021)))
32+
(i32.xor (local.get 0) (i32.const 50021))))
33+
34+
;; NEGATIVE const (-30013) — exercises the movw/movt reconstruction path in
35+
;; const_materialization (keyed by u32 bit-pattern).
36+
(func (export "rneg") (param i32) (result i32)
37+
(i32.add
38+
(i32.add (i32.mul (local.get 0) (i32.const -30013)) (i32.xor (local.get 0) (i32.const -30013)))
39+
(i32.and (local.get 0) (i32.const -30013))))
40+
)

0 commit comments

Comments
 (0)