Commit bf85d94
committed
feat(opt): eliminate_dead_locals — drop write-only locals (-0.86% on gale)
New pass that removes locals declared by a function but never read
by any LocalGet anywhere in the function body. Targets the gale
"default-then-override" pattern: rustc/LLVM materializes an EINVAL
default at function entry, then every reachable path overwrites
it before return. The default's local.set becomes pure dead store.
Key property: "zero reads anywhere" is path-INSENSITIVE. Unlike full
liveness (Pick #3), this rule is sound regardless of BrIf/BrTable/
early-Return control flow. So the pass DOES NOT need the
has_dataflow_unsafe_control_flow guard that gates simplify_locals
and coalesce_locals on every kernel-style early-exit function.
v0.5.0's simplify_locals had zero effect on the gale workload by
construction; this pass picks up where it refused to act.
Algorithm:
1. Recursive read-count scan over the instruction tree.
2. Dead set = { idx | idx >= param_count && reads(idx) == 0 }.
3. Neutralize writes:
LocalSet dead → Drop (preserves [T] -> [] stack effect)
LocalTee dead → removed (Tee's [T] -> [T] passes through)
4. Pack-down remap: dense indices, reuse remap_instructions.
5. Z3 translation validation — revert on rejection.
Stack-effect rationale for the asymmetric LocalSet/LocalTee handling:
LocalSet idx : [T] -> [] so Drop is the substitute
LocalTee idx : [T] -> [T] so removing leaves stack passing through
Confusing these would corrupt the stack — replacing LocalTee with
Drop would consume a value that downstream consumers expected to
remain.
Measurement on gale_ffi:
baseline: code section 811 bytes
v0.5.0 (regress'n): code section 862 bytes (+6.3%)
v0.6.0 PR-A (CSE): code section 808 bytes (-0.4%)
v0.6.0 PR-B (this): code section 804 bytes (-0.86%, vs baseline)
PR-A and PR-B are independent and stack: PR-A fixes the regression,
PR-B exposes a new optimization wasm-opt does that LOOM previously
skipped on early-exit code.
Visual confirmation on gale_bitarray_alloc_validate:
before: (local i32) ; i32.const -22 ; local.set 3 ; ...
after: (no locals) ; i32.const -22 ; drop ; ...
The leftover `const; drop` is dead code that vacuum could in principle
eliminate, but vacuum runs before this pass. A const+drop peephole
in vacuum is a follow-up (~5 LOC).
Tests (5 new): basic_write_only, preserves_used_locals,
localtee_neutralization, packs_indices, skips_params.
Pick #2 from v0.6.0 wasm-opt-gap research agent's plan (narrowed to
the path-insensitive subset; full liveness is Pick #3).
Trace: REQ-3, REQ-141 parent d354424 commit bf85d94
3 files changed
Lines changed: 409 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
51 | 51 | | |
52 | 52 | | |
53 | 53 | | |
54 | | - | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
55 | 59 | | |
56 | 60 | | |
57 | 61 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
50 | 50 | | |
51 | 51 | | |
52 | 52 | | |
53 | | - | |
| 53 | + | |
54 | 54 | | |
55 | 55 | | |
56 | 56 | | |
| |||
490 | 490 | | |
491 | 491 | | |
492 | 492 | | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
| 498 | + | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
493 | 502 | | |
494 | 503 | | |
495 | 504 | | |
| |||
0 commit comments