Skip to content

Commit bf85d94

Browse files
committed
feat(opt): eliminate_dead_locals — drop write-only locals (-0.86% on gale)
New pass that removes locals declared by a function but never read by any LocalGet anywhere in the function body. Targets the gale "default-then-override" pattern: rustc/LLVM materializes an EINVAL default at function entry, then every reachable path overwrites it before return. The default's local.set becomes pure dead store. Key property: "zero reads anywhere" is path-INSENSITIVE. Unlike full liveness (Pick #3), this rule is sound regardless of BrIf/BrTable/ early-Return control flow. So the pass DOES NOT need the has_dataflow_unsafe_control_flow guard that gates simplify_locals and coalesce_locals on every kernel-style early-exit function. v0.5.0's simplify_locals had zero effect on the gale workload by construction; this pass picks up where it refused to act. Algorithm: 1. Recursive read-count scan over the instruction tree. 2. Dead set = { idx | idx >= param_count && reads(idx) == 0 }. 3. Neutralize writes: LocalSet dead → Drop (preserves [T] -> [] stack effect) LocalTee dead → removed (Tee's [T] -> [T] passes through) 4. Pack-down remap: dense indices, reuse remap_instructions. 5. Z3 translation validation — revert on rejection. Stack-effect rationale for the asymmetric LocalSet/LocalTee handling: LocalSet idx : [T] -> [] so Drop is the substitute LocalTee idx : [T] -> [T] so removing leaves stack passing through Confusing these would corrupt the stack — replacing LocalTee with Drop would consume a value that downstream consumers expected to remain. Measurement on gale_ffi: baseline: code section 811 bytes v0.5.0 (regress'n): code section 862 bytes (+6.3%) v0.6.0 PR-A (CSE): code section 808 bytes (-0.4%) v0.6.0 PR-B (this): code section 804 bytes (-0.86%, vs baseline) PR-A and PR-B are independent and stack: PR-A fixes the regression, PR-B exposes a new optimization wasm-opt does that LOOM previously skipped on early-exit code. Visual confirmation on gale_bitarray_alloc_validate: before: (local i32) ; i32.const -22 ; local.set 3 ; ... after: (no locals) ; i32.const -22 ; drop ; ... The leftover `const; drop` is dead code that vacuum could in principle eliminate, but vacuum runs before this pass. A const+drop peephole in vacuum is a follow-up (~5 LOC). Tests (5 new): basic_write_only, preserves_used_locals, localtee_neutralization, packs_indices, skips_params. Pick #2 from v0.6.0 wasm-opt-gap research agent's plan (narrowed to the path-insensitive subset; full liveness is Pick #3). Trace: REQ-3, REQ-14
1 parent d354424 commit bf85d94

3 files changed

Lines changed: 409 additions & 2 deletions

File tree

.claude/settings.local.json

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,11 @@
5151
"Bash(Z3_SYS_Z3_HEADER=/opt/homebrew/include/z3.h LIBRARY_PATH=/opt/homebrew/lib cargo test --release --lib -- test_cse_phase4_keeps_small_constants --nocapture)",
5252
"Bash(Z3_SYS_Z3_HEADER=/opt/homebrew/include/z3.h LIBRARY_PATH=/opt/homebrew/lib cargo test --release --lib -- test_cse_phase4 --nocapture)",
5353
"Bash(grep -E \"Failed$|^error\" /private/tmp/claude-501/-Users-r-git-pulseengine-loom/c4560ac2-258a-4b13-acc5-e9e2a47f62e5/tasks/bafdrty57.output)",
54-
"Bash(cargo fmt *)"
54+
"Bash(cargo fmt *)",
55+
"Bash(wasm-tools print *)",
56+
"Bash(git pull *)",
57+
"Bash(Z3_SYS_Z3_HEADER=/opt/homebrew/include/z3.h LIBRARY_PATH=/opt/homebrew/lib cargo test --release --lib -- test_eliminate_dead_locals --nocapture)",
58+
"Bash(git commit *)"
5559
],
5660
"deny": [],
5761
"ask": []

loom-cli/src/main.rs

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@ enum Commands {
5050
attestation: bool,
5151

5252
/// Select specific optimization passes (comma-separated)
53-
/// Available: inline,precompute,constant-folding,cse,advanced,branches,dce,merge-blocks,vacuum,simplify-locals
53+
/// Available: inline,precompute,constant-folding,cse,advanced,branches,dce,merge-blocks,vacuum,simplify-locals,dead-locals
5454
/// Example: --passes inline,constant-folding,dce
5555
/// Default: all passes
5656
#[arg(long, value_delimiter = ',')]
@@ -490,6 +490,15 @@ fn optimize_command(
490490
track_pass("simplify-locals", before, after);
491491
}
492492

493+
if should_run("dead-locals") {
494+
println!(" Running: dead-locals");
495+
let before = count_instructions(&module);
496+
loom_core::optimize::eliminate_dead_locals(&mut module)
497+
.context("Dead-local elimination failed")?;
498+
let after = count_instructions(&module);
499+
track_pass("dead-locals", before, after);
500+
}
501+
493502
stats.optimization_time_ms = start_opt.elapsed().as_millis();
494503
println!("✓ Optimized in {} ms", stats.optimization_time_ms);
495504

0 commit comments

Comments
 (0)