@@ -5,6 +5,90 @@ All notable changes to LOOM will be documented in this file.
55The format is based on [ Keep a Changelog] ( https://keepachangelog.com/en/1.1.0/ ) ,
66and this project adheres to [ Semantic Versioning] ( https://semver.org/spec/v2.0.0.html ) .
77
8+ ## [ 0.6.0] - 2026-05-11
9+
10+ This release is driven entirely by real-world findings on production
11+ gale code (Verus-verified kernel-scheduler FFI wasm): closing v0.5.0's
12+ +6.3% CSE size regression, lifting two early-exit guards that made
13+ LOOM a no-op on kernel-style code, and fixing a Z3 panic that blocked
14+ the inline pass on i64-heavy modules. Net effect on the gale_ffi
15+ fixture: code section -0.86% (was +6.3% in v0.5.0). Net effect on
16+ a 2.3 MB calculator.wasm component: -0.4% from the new dead-store
17+ pass alone.
18+
19+ ### Optimizer correctness (Z3 / inline)
20+
21+ - ** Closed ` inline_functions ` Z3 SortDiffers panic on i64-heavy wasm**
22+ (PR-D, closes #98 ). The verifier's symbolic-locals initialization
23+ defaulted to 32-bit width regardless of declared type at three
24+ sites; the gale-ffi crate (u64-packed FFI returns) crashed every
25+ inline attempt with ` SortDiffers { left: BitVec(64), right: BitVec(32) } ` .
26+ Fix: new helpers ` local_type_at ` + ` bv_width_for_value_type `
27+ resolve param/local types correctly at each extension site.
28+ Defensive ` match_bv_widths ` zero-extend helper added for future
29+ binop-site backstop.
30+
31+ ### Optimizer code size on real workloads
32+
33+ - ** CSE cost gate eliminates the gale +6.3% regression** (PR-A).
34+ v0.5.0's enhanced CSE deduplicated every duplicate expression
35+ including 1-2 byte constants. Replacing ` i32.const -EINVAL `
36+ (2 bytes) with ` local.tee N / local.get N ` (4 bytes) plus a new
37+ local declaration was unconditionally a size regression.
38+ New ` Expr::worth_dedup(occurrences) ` predicate estimates net byte
39+ savings via the formula ` net = (N-1)·(cost-2) - 4 ` and skips when
40+ non-positive. Gale code section: 862 → 808 bytes.
41+
42+ - ** ` eliminate_dead_locals ` pass** (PR-B): drops locals declared by
43+ a function but never read by any ` LocalGet ` anywhere in the
44+ function body. Targets the gale "default-then-override" pattern
45+ (rustc materializes an EINVAL default that every reachable path
46+ overwrites). The rule is path-INSENSITIVE — sound regardless of
47+ BrIf/BrTable/early-Return control flow — so the pass DOES NOT
48+ need the ` has_dataflow_unsafe_control_flow ` guard that previously
49+ made ` simplify_locals ` and ` coalesce_locals ` no-ops on every
50+ kernel-style function. Gale code section: 808 → 804 bytes.
51+ Asymmetric write neutralization: ` LocalSet → Drop ` preserves
52+ ` [T] → [] ` ; ` LocalTee → removed ` lets ` [T] → [T] ` pass through.
53+
54+ - ** ` eliminate_dead_stores ` pass with full backward liveness** (PR-C):
55+ per-position dead-store elimination via backward liveness walk
56+ over the structured wasm instruction tree. Handles Block/If
57+ precisely (` live-before-if = live-in-then ∪ live-in-else ` ), Br/
58+ BrIf/BrTable via label-stack indexing, Return/Unreachable as
59+ no-continuation. Loop bodies use a conservative approximation
60+ (everything read anywhere in the body is live throughout) — sound
61+ but imprecise inside loops; loop fixpoint precision is a follow-up.
62+ Net effect on calculator.wasm: -0.4% from this pass alone (~ 10 KB
63+ on a 2.3 MB component).
64+
65+ ### Cleanup follow-ups
66+
67+ - ** ` vacuum ` const+drop peephole** (PR-D). PR-B/PR-C neutralize
68+ dead ` LocalSet idx ` to ` Drop ` , leaving the value-pusher
69+ immediately followed by Drop. New ` peephole_const_drop ` folds
70+ ` pure_push;Drop ` pairs (constants, LocalGet, GlobalGet), recursing
71+ into Block/Loop/If bodies. NOT folded: memory loads, calls,
72+ anything that can trap — discarding the result does not discard
73+ the trap.
74+
75+ ### Research outputs
76+
77+ - ` docs/research/gale-v0.5.0/source-pattern-analysis.md ` — eight
78+ optimization-relevant patterns found in gale source with
79+ file: line citations (FSM dispatch, default-then-override, Verus
80+ ` decreases ` bounded loops, tail-call dispatch, leaf-inline +
81+ const-prop candidates, bit-mask axioms, 2D ` match (state, event) ` ,
82+ Verus annotations as trusted axioms).
83+ - ` docs/research/gale-v0.5.0/wasm-opt-gap-analysis.md ` — ranked top
84+ 7 wasm-opt passes by expected payoff on kernel-style code, with
85+ per-pick LOC and complexity estimates. Picks #1 , #2 (narrowed),
86+ and #3 are shipped in this release.
87+
88+ ### Test count
89+
90+ 303 → 317 tests passing (+14 across all v0.6.0 PRs).
91+
892## [ 0.5.0] - 2026-05-02
993
1094This release closes a real soundness bug discovered on production
0 commit comments