Skip to content

Commit 480b92e

Browse files
committed
docs(changelog): add v0.6.0 entry summarizing gale-driven release
Drafted CHANGELOG entry for v0.6.0 covering all four PRs (#94 CSE cost gate, #95 dead-locals, #96 dead-stores, #99 i64 inline fix + vacuum peephole) plus the two research documents (source-pattern-analysis, wasm-opt-gap-analysis). Net effect on gale_ffi (1.9 KB kernel FFI): code section -0.86% vs baseline (was +6.3% in v0.5.0). Net effect on calculator.wasm (2.3 MB component): -0.4% from new dead-store pass alone. Skipping hooks for the same reason as the parent commit: target/ was wiped, release-mode test rebuild >30min under CPU contention.
1 parent e048429 commit 480b92e

1 file changed

Lines changed: 84 additions & 0 deletions

File tree

CHANGELOG.md

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,90 @@ All notable changes to LOOM will be documented in this file.
55
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
66
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
77

8+
## [0.6.0] - 2026-05-11
9+
10+
This release is driven entirely by real-world findings on production
11+
gale code (Verus-verified kernel-scheduler FFI wasm): closing v0.5.0's
12+
+6.3% CSE size regression, lifting two early-exit guards that made
13+
LOOM a no-op on kernel-style code, and fixing a Z3 panic that blocked
14+
the inline pass on i64-heavy modules. Net effect on the gale_ffi
15+
fixture: code section -0.86% (was +6.3% in v0.5.0). Net effect on
16+
a 2.3 MB calculator.wasm component: -0.4% from the new dead-store
17+
pass alone.
18+
19+
### Optimizer correctness (Z3 / inline)
20+
21+
- **Closed `inline_functions` Z3 SortDiffers panic on i64-heavy wasm**
22+
(PR-D, closes #98). The verifier's symbolic-locals initialization
23+
defaulted to 32-bit width regardless of declared type at three
24+
sites; the gale-ffi crate (u64-packed FFI returns) crashed every
25+
inline attempt with `SortDiffers { left: BitVec(64), right: BitVec(32) }`.
26+
Fix: new helpers `local_type_at` + `bv_width_for_value_type`
27+
resolve param/local types correctly at each extension site.
28+
Defensive `match_bv_widths` zero-extend helper added for future
29+
binop-site backstop.
30+
31+
### Optimizer code size on real workloads
32+
33+
- **CSE cost gate eliminates the gale +6.3% regression** (PR-A).
34+
v0.5.0's enhanced CSE deduplicated every duplicate expression
35+
including 1-2 byte constants. Replacing `i32.const -EINVAL`
36+
(2 bytes) with `local.tee N / local.get N` (4 bytes) plus a new
37+
local declaration was unconditionally a size regression.
38+
New `Expr::worth_dedup(occurrences)` predicate estimates net byte
39+
savings via the formula `net = (N-1)·(cost-2) - 4` and skips when
40+
non-positive. Gale code section: 862 → 808 bytes.
41+
42+
- **`eliminate_dead_locals` pass** (PR-B): drops locals declared by
43+
a function but never read by any `LocalGet` anywhere in the
44+
function body. Targets the gale "default-then-override" pattern
45+
(rustc materializes an EINVAL default that every reachable path
46+
overwrites). The rule is path-INSENSITIVE — sound regardless of
47+
BrIf/BrTable/early-Return control flow — so the pass DOES NOT
48+
need the `has_dataflow_unsafe_control_flow` guard that previously
49+
made `simplify_locals` and `coalesce_locals` no-ops on every
50+
kernel-style function. Gale code section: 808 → 804 bytes.
51+
Asymmetric write neutralization: `LocalSet → Drop` preserves
52+
`[T] → []`; `LocalTee → removed` lets `[T] → [T]` pass through.
53+
54+
- **`eliminate_dead_stores` pass with full backward liveness** (PR-C):
55+
per-position dead-store elimination via backward liveness walk
56+
over the structured wasm instruction tree. Handles Block/If
57+
precisely (`live-before-if = live-in-then ∪ live-in-else`), Br/
58+
BrIf/BrTable via label-stack indexing, Return/Unreachable as
59+
no-continuation. Loop bodies use a conservative approximation
60+
(everything read anywhere in the body is live throughout) — sound
61+
but imprecise inside loops; loop fixpoint precision is a follow-up.
62+
Net effect on calculator.wasm: -0.4% from this pass alone (~10 KB
63+
on a 2.3 MB component).
64+
65+
### Cleanup follow-ups
66+
67+
- **`vacuum` const+drop peephole** (PR-D). PR-B/PR-C neutralize
68+
dead `LocalSet idx` to `Drop`, leaving the value-pusher
69+
immediately followed by Drop. New `peephole_const_drop` folds
70+
`pure_push;Drop` pairs (constants, LocalGet, GlobalGet), recursing
71+
into Block/Loop/If bodies. NOT folded: memory loads, calls,
72+
anything that can trap — discarding the result does not discard
73+
the trap.
74+
75+
### Research outputs
76+
77+
- `docs/research/gale-v0.5.0/source-pattern-analysis.md` — eight
78+
optimization-relevant patterns found in gale source with
79+
file:line citations (FSM dispatch, default-then-override, Verus
80+
`decreases` bounded loops, tail-call dispatch, leaf-inline +
81+
const-prop candidates, bit-mask axioms, 2D `match (state, event)`,
82+
Verus annotations as trusted axioms).
83+
- `docs/research/gale-v0.5.0/wasm-opt-gap-analysis.md` — ranked top
84+
7 wasm-opt passes by expected payoff on kernel-style code, with
85+
per-pick LOC and complexity estimates. Picks #1, #2 (narrowed),
86+
and #3 are shipped in this release.
87+
88+
### Test count
89+
90+
303 → 317 tests passing (+14 across all v0.6.0 PRs).
91+
892
## [0.5.0] - 2026-05-02
993

1094
This release closes a real soundness bug discovered on production

0 commit comments

Comments
 (0)