Commit b53c807
gale's dissolved z_impl_k_mutex_unlock stopped compiling on v0.11.36: the
#311 call-result pair tagging legitimately keeps i64 pairs live across the
surrounding code, and when a later call's argument marshalling contains a
genuine register cycle, emit_arg_moves demanded a free callee-saved cycle
scratch (free_callee_saved -> Err) with R4-R8 all pinned — an exhaustion
class the 3b-lite retry ladder never matched.
Fix: emit_arg_moves now builds the same move set and hands it to the
v0.11.38 parallel-move resolver (synth_synthesis::parallel_move), which
breaks cycles WITHOUT a register when none is free: one SpillState slot,
lowered as `str rX, [sp, #slot]; mov...; ldr rY, [sp, #slot]` (slot freed
after the sequence). A free callee-saved register, when one exists, is
still passed as the scratch candidate and produces pure MOVs.
Two latent bugs die with the old emitter:
* its cycle-break was WRONG CODE — it parked cycle_src but clobbered
cycle_dst whose old value the next move still needed (a 2-swap left
arg1 = arg0's value; verified by simulation). The RISC-V copy in
synth-backend-riscv does NOT share the bug (it defers through scratch
correctly) and is untouched.
* free_callee_saved could hand back a register that WAS an arg source
(args are popped before the scratch query); the resolver filters its
scratch against the move set instead.
Bit-identity: the resolver's phase 1 now pops the lowest-destination ready
move first (BTreeSet), which is exactly the legacy scan order for the
ascending-destination arg lists — acyclic marshals (every function that
compiled on main) emit identical bytes. Cycle marshals previously either
Err'd (this bug) or miscompiled (above), so no correct output changes.
7 fixtures sha256-identical vs origin/main (control_step, flight_seam,
flight_seam_flat, high_pressure_i32/i64, u64_unpack, u64_unpack_inlined);
the 3 frozen differentials PASS (13/13, 0x07FDF307 x2).
i32-only edge: a cycle needing the slot in a function whose first pass
reserved NO spill area fails with the ladder-recoverable exhaustion Err
(SpillState.area_reserved, mirrored from compute_local_layout) instead of
silently aliasing the param-backing slots; the backend retry reserves the
area and the resolver then succeeds.
Repro: scripts/repro/mutex_pressure.wat — three live i64 pairs pin
(r3,r4)/(r5,r6)/(r7,r8) across two calls, param reload lands r1, call
result lands r0, swap2(param, result) is a genuine r0/r1 swap. On
v0.11.36..39: the exact #326 Err. After this fix: compiles, marshal is
`str r0,[sp,#0x18]; mov r0,r1; ldr r1,[sp,#0x18]; bl`, and
mutex_pressure_differential.py (wasmtime vs unicorn, BL relocs resolved,
order-sensitive 2a-b callee) passes 7/7.
Tests: 387 lib (+5: swap-under-saturation spills-not-errs, with-scratch
register path, acyclic-saturated plain MOVs, no-area ladder Err, resolver
lowest-dst-first order pin); workspace green; clippy/fmt clean.
Honest bounds: >4 args / i64 args are still outside emit_arg_moves' scope
(pre-existing); the resolver slot comes from the 8-slot spill pool — pool
exhaustion mid-marshal remains a hard Err (same class as the #320 bound).
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
1 parent 84e7aaf commit b53c807
4 files changed
Lines changed: 499 additions & 82 deletions
File tree
- crates/synth-synthesis/src
- scripts/repro
0 commit comments