Skip to content

Commit b53c807

Browse files
avrabeclaude
andauthored
fix(selector): arg-move cycles break via the parallel-move resolver — no callee-saved demand (#326) (#327)
gale's dissolved z_impl_k_mutex_unlock stopped compiling on v0.11.36: the #311 call-result pair tagging legitimately keeps i64 pairs live across the surrounding code, and when a later call's argument marshalling contains a genuine register cycle, emit_arg_moves demanded a free callee-saved cycle scratch (free_callee_saved -> Err) with R4-R8 all pinned — an exhaustion class the 3b-lite retry ladder never matched. Fix: emit_arg_moves now builds the same move set and hands it to the v0.11.38 parallel-move resolver (synth_synthesis::parallel_move), which breaks cycles WITHOUT a register when none is free: one SpillState slot, lowered as `str rX, [sp, #slot]; mov...; ldr rY, [sp, #slot]` (slot freed after the sequence). A free callee-saved register, when one exists, is still passed as the scratch candidate and produces pure MOVs. Two latent bugs die with the old emitter: * its cycle-break was WRONG CODE — it parked cycle_src but clobbered cycle_dst whose old value the next move still needed (a 2-swap left arg1 = arg0's value; verified by simulation). The RISC-V copy in synth-backend-riscv does NOT share the bug (it defers through scratch correctly) and is untouched. * free_callee_saved could hand back a register that WAS an arg source (args are popped before the scratch query); the resolver filters its scratch against the move set instead. Bit-identity: the resolver's phase 1 now pops the lowest-destination ready move first (BTreeSet), which is exactly the legacy scan order for the ascending-destination arg lists — acyclic marshals (every function that compiled on main) emit identical bytes. Cycle marshals previously either Err'd (this bug) or miscompiled (above), so no correct output changes. 7 fixtures sha256-identical vs origin/main (control_step, flight_seam, flight_seam_flat, high_pressure_i32/i64, u64_unpack, u64_unpack_inlined); the 3 frozen differentials PASS (13/13, 0x07FDF307 x2). i32-only edge: a cycle needing the slot in a function whose first pass reserved NO spill area fails with the ladder-recoverable exhaustion Err (SpillState.area_reserved, mirrored from compute_local_layout) instead of silently aliasing the param-backing slots; the backend retry reserves the area and the resolver then succeeds. Repro: scripts/repro/mutex_pressure.wat — three live i64 pairs pin (r3,r4)/(r5,r6)/(r7,r8) across two calls, param reload lands r1, call result lands r0, swap2(param, result) is a genuine r0/r1 swap. On v0.11.36..39: the exact #326 Err. After this fix: compiles, marshal is `str r0,[sp,#0x18]; mov r0,r1; ldr r1,[sp,#0x18]; bl`, and mutex_pressure_differential.py (wasmtime vs unicorn, BL relocs resolved, order-sensitive 2a-b callee) passes 7/7. Tests: 387 lib (+5: swap-under-saturation spills-not-errs, with-scratch register path, acyclic-saturated plain MOVs, no-area ladder Err, resolver lowest-dst-first order pin); workspace green; clippy/fmt clean. Honest bounds: >4 args / i64 args are still outside emit_arg_moves' scope (pre-existing); the resolver slot comes from the 8-slot spill pool — pool exhaustion mid-marshal remains a hard Err (same class as the #320 bound). Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
1 parent 84e7aaf commit b53c807

4 files changed

Lines changed: 499 additions & 82 deletions

File tree

0 commit comments

Comments
 (0)