Commit 861d4c0
VCR-ORACLE-001. The optimized ARM path (`ir_to_arm`) resolves branch
displacements by summing a hand-maintained byte-size *estimator* over the
instruction stream — a mirror of the Thumb-2 encoder, kept by hand only
because synth-synthesis cannot depend on synth-backend (the encoder lives
downstream). When the mirror drifts, a forward branch spanning the drifting op
lands at the wrong byte (the #483-class miscompile). This is the structural
cause behind #498.
- Extract the inline `instr_byte_size`/`reg_num` closures from `ir_to_arm` to a
module-level `pub fn estimate_arm_byte_size(op: &ArmOp) -> usize`
(logic-identical; whitespace-normalized diff of the match body is empty).
Frozen byte gate + 59 synthesis tests confirm the optimized path is
bit-identical.
- Add `crates/synth-backend/tests/estimator_encoder_agreement.rs` (synth-backend
CAN see both the estimator and the real encoder): for every op the optimized
path emits, at the operand shapes it emits them in, assert
`estimate_arm_byte_size(op) == ArmEncoder::encode(op).len()` OR a documented
`KNOWN_GAP` pinned to its exact measured (est, enc) pair. A no-wildcard
`coverage()` match over all 220 `ArmOp` variants is a compile-time tripwire:
a new variant won't compile until consciously classified OnPath/OffPath. (It
forces classification, NOT an agreement case — an OnPath variant with no
`cases()` entry still passes vacuously; adding it is a documented manual step.)
Scope: a gap-documenting REGRESSION GUARD, NOT a #498 fix — correcting the
estimator is byte-changing codegen (separately gated). Findings the oracle
records (correct + extend #498's report):
- #498's claim that `Cmp` high-reg drifts is FALSE: 16-bit CMP (T2, 0x45xx)
encodes high regs → 2 bytes. The real high-reg drifts are `Cmn`/`Adds`/`Subs`
(no 16-bit high-reg / flag-setting form) → 4, est 2.
- `Popcnt` is absent from the estimator entirely (`_ => 2`) but the encoder
expands it to 86 bytes — an 84-byte hole, the largest single drift.
- `I64DivU/RemU/DivS/RemS`, `I64Popcnt`, `I64Extend32S` over-estimate.
- far `BOffset`/`BCondOffset` need the 4-byte form but the estimator sizes the
pre-resolution 0-offset placeholder as 2 (single-pass chicken-and-egg).
- `Mov` small-negative imm: encoder's signed `imm <= 255` test emits a
wrong-value 2-byte `MOVS #(imm&0xFF)` — a latent encoder bug, surfaced here.
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
1 parent 31db4a0 commit 861d4c0
3 files changed
Lines changed: 1272 additions & 147 deletions
File tree
- crates
- synth-backend/tests
- synth-synthesis/src
0 commit comments