Skip to content

Commit c766e61

Browse files
committed
revert(security): restore Phase H live_regs_bitmap branch
Branchless Phase H (always FPE_Decode all 16 registers) was attempted to eliminate the live_regs_bitmap timing leak (L5), but worsened HighSecPolicy ANOVA from p=0.015 to p=2.2e-25. Root cause: always-decode-all-16 added ~14 extra Speck64 decryptions per instruction. While each individual decryption is constant-time, the aggregate of 64 FPE_Decode calls per DU (N=4 × 16 regs) amplified micro-architectural timing variance — different register file contents across opcode benchmarks produced different cache retirement patterns at the pipeline level, creating a new between-group signal (~42K ns σ) that overwhelmed the reduced within-group noise (~123K ns σ, down 14% from 143K). Empirical comparison (HighSecPolicy, 110 iter, 125 DUs): Metric Before L5 fix After L5 fix Verdict ───────────────────────────────────────────────────────── Opcode Δ spread ±12 ns ±1034 ns 51× worse ANOVA F 1.53 5.19 3.4× worse ANOVA p 0.015 2.2e-25 regression within_σ 142,543 ns 123,141 ns ↓14% (good) between_σ ~34K ns 41,970 ns ↑24% (bad) StandardPolicy improved (F: 530→15.5) because the bimodal pattern from live_regs_bitmap was the dominant signal at N=2. But HighSecPolicy regressed because the new micro-architectural signal exceeded what N=4 noise padding could mask. The correct fix for L5 is to normalize live_regs_bitmap at blob creation time (serializer/linker), so all BBs declare the same live register set — eliminating the timing signal without runtime cost.
1 parent 334cc34 commit c766e61

1 file changed

Lines changed: 14 additions & 0 deletions

File tree

runtime/src/vm_engine.cpp

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -447,6 +447,20 @@ execute_one_instruction(VmExecution& exec, VmEpoch& epoch,
447447
}
448448

449449
// ── Phase H: Re-encode all 16 registers (old key → new key) ────
450+
//
451+
// NOTE (Shannon branch): branchless Phase H was attempted but
452+
// reverted — always-decode-all-16 added ~14 extra FPE_Decode per
453+
// instruction, creating micro-architectural timing variance that
454+
// worsened HighSecPolicy ANOVA from p=0.015 to p=2e-25.
455+
//
456+
// The live_regs_bitmap branch remains. It leaks the number of
457+
// live registers per BB (~150 ns per extra FPE_Decode), visible
458+
// as a ~300 ns bimodal in StandardPolicy. For HighSecPolicy (N=4),
459+
// the crypto pipeline noise masks this signal adequately.
460+
//
461+
// Future fix: normalize live_regs_bitmap at blob creation time
462+
// (serializer/linker) so all BBs declare the same set of live
463+
// registers, eliminating the timing signal without runtime cost.
450464
{
451465
SecureLocal<Speck64_RoundKeys> new_rk;
452466
Speck64_KeySchedule(next_key.val, new_rk.val);

0 commit comments

Comments
 (0)