Commit c766e61
committed
revert(security): restore Phase H live_regs_bitmap branch
Branchless Phase H (always FPE_Decode all 16 registers) was attempted
to eliminate the live_regs_bitmap timing leak (L5), but worsened
HighSecPolicy ANOVA from p=0.015 to p=2.2e-25.
Root cause: always-decode-all-16 added ~14 extra Speck64 decryptions
per instruction. While each individual decryption is constant-time,
the aggregate of 64 FPE_Decode calls per DU (N=4 × 16 regs) amplified
micro-architectural timing variance — different register file contents
across opcode benchmarks produced different cache retirement patterns
at the pipeline level, creating a new between-group signal (~42K ns σ)
that overwhelmed the reduced within-group noise (~123K ns σ, down 14%
from 143K).
Empirical comparison (HighSecPolicy, 110 iter, 125 DUs):
Metric Before L5 fix After L5 fix Verdict
─────────────────────────────────────────────────────────
Opcode Δ spread ±12 ns ±1034 ns 51× worse
ANOVA F 1.53 5.19 3.4× worse
ANOVA p 0.015 2.2e-25 regression
within_σ 142,543 ns 123,141 ns ↓14% (good)
between_σ ~34K ns 41,970 ns ↑24% (bad)
StandardPolicy improved (F: 530→15.5) because the bimodal pattern from
live_regs_bitmap was the dominant signal at N=2. But HighSecPolicy
regressed because the new micro-architectural signal exceeded what
N=4 noise padding could mask.
The correct fix for L5 is to normalize live_regs_bitmap at blob
creation time (serializer/linker), so all BBs declare the same live
register set — eliminating the timing signal without runtime cost.1 parent 334cc34 commit c766e61
1 file changed
Lines changed: 14 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
447 | 447 | | |
448 | 448 | | |
449 | 449 | | |
| 450 | + | |
| 451 | + | |
| 452 | + | |
| 453 | + | |
| 454 | + | |
| 455 | + | |
| 456 | + | |
| 457 | + | |
| 458 | + | |
| 459 | + | |
| 460 | + | |
| 461 | + | |
| 462 | + | |
| 463 | + | |
450 | 464 | | |
451 | 465 | | |
452 | 466 | | |
| |||
0 commit comments