fix(permutation): make bitwise_permute constant-time#165
Merged
Conversation
9bdec01 to
60fe1e9
Compare
28a092f to
2e89ac6
Compare
6 tasks
tobyhede
approved these changes
May 11, 2026
Contributor
tobyhede
left a comment
There was a problem hiding this comment.
LGTM.
Adding some unit tests would improve, but not a deal breaker
The previous `bitwise_permute` used `BitArray::get_unchecked(secret_index)` to read bits at secret-derived positions. Even though all sized bit arrays here fit in a single cache line (≤16 bytes), the secret-indexed load is still a microarchitectural hazard — port pressure and other intra-cache-line side channels can leak the index on some CPUs. Reimplement as: unpack to one byte per bit (MSB-first), permute through the now-constant-time `permute_array` primitive, repack. This routes the bit-level operation through the same scan-and-mask machinery that already protects element-wise permutation. Secret intermediates (`bytes`, `bits`, `permuted`) are wrapped in `Zeroizing<_>` for unwind safety, matching the pattern in elementwise. Drops the direct `bitvec` usage in this file; the `BitArray` machinery is no longer needed. Bench results extended to cover bitwise_permute. Overhead: 28×–199× across N=8..128, in line with the elementwise result.
Addresses Toby's review feedback on PR #165 — the existing `bitwise_permute_case` only asserts `output != input`, which is too weak to catch most plausible regressions in the bit-pack/unpack glue. Adds four targeted invariant checks: - `zero_in_zero_out`: 0 -> 0 across every supported width. Catches stray bits introduced by sign/endian bugs in the unpack/repack path. - `all_ones_invariant`: !0 is a fixed point of every bit permutation (all bit positions hold 1, so no movement is observable). Catches missing-bit bugs at high/low width boundaries. - `preserves_hamming_weight`: popcount(permute(x)) == popcount(x). The strongest single property check available here — any bug that drops, duplicates, or mis-positions a bit changes the count. Also justifies the `unsafe new_unchecked` in the NonZero impls (popcount > 0 in => popcount > 0 out). - `is_deterministic`: cheap guard against future nondeterminism (hash-seeds, uninit-memory branches, etc.) creeping into the path. Existing `permute_array` correctness is already covered in elementwise tests, so these new tests focus on the bit-shuffle glue specific to `bitwise.rs` rather than retesting the underlying primitive.
2e89ac6 to
1d247ad
Compare
coderdan
added a commit
that referenced
this pull request
May 11, 2026
The wasm32 backend (#163), permutation cache-timing fixes (#164, #165), and digest 0.11 migration (#161) accumulated enough surface change since 0.1.0-pre4.2 to warrant a minor-version bump rather than another point release in the 0.1.0-pre4.x series — `0.2.0-pre` opens a new pre-release track that better signals the scope to downstream consumers.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stacked on #164 — review/merge that one first.
Summary
bitwise_permutepreviously usedBitArray::get_unchecked(secret_index)to fetch bits at key-derived positions. The bit array fits in a single cache line (≤16 bytes) so it's not exploitable by classic cache-line probing, but it's still a microarchitectural hazard — port pressure, store-buffer effects, and other intra-cache-line side channels can leak secret indices on some CPUs.permute_arrayprimitive → repack to bytes. This routes the bit-level operation through the same scan-and-mask machinery introduced in fix(permutation): close cache-timing side-channel in permute_array #164.bytes,bits,permuted) wrapped inZeroizing<_>for unwind safety, matching the elementwise pattern.bitvecusage inbitwise.rs; theBitArraymachinery is no longer needed there.Performance impact
Apple silicon (arm64),
--release, 500K iterations, key bytes pre-extracted via the publicPermuteAPI for the "old" reference:In line with the elementwise overhead — bit-level
bitwise_permutereusespermute_arrayinternally, so the cost profile matches. Same SIMD-shuffle follow-up (tbl/pshufb) applies if these sizes show up in a hot path.Verification
cargo test -p vitaminc-permutation— 6 unit + 3 doctests pass (round-trip, key inversion, bitwise correctness across u8/u16/u32/u64/u128 + NonZero variants)cargo fmt --check— cleanpermute_array, no secret-indexed loads remain inbitwise.rsTest plan
bitwise_permute_caseexercises N=8/16/32/64/128 across all int and NonZero variants