Skip to content

Wnaf Optimizations#15

Open
42Pupusas wants to merge 2 commits into
RustCrypto:mainfrom
42Pupusas:wnaf-optimizations
Open

Wnaf Optimizations#15
42Pupusas wants to merge 2 commits into
RustCrypto:mainfrom
42Pupusas:wnaf-optimizations

Conversation

@42Pupusas
Copy link
Copy Markdown

This is the base work for integrating optimizations discussed here with the proper shared types.

  • Fix wnaf_table to allocate 2^(w-2) entries instead of 2^(w-1): For window size w, wNAF digits have max magnitude 2^(w-1) - 1, so the table is indexed by |digit| / 2 with max index 2^(w-2) - 1. The previous 2^(w-1) allocation computed twice as many odd multiples as needed, wasting point additions during table setup. For w=5 this halves table construction from 16 to 8 entries.
  • Add WnafScalar::from_le_bytes: Constructs wNAF digits from a raw little-endian byte slice, enabling callers to pass pre-decomposed scalars shorter than the full field representation. This is needed for endomorphism-based (GLV) scalar multiplication where a 256-bit scalar is split into two ~128-bit halves — using 17-byte inputs instead of 32-byte produces ~half the wNAF digits and ~half the doublings in the evaluation loop.
  • Add WnafBase::multiscalar_mul_array: Fixed-size array variant of multiscalar_mul that avoids the two collect::<Vec<_>>() heap allocations the iterator-based version requires to build the slice-of-slices for wnaf_multi_exp.
  • Pre-size Vec allocations: WnafScalar::new, from_le_bytes, and WnafBase::new now use Vec::with_capacity sized to the exact output length, avoiding potential reallocation during wnaf_form/wnaf_table.

Verification

  • cargo test — passes (group crate)
  • Downstream testing against RustCrypto/elliptic-curves with patch.crates=io pointing at this branch: 211 tests pass across k256 (90), p256 (39), p384 (34), p521 (26), sm2 (22).
  • Benchmarked k256 Schnorr verify (cargo bench --bench schnorr): the table size fix alone recovered ~10% regression from using the group crate's wNAF vs a hand-rolled implementation; combined with multiscalar_mul_array and pre-sizing, total performance matches the hand-rolled baseline (~52 µs)
  • Profiled with perf record + perf report to confirm the table fix halved time spent in wnaf_table (14.6% → 8.4% of total verify time)
  • DHAT heap profiling confirmed multiscalar_mul_array eliminates 2 of 10 per-call heap allocations

@tarcieri I can add the dhat profiling harness if interested, but it adds a dev dependency so did not include it

42Pupusas and others added 2 commits May 14, 2026 10:40
For window size w, wNAF digits have max magnitude 2^(w-1) - 1.
The table is indexed by |digit| / 2, so the maximum index is
(2^(w-1) - 1) / 2 = 2^(w-2) - 1, requiring 2^(w-2) entries.

The previous 2^(w-1) allocation computed twice as many odd
multiples as needed, wasting point additions during table setup.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- WnafBase::multiscalar_mul_array: accepts fixed-size arrays instead of
  iterators, avoiding the two collect() heap allocations in
  multiscalar_mul.
- WnafScalar::new/from_le_bytes: use Vec::with_capacity to avoid
  reallocation during wnaf_form.
- WnafBase::new: use Vec::with_capacity for exact table sizing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
42Pupusas added a commit to 42Pupusas/elliptic-curves that referenced this pull request May 14, 2026
Replace custom wNAF implementation (wnaf_128, build_odd_multiples,
WnafSlot, wnaf_ladder) with the group crate's WnafBase/WnafScalar
types and WnafBase::multiscalar_mul_array.

A new WnafScalar::from_le_bytes constructor accepts short (128-bit)
GLV half-scalars, producing ~half the wNAF digits and ~half the
doublings in the evaluation loop. multiscalar_mul_array avoids the
two collect() heap allocations of the iterator-based multiscalar_mul.

Depends on RustCrypto/group#15 for the group
crate changes (wnaf_table size fix, from_le_bytes, multiscalar_mul_array,
pre-sized Vec allocations).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
42Pupusas added a commit to 42Pupusas/elliptic-curves that referenced this pull request May 14, 2026
Replace custom wNAF implementation (wnaf_128, build_odd_multiples,
WnafSlot, wnaf_ladder) with the group crate's WnafBase/WnafScalar
types and WnafBase::multiscalar_mul_array.

A new WnafScalar::from_le_bytes constructor accepts short (128-bit)
GLV half-scalars, producing ~half the wNAF digits and ~half the
doublings in the evaluation loop. multiscalar_mul_array avoids the
two collect() heap allocations of the iterator-based multiscalar_mul.

Depends on RustCrypto/group#15 for the group
crate changes (wnaf_table size fix, from_le_bytes, multiscalar_mul_array,
pre-sized Vec allocations).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant