Skip to content

lowram: Per-row t0/t1 computation in keygen #1030

Open
mkannwischer wants to merge 3 commits intomainfrom
keygen-buffer-sharing-stream-a
Open

lowram: Per-row t0/t1 computation in keygen #1030
mkannwischer wants to merge 3 commits intomainfrom
keygen-buffer-sharing-stream-a

Conversation

@mkannwischer
Copy link
Copy Markdown
Contributor

Replace mld_compute_t0_t1_tr_from_sk_components with the per-row
mld_compute_t0k_t1k. Both keygen and pk_from_sk now process one
row at a time, packing t1[k] into pk and t0[k] into sk immediately,
which eliminates the full polyveck allocations for t0 and t1.

To express this cleanly, refactor mld_polyvecl_pointwise_acc_montgomery
to take (mat, k, v) instead of (u, v) and move it into polyvec_lazy
with eager and lazy variants. Add per-row pack helpers
mld_pack_sk_t0 / mld_pack_pk_t1, and split mld_pack_sk_rho_key_tr_s2_t0
to drop the t0 packing.

Drop now-dead mld_polyveck_add, mld_polyveck_pack_t0,
mld_polyveck_power2round and mld_pack_pk along with their CBMC proofs.

Update the affected CBMC proofs.

Replace the row-level matrix buffer (mld_polyvecl) with a single-poly
buffer in REDUCE_RAM mode. In the lazy path, matrix elements A[k][l]
are sampled on demand one at a time, and the matrix-vector product
accumulates element-by-element instead of row-by-row.

Restructure polymat into eager/lazy variants following the same pattern
as s1hat/s2hat/t0hat:
- mld_polymat_eager: stores full K x L matrix
- mld_polymat_lazy: stores rho + single poly_buffer + tmp
- mld_polyvec_matrix_expand_eager/_lazy: separate implementations
- mld_polyvec_matrix_pointwise_montgomery_eager/_lazy: separate
  implementations with CBMC contracts only on the eager variants

Move all polymat-related code from polyvec.h/polyvec.c into
polyvec_lazy.h/polyvec_lazy.c.

Signed-off-by: Matthias J. Kannwischer <matthias@zerorisc.com>
Reuse t0 as the accumulator in mld_compute_t0_t1_tr_from_sk_components,
and have the caller provide s1 already in NTT form, removing two
allocations (s1hat and t) from the helper.

In mld_sign_keypair_internal, share the s1 and t1 buffers via a union
since s1hat is consumed before t1 is produced. Pack s1 into the secret
key before the in-place NTT so the original coefficients are preserved.

Split mld_pack_sk into mld_pack_sk_s1 and mld_pack_sk_rho_key_tr_s2_t0
to support packing s1 independently before the NTT.

Signed-off-by: Matthias J. Kannwischer <matthias@zerorisc.com>
@mkannwischer mkannwischer force-pushed the keygen-buffer-sharing-stream-a branch from 6d4e630 to 62a104b Compare April 8, 2026 07:29
Replace mld_compute_t0_t1_tr_from_sk_components with the per-row
mld_compute_t0k_t1k. Both keygen and pk_from_sk now process one
row at a time, packing t1[k] into pk and t0[k] into sk immediately,
which eliminates the full polyveck allocations for t0 and t1.

To express this cleanly, refactor mld_polyvecl_pointwise_acc_montgomery
to take (mat, k, v) instead of (u, v) and move it into polyvec_lazy
with eager and lazy variants. Add per-row pack helpers
mld_pack_sk_t0 / mld_pack_pk_t1, and split mld_pack_sk_rho_key_tr_s2_t0
to drop the t0 packing.

Drop now-dead mld_polyveck_add, mld_polyveck_pack_t0,
mld_polyveck_power2round and mld_pack_pk along with their CBMC proofs.

Update the affected CBMC proofs.

Signed-off-by: Matthias J. Kannwischer <matthias@zerorisc.com>
@mkannwischer mkannwischer force-pushed the keygen-buffer-sharing-stream-a branch from 62a104b to c9509d5 Compare April 8, 2026 07:42
@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented Apr 8, 2026

CBMC Results (ML-DSA-44)

⚠️ Attention Required

Proof Status Current Previous Change
sign_pk_from_sk ⚠️ 171s 6s +2750%
Full Results (182 proofs)
Proof Status Current Previous Change
**TOTAL** 2074s 2061s +0.6%
polyvecl_pointwise_acc_montgomery_c 268s 350s -23%
sign_verify_internal 201s 211s -5%
sign_pk_from_sk ⚠️ 171s 6s +2750%
poly_pointwise_montgomery_c 153s 148s +3%
rej_uniform_native 144s 142s +1%
mld_attempt_signature_generation 104s 105s -1%
mld_invntt_layer 74s 88s -16%
mld_ct_memcmp 71s 75s -5%
mld_ntt_layer 36s 54s -33%
polyvecl_pointwise_acc_montgomery_eager 23s - new
sign_signature_internal 23s 20s +15%
polyvec_matrix_expand_eager 22s - new
rej_uniform 20s 23s -13%
fqmul 19s 20s -5%
poly_uniform_eta_4x 19s 16s +19%
poly_chknorm_c 18s 19s -5%
sign_keypair_internal 16s 4s +300%
poly_uniform_4x 15s 14s +7%
polyeta_unpack 15s 16s -6%
poly_add 14s 11s +27%
keccakf1600x4_permute_native 13s 11s +18%
polyt0_unpack 13s 13s +0%
rej_uniform_c 13s 15s -13%
mld_ntt_butterfly_block 12s 13s -8%
polymat_permute_bitrev_to_custom 12s 28s -57%
polyz_unpack_c 10s 12s -17%
keccak_absorb_once_x4 9s 11s -18%
keccakf1600_permute_native 9s 7s +29%
mld_check_pct 9s 12s -25%
polyveck_chknorm 9s 4s +125%
keccakf1600_permute 8s 11s -27%
poly_invntt_tomont_c 8s 7s +14%
polyveck_caddq 8s 5s +60%
sign 8s 6s +33%
compute_t0k_t1k 7s - new
keccak_absorb 7s 7s +0%
poly_use_hint_c 7s 4s +75%
polyvec_matrix_expand_eager_serial 7s - new
polyveck_reduce 7s 4s +75%
mld_compute_pack_z 6s 5s +20%
mld_polyvecl_permute_bitrev_to_custom_native 6s 7s -14%
pack_sig_c 6s 5s +20%
poly_power2round 6s 5s +20%
polyveck_use_hint 6s 5s +20%
unpack_sig 6s 2s +200%
caddq 5s 4s +25%
keccak_squeezeblocks_x4 5s 7s -29%
make_hint 5s 2s +150%
mld_h 5s 4s +25%
mld_prepare_domain_separation_prefix 5s 4s +25%
mld_sample_s1_s2_serial 5s 8s -38%
poly_caddq_c 5s 7s -29%
poly_chknorm_native 5s 4s +25%
polyt1_unpack 5s 3s +67%
polyvec_matrix_pointwise_montgomery_eager 5s - new
polyveck_decompose 5s 7s -29%
polyveck_invntt_tomont 5s 6s -17%
polyveck_ntt 5s 5s +0%
polyvecl_chknorm 5s 3s +67%
polyvecl_pack_eta 5s 2s +150%
power2round 5s 4s +25%
reduce32 5s 1s +400%
rej_eta_native 5s 5s +0%
sign_signature_extmu 5s 3s +67%
sign_signature_pre_hash_shake256 5s 6s -17%
sign_verify_pre_hash_internal 5s 6s -17%
unpack_sk 5s 9s -44%
use_hint 5s 4s +25%
intt_native_x86_64 4s 2s +100%
keccakf1600_xor_bytes (big endian) 4s 3s +33%
mld_ct_cmask_nonzero_u32 4s 3s +33%
mld_ct_cmask_nonzero_u8 4s 2s +100%
mld_keccakf1600_extract_bytes 4s 3s +33%
pack_sig_h_poly 4s 2s +100%
pack_sk_rho_key_tr_s2 4s - new
pointwise_native_x86_64 4s 3s +33%
poly_caddq 4s 3s +33%
poly_decompose_c 4s 3s +33%
poly_decompose_native 4s 4s +0%
poly_invntt_tomont_native 4s 4s +0%
poly_ntt 4s 4s +0%
poly_pointwise_montgomery 4s 6s -33%
poly_shiftl 4s 3s +33%
poly_uniform 4s 4s +0%
poly_uniform_gamma1_4x 4s 3s +33%
polyt1_pack 4s 3s +33%
polyveck_pointwise_poly_montgomery 4s 5s -20%
polyveck_sub 4s 5s -20%
polyveck_unpack_eta 4s 3s +33%
polyvecl_ntt 4s 4s +0%
shake128x4_absorb_once 4s 3s +33%
shake256_finalize 4s 2s +100%
sign_keypair 4s 3s +33%
sign_verify 4s 5s -20%
sign_verify_extmu 4s 4s +0%
sys_check_capability 4s 3s +33%
unpack_hints 4s 5s -20%
decompose 3s 3s +0%
keccak_squeeze 3s 2s +50%
keccakf1600_extract_bytes (big endian) 3s 2s +50%
keccakf1600_xor_bytes 3s 1s +200%
keccakf1600x4_permute 3s 4s -25%
mld_ct_abs_i32 3s 1s +200%
mld_ct_get_optblocker_u32 3s 3s +0%
mld_ct_sel_int32 3s 2s +50%
mld_sample_s1_s2 3s 5s -40%
mld_value_barrier_i64 3s 3s +0%
mld_value_barrier_u32 3s 2s +50%
ntt_native_x86_64 3s 3s +0%
pack_pk_t1 3s - new
pack_sig_z 3s 3s +0%
poly_caddq_native_aarch64 3s 3s +0%
poly_challenge 3s 4s -25%
poly_chknorm 3s 5s -40%
poly_chknorm_native_aarch64 3s 2s +50%
poly_decompose 3s 2s +50%
poly_make_hint 3s 3s +0%
poly_ntt_c 3s 2s +50%
poly_ntt_native 3s 4s -25%
poly_sub 3s 3s +0%
poly_uniform_eta 3s 4s -25%
poly_uniform_gamma1 3s 3s +0%
poly_use_hint_native 3s 3s +0%
polyt0_pack 3s 7s -57%
polyveck_unpack_t0 3s 2s +50%
polyvecl_uniform_gamma1 3s 4s -25%
polyvecl_uniform_gamma1_serial 3s 3s +0%
polyz_unpack 3s 2s +50%
polyz_unpack_native 3s 3s +0%
rej_eta 3s 5s -40%
rej_eta_c 3s 5s -40%
shake128_finalize 3s 2s +50%
shake256 3s 2s +50%
shake256_release 3s 2s +50%
shake256_squeeze 3s 2s +50%
shake256x4_absorb_once 3s 3s +0%
shake256x4_squeezeblocks 3s 3s +0%
sign_open 3s 2s +50%
sign_signature_pre_hash_internal 3s 6s -50%
fqscale 2s 5s -60%
keccak_f1600_x1_native_aarch64 2s 2s +0%
keccak_f1600_x4_native_aarch64_v84a 2s 2s +0%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 2s 3s -33%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 2s 2s +0%
keccak_init 2s 2s +0%
keccakf1600x4_extract_bytes 2s 4s -50%
keccakf1600x4_xor_bytes 2s 2s +0%
mld_ct_get_optblocker_i64 2s 2s +0%
montgomery_reduce 2s 3s -33%
ntt_native_aarch64 2s 2s +0%
pack_sk_s1 2s - new
pack_sk_t0 2s - new
pointwise_native_aarch64 2s 3s -33%
poly_caddq_native 2s 3s -33%
poly_invntt_tomont 2s 3s -33%
poly_pointwise_montgomery_native 2s 2s +0%
poly_reduce 2s 3s -33%
poly_use_hint 2s 2s +0%
polyeta_pack 2s 2s +0%
polyveck_pack_eta 2s 2s +0%
polyveck_shiftl 2s 5s -60%
polyvecl_permute_bitrev_to_custom 2s 2s +0%
polyvecl_unpack_z 2s 5s -60%
polyw1_pack 2s 3s -33%
polyz_pack 2s 2s +0%
shake128_absorb 2s 2s +0%
shake128_init 2s 2s +0%
shake128_release 2s 4s -50%
shake128_squeeze 2s 2s +0%
shake128x4_squeezeblocks 2s 4s -50%
shake256_absorb 2s 4s -50%
shake256_init 2s 3s -33%
sign_signature 2s 4s -50%
sign_verify_pre_hash_shake256 2s 6s -67%
unpack_pk 2s 3s -33%
keccak_f1600_x1_native_aarch64_v84a 1s 2s -50%
keccak_finalize 1s 4s -75%
mld_ct_cmask_neg_i32 1s 2s -50%
mld_ct_get_optblocker_u8 1s 2s -50%
mld_value_barrier_u8 1s 3s -67%
polyveck_pack_w1 1s 3s -67%
polyvecl_unpack_eta 1s 3s -67%

@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented Apr 8, 2026

CBMC Results (ML-DSA-87)

⚠️ Attention Required

Proof Status Current Previous Change
mld_attempt_signature_generation ⚠️ 163s 97s +68%
polymat_permute_bitrev_to_custom ⚠️ 158s 25s +532%
polyveck_decompose ⚠️ 22s 13s +69%
sign_keypair_internal ⚠️ 27s 4s +575%
sign_pk_from_sk ⚠️ 51s 8s +538%
sign_verify_internal ⚠️ 293s 144s +103%
Full Results (182 proofs)
Proof Status Current Previous Change
**TOTAL** 2619s 2234s +17.2%
sign_verify_internal ⚠️ 293s 144s +103%
polyvecl_pointwise_acc_montgomery_c 274s 194s +41%
polyvec_matrix_expand_eager 183s - new
mld_attempt_signature_generation ⚠️ 163s 97s +68%
polymat_permute_bitrev_to_custom ⚠️ 158s 25s +532%
poly_pointwise_montgomery_c 141s 155s -9%
rej_uniform_native 141s 144s -2%
polyvec_matrix_expand_eager_serial 87s - new
mld_ct_memcmp 74s 73s +1%
polyvecl_pointwise_acc_montgomery_eager 69s - new
mld_invntt_layer 68s 95s -28%
mld_ntt_layer 54s 54s +0%
sign_pk_from_sk ⚠️ 51s 8s +538%
sign_signature_internal 39s 40s -3%
sign_keypair_internal ⚠️ 27s 4s +575%
polyveck_decompose ⚠️ 22s 13s +69%
fqmul 21s 19s +11%
rej_uniform 20s 19s +5%
poly_chknorm_c 18s 20s -10%
keccakf1600x4_permute_native 17s 13s +31%
polyeta_unpack 17s 18s -6%
poly_uniform_eta_4x 16s 15s +7%
poly_uniform_4x 15s 16s -6%
polyt0_unpack 14s 15s -7%
rej_uniform_c 14s 13s +8%
mld_check_pct 13s 13s +0%
poly_add 12s 13s -8%
polyveck_reduce 12s 7s +71%
mld_ntt_butterfly_block 11s 15s -27%
keccak_absorb_once_x4 10s 11s -9%
poly_invntt_tomont_c 10s 6s +67%
polyveck_pointwise_poly_montgomery 10s 5s +100%
polyz_unpack_c 10s 10s +0%
mld_polyvecl_permute_bitrev_to_custom_native 9s 13s -31%
polyvec_matrix_pointwise_montgomery_eager 9s - new
polyveck_ntt 9s 9s +0%
keccak_absorb 8s 7s +14%
keccakf1600_permute_native 8s 8s +0%
polyveck_caddq 8s 5s +60%
polyveck_invntt_tomont 8s 9s -11%
polyveck_shiftl 8s 9s -11%
unpack_sk 8s 9s -11%
keccak_squeezeblocks_x4 7s 7s +0%
keccakf1600_permute 7s 9s -22%
poly_challenge 7s 5s +40%
poly_decompose_c 7s 9s -22%
polyveck_use_hint 7s 12s -42%
polyvecl_ntt 7s 7s +0%
sign 7s 7s +0%
mld_sample_s1_s2 6s 5s +20%
mld_sample_s1_s2_serial 6s 5s +20%
poly_uniform_eta 6s 6s +0%
polyveck_sub 6s 7s -14%
polyvecl_chknorm 6s 9s -33%
shake256 6s 2s +200%
sign_signature_extmu 6s 4s +50%
mld_compute_pack_z 5s 7s -29%
mld_prepare_domain_separation_prefix 5s 2s +150%
pack_sig_c 5s 7s -29%
pointwise_native_x86_64 5s 3s +67%
poly_caddq_native 5s 5s +0%
poly_power2round 5s 4s +25%
poly_uniform_gamma1_4x 5s 5s +0%
polyt0_pack 5s 2s +150%
sign_open 5s 7s -29%
unpack_sig 5s 5s +0%
compute_t0k_t1k 4s - new
keccak_f1600_x1_native_aarch64 4s 2s +100%
mld_h 4s 5s -20%
ntt_native_aarch64 4s 4s +0%
ntt_native_x86_64 4s 3s +33%
pack_pk_t1 4s - new
poly_caddq_c 4s 6s -33%
poly_ntt 4s 4s +0%
poly_pointwise_montgomery 4s 3s +33%
poly_uniform 4s 4s +0%
poly_uniform_gamma1 4s 3s +33%
poly_use_hint_c 4s 5s -20%
poly_use_hint_native 4s 4s +0%
polyveck_chknorm 4s 7s -43%
polyveck_unpack_t0 4s 6s -33%
polyvecl_uniform_gamma1 4s 6s -33%
polyz_unpack_native 4s 3s +33%
shake128_finalize 4s 2s +100%
shake128_squeeze 4s 3s +33%
sign_keypair 4s 2s +100%
sign_signature_pre_hash_shake256 4s 5s -20%
sign_verify_extmu 4s 2s +100%
sign_verify_pre_hash_shake256 4s 3s +33%
unpack_hints 4s 6s -33%
unpack_pk 4s 4s +0%
use_hint 4s 2s +100%
decompose 3s 4s -25%
intt_native_x86_64 3s 3s +0%
keccak_f1600_x1_native_aarch64_v84a 3s 1s +200%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 3s 3s +0%
keccak_finalize 3s 3s +0%
keccak_init 3s 2s +50%
keccakf1600x4_extract_bytes 3s 3s +0%
make_hint 3s 4s -25%
mld_ct_abs_i32 3s 2s +50%
mld_ct_cmask_nonzero_u32 3s 2s +50%
mld_ct_cmask_nonzero_u8 3s 2s +50%
mld_ct_get_optblocker_i64 3s 3s +0%
mld_ct_get_optblocker_u32 3s 3s +0%
mld_ct_sel_int32 3s 1s +200%
mld_value_barrier_u32 3s 1s +200%
pack_sig_h_poly 3s 2s +50%
pack_sig_z 3s 2s +50%
pointwise_native_aarch64 3s 2s +50%
poly_caddq_native_aarch64 3s 4s -25%
poly_chknorm_native 3s 4s -25%
poly_chknorm_native_aarch64 3s 6s -50%
poly_decompose 3s 2s +50%
poly_invntt_tomont_native 3s 5s -40%
poly_ntt_c 3s 3s +0%
poly_ntt_native 3s 1s +200%
poly_reduce 3s 2s +50%
poly_use_hint 3s 4s -25%
polyeta_pack 3s 3s +0%
polyt1_unpack 3s 3s +0%
polyveck_pack_eta 3s 4s -25%
polyveck_pack_w1 3s 5s -40%
polyvecl_unpack_eta 3s 3s +0%
polyvecl_unpack_z 3s 3s +0%
polyw1_pack 3s 2s +50%
polyz_pack 3s 3s +0%
power2round 3s 5s -40%
reduce32 3s 4s -25%
rej_eta 3s 5s -40%
rej_eta_c 3s 5s -40%
shake128_absorb 3s 4s -25%
shake128_release 3s 2s +50%
shake128x4_squeezeblocks 3s 2s +50%
shake256x4_squeezeblocks 3s 3s +0%
sign_signature_pre_hash_internal 3s 4s -25%
sign_verify 3s 3s +0%
sign_verify_pre_hash_internal 3s 5s -40%
caddq 2s 3s -33%
fqscale 2s 1s +100%
keccak_f1600_x4_native_aarch64_v84a 2s 4s -50%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 2s 3s -33%
keccakf1600x4_permute 2s 4s -50%
mld_keccakf1600_extract_bytes 2s 5s -60%
mld_value_barrier_i64 2s 3s -33%
mld_value_barrier_u8 2s 1s +100%
montgomery_reduce 2s 4s -50%
pack_sk_rho_key_tr_s2 2s - new
pack_sk_s1 2s - new
pack_sk_t0 2s - new
poly_caddq 2s 4s -50%
poly_chknorm 2s 2s +0%
poly_decompose_native 2s 3s -33%
poly_invntt_tomont 2s 3s -33%
poly_make_hint 2s 4s -50%
poly_pointwise_montgomery_native 2s 3s -33%
poly_sub 2s 4s -50%
polyt1_pack 2s 2s +0%
polyveck_unpack_eta 2s 3s -33%
polyvecl_pack_eta 2s 6s -67%
polyvecl_permute_bitrev_to_custom 2s 3s -33%
polyvecl_uniform_gamma1_serial 2s 3s -33%
rej_eta_native 2s 4s -50%
shake128_init 2s 2s +0%
shake128x4_absorb_once 2s 3s -33%
shake256_absorb 2s 3s -33%
shake256_init 2s 4s -50%
shake256_squeeze 2s 3s -33%
shake256x4_absorb_once 2s 2s +0%
sign_signature 2s 4s -50%
sys_check_capability 2s 3s -33%
keccak_squeeze 1s 3s -67%
keccakf1600_extract_bytes (big endian) 1s 8s -88%
keccakf1600_xor_bytes 1s 3s -67%
keccakf1600_xor_bytes (big endian) 1s 4s -75%
keccakf1600x4_xor_bytes 1s 3s -67%
mld_ct_cmask_neg_i32 1s 2s -50%
mld_ct_get_optblocker_u8 1s 2s -50%
poly_shiftl 1s 3s -67%
polyz_unpack 1s 1s +0%
shake256_finalize 1s 3s -67%
shake256_release 1s 2s -50%

@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented Apr 8, 2026

CBMC Results (ML-DSA-65)

⚠️ Attention Required

Proof Status Current Previous Change
**TOTAL** ⚠️ 3637s 2418s +50.4%
polyvecl_pointwise_acc_montgomery_c ⚠️ 1556s 295s +427%
sign_keypair_internal ⚠️ 22s 7s +214%
sign_pk_from_sk ⚠️ 41s 9s +356%
Full Results (182 proofs)
Proof Status Current Previous Change
**TOTAL** ⚠️ 3637s 2418s +50.4%
polyvecl_pointwise_acc_montgomery_c ⚠️ 1556s 295s +427%
sign_verify_internal 295s 289s +2%
poly_pointwise_montgomery_c 168s 164s +2%
rej_uniform_native 151s 154s -2%
mld_attempt_signature_generation 150s 110s +36%
polyvec_matrix_expand_eager 108s - new
mld_ct_memcmp 77s 80s -4%
mld_invntt_layer 73s 98s -26%
mld_ntt_layer 56s 56s +0%
polyvec_matrix_expand_eager_serial 46s - new
polyvecl_pointwise_acc_montgomery_eager 42s - new
sign_pk_from_sk ⚠️ 41s 9s +356%
sign_signature_internal 36s 30s +20%
fqmul 22s 20s +10%
rej_uniform 22s 22s +0%
sign_keypair_internal ⚠️ 22s 7s +214%
poly_chknorm_c 21s 21s +0%
poly_uniform_eta_4x 19s 19s +0%
poly_uniform_4x 15s 17s -12%
polyt0_unpack 15s 17s -12%
rej_uniform_c 15s 15s +0%
mld_polyvecl_permute_bitrev_to_custom_native 14s 10s +40%
polymat_permute_bitrev_to_custom 14s 34s -59%
polyveck_decompose 14s 18s -22%
keccakf1600x4_permute_native 13s 14s -7%
mld_ntt_butterfly_block 12s 12s +0%
poly_add 12s 10s +20%
poly_invntt_tomont_c 11s 6s +83%
polyveck_shiftl 11s 9s +22%
keccakf1600_permute 10s 9s +11%
polyvec_matrix_pointwise_montgomery_eager 10s - new
keccak_absorb_once_x4 9s 11s -18%
mld_check_pct 9s 14s -36%
poly_decompose_c 9s 9s +0%
poly_power2round 9s 4s +125%
polyveck_pointwise_poly_montgomery 9s 8s +12%
polyvecl_chknorm 9s 3s +200%
polyveck_caddq 8s 8s +0%
polyveck_sub 8s 6s +33%
unpack_sk 8s 9s -11%
keccak_absorb 7s 8s -12%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 7s 3s +133%
keccakf1600_permute_native 7s 8s -12%
mld_sample_s1_s2 7s 5s +40%
polyvecl_ntt 7s 8s -12%
sign_keypair 7s 4s +75%
compute_t0k_t1k 6s - new
fqscale 6s 3s +100%
mld_compute_pack_z 6s 6s +0%
mld_h 6s 5s +20%
mld_sample_s1_s2_serial 6s 3s +100%
poly_challenge 6s 2s +200%
poly_uniform_eta 6s 6s +0%
polyveck_invntt_tomont 6s 7s -14%
polyveck_ntt 6s 9s -33%
polyveck_reduce 6s 5s +20%
polyveck_use_hint 6s 24s -75%
rej_eta_native 6s 6s +0%
sign 6s 9s -33%
keccak_squeezeblocks_x4 5s 7s -29%
make_hint 5s 6s -17%
poly_caddq_c 5s 5s +0%
poly_make_hint 5s 3s +67%
poly_uniform 5s 4s +25%
poly_use_hint_c 5s 5s +0%
polyeta_pack 5s 4s +25%
polyeta_unpack 5s 8s -38%
rej_eta_c 5s 4s +25%
shake128x4_absorb_once 5s 3s +67%
sign_signature_pre_hash_internal 5s 5s +0%
sign_signature_pre_hash_shake256 5s 3s +67%
sign_verify_extmu 5s 4s +25%
unpack_pk 5s 2s +150%
intt_native_x86_64 4s 4s +0%
keccak_f1600_x1_native_aarch64_v84a 4s 2s +100%
mld_ct_cmask_nonzero_u8 4s 4s +0%
mld_ct_get_optblocker_i64 4s 2s +100%
mld_keccakf1600_extract_bytes 4s 2s +100%
pack_sk_rho_key_tr_s2 4s - new
poly_caddq_native_aarch64 4s 3s +33%
poly_decompose_native 4s 3s +33%
poly_ntt 4s 4s +0%
poly_ntt_c 4s 4s +0%
poly_reduce 4s 3s +33%
poly_uniform_gamma1 4s 2s +100%
polyt1_unpack 4s 3s +33%
polyveck_chknorm 4s 10s -60%
polyveck_pack_eta 4s 3s +33%
shake256_absorb 4s 3s +33%
shake256_release 4s 3s +33%
sign_open 4s 5s -20%
sign_signature_extmu 4s 5s -20%
sign_verify_pre_hash_shake256 4s 4s +0%
unpack_hints 4s 7s -43%
caddq 3s 4s -25%
decompose 3s 4s -25%
keccak_f1600_x1_native_aarch64 3s 4s -25%
keccak_init 3s 2s +50%
keccakf1600_xor_bytes (big endian) 3s 2s +50%
keccakf1600x4_extract_bytes 3s 2s +50%
keccakf1600x4_permute 3s 3s +0%
mld_ct_abs_i32 3s 1s +200%
mld_ct_cmask_neg_i32 3s 3s +0%
mld_ct_get_optblocker_u32 3s 2s +50%
mld_prepare_domain_separation_prefix 3s 6s -50%
mld_value_barrier_i64 3s 3s +0%
mld_value_barrier_u32 3s 3s +0%
montgomery_reduce 3s 4s -25%
ntt_native_x86_64 3s 4s -25%
pack_pk_t1 3s - new
pack_sig_h_poly 3s 5s -40%
pack_sig_z 3s 3s +0%
pointwise_native_aarch64 3s 3s +0%
pointwise_native_x86_64 3s 5s -40%
poly_caddq_native 3s 3s +0%
poly_decompose 3s 2s +50%
poly_invntt_tomont 3s 3s +0%
poly_use_hint_native 3s 3s +0%
polyveck_pack_w1 3s 5s -40%
polyveck_unpack_eta 3s 5s -40%
polyvecl_pack_eta 3s 3s +0%
polyvecl_uniform_gamma1_serial 3s 5s -40%
polyvecl_unpack_eta 3s 3s +0%
polyvecl_unpack_z 3s 4s -25%
polyw1_pack 3s 4s -25%
polyz_pack 3s 3s +0%
polyz_unpack 3s 2s +50%
polyz_unpack_c 3s 4s -25%
power2round 3s 3s +0%
reduce32 3s 2s +50%
shake128_squeeze 3s 3s +0%
shake128x4_squeezeblocks 3s 2s +50%
shake256_squeeze 3s 4s -25%
shake256x4_absorb_once 3s 2s +50%
shake256x4_squeezeblocks 3s 2s +50%
sign_signature 3s 4s -25%
sign_verify_pre_hash_internal 3s 4s -25%
keccak_f1600_x4_native_aarch64_v84a 2s 3s -33%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 2s 2s +0%
keccak_finalize 2s 1s +100%
keccakf1600_extract_bytes (big endian) 2s 3s -33%
keccakf1600_xor_bytes 2s 2s +0%
mld_ct_cmask_nonzero_u32 2s 2s +0%
mld_ct_get_optblocker_u8 2s 3s -33%
mld_ct_sel_int32 2s 1s +100%
mld_value_barrier_u8 2s 4s -50%
ntt_native_aarch64 2s 4s -50%
pack_sig_c 2s 2s +0%
pack_sk_s1 2s - new
pack_sk_t0 2s - new
poly_chknorm 2s 2s +0%
poly_chknorm_native 2s 4s -50%
poly_chknorm_native_aarch64 2s 3s -33%
poly_invntt_tomont_native 2s 3s -33%
poly_ntt_native 2s 3s -33%
poly_pointwise_montgomery_native 2s 4s -50%
poly_shiftl 2s 2s +0%
poly_sub 2s 2s +0%
poly_uniform_gamma1_4x 2s 5s -60%
poly_use_hint 2s 3s -33%
polyt0_pack 2s 2s +0%
polyt1_pack 2s 2s +0%
polyveck_unpack_t0 2s 4s -50%
polyvecl_permute_bitrev_to_custom 2s 3s -33%
polyvecl_uniform_gamma1 2s 3s -33%
polyz_unpack_native 2s 1s +100%
rej_eta 2s 4s -50%
shake128_absorb 2s 2s +0%
shake128_finalize 2s 2s +0%
shake128_init 2s 3s -33%
shake128_release 2s 4s -50%
shake256 2s 3s -33%
shake256_finalize 2s 2s +0%
shake256_init 2s 1s +100%
sign_verify 2s 6s -67%
unpack_sig 2s 3s -33%
use_hint 2s 5s -60%
keccak_squeeze 1s 2s -50%
keccakf1600x4_xor_bytes 1s 3s -67%
poly_caddq 1s 4s -75%
poly_pointwise_montgomery 1s 2s -50%
sys_check_capability 1s 4s -75%

@mkannwischer mkannwischer marked this pull request as ready for review April 8, 2026 09:16
@mkannwischer mkannwischer requested a review from a team as a code owner April 8, 2026 09:16
@mkannwischer
Copy link
Copy Markdown
Contributor Author

CBMC proof performance will still need investigation, but I'll do that after the other PRs have been merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants