lowram: Per-row t0/t1 computation in keygen by mkannwischer · Pull Request #1030 · pq-code-package/mldsa-native

mkannwischer · 2026-04-08T07:23:03Z

Replace mld_compute_t0_t1_tr_from_sk_components with the per-row
mld_compute_t0k_t1k. Both keygen and pk_from_sk now process one
row at a time, packing t1[k] into pk and t0[k] into sk immediately,
which eliminates the full polyveck allocations for t0 and t1.

To express this cleanly, refactor mld_polyvecl_pointwise_acc_montgomery
to take (mat, k, v) instead of (u, v) and move it into polyvec_lazy
with eager and lazy variants. Add per-row pack helpers
mld_pack_sk_t0 / mld_pack_pk_t1, and split mld_pack_sk_rho_key_tr_s2_t0
to drop the t0 packing.

Drop now-dead mld_polyveck_add, mld_polyveck_pack_t0,
mld_polyveck_power2round and mld_pack_pk along with their CBMC proofs.

Update the affected CBMC proofs.

Depends on lowram: Stream matrix A element-by-element to reduce memory #1019
Depends on Lowram: Share buffers with non-overlapping lifetimes in keygen #1011
Hoisted out from PoC: Reduce large struct allocations to <= 13/17/21 KiB for ML-DSA-44/65/87 #1005

Replace the row-level matrix buffer (mld_polyvecl) with a single-poly buffer in REDUCE_RAM mode. In the lazy path, matrix elements A[k][l] are sampled on demand one at a time, and the matrix-vector product accumulates element-by-element instead of row-by-row. Restructure polymat into eager/lazy variants following the same pattern as s1hat/s2hat/t0hat: - mld_polymat_eager: stores full K x L matrix - mld_polymat_lazy: stores rho + single poly_buffer + tmp - mld_polyvec_matrix_expand_eager/_lazy: separate implementations - mld_polyvec_matrix_pointwise_montgomery_eager/_lazy: separate implementations with CBMC contracts only on the eager variants Move all polymat-related code from polyvec.h/polyvec.c into polyvec_lazy.h/polyvec_lazy.c. Signed-off-by: Matthias J. Kannwischer <matthias@zerorisc.com>

Reuse t0 as the accumulator in mld_compute_t0_t1_tr_from_sk_components, and have the caller provide s1 already in NTT form, removing two allocations (s1hat and t) from the helper. In mld_sign_keypair_internal, share the s1 and t1 buffers via a union since s1hat is consumed before t1 is produced. Pack s1 into the secret key before the in-place NTT so the original coefficients are preserved. Split mld_pack_sk into mld_pack_sk_s1 and mld_pack_sk_rho_key_tr_s2_t0 to support packing s1 independently before the NTT. Signed-off-by: Matthias J. Kannwischer <matthias@zerorisc.com>

Replace mld_compute_t0_t1_tr_from_sk_components with the per-row mld_compute_t0k_t1k. Both keygen and pk_from_sk now process one row at a time, packing t1[k] into pk and t0[k] into sk immediately, which eliminates the full polyveck allocations for t0 and t1. To express this cleanly, refactor mld_polyvecl_pointwise_acc_montgomery to take (mat, k, v) instead of (u, v) and move it into polyvec_lazy with eager and lazy variants. Add per-row pack helpers mld_pack_sk_t0 / mld_pack_pk_t1, and split mld_pack_sk_rho_key_tr_s2_t0 to drop the t0 packing. Drop now-dead mld_polyveck_add, mld_polyveck_pack_t0, mld_polyveck_power2round and mld_pack_pk along with their CBMC proofs. Update the affected CBMC proofs. Signed-off-by: Matthias J. Kannwischer <matthias@zerorisc.com>

oqs-bot · 2026-04-08T08:29:17Z

CBMC Results (ML-DSA-44)

⚠️ Attention Required

Proof	Status	Current	Previous	Change
`sign_pk_from_sk`	⚠️	171s	6s	+2750%

Full Results (182 proofs)

Proof	Status	Current	Previous	Change
`TOTAL`	✅	2074s	2061s	+0.6%
`polyvecl_pointwise_acc_montgomery_c`	✅	268s	350s	-23%
`sign_verify_internal`	✅	201s	211s	-5%
`sign_pk_from_sk`	⚠️	171s	6s	+2750%
`poly_pointwise_montgomery_c`	✅	153s	148s	+3%
`rej_uniform_native`	✅	144s	142s	+1%
`mld_attempt_signature_generation`	✅	104s	105s	-1%
`mld_invntt_layer`	✅	74s	88s	-16%
`mld_ct_memcmp`	✅	71s	75s	-5%
`mld_ntt_layer`	✅	36s	54s	-33%
`polyvecl_pointwise_acc_montgomery_eager`	✅	23s	-	new
`sign_signature_internal`	✅	23s	20s	+15%
`polyvec_matrix_expand_eager`	✅	22s	-	new
`rej_uniform`	✅	20s	23s	-13%
`fqmul`	✅	19s	20s	-5%
`poly_uniform_eta_4x`	✅	19s	16s	+19%
`poly_chknorm_c`	✅	18s	19s	-5%
`sign_keypair_internal`	✅	16s	4s	+300%
`poly_uniform_4x`	✅	15s	14s	+7%
`polyeta_unpack`	✅	15s	16s	-6%
`poly_add`	✅	14s	11s	+27%
`keccakf1600x4_permute_native`	✅	13s	11s	+18%
`polyt0_unpack`	✅	13s	13s	+0%
`rej_uniform_c`	✅	13s	15s	-13%
`mld_ntt_butterfly_block`	✅	12s	13s	-8%
`polymat_permute_bitrev_to_custom`	✅	12s	28s	-57%
`polyz_unpack_c`	✅	10s	12s	-17%
`keccak_absorb_once_x4`	✅	9s	11s	-18%
`keccakf1600_permute_native`	✅	9s	7s	+29%
`mld_check_pct`	✅	9s	12s	-25%
`polyveck_chknorm`	✅	9s	4s	+125%
`keccakf1600_permute`	✅	8s	11s	-27%
`poly_invntt_tomont_c`	✅	8s	7s	+14%
`polyveck_caddq`	✅	8s	5s	+60%
`sign`	✅	8s	6s	+33%
`compute_t0k_t1k`	✅	7s	-	new
`keccak_absorb`	✅	7s	7s	+0%
`poly_use_hint_c`	✅	7s	4s	+75%
`polyvec_matrix_expand_eager_serial`	✅	7s	-	new
`polyveck_reduce`	✅	7s	4s	+75%
`mld_compute_pack_z`	✅	6s	5s	+20%
`mld_polyvecl_permute_bitrev_to_custom_native`	✅	6s	7s	-14%
`pack_sig_c`	✅	6s	5s	+20%
`poly_power2round`	✅	6s	5s	+20%
`polyveck_use_hint`	✅	6s	5s	+20%
`unpack_sig`	✅	6s	2s	+200%
`caddq`	✅	5s	4s	+25%
`keccak_squeezeblocks_x4`	✅	5s	7s	-29%
`make_hint`	✅	5s	2s	+150%
`mld_h`	✅	5s	4s	+25%
`mld_prepare_domain_separation_prefix`	✅	5s	4s	+25%
`mld_sample_s1_s2_serial`	✅	5s	8s	-38%
`poly_caddq_c`	✅	5s	7s	-29%
`poly_chknorm_native`	✅	5s	4s	+25%
`polyt1_unpack`	✅	5s	3s	+67%
`polyvec_matrix_pointwise_montgomery_eager`	✅	5s	-	new
`polyveck_decompose`	✅	5s	7s	-29%
`polyveck_invntt_tomont`	✅	5s	6s	-17%
`polyveck_ntt`	✅	5s	5s	+0%
`polyvecl_chknorm`	✅	5s	3s	+67%
`polyvecl_pack_eta`	✅	5s	2s	+150%
`power2round`	✅	5s	4s	+25%
`reduce32`	✅	5s	1s	+400%
`rej_eta_native`	✅	5s	5s	+0%
`sign_signature_extmu`	✅	5s	3s	+67%
`sign_signature_pre_hash_shake256`	✅	5s	6s	-17%
`sign_verify_pre_hash_internal`	✅	5s	6s	-17%
`unpack_sk`	✅	5s	9s	-44%
`use_hint`	✅	5s	4s	+25%
`intt_native_x86_64`	✅	4s	2s	+100%
`keccakf1600_xor_bytes (big endian)`	✅	4s	3s	+33%
`mld_ct_cmask_nonzero_u32`	✅	4s	3s	+33%
`mld_ct_cmask_nonzero_u8`	✅	4s	2s	+100%
`mld_keccakf1600_extract_bytes`	✅	4s	3s	+33%
`pack_sig_h_poly`	✅	4s	2s	+100%
`pack_sk_rho_key_tr_s2`	✅	4s	-	new
`pointwise_native_x86_64`	✅	4s	3s	+33%
`poly_caddq`	✅	4s	3s	+33%
`poly_decompose_c`	✅	4s	3s	+33%
`poly_decompose_native`	✅	4s	4s	+0%
`poly_invntt_tomont_native`	✅	4s	4s	+0%
`poly_ntt`	✅	4s	4s	+0%
`poly_pointwise_montgomery`	✅	4s	6s	-33%
`poly_shiftl`	✅	4s	3s	+33%
`poly_uniform`	✅	4s	4s	+0%
`poly_uniform_gamma1_4x`	✅	4s	3s	+33%
`polyt1_pack`	✅	4s	3s	+33%
`polyveck_pointwise_poly_montgomery`	✅	4s	5s	-20%
`polyveck_sub`	✅	4s	5s	-20%
`polyveck_unpack_eta`	✅	4s	3s	+33%
`polyvecl_ntt`	✅	4s	4s	+0%
`shake128x4_absorb_once`	✅	4s	3s	+33%
`shake256_finalize`	✅	4s	2s	+100%
`sign_keypair`	✅	4s	3s	+33%
`sign_verify`	✅	4s	5s	-20%
`sign_verify_extmu`	✅	4s	4s	+0%
`sys_check_capability`	✅	4s	3s	+33%
`unpack_hints`	✅	4s	5s	-20%
`decompose`	✅	3s	3s	+0%
`keccak_squeeze`	✅	3s	2s	+50%
`keccakf1600_extract_bytes (big endian)`	✅	3s	2s	+50%
`keccakf1600_xor_bytes`	✅	3s	1s	+200%
`keccakf1600x4_permute`	✅	3s	4s	-25%
`mld_ct_abs_i32`	✅	3s	1s	+200%
`mld_ct_get_optblocker_u32`	✅	3s	3s	+0%
`mld_ct_sel_int32`	✅	3s	2s	+50%
`mld_sample_s1_s2`	✅	3s	5s	-40%
`mld_value_barrier_i64`	✅	3s	3s	+0%
`mld_value_barrier_u32`	✅	3s	2s	+50%
`ntt_native_x86_64`	✅	3s	3s	+0%
`pack_pk_t1`	✅	3s	-	new
`pack_sig_z`	✅	3s	3s	+0%
`poly_caddq_native_aarch64`	✅	3s	3s	+0%
`poly_challenge`	✅	3s	4s	-25%
`poly_chknorm`	✅	3s	5s	-40%
`poly_chknorm_native_aarch64`	✅	3s	2s	+50%
`poly_decompose`	✅	3s	2s	+50%
`poly_make_hint`	✅	3s	3s	+0%
`poly_ntt_c`	✅	3s	2s	+50%
`poly_ntt_native`	✅	3s	4s	-25%
`poly_sub`	✅	3s	3s	+0%
`poly_uniform_eta`	✅	3s	4s	-25%
`poly_uniform_gamma1`	✅	3s	3s	+0%
`poly_use_hint_native`	✅	3s	3s	+0%
`polyt0_pack`	✅	3s	7s	-57%
`polyveck_unpack_t0`	✅	3s	2s	+50%
`polyvecl_uniform_gamma1`	✅	3s	4s	-25%
`polyvecl_uniform_gamma1_serial`	✅	3s	3s	+0%
`polyz_unpack`	✅	3s	2s	+50%
`polyz_unpack_native`	✅	3s	3s	+0%
`rej_eta`	✅	3s	5s	-40%
`rej_eta_c`	✅	3s	5s	-40%
`shake128_finalize`	✅	3s	2s	+50%
`shake256`	✅	3s	2s	+50%
`shake256_release`	✅	3s	2s	+50%
`shake256_squeeze`	✅	3s	2s	+50%
`shake256x4_absorb_once`	✅	3s	3s	+0%
`shake256x4_squeezeblocks`	✅	3s	3s	+0%
`sign_open`	✅	3s	2s	+50%
`sign_signature_pre_hash_internal`	✅	3s	6s	-50%
`fqscale`	✅	2s	5s	-60%
`keccak_f1600_x1_native_aarch64`	✅	2s	2s	+0%
`keccak_f1600_x4_native_aarch64_v84a`	✅	2s	2s	+0%
`keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid`	✅	2s	3s	-33%
`keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid`	✅	2s	2s	+0%
`keccak_init`	✅	2s	2s	+0%
`keccakf1600x4_extract_bytes`	✅	2s	4s	-50%
`keccakf1600x4_xor_bytes`	✅	2s	2s	+0%
`mld_ct_get_optblocker_i64`	✅	2s	2s	+0%
`montgomery_reduce`	✅	2s	3s	-33%
`ntt_native_aarch64`	✅	2s	2s	+0%
`pack_sk_s1`	✅	2s	-	new
`pack_sk_t0`	✅	2s	-	new
`pointwise_native_aarch64`	✅	2s	3s	-33%
`poly_caddq_native`	✅	2s	3s	-33%
`poly_invntt_tomont`	✅	2s	3s	-33%
`poly_pointwise_montgomery_native`	✅	2s	2s	+0%
`poly_reduce`	✅	2s	3s	-33%
`poly_use_hint`	✅	2s	2s	+0%
`polyeta_pack`	✅	2s	2s	+0%
`polyveck_pack_eta`	✅	2s	2s	+0%
`polyveck_shiftl`	✅	2s	5s	-60%
`polyvecl_permute_bitrev_to_custom`	✅	2s	2s	+0%
`polyvecl_unpack_z`	✅	2s	5s	-60%
`polyw1_pack`	✅	2s	3s	-33%
`polyz_pack`	✅	2s	2s	+0%
`shake128_absorb`	✅	2s	2s	+0%
`shake128_init`	✅	2s	2s	+0%
`shake128_release`	✅	2s	4s	-50%
`shake128_squeeze`	✅	2s	2s	+0%
`shake128x4_squeezeblocks`	✅	2s	4s	-50%
`shake256_absorb`	✅	2s	4s	-50%
`shake256_init`	✅	2s	3s	-33%
`sign_signature`	✅	2s	4s	-50%
`sign_verify_pre_hash_shake256`	✅	2s	6s	-67%
`unpack_pk`	✅	2s	3s	-33%
`keccak_f1600_x1_native_aarch64_v84a`	✅	1s	2s	-50%
`keccak_finalize`	✅	1s	4s	-75%
`mld_ct_cmask_neg_i32`	✅	1s	2s	-50%
`mld_ct_get_optblocker_u8`	✅	1s	2s	-50%
`mld_value_barrier_u8`	✅	1s	3s	-67%
`polyveck_pack_w1`	✅	1s	3s	-67%
`polyvecl_unpack_eta`	✅	1s	3s	-67%

oqs-bot · 2026-04-08T08:30:06Z

CBMC Results (ML-DSA-87)

⚠️ Attention Required

Proof	Status	Current	Previous	Change
`mld_attempt_signature_generation`	⚠️	163s	97s	+68%
`polymat_permute_bitrev_to_custom`	⚠️	158s	25s	+532%
`polyveck_decompose`	⚠️	22s	13s	+69%
`sign_keypair_internal`	⚠️	27s	4s	+575%
`sign_pk_from_sk`	⚠️	51s	8s	+538%
`sign_verify_internal`	⚠️	293s	144s	+103%

Full Results (182 proofs)

Proof	Status	Current	Previous	Change
`TOTAL`	✅	2619s	2234s	+17.2%
`sign_verify_internal`	⚠️	293s	144s	+103%
`polyvecl_pointwise_acc_montgomery_c`	✅	274s	194s	+41%
`polyvec_matrix_expand_eager`	✅	183s	-	new
`mld_attempt_signature_generation`	⚠️	163s	97s	+68%
`polymat_permute_bitrev_to_custom`	⚠️	158s	25s	+532%
`poly_pointwise_montgomery_c`	✅	141s	155s	-9%
`rej_uniform_native`	✅	141s	144s	-2%
`polyvec_matrix_expand_eager_serial`	✅	87s	-	new
`mld_ct_memcmp`	✅	74s	73s	+1%
`polyvecl_pointwise_acc_montgomery_eager`	✅	69s	-	new
`mld_invntt_layer`	✅	68s	95s	-28%
`mld_ntt_layer`	✅	54s	54s	+0%
`sign_pk_from_sk`	⚠️	51s	8s	+538%
`sign_signature_internal`	✅	39s	40s	-3%
`sign_keypair_internal`	⚠️	27s	4s	+575%
`polyveck_decompose`	⚠️	22s	13s	+69%
`fqmul`	✅	21s	19s	+11%
`rej_uniform`	✅	20s	19s	+5%
`poly_chknorm_c`	✅	18s	20s	-10%
`keccakf1600x4_permute_native`	✅	17s	13s	+31%
`polyeta_unpack`	✅	17s	18s	-6%
`poly_uniform_eta_4x`	✅	16s	15s	+7%
`poly_uniform_4x`	✅	15s	16s	-6%
`polyt0_unpack`	✅	14s	15s	-7%
`rej_uniform_c`	✅	14s	13s	+8%
`mld_check_pct`	✅	13s	13s	+0%
`poly_add`	✅	12s	13s	-8%
`polyveck_reduce`	✅	12s	7s	+71%
`mld_ntt_butterfly_block`	✅	11s	15s	-27%
`keccak_absorb_once_x4`	✅	10s	11s	-9%
`poly_invntt_tomont_c`	✅	10s	6s	+67%
`polyveck_pointwise_poly_montgomery`	✅	10s	5s	+100%
`polyz_unpack_c`	✅	10s	10s	+0%
`mld_polyvecl_permute_bitrev_to_custom_native`	✅	9s	13s	-31%
`polyvec_matrix_pointwise_montgomery_eager`	✅	9s	-	new
`polyveck_ntt`	✅	9s	9s	+0%
`keccak_absorb`	✅	8s	7s	+14%
`keccakf1600_permute_native`	✅	8s	8s	+0%
`polyveck_caddq`	✅	8s	5s	+60%
`polyveck_invntt_tomont`	✅	8s	9s	-11%
`polyveck_shiftl`	✅	8s	9s	-11%
`unpack_sk`	✅	8s	9s	-11%
`keccak_squeezeblocks_x4`	✅	7s	7s	+0%
`keccakf1600_permute`	✅	7s	9s	-22%
`poly_challenge`	✅	7s	5s	+40%
`poly_decompose_c`	✅	7s	9s	-22%
`polyveck_use_hint`	✅	7s	12s	-42%
`polyvecl_ntt`	✅	7s	7s	+0%
`sign`	✅	7s	7s	+0%
`mld_sample_s1_s2`	✅	6s	5s	+20%
`mld_sample_s1_s2_serial`	✅	6s	5s	+20%
`poly_uniform_eta`	✅	6s	6s	+0%
`polyveck_sub`	✅	6s	7s	-14%
`polyvecl_chknorm`	✅	6s	9s	-33%
`shake256`	✅	6s	2s	+200%
`sign_signature_extmu`	✅	6s	4s	+50%
`mld_compute_pack_z`	✅	5s	7s	-29%
`mld_prepare_domain_separation_prefix`	✅	5s	2s	+150%
`pack_sig_c`	✅	5s	7s	-29%
`pointwise_native_x86_64`	✅	5s	3s	+67%
`poly_caddq_native`	✅	5s	5s	+0%
`poly_power2round`	✅	5s	4s	+25%
`poly_uniform_gamma1_4x`	✅	5s	5s	+0%
`polyt0_pack`	✅	5s	2s	+150%
`sign_open`	✅	5s	7s	-29%
`unpack_sig`	✅	5s	5s	+0%
`compute_t0k_t1k`	✅	4s	-	new
`keccak_f1600_x1_native_aarch64`	✅	4s	2s	+100%
`mld_h`	✅	4s	5s	-20%
`ntt_native_aarch64`	✅	4s	4s	+0%
`ntt_native_x86_64`	✅	4s	3s	+33%
`pack_pk_t1`	✅	4s	-	new
`poly_caddq_c`	✅	4s	6s	-33%
`poly_ntt`	✅	4s	4s	+0%
`poly_pointwise_montgomery`	✅	4s	3s	+33%
`poly_uniform`	✅	4s	4s	+0%
`poly_uniform_gamma1`	✅	4s	3s	+33%
`poly_use_hint_c`	✅	4s	5s	-20%
`poly_use_hint_native`	✅	4s	4s	+0%
`polyveck_chknorm`	✅	4s	7s	-43%
`polyveck_unpack_t0`	✅	4s	6s	-33%
`polyvecl_uniform_gamma1`	✅	4s	6s	-33%
`polyz_unpack_native`	✅	4s	3s	+33%
`shake128_finalize`	✅	4s	2s	+100%
`shake128_squeeze`	✅	4s	3s	+33%
`sign_keypair`	✅	4s	2s	+100%
`sign_signature_pre_hash_shake256`	✅	4s	5s	-20%
`sign_verify_extmu`	✅	4s	2s	+100%
`sign_verify_pre_hash_shake256`	✅	4s	3s	+33%
`unpack_hints`	✅	4s	6s	-33%
`unpack_pk`	✅	4s	4s	+0%
`use_hint`	✅	4s	2s	+100%
`decompose`	✅	3s	4s	-25%
`intt_native_x86_64`	✅	3s	3s	+0%
`keccak_f1600_x1_native_aarch64_v84a`	✅	3s	1s	+200%
`keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid`	✅	3s	3s	+0%
`keccak_finalize`	✅	3s	3s	+0%
`keccak_init`	✅	3s	2s	+50%
`keccakf1600x4_extract_bytes`	✅	3s	3s	+0%
`make_hint`	✅	3s	4s	-25%
`mld_ct_abs_i32`	✅	3s	2s	+50%
`mld_ct_cmask_nonzero_u32`	✅	3s	2s	+50%
`mld_ct_cmask_nonzero_u8`	✅	3s	2s	+50%
`mld_ct_get_optblocker_i64`	✅	3s	3s	+0%
`mld_ct_get_optblocker_u32`	✅	3s	3s	+0%
`mld_ct_sel_int32`	✅	3s	1s	+200%
`mld_value_barrier_u32`	✅	3s	1s	+200%
`pack_sig_h_poly`	✅	3s	2s	+50%
`pack_sig_z`	✅	3s	2s	+50%
`pointwise_native_aarch64`	✅	3s	2s	+50%
`poly_caddq_native_aarch64`	✅	3s	4s	-25%
`poly_chknorm_native`	✅	3s	4s	-25%
`poly_chknorm_native_aarch64`	✅	3s	6s	-50%
`poly_decompose`	✅	3s	2s	+50%
`poly_invntt_tomont_native`	✅	3s	5s	-40%
`poly_ntt_c`	✅	3s	3s	+0%
`poly_ntt_native`	✅	3s	1s	+200%
`poly_reduce`	✅	3s	2s	+50%
`poly_use_hint`	✅	3s	4s	-25%
`polyeta_pack`	✅	3s	3s	+0%
`polyt1_unpack`	✅	3s	3s	+0%
`polyveck_pack_eta`	✅	3s	4s	-25%
`polyveck_pack_w1`	✅	3s	5s	-40%
`polyvecl_unpack_eta`	✅	3s	3s	+0%
`polyvecl_unpack_z`	✅	3s	3s	+0%
`polyw1_pack`	✅	3s	2s	+50%
`polyz_pack`	✅	3s	3s	+0%
`power2round`	✅	3s	5s	-40%
`reduce32`	✅	3s	4s	-25%
`rej_eta`	✅	3s	5s	-40%
`rej_eta_c`	✅	3s	5s	-40%
`shake128_absorb`	✅	3s	4s	-25%
`shake128_release`	✅	3s	2s	+50%
`shake128x4_squeezeblocks`	✅	3s	2s	+50%
`shake256x4_squeezeblocks`	✅	3s	3s	+0%
`sign_signature_pre_hash_internal`	✅	3s	4s	-25%
`sign_verify`	✅	3s	3s	+0%
`sign_verify_pre_hash_internal`	✅	3s	5s	-40%
`caddq`	✅	2s	3s	-33%
`fqscale`	✅	2s	1s	+100%
`keccak_f1600_x4_native_aarch64_v84a`	✅	2s	4s	-50%
`keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid`	✅	2s	3s	-33%
`keccakf1600x4_permute`	✅	2s	4s	-50%
`mld_keccakf1600_extract_bytes`	✅	2s	5s	-60%
`mld_value_barrier_i64`	✅	2s	3s	-33%
`mld_value_barrier_u8`	✅	2s	1s	+100%
`montgomery_reduce`	✅	2s	4s	-50%
`pack_sk_rho_key_tr_s2`	✅	2s	-	new
`pack_sk_s1`	✅	2s	-	new
`pack_sk_t0`	✅	2s	-	new
`poly_caddq`	✅	2s	4s	-50%
`poly_chknorm`	✅	2s	2s	+0%
`poly_decompose_native`	✅	2s	3s	-33%
`poly_invntt_tomont`	✅	2s	3s	-33%
`poly_make_hint`	✅	2s	4s	-50%
`poly_pointwise_montgomery_native`	✅	2s	3s	-33%
`poly_sub`	✅	2s	4s	-50%
`polyt1_pack`	✅	2s	2s	+0%
`polyveck_unpack_eta`	✅	2s	3s	-33%
`polyvecl_pack_eta`	✅	2s	6s	-67%
`polyvecl_permute_bitrev_to_custom`	✅	2s	3s	-33%
`polyvecl_uniform_gamma1_serial`	✅	2s	3s	-33%
`rej_eta_native`	✅	2s	4s	-50%
`shake128_init`	✅	2s	2s	+0%
`shake128x4_absorb_once`	✅	2s	3s	-33%
`shake256_absorb`	✅	2s	3s	-33%
`shake256_init`	✅	2s	4s	-50%
`shake256_squeeze`	✅	2s	3s	-33%
`shake256x4_absorb_once`	✅	2s	2s	+0%
`sign_signature`	✅	2s	4s	-50%
`sys_check_capability`	✅	2s	3s	-33%
`keccak_squeeze`	✅	1s	3s	-67%
`keccakf1600_extract_bytes (big endian)`	✅	1s	8s	-88%
`keccakf1600_xor_bytes`	✅	1s	3s	-67%
`keccakf1600_xor_bytes (big endian)`	✅	1s	4s	-75%
`keccakf1600x4_xor_bytes`	✅	1s	3s	-67%
`mld_ct_cmask_neg_i32`	✅	1s	2s	-50%
`mld_ct_get_optblocker_u8`	✅	1s	2s	-50%
`poly_shiftl`	✅	1s	3s	-67%
`polyz_unpack`	✅	1s	1s	+0%
`shake256_finalize`	✅	1s	3s	-67%
`shake256_release`	✅	1s	2s	-50%

oqs-bot · 2026-04-08T08:50:50Z

CBMC Results (ML-DSA-65)

⚠️ Attention Required

Proof	Status	Current	Previous	Change
`TOTAL`	⚠️	3637s	2418s	+50.4%
`polyvecl_pointwise_acc_montgomery_c`	⚠️	1556s	295s	+427%
`sign_keypair_internal`	⚠️	22s	7s	+214%
`sign_pk_from_sk`	⚠️	41s	9s	+356%

Full Results (182 proofs)

Proof	Status	Current	Previous	Change
`TOTAL`	⚠️	3637s	2418s	+50.4%
`polyvecl_pointwise_acc_montgomery_c`	⚠️	1556s	295s	+427%
`sign_verify_internal`	✅	295s	289s	+2%
`poly_pointwise_montgomery_c`	✅	168s	164s	+2%
`rej_uniform_native`	✅	151s	154s	-2%
`mld_attempt_signature_generation`	✅	150s	110s	+36%
`polyvec_matrix_expand_eager`	✅	108s	-	new
`mld_ct_memcmp`	✅	77s	80s	-4%
`mld_invntt_layer`	✅	73s	98s	-26%
`mld_ntt_layer`	✅	56s	56s	+0%
`polyvec_matrix_expand_eager_serial`	✅	46s	-	new
`polyvecl_pointwise_acc_montgomery_eager`	✅	42s	-	new
`sign_pk_from_sk`	⚠️	41s	9s	+356%
`sign_signature_internal`	✅	36s	30s	+20%
`fqmul`	✅	22s	20s	+10%
`rej_uniform`	✅	22s	22s	+0%
`sign_keypair_internal`	⚠️	22s	7s	+214%
`poly_chknorm_c`	✅	21s	21s	+0%
`poly_uniform_eta_4x`	✅	19s	19s	+0%
`poly_uniform_4x`	✅	15s	17s	-12%
`polyt0_unpack`	✅	15s	17s	-12%
`rej_uniform_c`	✅	15s	15s	+0%
`mld_polyvecl_permute_bitrev_to_custom_native`	✅	14s	10s	+40%
`polymat_permute_bitrev_to_custom`	✅	14s	34s	-59%
`polyveck_decompose`	✅	14s	18s	-22%
`keccakf1600x4_permute_native`	✅	13s	14s	-7%
`mld_ntt_butterfly_block`	✅	12s	12s	+0%
`poly_add`	✅	12s	10s	+20%
`poly_invntt_tomont_c`	✅	11s	6s	+83%
`polyveck_shiftl`	✅	11s	9s	+22%
`keccakf1600_permute`	✅	10s	9s	+11%
`polyvec_matrix_pointwise_montgomery_eager`	✅	10s	-	new
`keccak_absorb_once_x4`	✅	9s	11s	-18%
`mld_check_pct`	✅	9s	14s	-36%
`poly_decompose_c`	✅	9s	9s	+0%
`poly_power2round`	✅	9s	4s	+125%
`polyveck_pointwise_poly_montgomery`	✅	9s	8s	+12%
`polyvecl_chknorm`	✅	9s	3s	+200%
`polyveck_caddq`	✅	8s	8s	+0%
`polyveck_sub`	✅	8s	6s	+33%
`unpack_sk`	✅	8s	9s	-11%
`keccak_absorb`	✅	7s	8s	-12%
`keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid`	✅	7s	3s	+133%
`keccakf1600_permute_native`	✅	7s	8s	-12%
`mld_sample_s1_s2`	✅	7s	5s	+40%
`polyvecl_ntt`	✅	7s	8s	-12%
`sign_keypair`	✅	7s	4s	+75%
`compute_t0k_t1k`	✅	6s	-	new
`fqscale`	✅	6s	3s	+100%
`mld_compute_pack_z`	✅	6s	6s	+0%
`mld_h`	✅	6s	5s	+20%
`mld_sample_s1_s2_serial`	✅	6s	3s	+100%
`poly_challenge`	✅	6s	2s	+200%
`poly_uniform_eta`	✅	6s	6s	+0%
`polyveck_invntt_tomont`	✅	6s	7s	-14%
`polyveck_ntt`	✅	6s	9s	-33%
`polyveck_reduce`	✅	6s	5s	+20%
`polyveck_use_hint`	✅	6s	24s	-75%
`rej_eta_native`	✅	6s	6s	+0%
`sign`	✅	6s	9s	-33%
`keccak_squeezeblocks_x4`	✅	5s	7s	-29%
`make_hint`	✅	5s	6s	-17%
`poly_caddq_c`	✅	5s	5s	+0%
`poly_make_hint`	✅	5s	3s	+67%
`poly_uniform`	✅	5s	4s	+25%
`poly_use_hint_c`	✅	5s	5s	+0%
`polyeta_pack`	✅	5s	4s	+25%
`polyeta_unpack`	✅	5s	8s	-38%
`rej_eta_c`	✅	5s	4s	+25%
`shake128x4_absorb_once`	✅	5s	3s	+67%
`sign_signature_pre_hash_internal`	✅	5s	5s	+0%
`sign_signature_pre_hash_shake256`	✅	5s	3s	+67%
`sign_verify_extmu`	✅	5s	4s	+25%
`unpack_pk`	✅	5s	2s	+150%
`intt_native_x86_64`	✅	4s	4s	+0%
`keccak_f1600_x1_native_aarch64_v84a`	✅	4s	2s	+100%
`mld_ct_cmask_nonzero_u8`	✅	4s	4s	+0%
`mld_ct_get_optblocker_i64`	✅	4s	2s	+100%
`mld_keccakf1600_extract_bytes`	✅	4s	2s	+100%
`pack_sk_rho_key_tr_s2`	✅	4s	-	new
`poly_caddq_native_aarch64`	✅	4s	3s	+33%
`poly_decompose_native`	✅	4s	3s	+33%
`poly_ntt`	✅	4s	4s	+0%
`poly_ntt_c`	✅	4s	4s	+0%
`poly_reduce`	✅	4s	3s	+33%
`poly_uniform_gamma1`	✅	4s	2s	+100%
`polyt1_unpack`	✅	4s	3s	+33%
`polyveck_chknorm`	✅	4s	10s	-60%
`polyveck_pack_eta`	✅	4s	3s	+33%
`shake256_absorb`	✅	4s	3s	+33%
`shake256_release`	✅	4s	3s	+33%
`sign_open`	✅	4s	5s	-20%
`sign_signature_extmu`	✅	4s	5s	-20%
`sign_verify_pre_hash_shake256`	✅	4s	4s	+0%
`unpack_hints`	✅	4s	7s	-43%
`caddq`	✅	3s	4s	-25%
`decompose`	✅	3s	4s	-25%
`keccak_f1600_x1_native_aarch64`	✅	3s	4s	-25%
`keccak_init`	✅	3s	2s	+50%
`keccakf1600_xor_bytes (big endian)`	✅	3s	2s	+50%
`keccakf1600x4_extract_bytes`	✅	3s	2s	+50%
`keccakf1600x4_permute`	✅	3s	3s	+0%
`mld_ct_abs_i32`	✅	3s	1s	+200%
`mld_ct_cmask_neg_i32`	✅	3s	3s	+0%
`mld_ct_get_optblocker_u32`	✅	3s	2s	+50%
`mld_prepare_domain_separation_prefix`	✅	3s	6s	-50%
`mld_value_barrier_i64`	✅	3s	3s	+0%
`mld_value_barrier_u32`	✅	3s	3s	+0%
`montgomery_reduce`	✅	3s	4s	-25%
`ntt_native_x86_64`	✅	3s	4s	-25%
`pack_pk_t1`	✅	3s	-	new
`pack_sig_h_poly`	✅	3s	5s	-40%
`pack_sig_z`	✅	3s	3s	+0%
`pointwise_native_aarch64`	✅	3s	3s	+0%
`pointwise_native_x86_64`	✅	3s	5s	-40%
`poly_caddq_native`	✅	3s	3s	+0%
`poly_decompose`	✅	3s	2s	+50%
`poly_invntt_tomont`	✅	3s	3s	+0%
`poly_use_hint_native`	✅	3s	3s	+0%
`polyveck_pack_w1`	✅	3s	5s	-40%
`polyveck_unpack_eta`	✅	3s	5s	-40%
`polyvecl_pack_eta`	✅	3s	3s	+0%
`polyvecl_uniform_gamma1_serial`	✅	3s	5s	-40%
`polyvecl_unpack_eta`	✅	3s	3s	+0%
`polyvecl_unpack_z`	✅	3s	4s	-25%
`polyw1_pack`	✅	3s	4s	-25%
`polyz_pack`	✅	3s	3s	+0%
`polyz_unpack`	✅	3s	2s	+50%
`polyz_unpack_c`	✅	3s	4s	-25%
`power2round`	✅	3s	3s	+0%
`reduce32`	✅	3s	2s	+50%
`shake128_squeeze`	✅	3s	3s	+0%
`shake128x4_squeezeblocks`	✅	3s	2s	+50%
`shake256_squeeze`	✅	3s	4s	-25%
`shake256x4_absorb_once`	✅	3s	2s	+50%
`shake256x4_squeezeblocks`	✅	3s	2s	+50%
`sign_signature`	✅	3s	4s	-25%
`sign_verify_pre_hash_internal`	✅	3s	4s	-25%
`keccak_f1600_x4_native_aarch64_v84a`	✅	2s	3s	-33%
`keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid`	✅	2s	2s	+0%
`keccak_finalize`	✅	2s	1s	+100%
`keccakf1600_extract_bytes (big endian)`	✅	2s	3s	-33%
`keccakf1600_xor_bytes`	✅	2s	2s	+0%
`mld_ct_cmask_nonzero_u32`	✅	2s	2s	+0%
`mld_ct_get_optblocker_u8`	✅	2s	3s	-33%
`mld_ct_sel_int32`	✅	2s	1s	+100%
`mld_value_barrier_u8`	✅	2s	4s	-50%
`ntt_native_aarch64`	✅	2s	4s	-50%
`pack_sig_c`	✅	2s	2s	+0%
`pack_sk_s1`	✅	2s	-	new
`pack_sk_t0`	✅	2s	-	new
`poly_chknorm`	✅	2s	2s	+0%
`poly_chknorm_native`	✅	2s	4s	-50%
`poly_chknorm_native_aarch64`	✅	2s	3s	-33%
`poly_invntt_tomont_native`	✅	2s	3s	-33%
`poly_ntt_native`	✅	2s	3s	-33%
`poly_pointwise_montgomery_native`	✅	2s	4s	-50%
`poly_shiftl`	✅	2s	2s	+0%
`poly_sub`	✅	2s	2s	+0%
`poly_uniform_gamma1_4x`	✅	2s	5s	-60%
`poly_use_hint`	✅	2s	3s	-33%
`polyt0_pack`	✅	2s	2s	+0%
`polyt1_pack`	✅	2s	2s	+0%
`polyveck_unpack_t0`	✅	2s	4s	-50%
`polyvecl_permute_bitrev_to_custom`	✅	2s	3s	-33%
`polyvecl_uniform_gamma1`	✅	2s	3s	-33%
`polyz_unpack_native`	✅	2s	1s	+100%
`rej_eta`	✅	2s	4s	-50%
`shake128_absorb`	✅	2s	2s	+0%
`shake128_finalize`	✅	2s	2s	+0%
`shake128_init`	✅	2s	3s	-33%
`shake128_release`	✅	2s	4s	-50%
`shake256`	✅	2s	3s	-33%
`shake256_finalize`	✅	2s	2s	+0%
`shake256_init`	✅	2s	1s	+100%
`sign_verify`	✅	2s	6s	-67%
`unpack_sig`	✅	2s	3s	-33%
`use_hint`	✅	2s	5s	-60%
`keccak_squeeze`	✅	1s	2s	-50%
`keccakf1600x4_xor_bytes`	✅	1s	3s	-67%
`poly_caddq`	✅	1s	4s	-75%
`poly_pointwise_montgomery`	✅	1s	2s	-50%
`sys_check_capability`	✅	1s	4s	-75%

mkannwischer · 2026-04-08T09:16:44Z

CBMC proof performance will still need investigation, but I'll do that after the other PRs have been merged.

mkannwischer added the low-ram label Apr 8, 2026

mkannwischer added 2 commits April 8, 2026 15:24

mkannwischer force-pushed the keygen-buffer-sharing-stream-a branch from 6d4e630 to 62a104b Compare April 8, 2026 07:29

mkannwischer force-pushed the keygen-buffer-sharing-stream-a branch from 62a104b to c9509d5 Compare April 8, 2026 07:42

mkannwischer assigned hanno-becker Apr 8, 2026

mkannwischer marked this pull request as ready for review April 8, 2026 09:16

mkannwischer requested a review from a team as a code owner April 8, 2026 09:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lowram: Per-row t0/t1 computation in keygen #1030

lowram: Per-row t0/t1 computation in keygen #1030
mkannwischer wants to merge 3 commits intomainfrom
keygen-buffer-sharing-stream-a

mkannwischer commented Apr 8, 2026

Uh oh!

oqs-bot commented Apr 8, 2026

Uh oh!

oqs-bot commented Apr 8, 2026

Uh oh!

oqs-bot commented Apr 8, 2026

Uh oh!

mkannwischer commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mkannwischer commented Apr 8, 2026

Uh oh!

oqs-bot commented Apr 8, 2026

CBMC Results (ML-DSA-44)

Uh oh!

oqs-bot commented Apr 8, 2026

CBMC Results (ML-DSA-87)

Uh oh!

oqs-bot commented Apr 8, 2026

CBMC Results (ML-DSA-65)

Uh oh!

mkannwischer commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants