Skip to content

Use heap allocation + valgrind in backend unit test#1633

Merged
mkannwischer merged 4 commits intomainfrom
unit_valgrind
Apr 19, 2026
Merged

Use heap allocation + valgrind in backend unit test#1633
mkannwischer merged 4 commits intomainfrom
unit_valgrind

Conversation

@hanno-becker
Copy link
Copy Markdown
Contributor

@hanno-becker hanno-becker force-pushed the unit_valgrind branch 4 times, most recently from 88001da to d97ce09 Compare March 19, 2026 21:03
@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented Mar 19, 2026

CBMC Results (ML-KEM-512)

Full Results (191 proofs)
Proof Status Current Previous Change
**TOTAL** 1241s 1263s -1.7%
mlk_indcpa_keypair_derand 229s 233s -2%
mlk_indcpa_enc 155s 163s -5%
mlk_rej_uniform_c 106s 115s -8%
mlk_polyvec_basemul_acc_montgomery_cached_c 47s 49s -4%
mlk_poly_rej_uniform 29s 30s -3%
mlk_keccak_squeezeblocks_x4 25s 26s -4%
mlk_ntt_layer 25s 28s -11%
poly_ntt_native 24s 23s +4%
mlk_poly_reduce_native 20s 20s +0%
mlk_polyvec_add 19s 17s +12%
keccakf1600x4_permute_native_x4 18s 19s -5%
mlk_fqmul 18s 15s +20%
mlk_indcpa_dec 16s 15s +7%
mlk_poly_decompress_d10_native 14s 15s -7%
mlk_poly_decompress_d4_native 13s 13s +0%
mlk_keccak_absorb_once_x4 8s 6s +33%
mlk_keccak_squeezeblocks 8s 9s -11%
mlk_ntt_butterfly_block 8s 7s +14%
mlk_polymat_permute_bitrev_to_custom 8s 7s +14%
polyvec_basemul_acc_montgomery_cached_native 8s 7s +14%
mlk_poly_frombytes_native 7s 8s -12%
mlk_poly_frommsg 7s 6s +17%
mlk_poly_ntt 6s 4s +50%
mlk_poly_rej_uniform_x4 6s 6s +0%
poly_compress_d10_native_x86_64 6s 2s +200%
kem_dec 5s 3s +67%
mlk_invntt_layer 5s 5s +0%
mlk_keccak_squeeze_once 5s 6s -17%
mlk_poly_compress_d10 5s 4s +25%
mlk_poly_compress_d10_c 5s 3s +67%
mlk_poly_compress_d4_c 5s 2s +150%
mlk_shake256 5s 1s +400%
intt_native_aarch64 4s 3s +33%
kem_check_pk 4s 3s +33%
mlk_check_pct 4s 4s +0%
mlk_ct_cmask_neg_i16 4s 1s +300%
mlk_ct_get_optblocker_u32 4s 3s +33%
mlk_ct_sel_int16 4s 1s +300%
mlk_enc_getnoise_eta1_eta2 4s 2s +100%
mlk_keccak_absorb_once 4s 3s +33%
mlk_keccakf1600_extract_bytes (big endian) 4s 2s +100%
mlk_keccakf1600x4_extract_bytes_c 4s 3s +33%
mlk_poly_cbd_eta1 4s 3s +33%
mlk_poly_cbd_eta2 4s 5s -20%
mlk_poly_compress_d11 4s 2s +100%
mlk_polyvec_ntt 4s 2s +100%
mlk_polyvec_permute_bitrev_to_custom_native 4s 4s +0%
mlk_polyvec_reduce 4s 3s +33%
mlk_scalar_compress_d1 4s 3s +33%
mlk_scalar_decompress_d11 4s 4s +0%
poly_decompress_d4_native_x86_64 4s 5s -20%
poly_frombytes_native_x86_64 4s 5s -20%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 3s 2s +50%
keccakf1600x4_extract_bytes_native 3s 2s +50%
kem_enc 3s 3s +0%
kem_keypair 3s 2s +50%
kem_keypair_derand 3s 2s +50%
mlk_ct_cmask_nonzero_u16 3s 1s +200%
mlk_ct_cmov_zero 3s 2s +50%
mlk_keccakf1600_permute 3s 3s +0%
mlk_keccakf1600x4_extract_bytes 3s 1s +200%
mlk_poly_compress_d11_native 3s 2s +50%
mlk_poly_compress_d5_c 3s 2s +50%
mlk_poly_compress_du 3s 3s +0%
mlk_poly_decompress_d10 3s 3s +0%
mlk_poly_decompress_d11_c 3s 3s +0%
mlk_poly_decompress_d11_native 3s 1s +200%
mlk_poly_decompress_d4_c 3s 1s +200%
mlk_poly_mulcache_compute_c 3s 2s +50%
mlk_poly_mulcache_compute_native 3s 3s +0%
mlk_poly_ntt_c 3s 5s -40%
mlk_poly_reduce 3s 2s +50%
mlk_poly_tobytes_c 3s 3s +0%
mlk_poly_tomont_c 3s 4s -25%
mlk_poly_tomont_native 3s 2s +50%
mlk_poly_tomsg 3s 2s +50%
mlk_polyvec_compress_du 3s 2s +50%
mlk_polyvec_frombytes 3s 3s +0%
mlk_polyvec_mulcache_compute 3s 5s -40%
mlk_polyvec_permute_bitrev_to_custom 3s 1s +200%
mlk_polyvec_tobytes 3s 2s +50%
mlk_shake128_squeezeblocks 3s 3s +0%
mlk_shake128x4_absorb_once 3s 1s +200%
mlk_shake256x4 3s 5s -40%
ntt_native_aarch64 3s 4s -25%
ntt_native_x86_64 3s 3s +0%
poly_compress_d4_native_x86_64 3s 2s +50%
poly_decompress_d10_native_x86_64 3s 3s +0%
poly_decompress_d11_native_x86_64 3s 1s +200%
poly_mulcache_compute_native_aarch64 3s 2s +50%
poly_mulcache_compute_native_x86_64 3s 2s +50%
poly_reduce_native_x86_64 3s 4s -25%
poly_tobytes_native_aarch64 3s 3s +0%
polyvec_basemul_acc_montgomery_cached_k3_native_aarch64 3s 2s +50%
polyvec_basemul_acc_montgomery_cached_k3_native_x86_64 3s 1s +200%
polyvec_basemul_acc_montgomery_cached_k4_native_aarch64 3s 3s +0%
rej_uniform_native 3s 2s +50%
rej_uniform_native_aarch64 3s 2s +50%
intt_native_x86_64 2s 1s +100%
keccak_f1600_x1_native_aarch64 2s 3s -33%
keccak_f1600_x1_native_aarch64_v84a 2s 4s -50%
keccak_f1600_x4_native_aarch64_v84a 2s 1s +100%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 2s 2s +0%
kem_check_sk 2s 3s -33%
kem_enc_derand 2s 2s +0%
mlk_barrett_reduce 2s 1s +100%
mlk_ct_get_optblocker_i32 2s 2s +0%
mlk_ct_memcmp 2s 4s -50%
mlk_ct_sel_uint8 2s 2s +0%
mlk_gen_matrix 2s 2s +0%
mlk_keccakf1600_permute_c 2s 5s -60%
mlk_keccakf1600_xor_bytes 2s 3s -33%
mlk_keccakf1600_xor_bytes (big endian) 2s 1s +100%
mlk_keccakf1600x4_xor_bytes_c 2s 3s -33%
mlk_keypair_getnoise_eta1 2s 3s -33%
mlk_poly_add 2s 3s -33%
mlk_poly_compress_d11_c 2s 2s +0%
mlk_poly_compress_d4_native 2s 3s -33%
mlk_poly_compress_d5 2s 2s +0%
mlk_poly_compress_d5_native 2s 2s +0%
mlk_poly_compress_dv 2s 2s +0%
mlk_poly_decompress_d10_c 2s 1s +100%
mlk_poly_decompress_d11 2s 1s +100%
mlk_poly_decompress_d4 2s 2s +0%
mlk_poly_decompress_d5 2s 2s +0%
mlk_poly_decompress_d5_native 2s 1s +100%
mlk_poly_decompress_du 2s 1s +100%
mlk_poly_frombytes 2s 1s +100%
mlk_poly_frombytes_c 2s 3s -33%
mlk_poly_getnoise_eta1122_4x 2s 4s -50%
mlk_poly_getnoise_eta1_4x 2s 3s -33%
mlk_poly_getnoise_eta1_4x_native 2s 3s -33%
mlk_poly_getnoise_eta2 2s 1s +100%
mlk_poly_invntt_tomont_c 2s 3s -33%
mlk_poly_mulcache_compute 2s 2s +0%
mlk_poly_reduce_c 2s 3s -33%
mlk_poly_sub 2s 2s +0%
mlk_poly_tobytes 2s 3s -33%
mlk_poly_tomont 2s 2s +0%
mlk_polyvec_basemul_acc_montgomery_cached 2s 1s +100%
mlk_polyvec_invntt_tomont 2s 2s +0%
mlk_scalar_compress_d10 2s 2s +0%
mlk_scalar_compress_d11 2s 3s -33%
mlk_scalar_compress_d5 2s 1s +100%
mlk_scalar_decompress_d10 2s 2s +0%
mlk_scalar_decompress_d5 2s 5s -60%
mlk_sha3_512 2s 1s +100%
mlk_shake128_absorb_once 2s 2s +0%
mlk_shake128x4_squeezeblocks 2s 2s +0%
mlk_value_barrier_i32 2s 2s +0%
mlk_value_barrier_u32 2s 2s +0%
mlk_value_barrier_u8 2s 3s -33%
nttunpack_native_x86_64 2s 4s -50%
poly_compress_d11_native_x86_64 2s 3s -33%
poly_compress_d5_native_x86_64 2s 3s -33%
poly_decompress_d5_native_x86_64 2s 2s +0%
poly_getnoise_eta1122_4x_native 2s 3s -33%
poly_tomont_native_x86_64 2s 3s -33%
polyvec_basemul_acc_montgomery_cached_k2_native_aarch64 2s 2s +0%
polyvec_basemul_acc_montgomery_cached_k2_native_x86_64 2s 1s +100%
polyvec_basemul_acc_montgomery_cached_k4_native_x86_64 2s 2s +0%
rej_uniform_native_x86_64 2s 4s -50%
sys_check_capability 2s 1s +100%
keccak_f1600_x4_native_avx2 1s 2s -50%
keccakf1600_permute_native 1s 2s -50%
keccakf1600x4_xor_bytes_native 1s 3s -67%
mlk_ct_cmask_nonzero_u8 1s 2s -50%
mlk_ct_get_optblocker_u8 1s 3s -67%
mlk_gen_matrix_serial 1s 4s -75%
mlk_keccakf1600_extract_bytes 1s 3s -67%
mlk_keccakf1600x4_permute 1s 2s -50%
mlk_keccakf1600x4_xor_bytes 1s 2s -50%
mlk_matvec_mul 1s 4s -75%
mlk_montgomery_reduce 1s 2s -50%
mlk_poly_compress_d10_native 1s 2s -50%
mlk_poly_compress_d4 1s 2s -50%
mlk_poly_decompress_d5_c 1s 4s -75%
mlk_poly_decompress_dv 1s 2s -50%
mlk_poly_invntt_tomont 1s 2s -50%
mlk_poly_tobytes_native 1s 3s -67%
mlk_polyvec_decompress_du 1s 3s -67%
mlk_polyvec_tomont 1s 4s -75%
mlk_rej_uniform 1s 2s -50%
mlk_scalar_compress_d4 1s 3s -67%
mlk_scalar_decompress_d4 1s 1s +0%
mlk_scalar_signed_to_unsigned_q 1s 5s -80%
mlk_sha3_256 1s 2s -50%
poly_invntt_tomont_native 1s 1s +0%
poly_reduce_native_aarch64 1s 2s -50%
poly_tobytes_native_x86_64 1s 5s -80%
poly_tomont_native_aarch64 1s 1s +0%

@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented Mar 19, 2026

CBMC Results (ML-KEM-1024)

Full Results (191 proofs)
Proof Status Current Previous Change
**TOTAL** 1295s 1443s -10.3%
mlk_indcpa_enc 152s 164s -7%
mlk_rej_uniform_c 141s 174s -19%
mlk_indcpa_keypair_derand 132s 144s -8%
mlk_polyvec_basemul_acc_montgomery_cached_c 82s 105s -22%
mlk_ntt_layer 36s 41s -12%
polyvec_basemul_acc_montgomery_cached_native 34s 39s -13%
mlk_poly_rej_uniform 33s 35s -6%
mlk_keccak_squeezeblocks_x4 27s 30s -10%
poly_ntt_native 27s 35s -23%
mlk_polyvec_add 26s 29s -10%
mlk_poly_reduce_native 23s 24s -4%
keccakf1600x4_permute_native_x4 21s 18s +17%
mlk_fqmul 18s 19s -5%
mlk_poly_decompress_d5_native 16s 18s -11%
mlk_poly_decompress_d11_native 15s 18s -17%
mlk_polymat_permute_bitrev_to_custom 12s 12s +0%
mlk_poly_frombytes_native 11s 12s -8%
mlk_indcpa_dec 9s 13s -31%
mlk_poly_frommsg 9s 14s -36%
mlk_keccak_absorb_once_x4 7s 6s +17%
mlk_ntt_butterfly_block 7s 7s +0%
mlk_poly_ntt 7s 7s +0%
mlk_shake256x4 7s 3s +133%
poly_decompress_d5_native_x86_64 7s 6s +17%
poly_frombytes_native_x86_64 7s 9s -22%
kem_dec 6s 6s +0%
mlk_gen_matrix 6s 6s +0%
mlk_keccak_squeeze_once 6s 6s +0%
mlk_keccak_squeezeblocks 6s 7s -14%
mlk_poly_rej_uniform_x4 6s 7s -14%
mlk_polyvec_mulcache_compute 6s 3s +100%
mlk_polyvec_permute_bitrev_to_custom_native 6s 3s +100%
mlk_gen_matrix_serial 5s 6s -17%
mlk_invntt_layer 5s 6s -17%
mlk_poly_tobytes 5s 4s +25%
mlk_poly_tomsg 5s 4s +25%
mlk_polyvec_ntt 5s 3s +67%
mlk_rej_uniform 5s 2s +150%
poly_decompress_d11_native_x86_64 5s 6s -17%
polyvec_basemul_acc_montgomery_cached_k2_native_aarch64 5s 3s +67%
intt_native_x86_64 4s 3s +33%
keccakf1600_permute_native 4s 1s +300%
kem_enc_derand 4s 3s +33%
mlk_enc_getnoise_eta1_eta2 4s 4s +0%
mlk_keccak_absorb_once 4s 4s +0%
mlk_keccakf1600_permute_c 4s 3s +33%
mlk_keccakf1600x4_xor_bytes 4s 2s +100%
mlk_montgomery_reduce 4s 4s +0%
mlk_poly_compress_d11_c 4s 8s -50%
mlk_poly_compress_d5_native 4s 2s +100%
mlk_poly_decompress_d11_c 4s 2s +100%
mlk_poly_frombytes_c 4s 3s +33%
mlk_poly_getnoise_eta1_4x 4s 3s +33%
mlk_poly_getnoise_eta1_4x_native 4s 4s +0%
mlk_poly_mulcache_compute_c 4s 3s +33%
mlk_poly_reduce_c 4s 3s +33%
poly_tomont_native_x86_64 4s 1s +300%
polyvec_basemul_acc_montgomery_cached_k3_native_x86_64 4s 2s +100%
keccak_f1600_x1_native_aarch64 3s 2s +50%
keccakf1600x4_xor_bytes_native 3s 1s +200%
kem_check_sk 3s 4s -25%
kem_keypair 3s 5s -40%
mlk_barrett_reduce 3s 2s +50%
mlk_keccakf1600_extract_bytes 3s 2s +50%
mlk_keccakf1600_permute 3s 1s +200%
mlk_keccakf1600x4_permute 3s 6s -50%
mlk_poly_add 3s 2s +50%
mlk_poly_compress_d4_native 3s 2s +50%
mlk_poly_compress_d5 3s 2s +50%
mlk_poly_compress_dv 3s 3s +0%
mlk_poly_decompress_d4 3s 2s +50%
mlk_poly_decompress_d4_c 3s 2s +50%
mlk_poly_decompress_d5_c 3s 2s +50%
mlk_poly_decompress_du 3s 3s +0%
mlk_poly_frombytes 3s 2s +50%
mlk_poly_invntt_tomont 3s 2s +50%
mlk_poly_ntt_c 3s 3s +0%
mlk_poly_tobytes_native 3s 2s +50%
mlk_poly_tomont_native 3s 4s -25%
mlk_polyvec_basemul_acc_montgomery_cached 3s 4s -25%
mlk_polyvec_compress_du 3s 2s +50%
mlk_polyvec_decompress_du 3s 5s -40%
mlk_polyvec_tomont 3s 2s +50%
mlk_scalar_compress_d10 3s 3s +0%
mlk_scalar_decompress_d11 3s 4s -25%
mlk_scalar_decompress_d5 3s 3s +0%
mlk_sha3_512 3s 4s -25%
mlk_shake128x4_squeezeblocks 3s 2s +50%
mlk_shake256 3s 3s +0%
ntt_native_aarch64 3s 4s -25%
ntt_native_x86_64 3s 2s +50%
poly_decompress_d4_native_x86_64 3s 4s -25%
poly_getnoise_eta1122_4x_native 3s 2s +50%
poly_invntt_tomont_native 3s 5s -40%
poly_reduce_native_aarch64 3s 4s -25%
poly_tomont_native_aarch64 3s 5s -40%
polyvec_basemul_acc_montgomery_cached_k3_native_aarch64 3s 3s +0%
rej_uniform_native 3s 4s -25%
rej_uniform_native_aarch64 3s 3s +0%
rej_uniform_native_x86_64 3s 3s +0%
sys_check_capability 3s 2s +50%
intt_native_aarch64 2s 3s -33%
keccak_f1600_x4_native_aarch64_v84a 2s 2s +0%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 2s 2s +0%
keccakf1600x4_extract_bytes_native 2s 2s +0%
mlk_check_pct 2s 2s +0%
mlk_ct_cmask_neg_i16 2s 3s -33%
mlk_ct_cmask_nonzero_u16 2s 3s -33%
mlk_ct_cmov_zero 2s 1s +100%
mlk_ct_get_optblocker_i32 2s 1s +100%
mlk_ct_get_optblocker_u32 2s 2s +0%
mlk_ct_memcmp 2s 2s +0%
mlk_ct_sel_int16 2s 2s +0%
mlk_ct_sel_uint8 2s 2s +0%
mlk_keccakf1600_xor_bytes 2s 2s +0%
mlk_keccakf1600_xor_bytes (big endian) 2s 1s +100%
mlk_keccakf1600x4_extract_bytes 2s 3s -33%
mlk_keccakf1600x4_xor_bytes_c 2s 2s +0%
mlk_poly_compress_d10_c 2s 2s +0%
mlk_poly_compress_d10_native 2s 3s -33%
mlk_poly_compress_d11_native 2s 1s +100%
mlk_poly_compress_d4 2s 2s +0%
mlk_poly_compress_d5_c 2s 2s +0%
mlk_poly_compress_du 2s 3s -33%
mlk_poly_decompress_d10 2s 6s -67%
mlk_poly_decompress_d10_c 2s 2s +0%
mlk_poly_decompress_d10_native 2s 2s +0%
mlk_poly_decompress_d4_native 2s 2s +0%
mlk_poly_getnoise_eta2 2s 2s +0%
mlk_poly_invntt_tomont_c 2s 3s -33%
mlk_poly_mulcache_compute 2s 1s +100%
mlk_poly_mulcache_compute_native 2s 2s +0%
mlk_poly_reduce 2s 1s +100%
mlk_poly_tomont 2s 2s +0%
mlk_polyvec_frombytes 2s 2s +0%
mlk_polyvec_invntt_tomont 2s 1s +100%
mlk_polyvec_permute_bitrev_to_custom 2s 3s -33%
mlk_polyvec_tobytes 2s 3s -33%
mlk_scalar_compress_d11 2s 2s +0%
mlk_scalar_compress_d4 2s 2s +0%
mlk_scalar_compress_d5 2s 5s -60%
mlk_scalar_decompress_d4 2s 2s +0%
mlk_scalar_signed_to_unsigned_q 2s 2s +0%
mlk_sha3_256 2s 1s +100%
mlk_shake128_squeezeblocks 2s 1s +100%
mlk_shake128x4_absorb_once 2s 3s -33%
nttunpack_native_x86_64 2s 2s +0%
poly_compress_d10_native_x86_64 2s 1s +100%
poly_compress_d11_native_x86_64 2s 3s -33%
poly_compress_d5_native_x86_64 2s 3s -33%
poly_decompress_d10_native_x86_64 2s 2s +0%
poly_mulcache_compute_native_aarch64 2s 2s +0%
poly_reduce_native_x86_64 2s 4s -50%
poly_tobytes_native_aarch64 2s 1s +100%
poly_tobytes_native_x86_64 2s 4s -50%
polyvec_basemul_acc_montgomery_cached_k2_native_x86_64 2s 2s +0%
polyvec_basemul_acc_montgomery_cached_k4_native_x86_64 2s 3s -33%
keccak_f1600_x1_native_aarch64_v84a 1s 3s -67%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 1s 2s -50%
keccak_f1600_x4_native_avx2 1s 3s -67%
kem_check_pk 1s 3s -67%
kem_enc 1s 3s -67%
kem_keypair_derand 1s 3s -67%
mlk_ct_cmask_nonzero_u8 1s 2s -50%
mlk_ct_get_optblocker_u8 1s 3s -67%
mlk_keccakf1600_extract_bytes (big endian) 1s 1s +0%
mlk_keccakf1600x4_extract_bytes_c 1s 3s -67%
mlk_keypair_getnoise_eta1 1s 3s -67%
mlk_matvec_mul 1s 3s -67%
mlk_poly_cbd_eta1 1s 2s -50%
mlk_poly_cbd_eta2 1s 3s -67%
mlk_poly_compress_d10 1s 3s -67%
mlk_poly_compress_d11 1s 2s -50%
mlk_poly_compress_d4_c 1s 3s -67%
mlk_poly_decompress_d11 1s 2s -50%
mlk_poly_decompress_d5 1s 1s +0%
mlk_poly_decompress_dv 1s 3s -67%
mlk_poly_getnoise_eta1122_4x 1s 1s +0%
mlk_poly_sub 1s 2s -50%
mlk_poly_tobytes_c 1s 1s +0%
mlk_poly_tomont_c 1s 3s -67%
mlk_polyvec_reduce 1s 2s -50%
mlk_scalar_compress_d1 1s 2s -50%
mlk_scalar_decompress_d10 1s 2s -50%
mlk_shake128_absorb_once 1s 1s +0%
mlk_value_barrier_i32 1s 4s -75%
mlk_value_barrier_u32 1s 4s -75%
mlk_value_barrier_u8 1s 4s -75%
poly_compress_d4_native_x86_64 1s 1s +0%
poly_mulcache_compute_native_x86_64 1s 3s -67%
polyvec_basemul_acc_montgomery_cached_k4_native_aarch64 1s 1s +0%

@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented Mar 19, 2026

CBMC Results (ML-KEM-768)

Full Results (191 proofs)
Proof Status Current Previous Change
**TOTAL** 1337s 1246s +7.3%
mlk_indcpa_keypair_derand 199s 188s +6%
mlk_indcpa_enc 181s 154s +18%
mlk_rej_uniform_c 143s 134s +7%
mlk_polyvec_basemul_acc_montgomery_cached_c 45s 41s +10%
mlk_ntt_layer 38s 31s +23%
mlk_poly_rej_uniform 34s 31s +10%
mlk_polyvec_add 29s 29s +0%
poly_ntt_native 28s 26s +8%
mlk_keccak_squeezeblocks_x4 27s 28s -4%
mlk_poly_reduce_native 22s 22s +0%
keccakf1600x4_permute_native_x4 19s 18s +6%
polyvec_basemul_acc_montgomery_cached_native 17s 16s +6%
mlk_fqmul 16s 16s +0%
mlk_poly_decompress_d10_native 15s 16s -6%
mlk_poly_decompress_d4_native 15s 14s +7%
mlk_indcpa_dec 14s 14s +0%
mlk_poly_frombytes_native 10s 9s +11%
mlk_poly_frommsg 10s 9s +11%
mlk_ntt_butterfly_block 8s 9s -11%
mlk_poly_rej_uniform_x4 8s 7s +14%
mlk_invntt_layer 7s 7s +0%
mlk_keccak_squeezeblocks 7s 8s -12%
mlk_poly_ntt 7s 7s +0%
mlk_polymat_permute_bitrev_to_custom 7s 5s +40%
mlk_keccak_absorb_once 6s 5s +20%
mlk_keccak_absorb_once_x4 6s 7s -14%
poly_frombytes_native_x86_64 6s 3s +100%
kem_check_sk 5s 3s +67%
mlk_gen_matrix 5s 3s +67%
mlk_keccak_squeeze_once 5s 6s -17%
mlk_poly_compress_d5_c 5s 1s +400%
mlk_poly_frombytes 5s 2s +150%
mlk_poly_getnoise_eta1_4x 5s 4s +25%
mlk_poly_ntt_c 5s 3s +67%
poly_decompress_d10_native_x86_64 5s 5s +0%
rej_uniform_native_x86_64 5s 3s +67%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 4s 2s +100%
kem_check_pk 4s 3s +33%
kem_dec 4s 4s +0%
mlk_ct_cmov_zero 4s 1s +300%
mlk_keccakf1600_permute_c 4s 4s +0%
mlk_poly_invntt_tomont 4s 1s +300%
mlk_poly_invntt_tomont_c 4s 1s +300%
mlk_poly_mulcache_compute_native 4s 3s +33%
mlk_polyvec_permute_bitrev_to_custom 4s 2s +100%
mlk_polyvec_tobytes 4s 3s +33%
mlk_shake128x4_squeezeblocks 4s 1s +300%
mlk_shake256 4s 2s +100%
mlk_value_barrier_u8 4s 2s +100%
poly_decompress_d4_native_x86_64 4s 4s +0%
poly_tobytes_native_aarch64 4s 3s +33%
poly_tobytes_native_x86_64 4s 3s +33%
intt_native_aarch64 3s 2s +50%
intt_native_x86_64 3s 3s +0%
keccak_f1600_x1_native_aarch64 3s 1s +200%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 3s 1s +200%
kem_enc 3s 1s +200%
kem_enc_derand 3s 4s -25%
kem_keypair_derand 3s 2s +50%
mlk_gen_matrix_serial 3s 4s -25%
mlk_keccakf1600_extract_bytes (big endian) 3s 1s +200%
mlk_keccakf1600_permute 3s 2s +50%
mlk_keccakf1600_xor_bytes 3s 3s +0%
mlk_keccakf1600x4_extract_bytes_c 3s 2s +50%
mlk_poly_add 3s 3s +0%
mlk_poly_cbd_eta2 3s 3s +0%
mlk_poly_compress_d10 3s 3s +0%
mlk_poly_compress_d11_c 3s 3s +0%
mlk_poly_compress_d4 3s 3s +0%
mlk_poly_compress_d4_c 3s 3s +0%
mlk_poly_compress_d5_native 3s 2s +50%
mlk_poly_compress_du 3s 1s +200%
mlk_poly_decompress_d11_native 3s 4s -25%
mlk_poly_decompress_d4 3s 3s +0%
mlk_poly_getnoise_eta1122_4x 3s 3s +0%
mlk_poly_getnoise_eta1_4x_native 3s 2s +50%
mlk_poly_mulcache_compute_c 3s 5s -40%
mlk_poly_tobytes 3s 3s +0%
mlk_poly_tomont_c 3s 2s +50%
mlk_polyvec_compress_du 3s 1s +200%
mlk_polyvec_frombytes 3s 3s +0%
mlk_polyvec_permute_bitrev_to_custom_native 3s 2s +50%
mlk_polyvec_reduce 3s 2s +50%
mlk_scalar_compress_d4 3s 5s -40%
mlk_scalar_decompress_d11 3s 2s +50%
mlk_scalar_signed_to_unsigned_q 3s 2s +50%
mlk_sha3_256 3s 1s +200%
mlk_sha3_512 3s 2s +50%
mlk_shake128x4_absorb_once 3s 3s +0%
ntt_native_x86_64 3s 1s +200%
poly_compress_d11_native_x86_64 3s 2s +50%
poly_compress_d4_native_x86_64 3s 3s +0%
poly_compress_d5_native_x86_64 3s 2s +50%
poly_getnoise_eta1122_4x_native 3s 2s +50%
poly_invntt_tomont_native 3s 2s +50%
poly_tomont_native_x86_64 3s 2s +50%
polyvec_basemul_acc_montgomery_cached_k2_native_aarch64 3s 2s +50%
polyvec_basemul_acc_montgomery_cached_k3_native_aarch64 3s 3s +0%
polyvec_basemul_acc_montgomery_cached_k4_native_aarch64 3s 2s +50%
polyvec_basemul_acc_montgomery_cached_k4_native_x86_64 3s 2s +50%
rej_uniform_native 3s 2s +50%
sys_check_capability 3s 2s +50%
keccak_f1600_x1_native_aarch64_v84a 2s 2s +0%
keccak_f1600_x4_native_aarch64_v84a 2s 1s +100%
keccakf1600_permute_native 2s 2s +0%
keccakf1600x4_xor_bytes_native 2s 2s +0%
kem_keypair 2s 2s +0%
mlk_barrett_reduce 2s 4s -50%
mlk_ct_cmask_neg_i16 2s 2s +0%
mlk_ct_cmask_nonzero_u8 2s 2s +0%
mlk_ct_get_optblocker_u32 2s 2s +0%
mlk_ct_memcmp 2s 2s +0%
mlk_ct_sel_int16 2s 2s +0%
mlk_enc_getnoise_eta1_eta2 2s 3s -33%
mlk_keccakf1600_xor_bytes (big endian) 2s 5s -60%
mlk_keccakf1600x4_extract_bytes 2s 2s +0%
mlk_keccakf1600x4_permute 2s 1s +100%
mlk_keccakf1600x4_xor_bytes 2s 3s -33%
mlk_keccakf1600x4_xor_bytes_c 2s 1s +100%
mlk_keypair_getnoise_eta1 2s 2s +0%
mlk_matvec_mul 2s 2s +0%
mlk_montgomery_reduce 2s 3s -33%
mlk_poly_cbd_eta1 2s 2s +0%
mlk_poly_compress_d10_c 2s 2s +0%
mlk_poly_compress_d10_native 2s 1s +100%
mlk_poly_compress_d11 2s 3s -33%
mlk_poly_compress_d11_native 2s 2s +0%
mlk_poly_compress_d5 2s 3s -33%
mlk_poly_compress_dv 2s 5s -60%
mlk_poly_decompress_d10 2s 2s +0%
mlk_poly_decompress_d11 2s 2s +0%
mlk_poly_decompress_d11_c 2s 2s +0%
mlk_poly_decompress_d4_c 2s 2s +0%
mlk_poly_decompress_d5_c 2s 1s +100%
mlk_poly_decompress_d5_native 2s 1s +100%
mlk_poly_frombytes_c 2s 3s -33%
mlk_poly_mulcache_compute 2s 5s -60%
mlk_poly_reduce 2s 1s +100%
mlk_poly_reduce_c 2s 2s +0%
mlk_poly_tomont 2s 3s -33%
mlk_poly_tomont_native 2s 2s +0%
mlk_poly_tomsg 2s 5s -60%
mlk_polyvec_basemul_acc_montgomery_cached 2s 3s -33%
mlk_polyvec_decompress_du 2s 2s +0%
mlk_polyvec_mulcache_compute 2s 3s -33%
mlk_polyvec_tomont 2s 2s +0%
mlk_rej_uniform 2s 3s -33%
mlk_scalar_compress_d1 2s 3s -33%
mlk_scalar_compress_d10 2s 2s +0%
mlk_scalar_compress_d5 2s 4s -50%
mlk_scalar_decompress_d10 2s 4s -50%
mlk_scalar_decompress_d4 2s 1s +100%
mlk_scalar_decompress_d5 2s 2s +0%
mlk_shake128_absorb_once 2s 1s +100%
mlk_shake128_squeezeblocks 2s 3s -33%
mlk_shake256x4 2s 5s -60%
mlk_value_barrier_i32 2s 1s +100%
ntt_native_aarch64 2s 2s +0%
nttunpack_native_x86_64 2s 2s +0%
poly_compress_d10_native_x86_64 2s 1s +100%
poly_decompress_d11_native_x86_64 2s 2s +0%
poly_mulcache_compute_native_aarch64 2s 1s +100%
poly_mulcache_compute_native_x86_64 2s 2s +0%
poly_reduce_native_x86_64 2s 1s +100%
poly_tomont_native_aarch64 2s 3s -33%
polyvec_basemul_acc_montgomery_cached_k2_native_x86_64 2s 2s +0%
polyvec_basemul_acc_montgomery_cached_k3_native_x86_64 2s 1s +100%
keccak_f1600_x4_native_avx2 1s 2s -50%
keccakf1600x4_extract_bytes_native 1s 1s +0%
mlk_check_pct 1s 2s -50%
mlk_ct_cmask_nonzero_u16 1s 1s +0%
mlk_ct_get_optblocker_i32 1s 3s -67%
mlk_ct_get_optblocker_u8 1s 1s +0%
mlk_ct_sel_uint8 1s 3s -67%
mlk_keccakf1600_extract_bytes 1s 1s +0%
mlk_poly_compress_d4_native 1s 2s -50%
mlk_poly_decompress_d10_c 1s 4s -75%
mlk_poly_decompress_d5 1s 1s +0%
mlk_poly_decompress_du 1s 2s -50%
mlk_poly_decompress_dv 1s 3s -67%
mlk_poly_getnoise_eta2 1s 1s +0%
mlk_poly_sub 1s 3s -67%
mlk_poly_tobytes_c 1s 3s -67%
mlk_poly_tobytes_native 1s 1s +0%
mlk_polyvec_invntt_tomont 1s 3s -67%
mlk_polyvec_ntt 1s 3s -67%
mlk_scalar_compress_d11 1s 2s -50%
mlk_value_barrier_u32 1s 2s -50%
poly_decompress_d5_native_x86_64 1s 4s -75%
poly_reduce_native_aarch64 1s 3s -67%
rej_uniform_native_aarch64 1s 3s -67%

@hanno-becker hanno-becker force-pushed the unit_valgrind branch 2 times, most recently from 79a002b to 72e929a Compare March 20, 2026 05:42
@hanno-becker hanno-becker force-pushed the unit_valgrind branch 3 times, most recently from 6ffbf41 to 15fb7d5 Compare April 18, 2026 20:05
@hanno-becker hanno-becker marked this pull request as ready for review April 19, 2026 04:42
@hanno-becker hanno-becker requested a review from a team as a code owner April 19, 2026 04:42
@hanno-becker hanno-becker force-pushed the unit_valgrind branch 4 times, most recently from 34c12f9 to 3401242 Compare April 19, 2026 10:25
Replace aligned_alloc + MLK_ALIGN_UP with posix_memalign in
custom_heap_alloc_config.h. Unlike aligned_alloc, posix_memalign
does not require the size to be a multiple of the alignment,
removing the need for MLK_ALIGN_UP rounding. This ensures that
allocations are exact-sized, allowing memory-safety tests like
valgrind and ASan to detect overflows at precise buffer boundaries.

On Windows, where posix_memalign is not available, we use
_aligned_malloc instead. This, too, does not require the size
to be a multiple of the alignment.

Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
Replace all stack-allocated buffers in test_unit.c with heap
allocations via MLK_ALLOC/MLK_FREE, using the custom_heap_alloc_config.
This enables valgrind to detect buffer overflows in assembly backends,
which operate on these buffers.

Build the unit test objects with custom_heap_alloc_config.h by adding
the appropriate -DMLK_CONFIG_FILE, -std=c11, and -D_GNU_SOURCE flags
in components.mk.

Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
Add native-vs-C consistency tests for previously untested backends:
- mlk_rej_uniform_native: compare against mlk_rej_uniform_c
- mlk_poly_compress_d{4,5,10,11}_native: compare against C reference
- mlk_poly_decompress_d{4,5,10,11}_native: compare against C reference

These tests call the assembly backends directly with heap-allocated
buffers, enabling valgrind to detect buffer overflows. In particular,
the rej_uniform test would have caught the 4-byte overread in the
AVX2 rejection sampling fixed in commit f10b801.

Previous ad-hoc tests for detecting overflow in (de)compression
routines are now subsumed by the unit tests, and removed.

Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
Add unit_valgrind job to ci.yml that runs the unit tests under
valgrind on x86_64 and aarch64 runners. This catches buffer
overflows in hand-written assembly that ASan cannot detect, since
ASan only instruments compiler-generated code.

Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
Copy link
Copy Markdown
Contributor

@mkannwischer mkannwischer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hanno-becker!

For future reference, this is how it looks like when a memory-overread is detected:

  INFO  > Unit Test          ML-KEM-512  (native opt):     EXEC_WRAPPER=valgrind --error-exitcode=1 make run_unit_512 -j4
  ERROR > Unit Test          ML-KEM-512  (native opt):     'EXEC_WRAPPER=valgrind --error-exitcode=1 make run_unit_512 -j4' failed with with 2
  ERROR > Unit Test          ML-KEM-512  (native opt):     ==25679== Memcheck, a memory error detector
  ==25679== Copyright (C) 2002-2024, and GNU GPL'd, by Julian Seward et al.
  ==25679== Using Valgrind-3.26.0 and LibVEX; rerun with -h for copyright info
  ==25679== Command: test/build/mlkem512/bin/test_unit512
  ==25679== 
  ==25679== Invalid read of size 16
  ==25679==    at 0x40088FB: ??? (in /home/runner/work/mlkem-native/mlkem-native/test/build/mlkem512/bin/test_unit512)
  ==25679==    by 0x4001649: main (in /home/runner/work/mlkem-native/mlkem-native/test/build/mlkem512/bin/test_unit512)
  ==25679==  Address 0x59000cc is 492 bytes inside a block of size 504 alloc'd
  ==25679==    at 0x486035B: posix_memalign (in /nix/store/9i082ffnccn11i398mfcpmgwg2hxzl1m-valgrind-3.26.0/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
  ==25679==    by 0x40021CB: mlk_posix_memalign.constprop.0 (in /home/runner/work/mlkem-native/mlkem-native/test/build/mlkem512/bin/test_unit512)
  ==25679==    by 0x4001601: main (in /home/runner/work/mlkem-native/mlkem-native/test/build/mlkem512/bin/test_unit512)
  ==25679== 
  ==25679== 
  ==25679== HEAP SUMMARY:
  ==25679==     in use at exit: 0 bytes in 0 blocks
  ==25679==   total heap usage: 307,223 allocs, 307,223 frees, 171,631,592 bytes allocated
  ==25679== 
  ==25679== All heap blocks were freed -- no leaks are possible
  ==25679== 
  ==25679== For lists of detected and suppressed errors, rerun with: -s
  ==25679== ERROR SUMMARY: 509 errors from 1 contexts (suppressed: 0 from 0)
  make: *** [Makefile:75: run_unit_512] Error 1

@mkannwischer mkannwischer merged commit 29c2685 into main Apr 19, 2026
391 checks passed
@mkannwischer mkannwischer deleted the unit_valgrind branch April 19, 2026 12:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Use heap allocation + valgrind in backend unit test

3 participants