Skip to content

Add HOL Light pointwise-acc multiplication proofs for AArch64 and x86_64#1010

Open
jakemas wants to merge 1 commit intomainfrom
jakemas/hol-light-pointwise-acc
Open

Add HOL Light pointwise-acc multiplication proofs for AArch64 and x86_64#1010
jakemas wants to merge 1 commit intomainfrom
jakemas/hol-light-pointwise-acc

Conversation

@jakemas
Copy link
Copy Markdown
Contributor

@jakemas jakemas commented Apr 1, 2026

Resolves:

Summary

Dependencies

@jakemas jakemas requested a review from a team as a code owner April 1, 2026 21:23
@jakemas jakemas marked this pull request as draft April 1, 2026 21:26
@jakemas jakemas force-pushed the jakemas/hol-light-pointwise-acc branch 5 times, most recently from c96a621 to b9b5908 Compare April 1, 2026 22:05
@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented Apr 1, 2026

CBMC Results (ML-DSA-44)

Full Results (186 proofs)
Proof Status Current Previous Change
**TOTAL** 2082s 2039s +2.1%
polyvecl_pointwise_acc_montgomery_c 346s 336s +3%
sign_verify_internal 208s 213s -2%
poly_pointwise_montgomery_c 146s 148s -1%
rej_uniform_native 145s 139s +4%
mld_attempt_signature_generation 108s 103s +5%
mld_invntt_layer 86s 85s +1%
mld_ct_memcmp 78s 72s +8%
mld_ntt_layer 55s 52s +6%
polymat_permute_bitrev_to_custom 29s 26s +12%
poly_chknorm_c 23s 20s +15%
polyvec_matrix_expand 21s 23s -9%
polyvec_matrix_expand_serial 21s 21s +0%
rej_uniform 21s 20s +5%
sign_signature_internal 21s 19s +11%
poly_uniform_eta_4x 20s 17s +18%
fqmul 18s 20s -10%
polyeta_unpack 17s 16s +6%
rej_uniform_c 16s 15s +7%
poly_uniform_4x 15s 16s -6%
keccakf1600x4_permute_native 14s 13s +8%
polyt0_unpack 14s 14s +0%
mld_ntt_butterfly_block 13s 11s +18%
mld_compute_t0_t1_tr_from_sk_components 12s 11s +9%
polyz_unpack_c 12s 12s +0%
keccak_absorb_once_x4 10s 9s +11%
polyvec_matrix_pointwise_montgomery 10s 6s +67%
unpack_sk 10s 7s +43%
poly_add 9s 10s -10%
keccak_squeezeblocks_x4 8s 6s +33%
keccakf1600_permute 8s 8s +0%
mld_check_pct 8s 12s -33%
mld_polyvecl_permute_bitrev_to_custom_native 8s 8s +0%
sign 8s 8s +0%
keccakf1600_permute_native 7s 9s -22%
pointwise_acc_native_aarch64 7s - new
poly_caddq_c 7s 5s +40%
poly_chknorm_native_aarch64 7s 4s +75%
polyveck_decompose 7s 6s +17%
polyveck_use_hint 7s 7s +0%
shake256x4_absorb_once 7s 4s +75%
sign_open 7s 3s +133%
sign_pk_from_sk 7s 8s -12%
keccak_absorb 6s 5s +20%
mld_h 6s 4s +50%
poly_decompose 6s 6s +0%
poly_use_hint_c 6s 4s +50%
polyveck_invntt_tomont 6s 3s +100%
polyveck_reduce 6s 4s +50%
polyveck_sub 6s 4s +50%
polyvecl_chknorm 6s 5s +20%
polyvecl_ntt 6s 6s +0%
sign_keypair 6s 6s +0%
sign_keypair_internal 6s 4s +50%
intt_native_x86_64 5s 3s +67%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 5s 1s +400%
mld_ct_cmask_nonzero_u8 5s 3s +67%
mld_sample_s1_s2_serial 5s 4s +25%
pointwise_acc_native_x86_64 5s - new
pointwise_native_x86_64 5s 2s +150%
poly_chknorm 5s 2s +150%
poly_invntt_tomont_c 5s 6s -17%
poly_uniform 5s 4s +25%
polyveck_add 5s 8s -38%
polyveck_caddq 5s 3s +67%
polyveck_power2round 5s 6s -17%
polyveck_shiftl 5s 4s +25%
polyveck_unpack_t0 5s 3s +67%
polyvecl_permute_bitrev_to_custom 5s 5s +0%
rej_eta_native 5s 6s -17%
sign_signature_pre_hash_shake256 5s 4s +25%
unpack_hints 5s 6s -17%
caddq 4s 2s +100%
decompose 4s 2s +100%
keccak_f1600_x1_native_aarch64_v84a 4s 3s +33%
keccakf1600_extract_bytes (big endian) 4s 5s -20%
keccakf1600x4_xor_bytes 4s 3s +33%
mld_compute_pack_z 4s 7s -43%
mld_ct_cmask_nonzero_u32 4s 2s +100%
mld_ct_get_optblocker_u8 4s 1s +300%
montgomery_reduce 4s 2s +100%
pack_sig_h_poly 4s 5s -20%
pointwise_native_aarch64 4s 2s +100%
poly_caddq 4s 3s +33%
poly_challenge 4s 7s -43%
poly_chknorm_native 4s 2s +100%
poly_invntt_tomont_native 4s 2s +100%
poly_pointwise_montgomery 4s 3s +33%
poly_shiftl 4s 3s +33%
poly_uniform_eta 4s 5s -20%
poly_uniform_gamma1 4s 3s +33%
poly_use_hint_native 4s 6s -33%
polyveck_chknorm 4s 3s +33%
polyveck_pack_eta 4s 4s +0%
polyveck_unpack_eta 4s 4s +0%
polyvecl_pack_eta 4s 4s +0%
polyvecl_unpack_z 4s 3s +33%
polyz_unpack_native 4s 3s +33%
rej_eta_c 4s 5s -20%
sign_signature 4s 5s -20%
sign_verify_pre_hash_internal 4s 6s -33%
sys_check_capability 4s 4s +0%
keccak_finalize 3s 4s -25%
keccakf1600_xor_bytes 3s 2s +50%
keccakf1600_xor_bytes (big endian) 3s 2s +50%
make_hint 3s 1s +200%
mld_prepare_domain_separation_prefix 3s 6s -50%
mld_sample_s1_s2 3s 6s -50%
mld_value_barrier_u8 3s 2s +50%
ntt_native_aarch64 3s 5s -40%
ntt_native_x86_64 3s 4s -25%
pack_pk 3s 4s -25%
pack_sig_z 3s 5s -40%
poly_caddq_native_aarch64 3s 4s -25%
poly_decompose_native 3s 3s +0%
poly_invntt_tomont 3s 4s -25%
poly_make_hint 3s 5s -40%
poly_pointwise_montgomery_native 3s 3s +0%
poly_power2round 3s 2s +50%
poly_sub 3s 3s +0%
poly_uniform_gamma1_4x 3s 4s -25%
polyt0_pack 3s 7s -57%
polyt1_pack 3s 4s -25%
polyt1_unpack 3s 5s -40%
polyveck_ntt 3s 6s -50%
polyveck_pack_t0 3s 3s +0%
polyvecl_pointwise_acc_montgomery_native 3s 4s -25%
polyvecl_uniform_gamma1 3s 3s +0%
polyvecl_uniform_gamma1_serial 3s 3s +0%
polyw1_pack 3s 2s +50%
polyz_pack 3s 2s +50%
polyz_unpack 3s 2s +50%
reduce32 3s 2s +50%
rej_eta 3s 4s -25%
shake128_absorb 3s 3s +0%
shake128_finalize 3s 3s +0%
shake128x4_absorb_once 3s 4s -25%
shake256_finalize 3s 3s +0%
shake256_release 3s 3s +0%
shake256_squeeze 3s 3s +0%
shake256x4_squeezeblocks 3s 1s +200%
sign_signature_pre_hash_internal 3s 2s +50%
sign_verify 3s 6s -50%
sign_verify_extmu 3s 4s -25%
unpack_sig 3s 3s +0%
fqscale 2s 3s -33%
keccak_f1600_x4_native_aarch64_v84a 2s 2s +0%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 2s 4s -50%
keccak_init 2s 2s +0%
keccak_squeeze 2s 6s -67%
keccakf1600x4_extract_bytes 2s 3s -33%
mld_ct_abs_i32 2s 3s -33%
mld_keccakf1600_extract_bytes 2s 3s -33%
mld_value_barrier_i64 2s 2s +0%
mld_value_barrier_u32 2s 1s +100%
pack_sig_c 2s 3s -33%
pack_sk 2s 3s -33%
poly_caddq_native 2s 3s -33%
poly_decompose_c 2s 3s -33%
poly_ntt 2s 2s +0%
poly_ntt_native 2s 1s +100%
poly_reduce 2s 3s -33%
poly_use_hint 2s 3s -33%
polyeta_pack 2s 2s +0%
polyveck_pack_w1 2s 3s -33%
polyveck_pointwise_poly_montgomery 2s 3s -33%
polyvecl_pointwise_acc_montgomery 2s 3s -33%
polyvecl_unpack_eta 2s 3s -33%
power2round 2s 5s -60%
shake128_init 2s 5s -60%
shake128_release 2s 5s -60%
shake128x4_squeezeblocks 2s 2s +0%
shake256 2s 2s +0%
shake256_absorb 2s 2s +0%
shake256_init 2s 2s +0%
sign_signature_extmu 2s 6s -67%
sign_verify_pre_hash_shake256 2s 3s -33%
unpack_pk 2s 2s +0%
use_hint 2s 2s +0%
keccak_f1600_x1_native_aarch64 1s 2s -50%
keccakf1600x4_permute 1s 2s -50%
mld_ct_cmask_neg_i32 1s 1s +0%
mld_ct_get_optblocker_i64 1s 2s -50%
mld_ct_get_optblocker_u32 1s 2s -50%
mld_ct_sel_int32 1s 2s -50%
poly_ntt_c 1s 2s -50%
shake128_squeeze 1s 3s -67%

@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented Apr 1, 2026

CBMC Results (ML-DSA-65)

Full Results (186 proofs)
Proof Status Current Previous Change
**TOTAL** 2180s 2329s -6.4%
sign_verify_internal 266s 286s -7%
polyvecl_pointwise_acc_montgomery_c 243s 280s -13%
rej_uniform_native 140s 148s -5%
poly_pointwise_montgomery_c 133s 156s -15%
polyvec_matrix_expand 104s 107s -3%
mld_attempt_signature_generation 98s 107s -8%
mld_invntt_layer 86s 94s -9%
polyvec_matrix_expand_serial 81s 81s +0%
mld_ct_memcmp 68s 79s -14%
mld_ntt_layer 52s 55s -5%
polymat_permute_bitrev_to_custom 33s 37s -11%
mld_compute_t0_t1_tr_from_sk_components 27s 28s -4%
sign_signature_internal 27s 25s +8%
polyveck_use_hint 23s 22s +5%
poly_chknorm_c 22s 19s +16%
fqmul 20s 18s +11%
rej_uniform 20s 22s -9%
poly_uniform_4x 15s 15s +0%
poly_uniform_eta_4x 14s 16s -12%
polyveck_decompose 14s 16s -12%
polyt0_unpack 13s 15s -13%
polyveck_power2round 13s 12s +8%
rej_uniform_c 13s 15s -13%
keccakf1600x4_permute_native 12s 13s -8%
mld_check_pct 12s 13s -8%
mld_ntt_butterfly_block 12s 13s -8%
polyvec_matrix_pointwise_montgomery 12s 15s -20%
poly_add 11s 10s +10%
keccak_absorb_once_x4 9s 10s -10%
keccakf1600_permute_native 9s 8s +12%
mld_polyvecl_permute_bitrev_to_custom_native 9s 8s +12%
poly_decompose_c 9s 7s +29%
poly_invntt_tomont_c 9s 7s +29%
polyveck_ntt 9s 7s +29%
unpack_sk 9s 9s +0%
keccak_absorb 8s 5s +60%
keccakf1600_permute 8s 10s -20%
polyveck_add 8s 10s -20%
polyveck_chknorm 8s 8s +0%
polyveck_sub 8s 8s +0%
mld_sample_s1_s2_serial 7s 3s +133%
polyveck_caddq 7s 11s -36%
sign 7s 7s +0%
pointwise_acc_native_x86_64 6s - new
pointwise_native_x86_64 6s 1s +500%
poly_uniform_eta 6s 3s +100%
polyveck_invntt_tomont 6s 9s -33%
polyveck_pack_eta 6s 3s +100%
polyveck_pointwise_poly_montgomery 6s 6s +0%
polyveck_reduce 6s 5s +20%
polyvecl_ntt 6s 5s +20%
sign_verify_pre_hash_internal 6s 6s +0%
keccak_squeezeblocks_x4 5s 6s -17%
mld_h 5s 5s +0%
mld_prepare_domain_separation_prefix 5s 6s -17%
pack_pk 5s 3s +67%
poly_caddq_c 5s 5s +0%
poly_chknorm_native_aarch64 5s 4s +25%
poly_power2round 5s 7s -29%
poly_uniform 5s 5s +0%
poly_use_hint_c 5s 4s +25%
polyeta_unpack 5s 6s -17%
polyt1_unpack 5s 5s +0%
polyveck_pack_t0 5s 4s +25%
polyveck_shiftl 5s 7s -29%
polyvecl_permute_bitrev_to_custom 5s 4s +25%
polyvecl_unpack_z 5s 2s +150%
shake128_squeeze 5s 2s +150%
sign_open 5s 4s +25%
sign_pk_from_sk 5s 9s -44%
sign_verify 5s 2s +150%
sign_verify_pre_hash_shake256 5s 5s +0%
decompose 4s 4s +0%
keccakf1600_extract_bytes (big endian) 4s 4s +0%
mld_compute_pack_z 4s 6s -33%
mld_ct_get_optblocker_u32 4s 2s +100%
mld_sample_s1_s2 4s 7s -43%
montgomery_reduce 4s 5s -20%
ntt_native_x86_64 4s 5s -20%
pack_sig_h_poly 4s 4s +0%
pack_sig_z 4s 2s +100%
pointwise_acc_native_aarch64 4s - new
poly_chknorm 4s 2s +100%
poly_invntt_tomont_native 4s 3s +33%
poly_ntt 4s 4s +0%
poly_ntt_native 4s 3s +33%
poly_use_hint_native 4s 4s +0%
polyeta_pack 4s 4s +0%
polyt0_pack 4s 3s +33%
polyveck_unpack_t0 4s 2s +100%
polyvecl_pointwise_acc_montgomery_native 4s 3s +33%
polyvecl_uniform_gamma1 4s 3s +33%
shake128x4_absorb_once 4s 3s +33%
shake256_release 4s 3s +33%
sign_keypair_internal 4s 6s -33%
sign_signature_pre_hash_shake256 4s 5s -20%
fqscale 3s 4s -25%
intt_native_x86_64 3s 2s +50%
keccak_f1600_x1_native_aarch64_v84a 3s 3s +0%
keccak_f1600_x4_native_aarch64_v84a 3s 3s +0%
keccak_finalize 3s 1s +200%
keccakf1600_xor_bytes 3s 2s +50%
keccakf1600_xor_bytes (big endian) 3s 3s +0%
keccakf1600x4_extract_bytes 3s 1s +200%
mld_ct_abs_i32 3s 2s +50%
mld_ct_cmask_nonzero_u32 3s 3s +0%
pack_sig_c 3s 2s +50%
pack_sk 3s 4s -25%
pointwise_native_aarch64 3s 3s +0%
poly_caddq 3s 2s +50%
poly_caddq_native 3s 2s +50%
poly_caddq_native_aarch64 3s 5s -40%
poly_challenge 3s 6s -50%
poly_chknorm_native 3s 3s +0%
poly_invntt_tomont 3s 6s -50%
poly_make_hint 3s 3s +0%
poly_ntt_c 3s 2s +50%
poly_pointwise_montgomery 3s 6s -50%
poly_reduce 3s 2s +50%
poly_sub 3s 3s +0%
poly_uniform_gamma1 3s 5s -40%
poly_uniform_gamma1_4x 3s 5s -40%
poly_use_hint 3s 1s +200%
polyveck_pack_w1 3s 3s +0%
polyveck_unpack_eta 3s 2s +50%
polyvecl_pointwise_acc_montgomery 3s 4s -25%
polyvecl_uniform_gamma1_serial 3s 2s +50%
polyvecl_unpack_eta 3s 3s +0%
polyw1_pack 3s 3s +0%
polyz_unpack_native 3s 5s -40%
power2round 3s 2s +50%
reduce32 3s 3s +0%
rej_eta_c 3s 3s +0%
rej_eta_native 3s 5s -40%
shake128x4_squeezeblocks 3s 3s +0%
shake256 3s 2s +50%
shake256x4_absorb_once 3s 1s +200%
shake256x4_squeezeblocks 3s 3s +0%
sign_keypair 3s 2s +50%
sign_signature_extmu 3s 4s -25%
sign_signature_pre_hash_internal 3s 7s -57%
sign_verify_extmu 3s 2s +50%
unpack_hints 3s 5s -40%
unpack_sig 3s 2s +50%
use_hint 3s 3s +0%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 2s 1s +100%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 2s 4s -50%
keccak_squeeze 2s 6s -67%
keccakf1600x4_permute 2s 5s -60%
keccakf1600x4_xor_bytes 2s 3s -33%
make_hint 2s 2s +0%
mld_ct_cmask_neg_i32 2s 4s -50%
mld_ct_cmask_nonzero_u8 2s 2s +0%
mld_ct_get_optblocker_i64 2s 2s +0%
mld_ct_get_optblocker_u8 2s 2s +0%
mld_ct_sel_int32 2s 4s -50%
mld_keccakf1600_extract_bytes 2s 1s +100%
mld_value_barrier_u32 2s 1s +100%
mld_value_barrier_u8 2s 4s -50%
ntt_native_aarch64 2s 3s -33%
poly_decompose 2s 3s -33%
poly_decompose_native 2s 4s -50%
poly_pointwise_montgomery_native 2s 2s +0%
poly_shiftl 2s 2s +0%
polyt1_pack 2s 3s -33%
polyvecl_chknorm 2s 5s -60%
polyvecl_pack_eta 2s 3s -33%
polyz_pack 2s 2s +0%
polyz_unpack 2s 4s -50%
polyz_unpack_c 2s 2s +0%
rej_eta 2s 4s -50%
shake128_finalize 2s 1s +100%
shake128_release 2s 3s -33%
shake256_absorb 2s 3s -33%
shake256_finalize 2s 2s +0%
shake256_squeeze 2s 5s -60%
sign_signature 2s 4s -50%
sys_check_capability 2s 3s -33%
unpack_pk 2s 4s -50%
caddq 1s 2s -50%
keccak_f1600_x1_native_aarch64 1s 2s -50%
keccak_init 1s 2s -50%
mld_value_barrier_i64 1s 4s -75%
shake128_absorb 1s 4s -75%
shake128_init 1s 2s -50%
shake256_init 1s 2s -50%

@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented Apr 1, 2026

CBMC Results (ML-DSA-87)

Full Results (186 proofs)
Proof Status Current Previous Change
**TOTAL** 2264s 2411s -6.1%
polyvec_matrix_expand 237s 255s -7%
polyvecl_pointwise_acc_montgomery_c 191s 218s -12%
poly_pointwise_montgomery_c 166s 186s -11%
sign_verify_internal 145s 147s -1%
rej_uniform_native 141s 155s -9%
mld_invntt_layer 101s 104s -3%
mld_attempt_signature_generation 98s 96s +2%
polyvec_matrix_expand_serial 85s 88s -3%
mld_ct_memcmp 79s 89s -11%
mld_ntt_layer 56s 59s -5%
sign_signature_internal 40s 41s -2%
polymat_permute_bitrev_to_custom 29s 27s +7%
mld_compute_t0_t1_tr_from_sk_components 25s 25s +0%
rej_uniform 22s 24s -8%
fqmul 20s 23s -13%
poly_chknorm_c 19s 23s -17%
polyeta_unpack 18s 19s -5%
poly_uniform_4x 17s 17s +0%
keccakf1600x4_permute_native 15s 14s +7%
poly_uniform_eta_4x 15s 16s -6%
polyt0_unpack 15s 20s -25%
polyveck_power2round 15s 20s -25%
rej_uniform_c 15s 19s -21%
mld_polyvecl_permute_bitrev_to_custom_native 14s 14s +0%
mld_ntt_butterfly_block 13s 16s -19%
polyveck_use_hint 13s 12s +8%
poly_add 12s 11s +9%
sign_pk_from_sk 12s 7s +71%
mld_check_pct 11s 17s -35%
polyveck_add 11s 11s +0%
polyveck_decompose 11s 11s +0%
keccak_absorb_once_x4 10s 12s -17%
pointwise_acc_native_x86_64 10s - new
polyvec_matrix_pointwise_montgomery 10s 10s +0%
polyveck_invntt_tomont 10s 10s +0%
unpack_sk 10s 8s +25%
keccakf1600_permute_native 9s 9s +0%
mld_compute_pack_z 9s 10s -10%
polyveck_ntt 9s 8s +12%
polyz_unpack_c 9s 12s -25%
keccak_squeezeblocks_x4 8s 9s -11%
keccakf1600_permute 8s 6s +33%
pointwise_acc_native_aarch64 8s - new
poly_invntt_tomont_c 8s 7s +14%
polyveck_reduce 8s 8s +0%
polyveck_shiftl 8s 8s +0%
polyvecl_ntt 8s 9s -11%
keccak_absorb 7s 7s +0%
mld_sample_s1_s2 7s 5s +40%
poly_decompose_c 7s 7s +0%
polyveck_chknorm 7s 7s +0%
mld_sample_s1_s2_serial 6s 7s -14%
poly_uniform_eta 6s 7s -14%
polyeta_pack 6s 5s +20%
polyveck_caddq 6s 8s -25%
polyveck_pointwise_poly_montgomery 6s 6s +0%
polyvecl_uniform_gamma1_serial 6s 2s +200%
sign 6s 7s -14%
sign_keypair 6s 3s +100%
sign_open 6s 5s +20%
sign_signature 6s 5s +20%
sign_verify_extmu 6s 4s +50%
unpack_hints 6s 6s +0%
mld_h 5s 3s +67%
mld_prepare_domain_separation_prefix 5s 4s +25%
ntt_native_aarch64 5s 4s +25%
poly_caddq_c 5s 7s -29%
poly_caddq_native 5s 2s +150%
poly_power2round 5s 6s -17%
poly_uniform 5s 5s +0%
polyveck_sub 5s 7s -29%
polyvecl_chknorm 5s 8s -38%
sign_keypair_internal 5s 6s -17%
sign_signature_pre_hash_internal 5s 4s +25%
sign_verify_pre_hash_internal 5s 2s +150%
unpack_pk 5s 5s +0%
mld_ct_abs_i32 4s 3s +33%
mld_ct_cmask_nonzero_u32 4s 3s +33%
mld_ct_get_optblocker_i64 4s 4s +0%
pointwise_native_aarch64 4s 4s +0%
poly_caddq_native_aarch64 4s 5s -20%
poly_chknorm_native_aarch64 4s 2s +100%
poly_ntt 4s 4s +0%
poly_sub 4s 3s +33%
polyt0_pack 4s 5s -20%
polyveck_pack_w1 4s 4s +0%
polyveck_unpack_t0 4s 5s -20%
polyvecl_pointwise_acc_montgomery 4s 4s +0%
polyvecl_pointwise_acc_montgomery_native 4s 3s +33%
polyvecl_unpack_eta 4s 5s -20%
polyw1_pack 4s 3s +33%
polyz_unpack_native 4s 2s +100%
shake128_release 4s 3s +33%
shake128_squeeze 4s 2s +100%
shake256_finalize 4s 6s -33%
shake256_release 4s 3s +33%
sign_signature_extmu 4s 6s -33%
sign_signature_pre_hash_shake256 4s 4s +0%
sign_verify_pre_hash_shake256 4s 7s -43%
unpack_sig 4s 3s +33%
caddq 3s 2s +50%
intt_native_x86_64 3s 1s +200%
keccak_f1600_x1_native_aarch64 3s 1s +200%
keccak_f1600_x1_native_aarch64_v84a 3s 2s +50%
keccak_f1600_x4_native_aarch64_v84a 3s 1s +200%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 3s 1s +200%
keccak_finalize 3s 2s +50%
keccak_squeeze 3s 3s +0%
keccakf1600_xor_bytes 3s 2s +50%
keccakf1600_xor_bytes (big endian) 3s 3s +0%
mld_ct_get_optblocker_u32 3s 3s +0%
mld_ct_get_optblocker_u8 3s 2s +50%
montgomery_reduce 3s 3s +0%
pack_sig_c 3s 2s +50%
pack_sig_h_poly 3s 4s -25%
pack_sk 3s 6s -50%
pointwise_native_x86_64 3s 2s +50%
poly_challenge 3s 5s -40%
poly_chknorm_native 3s 4s -25%
poly_decompose_native 3s 7s -57%
poly_invntt_tomont 3s 4s -25%
poly_invntt_tomont_native 3s 3s +0%
poly_ntt_c 3s 6s -50%
poly_pointwise_montgomery 3s 1s +200%
poly_reduce 3s 4s -25%
poly_use_hint 3s 3s +0%
polyvecl_permute_bitrev_to_custom 3s 4s -25%
polyvecl_uniform_gamma1 3s 2s +50%
polyz_pack 3s 2s +50%
power2round 3s 2s +50%
reduce32 3s 2s +50%
rej_eta_c 3s 3s +0%
rej_eta_native 3s 4s -25%
shake128_absorb 3s 3s +0%
shake128_finalize 3s 2s +50%
shake128x4_squeezeblocks 3s 2s +50%
shake256_absorb 3s 4s -25%
shake256_init 3s 1s +200%
shake256x4_absorb_once 3s 2s +50%
shake256x4_squeezeblocks 3s 2s +50%
sign_verify 3s 6s -50%
sys_check_capability 3s 3s +0%
decompose 2s 2s +0%
fqscale 2s 3s -33%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 2s 1s +100%
keccak_init 2s 3s -33%
keccakf1600x4_extract_bytes 2s 2s +0%
keccakf1600x4_permute 2s 2s +0%
keccakf1600x4_xor_bytes 2s 3s -33%
make_hint 2s 4s -50%
mld_ct_cmask_neg_i32 2s 2s +0%
mld_ct_sel_int32 2s 2s +0%
mld_value_barrier_i64 2s 2s +0%
mld_value_barrier_u32 2s 1s +100%
mld_value_barrier_u8 2s 3s -33%
ntt_native_x86_64 2s 3s -33%
pack_sig_z 2s 4s -50%
poly_caddq 2s 4s -50%
poly_chknorm 2s 1s +100%
poly_make_hint 2s 6s -67%
poly_ntt_native 2s 3s -33%
poly_pointwise_montgomery_native 2s 4s -50%
poly_shiftl 2s 5s -60%
poly_uniform_gamma1_4x 2s 2s +0%
polyveck_pack_eta 2s 4s -50%
polyveck_pack_t0 2s 4s -50%
polyveck_unpack_eta 2s 6s -67%
polyvecl_pack_eta 2s 3s -33%
polyvecl_unpack_z 2s 5s -60%
polyz_unpack 2s 4s -50%
shake256 2s 3s -33%
use_hint 2s 2s +0%
keccakf1600_extract_bytes (big endian) 1s 2s -50%
mld_ct_cmask_nonzero_u8 1s 2s -50%
mld_keccakf1600_extract_bytes 1s 2s -50%
pack_pk 1s 4s -75%
poly_decompose 1s 4s -75%
poly_uniform_gamma1 1s 2s -50%
poly_use_hint_c 1s 2s -50%
poly_use_hint_native 1s 3s -67%
polyt1_pack 1s 3s -67%
polyt1_unpack 1s 3s -67%
rej_eta 1s 3s -67%
shake128_init 1s 4s -75%
shake128x4_absorb_once 1s 2s -50%
shake256_squeeze 1s 4s -75%

@jakemas jakemas marked this pull request as ready for review April 2, 2026 01:08
@mkannwischer mkannwischer self-assigned this Apr 2, 2026
Copy link
Copy Markdown
Contributor

@mkannwischer mkannwischer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jakemas - the HOL-Light proofs look good to me. I checked the spec and everything makes sense to me.
The CBMC contracts, however, do not match that - please update.

Do you have an intuition why the proof for 44 is much slower than the one for 65? That does not seem to make sense to me. Would be good to improve this, but that could also be done in a follow-up PR.

Comment on lines +161 to +162
requires(array_abs_bound(a, 0, 4 * MLDSA_N, 75423753))
requires(array_abs_bound(b, 0, 4 * MLDSA_N, 75423753))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not match the HOL-Light pre-conditions:

               (!i. i < 1024 ==> abs(ival(x i)) <= &8380416) /\
               (!i. i < 1024 ==> abs(ival(y i)) <= &75423752) /\

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, fixed!

Comment on lines +136 to +137
requires(array_abs_bound(a, 0, 4 * MLDSA_N, 75423753))
requires(array_abs_bound(b, 0, 4 * MLDSA_N, 75423753))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same in the x86 backend

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, fixed!

@jakemas jakemas force-pushed the jakemas/hol-light-pointwise-acc branch 2 times, most recently from ba6edd2 to 87d0267 Compare April 2, 2026 19:56
@jakemas
Copy link
Copy Markdown
Contributor Author

jakemas commented Apr 2, 2026

Do you have an intuition why the proof for 44 is much slower than the one for 65? That does not seem to make sense to me. Would be good to improve this, but that could also be done in a follow-up PR.

Good catch. The bounds phase (proving congruence + boundedness for each of the 256 output coefficients) was using a slow approach for l4 that repeatedly calls find_term to search the goal term for ival(word_mul ...) subterms and then proves each rewrite via ARITH_TAC. This is quadratic in the goal term size.

I found this when developing the l7 proofs and optimized the approach there (as they were too slow without it), pre-computing all the ival_mul rewrite theorems into an array upfront then indexing directly, but didn't backport it to the smaller sizes. I've now applied the same optimization to the l4 proofs (both architectures) and also to the aarch64 l5/l7 which were also unoptimized. Should see a significant speedup.

Times improved 3-4x for L4:

Proof Before After Speedup
aarch64 l4 2h21m 42m 3.4x
aarch64 l5 49m 47m ~same
aarch64 l7 1h29m 1h29m ~same
x86_64 l4 2h23m 35m 4.1x
x86_64 l5 47m 49m ~same
x86_64 l7 1h32m ~1h31m ~same

Comment on lines +160 to +167
/* check-magic: off */
requires(array_abs_bound(a, 0, 4 * MLDSA_N, 8380417))
requires(array_abs_bound(b, 0, 4 * MLDSA_N, 75423753))
/* check-magic: on */
assigns(memory_slice(r, sizeof(int32_t) * MLDSA_N))
/* check-magic: off */
ensures(array_abs_bound(r, 0, MLDSA_N, 8380417))
/* check-magic: on */
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: turning off check-magic once instead of twice would be less distracting.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also present in #1006. Maybe you can still change that in both before we merge.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, fixed in both this and #1006

mkannwischer
mkannwischer previously approved these changes Apr 3, 2026
Copy link
Copy Markdown
Contributor

@mkannwischer mkannwischer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jakemas - this looks great to me. Thank you for the speed improvements for the L4 proof - that is greatly appreciated.
I think this is ready to be merged.

@hanno-becker, could you please double check the HOL-Light specs?

@jakemas jakemas force-pushed the jakemas/hol-light-pointwise-acc branch from 87d0267 to 21c0cbc Compare April 3, 2026 17:29
@hanno-becker hanno-becker force-pushed the jakemas/hol-light-pointwise-acc branch from 21c0cbc to 554a957 Compare April 5, 2026 05:10
Copy link
Copy Markdown
Contributor

@hanno-becker hanno-becker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @jakemas, this is great work 🎉

Unless I'm missing something, we still need CBMC proofs for the native versions of pointwise_acc_montgomery?

@mkannwischer mkannwischer dismissed their stale review April 5, 2026 05:23

Hanno is right: The CBMC proofs still need to be added.

@jakemas jakemas force-pushed the jakemas/hol-light-pointwise-acc branch 2 times, most recently from 8afaabe to 779cf15 Compare April 7, 2026 23:46
@jakemas
Copy link
Copy Markdown
Contributor Author

jakemas commented Apr 8, 2026

Ok x86 CBMC proofs in, working on an issue with the arm ones.
working theory:

The root cause is the aarch64 asm contract uses flat int32_t * with
array_abs_bound(a, 0, L*MLDSA_N, ...), but the native wrapper casts
from int32_t u[L][MLDSA_N] to (const int32_t *)u. Z3 struggles to
prove the 2D-to-flat cast preserves the bounds.

@jakemas jakemas force-pushed the jakemas/hol-light-pointwise-acc branch 2 times, most recently from 8814c18 to b464b88 Compare April 8, 2026 06:42
@jakemas
Copy link
Copy Markdown
Contributor Author

jakemas commented Apr 8, 2026

CBMC proofs added. Rebasing now.

@jakemas jakemas force-pushed the jakemas/hol-light-pointwise-acc branch from b464b88 to 79f0d01 Compare April 8, 2026 07:03
Port the ML-DSA pointwise multiplication-accumulation (l=4,5,7) and
their HOL Light proofs of correctness from s2n-bignum to mldsa-native,
for both AArch64 (NEON) and x86_64 (AVX2). Includes constant-time and
memory safety proofs for both architectures.

Ported from s2n-bignum PR #373.

Signed-off-by: Jake Massimo <jakemas@amazon.com>
Signed-off-by: Ubuntu <ubuntu@ip-172-31-29-57.us-west-2.compute.internal>
@jakemas jakemas force-pushed the jakemas/hol-light-pointwise-acc branch from 79f0d01 to 3b06637 Compare April 8, 2026 16:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants