Skip to content

RFC: Add incremental encaps API to support ML-KEM Braid#1619

Draft
mkannwischer wants to merge 2 commits into
mainfrom
incremental-enc-api
Draft

RFC: Add incremental encaps API to support ML-KEM Braid#1619
mkannwischer wants to merge 2 commits into
mainfrom
incremental-enc-api

Conversation

@mkannwischer

Copy link
Copy Markdown
Contributor

Split ML-KEM encapsulation into two phases (mlk_kem_enc_derand_u / mlk_kem_enc_v) to support protocols like Braid that need to interleave encapsulation with other operations between computing the u- and v-components of the ciphertext. The first phase only requires the public seed and H(pk), not the full public key vector. Internally, K-PKE.Encrypt is refactored into mlk_indcpa_enc_u + mlk_indcpa_enc_v. The non-incremental KEM path calls mlk_indcpa_enc directly to avoid serialization overhead. The intermediate noise polynomial epp is serialized as 4-bit nibbles (128 bytes) - this is primarily done to not require a pre-condition on the allowed values.

@mkannwischer mkannwischer force-pushed the incremental-enc-api branch 2 times, most recently from 325ab51 to 285fc8a Compare March 12, 2026 05:37
@mkannwischer mkannwischer added the benchmark this PR should be benchmarked in CI label Mar 12, 2026

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 4th gen (c7i)

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 11667 cycles 11774 cycles 0.99
ML-KEM-512 encaps 13401 cycles 13356 cycles 1.00
ML-KEM-512 decaps 17333 cycles 17522 cycles 0.99
ML-KEM-768 keypair 20339 cycles 20211 cycles 1.01
ML-KEM-768 encaps 21438 cycles 21480 cycles 1.00
ML-KEM-768 decaps 27521 cycles 27490 cycles 1.00
ML-KEM-1024 keypair 28756 cycles 28747 cycles 1.00
ML-KEM-1024 encaps 30828 cycles 30705 cycles 1.00
ML-KEM-1024 decaps 38764 cycles 38459 cycles 1.01

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ppc64le (POWER10) benchmarks

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 59376 cycles 59560 cycles 1.00
ML-KEM-512 encaps 72055 cycles 72057 cycles 1.00
ML-KEM-512 decaps 91812 cycles 91947 cycles 1.00
ML-KEM-768 keypair 98208 cycles 98659 cycles 1.00
ML-KEM-768 encaps 114736 cycles 115076 cycles 1.00
ML-KEM-768 decaps 140432 cycles 140831 cycles 1.00
ML-KEM-1024 keypair 148862 cycles 148847 cycles 1.00
ML-KEM-1024 encaps 167902 cycles 167928 cycles 1.00
ML-KEM-1024 decaps 198941 cycles 199093 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 3rd gen (c6a)

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 13939 cycles 13907 cycles 1.00
ML-KEM-512 encaps 15689 cycles 15691 cycles 1.00
ML-KEM-512 decaps 21157 cycles 21253 cycles 1.00
ML-KEM-768 keypair 23701 cycles 23709 cycles 1.00
ML-KEM-768 encaps 25099 cycles 25155 cycles 1.00
ML-KEM-768 decaps 33133 cycles 33007 cycles 1.00
ML-KEM-1024 keypair 33205 cycles 33204 cycles 1.00
ML-KEM-1024 encaps 35665 cycles 35641 cycles 1.00
ML-KEM-1024 decaps 46453 cycles 46195 cycles 1.01

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'AMD EPYC 3rd gen (c6a)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: a4e4e31 Previous: 2bf8e59 Ratio
ML-KEM-512 encaps 16707 cycles 15974 cycles 1.05
ML-KEM-768 decaps 35711 cycles 33345 cycles 1.07
ML-KEM-1024 decaps 50650 cycles 46735 cycles 1.08

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 4th gen (c7i) (no-opt)

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 28423 cycles 28218 cycles 1.01
ML-KEM-512 encaps 35312 cycles 36635 cycles 0.96
ML-KEM-512 decaps 45241 cycles 45192 cycles 1.00
ML-KEM-768 keypair 46322 cycles 46296 cycles 1.00
ML-KEM-768 encaps 55233 cycles 55812 cycles 0.99
ML-KEM-768 decaps 69681 cycles 69913 cycles 1.00
ML-KEM-1024 keypair 70870 cycles 70293 cycles 1.01
ML-KEM-1024 encaps 83960 cycles 82553 cycles 1.02
ML-KEM-1024 decaps 101882 cycles 98932 cycles 1.03

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 4th gen (c7a)

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 12697 cycles 12706 cycles 1.00
ML-KEM-512 encaps 14226 cycles 14177 cycles 1.00
ML-KEM-512 decaps 19050 cycles 19036 cycles 1.00
ML-KEM-768 keypair 21894 cycles 21905 cycles 1.00
ML-KEM-768 encaps 22989 cycles 22946 cycles 1.00
ML-KEM-768 decaps 30055 cycles 29897 cycles 1.01
ML-KEM-1024 keypair 30714 cycles 30697 cycles 1.00
ML-KEM-1024 encaps 32722 cycles 32787 cycles 1.00
ML-KEM-1024 decaps 42327 cycles 42190 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'AMD EPYC 4th gen (c7a)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: a4e4e31 Previous: 2bf8e59 Ratio
ML-KEM-512 keypair 13236 cycles 12779 cycles 1.04
ML-KEM-512 encaps 15642 cycles 14273 cycles 1.10
ML-KEM-768 decaps 32957 cycles 30058 cycles 1.10
ML-KEM-1024 keypair 34340 cycles 32987 cycles 1.04
ML-KEM-1024 decaps 47071 cycles 42393 cycles 1.11

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 3rd gen (c6i)

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 17471 cycles 17431 cycles 1.00
ML-KEM-512 encaps 19845 cycles 19836 cycles 1.00
ML-KEM-512 decaps 26406 cycles 26354 cycles 1.00
ML-KEM-768 keypair 29863 cycles 29796 cycles 1.00
ML-KEM-768 encaps 31769 cycles 31052 cycles 1.02
ML-KEM-768 decaps 41439 cycles 41419 cycles 1.00
ML-KEM-1024 keypair 42329 cycles 42318 cycles 1.00
ML-KEM-1024 encaps 45595 cycles 45892 cycles 0.99
ML-KEM-1024 decaps 59304 cycles 61098 cycles 0.97

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Intel Xeon 3rd gen (c6i)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: a4e4e31 Previous: 2bf8e59 Ratio
ML-KEM-512 encaps 20660 cycles 19953 cycles 1.04
ML-KEM-768 keypair 32264 cycles 31153 cycles 1.04
ML-KEM-1024 decaps 61128 cycles 58193 cycles 1.05

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 3rd gen (c6a) (no-opt)

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 40231 cycles 40276 cycles 1.00
ML-KEM-512 encaps 48480 cycles 48441 cycles 1.00
ML-KEM-512 decaps 62705 cycles 62607 cycles 1.00
ML-KEM-768 keypair 63832 cycles 63754 cycles 1.00
ML-KEM-768 encaps 74842 cycles 75005 cycles 1.00
ML-KEM-768 decaps 93488 cycles 93641 cycles 1.00
ML-KEM-1024 keypair 95299 cycles 95232 cycles 1.00
ML-KEM-1024 encaps 109171 cycles 109421 cycles 1.00
ML-KEM-1024 decaps 132011 cycles 132194 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 4th gen (c7a) (no-opt)

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 36582 cycles 36601 cycles 1.00
ML-KEM-512 encaps 43100 cycles 43070 cycles 1.00
ML-KEM-512 decaps 55713 cycles 55708 cycles 1.00
ML-KEM-768 keypair 58695 cycles 58652 cycles 1.00
ML-KEM-768 encaps 67682 cycles 67635 cycles 1.00
ML-KEM-768 decaps 84507 cycles 84425 cycles 1.00
ML-KEM-1024 keypair 89091 cycles 88991 cycles 1.00
ML-KEM-1024 encaps 99378 cycles 99229 cycles 1.00
ML-KEM-1024 decaps 121053 cycles 120563 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A76 (Raspberry Pi 5) benchmarks

Details
Benchmark suite Current: a4e4e31 Previous: 2bf8e59 Ratio
ML-KEM-512 keypair 28285 cycles 28220 cycles 1.00
ML-KEM-512 encaps 34092 cycles 34106 cycles 1.00
ML-KEM-512 decaps 44329 cycles 44333 cycles 1.00
ML-KEM-768 keypair 47645 cycles 47614 cycles 1.00
ML-KEM-768 encaps 53834 cycles 53939 cycles 1.00
ML-KEM-768 decaps 68301 cycles 68365 cycles 1.00
ML-KEM-1024 keypair 70227 cycles 70253 cycles 1.00
ML-KEM-1024 encaps 78707 cycles 78729 cycles 1.00
ML-KEM-1024 decaps 98290 cycles 98443 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton4

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 17676 cycles 17646 cycles 1.00
ML-KEM-512 encaps 20593 cycles 20606 cycles 1.00
ML-KEM-512 decaps 27028 cycles 27084 cycles 1.00
ML-KEM-768 keypair 29923 cycles 29905 cycles 1.00
ML-KEM-768 encaps 32788 cycles 32773 cycles 1.00
ML-KEM-768 decaps 41939 cycles 41963 cycles 1.00
ML-KEM-1024 keypair 43711 cycles 43739 cycles 1.00
ML-KEM-1024 encaps 48758 cycles 48736 cycles 1.00
ML-KEM-1024 decaps 61406 cycles 61382 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 3rd gen (c6i) (no-opt)

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 45684 cycles 45722 cycles 1.00
ML-KEM-512 encaps 54598 cycles 54423 cycles 1.00
ML-KEM-512 decaps 69928 cycles 69779 cycles 1.00
ML-KEM-768 keypair 73225 cycles 74154 cycles 0.99
ML-KEM-768 encaps 86160 cycles 86032 cycles 1.00
ML-KEM-768 decaps 106234 cycles 106582 cycles 1.00
ML-KEM-1024 keypair 112133 cycles 112073 cycles 1.00
ML-KEM-1024 encaps 124870 cycles 124711 cycles 1.00
ML-KEM-1024 decaps 150839 cycles 150591 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton4 (no-opt)

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 35448 cycles 35408 cycles 1.00
ML-KEM-512 encaps 41305 cycles 40111 cycles 1.03
ML-KEM-512 decaps 51288 cycles 51135 cycles 1.00
ML-KEM-768 keypair 56738 cycles 56671 cycles 1.00
ML-KEM-768 encaps 64836 cycles 65149 cycles 1.00
ML-KEM-768 decaps 79062 cycles 79291 cycles 1.00
ML-KEM-1024 keypair 88013 cycles 87860 cycles 1.00
ML-KEM-1024 encaps 97113 cycles 96876 cycles 1.00
ML-KEM-1024 decaps 116135 cycles 115825 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton3

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 18674 cycles 18640 cycles 1.00
ML-KEM-512 encaps 21835 cycles 21878 cycles 1.00
ML-KEM-512 decaps 28794 cycles 28869 cycles 1.00
ML-KEM-768 keypair 31593 cycles 31542 cycles 1.00
ML-KEM-768 encaps 34796 cycles 34773 cycles 1.00
ML-KEM-768 decaps 44735 cycles 44779 cycles 1.00
ML-KEM-1024 keypair 46064 cycles 46077 cycles 1.00
ML-KEM-1024 encaps 51462 cycles 51494 cycles 1.00
ML-KEM-1024 decaps 65067 cycles 65017 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton2

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 28337 cycles 28270 cycles 1.00
ML-KEM-512 encaps 34209 cycles 34120 cycles 1.00
ML-KEM-512 decaps 44538 cycles 44375 cycles 1.00
ML-KEM-768 keypair 47612 cycles 47674 cycles 1.00
ML-KEM-768 encaps 53936 cycles 53909 cycles 1.00
ML-KEM-768 decaps 68333 cycles 68363 cycles 1.00
ML-KEM-1024 keypair 70349 cycles 70257 cycles 1.00
ML-KEM-1024 encaps 78617 cycles 78760 cycles 1.00
ML-KEM-1024 decaps 98461 cycles 98451 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton3 (no-opt)

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 38934 cycles 38890 cycles 1.00
ML-KEM-512 encaps 46774 cycles 44600 cycles 1.05
ML-KEM-512 decaps 56788 cycles 56685 cycles 1.00
ML-KEM-768 keypair 62284 cycles 62295 cycles 1.00
ML-KEM-768 encaps 71210 cycles 72323 cycles 0.98
ML-KEM-768 decaps 86947 cycles 87695 cycles 0.99
ML-KEM-1024 keypair 96359 cycles 96156 cycles 1.00
ML-KEM-1024 encaps 106402 cycles 106137 cycles 1.00
ML-KEM-1024 decaps 126922 cycles 126582 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton2 (no-opt)

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 59254 cycles 59136 cycles 1.00
ML-KEM-512 encaps 69196 cycles 68627 cycles 1.01
ML-KEM-512 decaps 87340 cycles 87348 cycles 1.00
ML-KEM-768 keypair 95410 cycles 95336 cycles 1.00
ML-KEM-768 encaps 110535 cycles 109885 cycles 1.01
ML-KEM-768 decaps 134324 cycles 134360 cycles 1.00
ML-KEM-1024 keypair 145962 cycles 147936 cycles 0.99
ML-KEM-1024 encaps 161958 cycles 163772 cycles 0.99
ML-KEM-1024 decaps 193999 cycles 195429 cycles 0.99

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot

oqs-bot commented Mar 12, 2026

Copy link
Copy Markdown
Contributor

CBMC Results (ML-KEM-512)

Full Results (198 proofs)
Proof Status Current Previous Change
**TOTAL** 1357s 1490s -8.9%
mlk_indcpa_keypair_derand 247s 254s -3%
mlk_poly_rej_uniform 132s 153s -14%
mlk_rej_uniform_c 125s 140s -11%
mlk_indcpa_enc_u 61s - new
mlk_polyvec_basemul_acc_montgomery_cached_c 50s 59s -15%
poly_ntt_native 40s 41s -2%
mlk_poly_reduce_native 35s 38s -8%
mlk_ntt_layer 33s 37s -11%
mlk_keccak_squeezeblocks_x4 25s 27s -7%
mlk_indcpa_dec 17s 18s -6%
keccakf1600x4_permute_native_x4 16s 16s +0%
mlk_fqmul 15s 16s -6%
mlk_poly_decompress_d10_native 15s 15s +0%
mlk_poly_decompress_d4_native 13s 15s -13%
mlk_polyvec_add 13s 11s +18%
mlk_enc_derand_u 11s - new
mlk_indcpa_enc_v 11s - new
mlk_poly_frommsg 11s 12s -8%
mlk_keccak_squeezeblocks 9s 8s +12%
mlk_poly_ntt 9s 7s +29%
mlk_keccakf1600_permute_c 8s 3s +167%
mlk_keccak_squeeze_once 7s 7s +0%
mlk_ntt_butterfly_block 7s 8s -12%
mlk_poly_compress_d10_native 7s 4s +75%
mlk_poly_frombytes_native 7s 9s -22%
polyvec_basemul_acc_montgomery_cached_native 7s 5s +40%
rej_uniform_native_x86_64 7s 5s +40%
mlk_ct_cmask_nonzero_u8 6s 1s +500%
mlk_poly_cbd_eta2 6s 4s +50%
kem_dec 5s 5s +0%
mlk_enc_v 5s - new
mlk_poly_add 5s 4s +25%
mlk_poly_compress_d10_c 5s 4s +25%
mlk_poly_decompress_d4_c 5s 3s +67%
mlk_poly_ntt_c 5s 4s +25%
mlk_poly_rej_uniform_x4 5s 6s -17%
mlk_polyvec_mulcache_compute 5s 1s +400%
mlk_serialize_epp 5s - new
mlk_shake256x4 5s 3s +67%
nttunpack_native_x86_64 5s 4s +25%
poly_decompress_d4_native_x86_64 5s 5s +0%
poly_decompress_d5_native_x86_64 5s 2s +150%
kem_enc 4s 2s +100%
mlk_deserialize_epp 4s - new
mlk_indcpa_enc 4s 171s -98%
mlk_invntt_layer 4s 6s -33%
mlk_keccak_absorb_once 4s 4s +0%
mlk_keccak_absorb_once_x4 4s 7s -43%
mlk_keccakf1600x4_extract_bytes_c 4s 3s +33%
mlk_poly_decompress_d10 4s 4s +0%
mlk_poly_decompress_dv 4s 3s +33%
mlk_poly_getnoise_eta1_4x 4s 2s +100%
mlk_scalar_decompress_d4 4s 1s +300%
poly_compress_d10_native_x86_64 4s 1s +300%
poly_decompress_d10_native_x86_64 4s 4s +0%
poly_decompress_d11_native_x86_64 4s 3s +33%
poly_frombytes_native_x86_64 4s 6s -33%
rej_uniform_native 4s 3s +33%
sys_check_capability 4s 1s +300%
keccak_f1600_x4_native_aarch64_v84a 3s 3s +0%
keccakf1600_permute_native 3s 2s +50%
mlk_ct_cmov_zero 3s 3s +0%
mlk_ct_sel_uint8 3s 2s +50%
mlk_gen_matrix_serial 3s 2s +50%
mlk_keccakf1600_extract_bytes (big endian) 3s 2s +50%
mlk_keccakf1600_xor_bytes (big endian) 3s 1s +200%
mlk_keccakf1600x4_xor_bytes_c 3s 1s +200%
mlk_poly_compress_d11_native 3s 2s +50%
mlk_poly_compress_d5 3s 3s +0%
mlk_poly_compress_d5_c 3s 1s +200%
mlk_poly_decompress_d5_c 3s 2s +50%
mlk_poly_frombytes_c 3s 1s +200%
mlk_poly_getnoise_eta1122_4x 3s 4s -25%
mlk_poly_reduce 3s 3s +0%
mlk_poly_tomsg 3s 2s +50%
mlk_polyvec_basemul_acc_montgomery_cached 3s 1s +200%
mlk_polyvec_permute_bitrev_to_custom 3s 2s +50%
mlk_polyvec_permute_bitrev_to_custom_native 3s 2s +50%
mlk_scalar_compress_d1 3s 1s +200%
mlk_scalar_compress_d10 3s 1s +200%
mlk_scalar_compress_d4 3s 3s +0%
mlk_scalar_compress_d5 3s 2s +50%
mlk_scalar_decompress_d10 3s 4s -25%
mlk_scalar_decompress_d11 3s 1s +200%
mlk_sha3_256 3s 5s -40%
mlk_sha3_512 3s 1s +200%
mlk_shake128x4_squeezeblocks 3s 1s +200%
mlk_value_barrier_i32 3s 3s +0%
poly_compress_d5_native_x86_64 3s 3s +0%
poly_reduce_native_aarch64 3s 2s +50%
poly_tobytes_native_aarch64 3s 2s +50%
poly_tomont_native_x86_64 3s 3s +0%
polyvec_basemul_acc_montgomery_cached_k4_native_aarch64 3s 2s +50%
intt_native_aarch64 2s 3s -33%
intt_native_x86_64 2s 3s -33%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 2s 3s -33%
keccakf1600x4_extract_bytes_native 2s 3s -33%
kem_check_pk 2s 3s -33%
kem_check_sk 2s 5s -60%
kem_enc_derand 2s 4s -50%
kem_keypair 2s 2s +0%
kem_keypair_derand 2s 3s -33%
mlk_barrett_reduce 2s 1s +100%
mlk_check_pct 2s 2s +0%
mlk_ct_cmask_neg_i16 2s 2s +0%
mlk_ct_sel_int16 2s 3s -33%
mlk_deserialize_polyvec_16le 2s - new
mlk_gen_matrix 2s 3s -33%
mlk_keccakf1600_xor_bytes 2s 3s -33%
mlk_keccakf1600x4_permute 2s 2s +0%
mlk_keccakf1600x4_xor_bytes 2s 1s +100%
mlk_montgomery_reduce 2s 2s +0%
mlk_poly_compress_d11 2s 2s +0%
mlk_poly_compress_d4 2s 1s +100%
mlk_poly_compress_dv 2s 2s +0%
mlk_poly_decompress_d10_c 2s 3s -33%
mlk_poly_decompress_d11_c 2s 3s -33%
mlk_poly_decompress_d11_native 2s 2s +0%
mlk_poly_decompress_d4 2s 2s +0%
mlk_poly_decompress_d5 2s 1s +100%
mlk_poly_decompress_d5_native 2s 1s +100%
mlk_poly_getnoise_eta1_4x_native 2s 4s -50%
mlk_poly_getnoise_eta2 2s 3s -33%
mlk_poly_invntt_tomont 2s 2s +0%
mlk_poly_invntt_tomont_c 2s 2s +0%
mlk_poly_mulcache_compute_c 2s 4s -50%
mlk_poly_reduce_c 2s 2s +0%
mlk_poly_sub 2s 2s +0%
mlk_poly_tobytes_c 2s 2s +0%
mlk_poly_tobytes_native 2s 1s +100%
mlk_poly_tomont 2s 2s +0%
mlk_poly_tomont_native 2s 2s +0%
mlk_polymat_permute_bitrev_to_custom 2s 2s +0%
mlk_polyvec_compress_du 2s 2s +0%
mlk_polyvec_decompress_du 2s 2s +0%
mlk_polyvec_frombytes 2s 4s -50%
mlk_polyvec_invntt_tomont 2s 1s +100%
mlk_polyvec_reduce 2s 3s -33%
mlk_polyvec_tobytes 2s 2s +0%
mlk_scalar_compress_d11 2s 1s +100%
mlk_scalar_decompress_d5 2s 2s +0%
mlk_scalar_signed_to_unsigned_q 2s 5s -60%
mlk_serialize_polyvec_16le 2s - new
mlk_shake128_absorb_once 2s 2s +0%
mlk_shake128_squeezeblocks 2s 2s +0%
mlk_shake128x4_absorb_once 2s 1s +100%
mlk_value_barrier_u32 2s 2s +0%
mlk_value_barrier_u8 2s 2s +0%
ntt_native_aarch64 2s 2s +0%
poly_compress_d4_native_x86_64 2s 3s -33%
poly_getnoise_eta1122_4x_native 2s 3s -33%
poly_invntt_tomont_native 2s 3s -33%
poly_mulcache_compute_native_aarch64 2s 4s -50%
poly_mulcache_compute_native_x86_64 2s 2s +0%
poly_reduce_native_x86_64 2s 1s +100%
poly_tomont_native_aarch64 2s 1s +100%
polyvec_basemul_acc_montgomery_cached_k2_native_aarch64 2s 2s +0%
polyvec_basemul_acc_montgomery_cached_k2_native_x86_64 2s 3s -33%
polyvec_basemul_acc_montgomery_cached_k4_native_x86_64 2s 2s +0%
rej_uniform_native_aarch64 2s 2s +0%
keccak_f1600_x1_native_aarch64 1s 3s -67%
keccak_f1600_x1_native_aarch64_v84a 1s 2s -50%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 1s 5s -80%
keccak_f1600_x4_native_avx2 1s 1s +0%
keccakf1600x4_xor_bytes_native 1s 2s -50%
mlk_ct_cmask_nonzero_u16 1s 3s -67%
mlk_ct_get_optblocker_i32 1s 1s +0%
mlk_ct_get_optblocker_u32 1s 2s -50%
mlk_ct_get_optblocker_u8 1s 2s -50%
mlk_ct_memcmp 1s 2s -50%
mlk_keccakf1600_extract_bytes 1s 2s -50%
mlk_keccakf1600_permute 1s 1s +0%
mlk_keccakf1600x4_extract_bytes 1s 2s -50%
mlk_keypair_getnoise_eta1 1s 2s -50%
mlk_matvec_mul 1s 2s -50%
mlk_poly_cbd_eta1 1s 2s -50%
mlk_poly_compress_d10 1s 2s -50%
mlk_poly_compress_d11_c 1s 1s +0%
mlk_poly_compress_d4_c 1s 3s -67%
mlk_poly_compress_d4_native 1s 3s -67%
mlk_poly_compress_d5_native 1s 4s -75%
mlk_poly_compress_du 1s 2s -50%
mlk_poly_decompress_d11 1s 3s -67%
mlk_poly_decompress_du 1s 4s -75%
mlk_poly_frombytes 1s 2s -50%
mlk_poly_mulcache_compute 1s 2s -50%
mlk_poly_mulcache_compute_native 1s 3s -67%
mlk_poly_tobytes 1s 2s -50%
mlk_poly_tomont_c 1s 1s +0%
mlk_polyvec_ntt 1s 1s +0%
mlk_polyvec_tomont 1s 1s +0%
mlk_rej_uniform 1s 2s -50%
mlk_shake256 1s 4s -75%
ntt_native_x86_64 1s 5s -80%
poly_compress_d11_native_x86_64 1s 2s -50%
poly_tobytes_native_x86_64 1s 3s -67%
polyvec_basemul_acc_montgomery_cached_k3_native_aarch64 1s 1s +0%
polyvec_basemul_acc_montgomery_cached_k3_native_x86_64 1s 4s -75%

@oqs-bot

oqs-bot commented Mar 12, 2026

Copy link
Copy Markdown
Contributor

CBMC Results (ML-KEM-768)

Full Results (198 proofs)
Proof Status Current Previous Change
**TOTAL** 1436s 1400s +2.6%
mlk_indcpa_keypair_derand 206s 185s +11%
mlk_poly_rej_uniform 173s 145s +19%
mlk_rej_uniform_c 146s 127s +15%
mlk_polyvec_basemul_acc_montgomery_cached_c 54s 47s +15%
poly_ntt_native 50s 38s +32%
mlk_ntt_layer 44s 30s +47%
mlk_indcpa_enc_u 41s - new
mlk_poly_reduce_native 41s 35s +17%
mlk_keccak_squeezeblocks_x4 30s 24s +25%
mlk_fqmul 21s 17s +24%
mlk_poly_decompress_d4_native 19s 13s +46%
polyvec_basemul_acc_montgomery_cached_native 18s 20s -10%
keccakf1600x4_permute_native_x4 17s 18s -6%
mlk_poly_decompress_d10_native 17s 16s +6%
mlk_indcpa_dec 16s 12s +33%
mlk_poly_frommsg 14s 9s +56%
mlk_indcpa_enc_v 13s - new
mlk_poly_frombytes_native 12s 9s +33%
mlk_polyvec_add 10s 9s +11%
mlk_ntt_butterfly_block 9s 8s +12%
mlk_invntt_layer 8s 6s +33%
mlk_keccak_squeeze_once 8s 8s +0%
mlk_keccak_squeezeblocks 8s 9s -11%
mlk_keccak_absorb_once_x4 7s 7s +0%
mlk_poly_rej_uniform_x4 7s 4s +75%
keccakf1600x4_extract_bytes_native 6s 3s +100%
mlk_enc_v 6s - new
mlk_gen_matrix_serial 6s 3s +100%
mlk_keccak_absorb_once 6s 6s +0%
mlk_keccakf1600_permute_c 6s 5s +20%
mlk_poly_ntt 6s 8s -25%
poly_decompress_d10_native_x86_64 6s 5s +20%
rej_uniform_native 6s 4s +50%
rej_uniform_native_x86_64 6s 7s -14%
mlk_enc_derand_u 5s - new
poly_decompress_d4_native_x86_64 5s 5s +0%
polyvec_basemul_acc_montgomery_cached_k4_native_aarch64 5s 3s +67%
kem_dec 4s 5s -20%
mlk_gen_matrix 4s 3s +33%
mlk_keccakf1600x4_extract_bytes_c 4s 2s +100%
mlk_poly_compress_d10_c 4s 2s +100%
mlk_poly_compress_d4_native 4s 1s +300%
mlk_poly_compress_dv 4s 2s +100%
mlk_poly_decompress_d5_c 4s 2s +100%
mlk_poly_decompress_du 4s 2s +100%
mlk_poly_decompress_dv 4s 3s +33%
mlk_poly_invntt_tomont_c 4s 3s +33%
mlk_poly_ntt_c 4s 3s +33%
mlk_poly_tobytes_native 4s 3s +33%
mlk_poly_tomont_c 4s 2s +100%
mlk_polyvec_frombytes 4s 2s +100%
mlk_rej_uniform 4s 3s +33%
mlk_scalar_compress_d11 4s 3s +33%
mlk_shake128_squeezeblocks 4s 2s +100%
mlk_shake128x4_squeezeblocks 4s 1s +300%
poly_compress_d5_native_x86_64 4s 4s +0%
poly_frombytes_native_x86_64 4s 4s +0%
poly_mulcache_compute_native_x86_64 4s 2s +100%
polyvec_basemul_acc_montgomery_cached_k3_native_x86_64 4s 2s +100%
keccak_f1600_x1_native_aarch64_v84a 3s 2s +50%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 3s 2s +50%
keccakf1600x4_xor_bytes_native 3s 3s +0%
kem_check_pk 3s 4s -25%
kem_check_sk 3s 1s +200%
kem_enc_derand 3s 3s +0%
kem_keypair 3s 3s +0%
kem_keypair_derand 3s 3s +0%
mlk_check_pct 3s 3s +0%
mlk_ct_cmov_zero 3s 3s +0%
mlk_indcpa_enc 3s 173s -98%
mlk_keccakf1600_extract_bytes (big endian) 3s 2s +50%
mlk_keccakf1600_xor_bytes (big endian) 3s 4s -25%
mlk_keccakf1600x4_permute 3s 1s +200%
mlk_poly_add 3s 3s +0%
mlk_poly_cbd_eta1 3s 3s +0%
mlk_poly_cbd_eta2 3s 1s +200%
mlk_poly_compress_d11_native 3s 1s +200%
mlk_poly_compress_d4_c 3s 4s -25%
mlk_poly_compress_d5_native 3s 3s +0%
mlk_poly_decompress_d11_native 3s 3s +0%
mlk_poly_decompress_d4 3s 3s +0%
mlk_poly_frombytes 3s 2s +50%
mlk_poly_frombytes_c 3s 4s -25%
mlk_poly_getnoise_eta1_4x 3s 3s +0%
mlk_poly_invntt_tomont 3s 2s +50%
mlk_poly_mulcache_compute_c 3s 4s -25%
mlk_poly_sub 3s 4s -25%
mlk_poly_tomsg 3s 3s +0%
mlk_polymat_permute_bitrev_to_custom 3s 2s +50%
mlk_polyvec_compress_du 3s 1s +200%
mlk_polyvec_permute_bitrev_to_custom 3s 2s +50%
mlk_scalar_decompress_d11 3s 1s +200%
mlk_scalar_signed_to_unsigned_q 3s 6s -50%
mlk_serialize_epp 3s - new
mlk_serialize_polyvec_16le 3s - new
mlk_shake256 3s 3s +0%
mlk_shake256x4 3s 3s +0%
mlk_value_barrier_u8 3s 3s +0%
ntt_native_aarch64 3s 3s +0%
nttunpack_native_x86_64 3s 4s -25%
poly_invntt_tomont_native 3s 3s +0%
poly_mulcache_compute_native_aarch64 3s 2s +50%
poly_tomont_native_x86_64 3s 3s +0%
polyvec_basemul_acc_montgomery_cached_k2_native_aarch64 3s 2s +50%
polyvec_basemul_acc_montgomery_cached_k3_native_aarch64 3s 3s +0%
polyvec_basemul_acc_montgomery_cached_k4_native_x86_64 3s 4s -25%
intt_native_aarch64 2s 1s +100%
intt_native_x86_64 2s 3s -33%
keccak_f1600_x1_native_aarch64 2s 1s +100%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 2s 3s -33%
keccak_f1600_x4_native_avx2 2s 2s +0%
keccakf1600_permute_native 2s 1s +100%
kem_enc 2s 1s +100%
mlk_ct_cmask_nonzero_u16 2s 3s -33%
mlk_ct_get_optblocker_i32 2s 3s -33%
mlk_ct_get_optblocker_u32 2s 2s +0%
mlk_ct_get_optblocker_u8 2s 2s +0%
mlk_deserialize_polyvec_16le 2s - new
mlk_keccakf1600x4_extract_bytes 2s 2s +0%
mlk_keypair_getnoise_eta1 2s 2s +0%
mlk_matvec_mul 2s 4s -50%
mlk_poly_compress_d10_native 2s 4s -50%
mlk_poly_compress_d11_c 2s 2s +0%
mlk_poly_compress_d5 2s 3s -33%
mlk_poly_compress_du 2s 2s +0%
mlk_poly_decompress_d11 2s 1s +100%
mlk_poly_decompress_d11_c 2s 2s +0%
mlk_poly_decompress_d4_c 2s 4s -50%
mlk_poly_decompress_d5_native 2s 2s +0%
mlk_poly_getnoise_eta1_4x_native 2s 2s +0%
mlk_poly_getnoise_eta2 2s 3s -33%
mlk_poly_mulcache_compute_native 2s 2s +0%
mlk_poly_reduce_c 2s 2s +0%
mlk_poly_tobytes 2s 3s -33%
mlk_poly_tomont 2s 1s +100%
mlk_poly_tomont_native 2s 1s +100%
mlk_polyvec_decompress_du 2s 3s -33%
mlk_polyvec_invntt_tomont 2s 1s +100%
mlk_polyvec_mulcache_compute 2s 3s -33%
mlk_polyvec_ntt 2s 2s +0%
mlk_polyvec_reduce 2s 2s +0%
mlk_polyvec_tomont 2s 2s +0%
mlk_scalar_compress_d1 2s 2s +0%
mlk_scalar_compress_d10 2s 1s +100%
mlk_scalar_compress_d5 2s 3s -33%
mlk_scalar_decompress_d10 2s 4s -50%
mlk_scalar_decompress_d4 2s 1s +100%
mlk_shake128x4_absorb_once 2s 3s -33%
mlk_value_barrier_i32 2s 3s -33%
mlk_value_barrier_u32 2s 3s -33%
ntt_native_x86_64 2s 2s +0%
poly_compress_d10_native_x86_64 2s 1s +100%
poly_compress_d11_native_x86_64 2s 4s -50%
poly_compress_d4_native_x86_64 2s 2s +0%
poly_decompress_d5_native_x86_64 2s 2s +0%
poly_getnoise_eta1122_4x_native 2s 3s -33%
poly_reduce_native_aarch64 2s 3s -33%
poly_reduce_native_x86_64 2s 3s -33%
poly_tobytes_native_aarch64 2s 2s +0%
poly_tobytes_native_x86_64 2s 2s +0%
polyvec_basemul_acc_montgomery_cached_k2_native_x86_64 2s 3s -33%
rej_uniform_native_aarch64 2s 4s -50%
sys_check_capability 2s 2s +0%
keccak_f1600_x4_native_aarch64_v84a 1s 3s -67%
mlk_barrett_reduce 1s 2s -50%
mlk_ct_cmask_neg_i16 1s 2s -50%
mlk_ct_cmask_nonzero_u8 1s 1s +0%
mlk_ct_memcmp 1s 2s -50%
mlk_ct_sel_int16 1s 2s -50%
mlk_ct_sel_uint8 1s 3s -67%
mlk_deserialize_epp 1s - new
mlk_keccakf1600_extract_bytes 1s 3s -67%
mlk_keccakf1600_permute 1s 2s -50%
mlk_keccakf1600_xor_bytes 1s 3s -67%
mlk_keccakf1600x4_xor_bytes 1s 2s -50%
mlk_keccakf1600x4_xor_bytes_c 1s 1s +0%
mlk_montgomery_reduce 1s 1s +0%
mlk_poly_compress_d10 1s 3s -67%
mlk_poly_compress_d11 1s 3s -67%
mlk_poly_compress_d4 1s 2s -50%
mlk_poly_compress_d5_c 1s 4s -75%
mlk_poly_decompress_d10 1s 1s +0%
mlk_poly_decompress_d10_c 1s 1s +0%
mlk_poly_decompress_d5 1s 1s +0%
mlk_poly_getnoise_eta1122_4x 1s 1s +0%
mlk_poly_mulcache_compute 1s 2s -50%
mlk_poly_reduce 1s 1s +0%
mlk_poly_tobytes_c 1s 5s -80%
mlk_polyvec_basemul_acc_montgomery_cached 1s 1s +0%
mlk_polyvec_permute_bitrev_to_custom_native 1s 3s -67%
mlk_polyvec_tobytes 1s 2s -50%
mlk_scalar_compress_d4 1s 3s -67%
mlk_scalar_decompress_d5 1s 3s -67%
mlk_sha3_256 1s 3s -67%
mlk_sha3_512 1s 1s +0%
mlk_shake128_absorb_once 1s 1s +0%
poly_decompress_d11_native_x86_64 1s 3s -67%
poly_tomont_native_aarch64 1s 2s -50%

@oqs-bot

oqs-bot commented Mar 12, 2026

Copy link
Copy Markdown
Contributor

CBMC Results (ML-KEM-1024)

⚠️ Attention Required

Proof Status Current Previous Change
**TOTAL** ⚠️ 1772s 1380s +28.4%
Full Results (198 proofs)
Proof Status Current Previous Change
**TOTAL** ⚠️ 1772s 1380s +28.4%
mlk_indcpa_enc_u 491s - new
mlk_poly_rej_uniform 144s 159s -9%
mlk_rej_uniform_c 129s 131s -2%
mlk_indcpa_keypair_derand 126s 124s +2%
mlk_polyvec_basemul_acc_montgomery_cached_c 77s 77s +0%
poly_ntt_native 41s 41s +0%
polyvec_basemul_acc_montgomery_cached_native 38s 36s +6%
mlk_poly_reduce_native 35s 34s +3%
mlk_ntt_layer 31s 28s +11%
mlk_keccak_squeezeblocks_x4 27s 28s -4%
mlk_fqmul 16s 14s +14%
keccakf1600x4_permute_native_x4 15s 17s -12%
mlk_poly_decompress_d11_native 15s 14s +7%
mlk_poly_decompress_d5_native 14s 12s +17%
mlk_indcpa_enc_v 12s - new
mlk_polyvec_add 12s 11s +9%
mlk_indcpa_dec 11s 10s +10%
mlk_poly_frommsg 10s 12s -17%
mlk_enc_derand_u 9s - new
mlk_poly_frombytes_native 9s 9s +0%
mlk_keccak_squeeze_once 8s 8s +0%
mlk_ntt_butterfly_block 8s 7s +14%
mlk_poly_ntt 8s 7s +14%
rej_uniform_native_x86_64 8s 7s +14%
mlk_invntt_layer 7s 6s +17%
mlk_keccak_absorb_once_x4 7s 5s +40%
mlk_keccak_squeezeblocks 7s 7s +0%
mlk_keccakf1600_permute_c 7s 3s +133%
poly_decompress_d5_native_x86_64 7s 5s +40%
mlk_polymat_permute_bitrev_to_custom 6s 7s -14%
mlk_polyvec_ntt 6s 5s +20%
kem_dec 5s 4s +25%
mlk_enc_v 5s - new
mlk_gen_matrix_serial 5s 5s +0%
mlk_poly_add 5s 3s +67%
mlk_poly_compress_d11 5s 2s +150%
mlk_poly_compress_d11_c 5s 5s +0%
mlk_poly_getnoise_eta1_4x 5s 3s +67%
mlk_poly_mulcache_compute_c 5s 4s +25%
mlk_poly_rej_uniform_x4 5s 5s +0%
mlk_poly_tomsg 5s 5s +0%
mlk_polyvec_mulcache_compute 5s 3s +67%
poly_decompress_d11_native_x86_64 5s 4s +25%
poly_tomont_native_aarch64 5s 3s +67%
kem_check_pk 4s 3s +33%
kem_enc 4s 4s +0%
mlk_ct_sel_uint8 4s 2s +100%
mlk_keypair_getnoise_eta1 4s 3s +33%
mlk_poly_compress_d10_native 4s 1s +300%
mlk_poly_compress_d11_native 4s 2s +100%
mlk_poly_compress_dv 4s 4s +0%
mlk_poly_decompress_d10_c 4s 3s +33%
mlk_poly_decompress_d11 4s 1s +300%
mlk_poly_decompress_d4 4s 4s +0%
mlk_poly_decompress_d4_c 4s 2s +100%
mlk_poly_decompress_du 4s 3s +33%
mlk_poly_getnoise_eta2 4s 3s +33%
mlk_poly_ntt_c 4s 4s +0%
mlk_poly_reduce_c 4s 2s +100%
mlk_poly_tomont_native 4s 4s +0%
mlk_scalar_compress_d11 4s 2s +100%
mlk_sha3_256 4s 2s +100%
mlk_shake256x4 4s 3s +33%
poly_decompress_d4_native_x86_64 4s 1s +300%
polyvec_basemul_acc_montgomery_cached_k2_native_x86_64 4s 3s +33%
polyvec_basemul_acc_montgomery_cached_k3_native_x86_64 4s 2s +100%
sys_check_capability 4s 3s +33%
keccakf1600_permute_native 3s 3s +0%
kem_keypair 3s 2s +50%
mlk_check_pct 3s 1s +200%
mlk_ct_cmask_neg_i16 3s 1s +200%
mlk_ct_cmov_zero 3s 2s +50%
mlk_deserialize_polyvec_16le 3s - new
mlk_gen_matrix 3s 6s -50%
mlk_keccak_absorb_once 3s 7s -57%
mlk_keccakf1600_extract_bytes (big endian) 3s 3s +0%
mlk_keccakf1600_xor_bytes (big endian) 3s 2s +50%
mlk_keccakf1600x4_extract_bytes_c 3s 2s +50%
mlk_keccakf1600x4_xor_bytes 3s 3s +0%
mlk_matvec_mul 3s 4s -25%
mlk_poly_cbd_eta2 3s 2s +50%
mlk_poly_compress_d10 3s 2s +50%
mlk_poly_compress_d10_c 3s 3s +0%
mlk_poly_compress_d5_c 3s 2s +50%
mlk_poly_decompress_d10_native 3s 4s -25%
mlk_poly_decompress_d11_c 3s 2s +50%
mlk_poly_decompress_d5 3s 2s +50%
mlk_poly_decompress_dv 3s 2s +50%
mlk_poly_getnoise_eta1122_4x 3s 3s +0%
mlk_poly_invntt_tomont 3s 3s +0%
mlk_poly_invntt_tomont_c 3s 3s +0%
mlk_poly_reduce 3s 2s +50%
mlk_poly_tobytes 3s 4s -25%
mlk_poly_tobytes_c 3s 2s +50%
mlk_polyvec_decompress_du 3s 2s +50%
mlk_polyvec_permute_bitrev_to_custom_native 3s 1s +200%
mlk_scalar_compress_d10 3s 1s +200%
mlk_scalar_compress_d5 3s 2s +50%
mlk_scalar_decompress_d4 3s 3s +0%
mlk_scalar_signed_to_unsigned_q 3s 3s +0%
mlk_serialize_epp 3s - new
mlk_shake128_squeezeblocks 3s 4s -25%
poly_compress_d11_native_x86_64 3s 2s +50%
poly_compress_d4_native_x86_64 3s 2s +50%
poly_frombytes_native_x86_64 3s 3s +0%
poly_getnoise_eta1122_4x_native 3s 1s +200%
poly_invntt_tomont_native 3s 3s +0%
poly_mulcache_compute_native_x86_64 3s 2s +50%
poly_reduce_native_x86_64 3s 5s -40%
polyvec_basemul_acc_montgomery_cached_k2_native_aarch64 3s 3s +0%
polyvec_basemul_acc_montgomery_cached_k3_native_aarch64 3s 3s +0%
polyvec_basemul_acc_montgomery_cached_k4_native_aarch64 3s 2s +50%
rej_uniform_native 3s 4s -25%
intt_native_aarch64 2s 2s +0%
keccak_f1600_x1_native_aarch64_v84a 2s 1s +100%
keccak_f1600_x4_native_aarch64_v84a 2s 1s +100%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 2s 3s -33%
keccak_f1600_x4_native_avx2 2s 1s +100%
keccakf1600x4_extract_bytes_native 2s 2s +0%
kem_check_sk 2s 2s +0%
kem_keypair_derand 2s 1s +100%
mlk_barrett_reduce 2s 2s +0%
mlk_ct_cmask_nonzero_u8 2s 2s +0%
mlk_ct_get_optblocker_u32 2s 2s +0%
mlk_ct_get_optblocker_u8 2s 3s -33%
mlk_ct_memcmp 2s 3s -33%
mlk_deserialize_epp 2s - new
mlk_keccakf1600_extract_bytes 2s 2s +0%
mlk_keccakf1600x4_extract_bytes 2s 2s +0%
mlk_keccakf1600x4_permute 2s 2s +0%
mlk_poly_cbd_eta1 2s 2s +0%
mlk_poly_compress_d4 2s 3s -33%
mlk_poly_compress_d5 2s 2s +0%
mlk_poly_compress_d5_native 2s 3s -33%
mlk_poly_compress_du 2s 2s +0%
mlk_poly_decompress_d10 2s 3s -33%
mlk_poly_decompress_d5_c 2s 4s -50%
mlk_poly_frombytes_c 2s 3s -33%
mlk_poly_getnoise_eta1_4x_native 2s 4s -50%
mlk_poly_mulcache_compute_native 2s 2s +0%
mlk_poly_sub 2s 2s +0%
mlk_poly_tomont 2s 3s -33%
mlk_poly_tomont_c 2s 1s +100%
mlk_polyvec_compress_du 2s 2s +0%
mlk_polyvec_frombytes 2s 2s +0%
mlk_polyvec_invntt_tomont 2s 2s +0%
mlk_rej_uniform 2s 3s -33%
mlk_scalar_compress_d4 2s 2s +0%
mlk_sha3_512 2s 2s +0%
mlk_shake128x4_absorb_once 2s 2s +0%
mlk_shake128x4_squeezeblocks 2s 2s +0%
mlk_shake256 2s 1s +100%
mlk_value_barrier_i32 2s 1s +100%
mlk_value_barrier_u8 2s 1s +100%
ntt_native_aarch64 2s 3s -33%
ntt_native_x86_64 2s 3s -33%
nttunpack_native_x86_64 2s 4s -50%
poly_compress_d10_native_x86_64 2s 3s -33%
poly_compress_d5_native_x86_64 2s 4s -50%
poly_mulcache_compute_native_aarch64 2s 3s -33%
poly_reduce_native_aarch64 2s 3s -33%
poly_tobytes_native_aarch64 2s 3s -33%
poly_tobytes_native_x86_64 2s 2s +0%
rej_uniform_native_aarch64 2s 2s +0%
intt_native_x86_64 1s 2s -50%
keccak_f1600_x1_native_aarch64 1s 2s -50%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 1s 2s -50%
keccakf1600x4_xor_bytes_native 1s 3s -67%
kem_enc_derand 1s 4s -75%
mlk_ct_cmask_nonzero_u16 1s 1s +0%
mlk_ct_get_optblocker_i32 1s 2s -50%
mlk_ct_sel_int16 1s 1s +0%
mlk_indcpa_enc 1s 147s -99%
mlk_keccakf1600_permute 1s 2s -50%
mlk_keccakf1600_xor_bytes 1s 1s +0%
mlk_keccakf1600x4_xor_bytes_c 1s 2s -50%
mlk_montgomery_reduce 1s 4s -75%
mlk_poly_compress_d4_c 1s 1s +0%
mlk_poly_compress_d4_native 1s 2s -50%
mlk_poly_decompress_d4_native 1s 1s +0%
mlk_poly_frombytes 1s 1s +0%
mlk_poly_mulcache_compute 1s 1s +0%
mlk_poly_tobytes_native 1s 2s -50%
mlk_polyvec_basemul_acc_montgomery_cached 1s 2s -50%
mlk_polyvec_permute_bitrev_to_custom 1s 3s -67%
mlk_polyvec_reduce 1s 2s -50%
mlk_polyvec_tobytes 1s 2s -50%
mlk_polyvec_tomont 1s 2s -50%
mlk_scalar_compress_d1 1s 3s -67%
mlk_scalar_decompress_d10 1s 2s -50%
mlk_scalar_decompress_d11 1s 4s -75%
mlk_scalar_decompress_d5 1s 3s -67%
mlk_serialize_polyvec_16le 1s - new
mlk_shake128_absorb_once 1s 4s -75%
mlk_value_barrier_u32 1s 2s -50%
poly_decompress_d10_native_x86_64 1s 1s +0%
poly_tomont_native_x86_64 1s 2s -50%
polyvec_basemul_acc_montgomery_cached_k4_native_x86_64 1s 2s -50%

@hanno-becker hanno-becker added benchmark this PR should be benchmarked in CI and removed benchmark this PR should be benchmarked in CI labels Mar 13, 2026

@hanno-becker hanno-becker left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of 0a01cc4? Tests also serve as documentation, and using internal constants rather than public ones sets a wrong example.

If this is needed, can it be done in a preparatory PR? It seems unrelated to this PR.

@mkannwischer

mkannwischer commented Mar 13, 2026

Copy link
Copy Markdown
Contributor Author

What's the purpose of 0a01cc4? Tests also serve as documentation, and using internal constants rather than public ones sets a wrong example.

If this is needed, can it be done in a preparatory PR? It seems unrelated to this PR.

The main question here is if we want to add the new API in mlkem_native.h or not. If we don't, we can't test the API in the standard test_mlkem.c, but we could add it in a separate test that includes kem.h, but not mlkem_native.h.
The purpose of 0a01cc4 was to get something to work first, so we can discuss how we want to proceed.

I agree with you that we don't want to keep it as is right now.

@hanno-becker

Copy link
Copy Markdown
Contributor

Seeing that you also observed a slowdown on x86, I wonder if we should treat the incremental API as internal by default and only expose it in the public API if some new option MLK_CONFIG_ENABLE_MLKEM_BRAID it set?

@mkannwischer mkannwischer added benchmark this PR should be benchmarked in CI and removed benchmark this PR should be benchmarked in CI labels Mar 17, 2026

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Intel Xeon 4th gen (c7i)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: a4e4e31 Previous: 2bf8e59 Ratio
ML-KEM-1024 decaps 40620 cycles 39396 cycles 1.03

This comment was automatically generated by workflow using github-action-benchmark.

@hanno-becker hanno-becker added benchmark this PR should be benchmarked in CI and removed benchmark this PR should be benchmarked in CI labels Mar 17, 2026
@mkannwischer mkannwischer force-pushed the incremental-enc-api branch 2 times, most recently from 4f0ace1 to 732adb5 Compare May 7, 2026 05:35
@mkannwischer mkannwischer added benchmark this PR should be benchmarked in CI and removed benchmark this PR should be benchmarked in CI labels May 7, 2026

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mac Mini (M1, 2020) benchmarks

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 12320 cycles 12320 cycles 1
ML-KEM-512 encaps 15047 cycles 14999 cycles 1.00
ML-KEM-512 decaps 19599 cycles 19552 cycles 1.00
ML-KEM-768 keypair 21264 cycles 21264 cycles 1
ML-KEM-768 encaps 23880 cycles 23870 cycles 1.00
ML-KEM-768 decaps 30427 cycles 30414 cycles 1.00
ML-KEM-1024 keypair 30323 cycles 30327 cycles 1.00
ML-KEM-1024 encaps 34616 cycles 34573 cycles 1.00
ML-KEM-1024 decaps 44229 cycles 44193 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A55 (Snapdragon 888) benchmarks

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 59787 cycles 59728 cycles 1.00
ML-KEM-512 encaps 67447 cycles 67429 cycles 1.00
ML-KEM-512 decaps 86139 cycles 86125 cycles 1.00
ML-KEM-768 keypair 97408 cycles 97470 cycles 1.00
ML-KEM-768 encaps 110758 cycles 110896 cycles 1.00
ML-KEM-768 decaps 137357 cycles 138405 cycles 0.99
ML-KEM-1024 keypair 154780 cycles 154989 cycles 1.00
ML-KEM-1024 encaps 171299 cycles 172090 cycles 1.00
ML-KEM-1024 decaps 207123 cycles 209372 cycles 0.99

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A72 (Raspberry Pi 4) benchmarks

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 50693 cycles 51223 cycles 0.99
ML-KEM-512 encaps 58494 cycles 59547 cycles 0.98
ML-KEM-512 decaps 74583 cycles 75793 cycles 0.98
ML-KEM-768 keypair 85700 cycles 86166 cycles 0.99
ML-KEM-768 encaps 93550 cycles 94272 cycles 0.99
ML-KEM-768 decaps 117423 cycles 117661 cycles 1.00
ML-KEM-1024 keypair 130295 cycles 129800 cycles 1.00
ML-KEM-1024 encaps 141861 cycles 142914 cycles 0.99
ML-KEM-1024 decaps 173922 cycles 174806 cycles 0.99

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SpacemiT K1 8 (Banana Pi F3) benchmarks

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 155501 cycles 155510 cycles 1.00
ML-KEM-512 encaps 163235 cycles 163424 cycles 1.00
ML-KEM-512 decaps 206715 cycles 206679 cycles 1.00
ML-KEM-768 keypair 249857 cycles 249912 cycles 1.00
ML-KEM-768 encaps 270337 cycles 270404 cycles 1.00
ML-KEM-768 decaps 332607 cycles 332257 cycles 1.00
ML-KEM-1024 keypair 395706 cycles 396307 cycles 1.00
ML-KEM-1024 encaps 423713 cycles 423343 cycles 1.00
ML-KEM-1024 decaps 505216 cycles 507057 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@mkannwischer mkannwischer force-pushed the incremental-enc-api branch from 1ce787b to a4e4e31 Compare May 7, 2026 06:44
@mkannwischer mkannwischer added benchmark this PR should be benchmarked in CI and removed benchmark this PR should be benchmarked in CI labels May 7, 2026
@mkannwischer mkannwischer force-pushed the incremental-enc-api branch from a4e4e31 to 856b540 Compare May 24, 2026 07:13
@mkannwischer mkannwischer added benchmark this PR should be benchmarked in CI and removed benchmark this PR should be benchmarked in CI labels May 24, 2026
@mkannwischer mkannwischer force-pushed the incremental-enc-api branch 3 times, most recently from ab8f4bf to 37d5620 Compare June 14, 2026 14:14
@rod-chapman rod-chapman force-pushed the incremental-enc-api branch 2 times, most recently from d488734 to a51ea38 Compare June 17, 2026 14:34
mkannwischer and others added 2 commits June 18, 2026 10:57
Split K-PKE.Encrypt and ML-KEM.Encaps into two phases (u and v) to
support protocols like MLKEMBraid that transmit large KEM components
in parallel over bandwidth-constrained channels.

CPA level (indcpa):
- mlk_indcpa_enc_u: computes ct_u from ek_seed, outputs intermediate
  state (sp, epp, sp_cache)
- mlk_indcpa_enc_v: computes ct_v from ek_vector using intermediate
  state from enc_u

CCA KEM level (kem):
- mlk_kem_enc_derand_u: FO transform + enc_u, outputs shared secret
  and intermediate state; only needs ek_seed and H(pk)
- mlk_kem_enc_v: modulus check on ek_vector + enc_v; only needs
  ek_vector

epp is serialized as 4-bit nibbles (ETA2 - x) to provide a natural
coefficient bound on deserialization; sp is serialized as 16-bit LE.
The shared sp mulcache is computed once and threaded through enc_u/enc_v.

Includes CBMC contracts and proofs for the new functions, the
MLK_CONFIG_ENABLE_MLKEM_BRAID configuration option exposing the API,
recomputed peak stack consumption values, and OpenTitan work buffer
size updates.

The test verifies that the incremental API produces identical
ciphertexts and shared secrets as the standard API across all three
parameter sets.

Co-authored-by: Hanno Becker <beckphan@amazon.co.uk>
Signed-off-by: Matthias J. Kannwischer <matthias@zerorisc.com>
Signed-off-by: Rod Chapman <rodchap@amazon.com>
Signed-off-by: Rod Chapman <rodchap@amazon.com>
@rod-chapman rod-chapman force-pushed the incremental-enc-api branch from a1672a8 to e9ea411 Compare June 18, 2026 09:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

benchmark this PR should be benchmarked in CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants