CI: Add native ppc64le (POWER10) CI #1406
Conversation
9dac78c to
91ac67d
Compare
hanno-becker
left a comment
There was a problem hiding this comment.
We should use MLK_FORCE_PPC64LE to double-check that we're on a PPC64LE system.
Also, can we add this to the benchmarking CI? We'll need to see if it's stable enough, but until we know better, we may as well try?
There was a problem hiding this comment.
Mac Mini (M1, 2020) benchmarks
Details
| Benchmark suite | Current: ae30f98 | Previous: 19b5b63 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
12324 cycles |
12324 cycles |
1 |
ML-KEM-512 encaps |
14983 cycles |
14983 cycles |
1 |
ML-KEM-512 decaps |
19543 cycles |
19542 cycles |
1.00 |
ML-KEM-768 keypair |
21382 cycles |
21390 cycles |
1.00 |
ML-KEM-768 encaps |
23949 cycles |
23949 cycles |
1 |
ML-KEM-768 decaps |
30519 cycles |
30520 cycles |
1.00 |
ML-KEM-1024 keypair |
30395 cycles |
30395 cycles |
1 |
ML-KEM-1024 encaps |
34601 cycles |
34601 cycles |
1 |
ML-KEM-1024 decaps |
44203 cycles |
44202 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Intel Xeon 4th gen (c7i)
Details
| Benchmark suite | Current: ae30f98 | Previous: 19b5b63 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
9733 cycles |
9647 cycles |
1.01 |
ML-KEM-512 encaps |
11069 cycles |
11147 cycles |
0.99 |
ML-KEM-512 decaps |
15222 cycles |
15088 cycles |
1.01 |
ML-KEM-768 keypair |
16586 cycles |
16678 cycles |
0.99 |
ML-KEM-768 encaps |
17800 cycles |
17727 cycles |
1.00 |
ML-KEM-768 decaps |
23278 cycles |
23386 cycles |
1.00 |
ML-KEM-1024 keypair |
22462 cycles |
22298 cycles |
1.01 |
ML-KEM-1024 encaps |
24320 cycles |
24452 cycles |
0.99 |
ML-KEM-1024 decaps |
31888 cycles |
32044 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Intel Xeon 3rd gen (c6i)
Details
| Benchmark suite | Current: ae30f98 | Previous: 19b5b63 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
16334 cycles |
16411 cycles |
1.00 |
ML-KEM-512 encaps |
18587 cycles |
18543 cycles |
1.00 |
ML-KEM-512 decaps |
25042 cycles |
25090 cycles |
1.00 |
ML-KEM-768 keypair |
27685 cycles |
27726 cycles |
1.00 |
ML-KEM-768 encaps |
29897 cycles |
29813 cycles |
1.00 |
ML-KEM-768 decaps |
39251 cycles |
39356 cycles |
1.00 |
ML-KEM-1024 keypair |
37945 cycles |
37784 cycles |
1.00 |
ML-KEM-1024 encaps |
40464 cycles |
40560 cycles |
1.00 |
ML-KEM-1024 decaps |
54150 cycles |
54259 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
AMD EPYC 3rd gen (c6a)
Details
| Benchmark suite | Current: ae30f98 | Previous: eceae93 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
16685 cycles |
16518 cycles |
1.01 |
ML-KEM-512 encaps |
18449 cycles |
18270 cycles |
1.01 |
ML-KEM-512 decaps |
23842 cycles |
23843 cycles |
1.00 |
ML-KEM-768 keypair |
28589 cycles |
28904 cycles |
0.99 |
ML-KEM-768 encaps |
29807 cycles |
30040 cycles |
0.99 |
ML-KEM-768 decaps |
37782 cycles |
37756 cycles |
1.00 |
ML-KEM-1024 keypair |
41213 cycles |
41400 cycles |
1.00 |
ML-KEM-1024 encaps |
43496 cycles |
43646 cycles |
1.00 |
ML-KEM-1024 decaps |
53964 cycles |
54140 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Intel Xeon 4th gen (c7i) (no-opt)
Details
| Benchmark suite | Current: ae30f98 | Previous: 19b5b63 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
29065 cycles |
28994 cycles |
1.00 |
ML-KEM-512 encaps |
35698 cycles |
35694 cycles |
1.00 |
ML-KEM-512 decaps |
45648 cycles |
45680 cycles |
1.00 |
ML-KEM-768 keypair |
47260 cycles |
47425 cycles |
1.00 |
ML-KEM-768 encaps |
56634 cycles |
56896 cycles |
1.00 |
ML-KEM-768 decaps |
70145 cycles |
70110 cycles |
1.00 |
ML-KEM-1024 keypair |
71636 cycles |
71595 cycles |
1.00 |
ML-KEM-1024 encaps |
83879 cycles |
83824 cycles |
1.00 |
ML-KEM-1024 decaps |
99708 cycles |
100216 cycles |
0.99 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
AMD EPYC 4th gen (c7a)
Details
| Benchmark suite | Current: ae30f98 | Previous: 19b5b63 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
12037 cycles |
12010 cycles |
1.00 |
ML-KEM-512 encaps |
13163 cycles |
13175 cycles |
1.00 |
ML-KEM-512 decaps |
18056 cycles |
18036 cycles |
1.00 |
ML-KEM-768 keypair |
20735 cycles |
20671 cycles |
1.00 |
ML-KEM-768 encaps |
21777 cycles |
21711 cycles |
1.00 |
ML-KEM-768 decaps |
28806 cycles |
28753 cycles |
1.00 |
ML-KEM-1024 keypair |
27881 cycles |
27862 cycles |
1.00 |
ML-KEM-1024 encaps |
30004 cycles |
29922 cycles |
1.00 |
ML-KEM-1024 decaps |
39505 cycles |
39445 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Graviton4
Details
| Benchmark suite | Current: ae30f98 | Previous: 19b5b63 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
17668 cycles |
17686 cycles |
1.00 |
ML-KEM-512 encaps |
20665 cycles |
20699 cycles |
1.00 |
ML-KEM-512 decaps |
27117 cycles |
27115 cycles |
1.00 |
ML-KEM-768 keypair |
30282 cycles |
30297 cycles |
1.00 |
ML-KEM-768 encaps |
33018 cycles |
32952 cycles |
1.00 |
ML-KEM-768 decaps |
42203 cycles |
42223 cycles |
1.00 |
ML-KEM-1024 keypair |
43883 cycles |
43897 cycles |
1.00 |
ML-KEM-1024 encaps |
48925 cycles |
48923 cycles |
1.00 |
ML-KEM-1024 decaps |
61554 cycles |
61512 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Graviton3
Details
| Benchmark suite | Current: ae30f98 | Previous: 19b5b63 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
18717 cycles |
18740 cycles |
1.00 |
ML-KEM-512 encaps |
22009 cycles |
22057 cycles |
1.00 |
ML-KEM-512 decaps |
29030 cycles |
29063 cycles |
1.00 |
ML-KEM-768 keypair |
31973 cycles |
32006 cycles |
1.00 |
ML-KEM-768 encaps |
35079 cycles |
35019 cycles |
1.00 |
ML-KEM-768 decaps |
45055 cycles |
45102 cycles |
1.00 |
ML-KEM-1024 keypair |
46367 cycles |
46361 cycles |
1.00 |
ML-KEM-1024 encaps |
51714 cycles |
51734 cycles |
1.00 |
ML-KEM-1024 decaps |
65244 cycles |
65265 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Graviton4 (no-opt)
Details
| Benchmark suite | Current: ae30f98 | Previous: 19b5b63 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
35496 cycles |
35503 cycles |
1.00 |
ML-KEM-512 encaps |
40366 cycles |
40344 cycles |
1.00 |
ML-KEM-512 decaps |
51535 cycles |
51488 cycles |
1.00 |
ML-KEM-768 keypair |
58670 cycles |
59133 cycles |
0.99 |
ML-KEM-768 encaps |
65356 cycles |
66118 cycles |
0.99 |
ML-KEM-768 decaps |
79644 cycles |
80002 cycles |
1.00 |
ML-KEM-1024 keypair |
87929 cycles |
87867 cycles |
1.00 |
ML-KEM-1024 encaps |
96465 cycles |
96415 cycles |
1.00 |
ML-KEM-1024 decaps |
115853 cycles |
115717 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Intel Xeon 3rd gen (c6i) (no-opt)
Details
| Benchmark suite | Current: ae30f98 | Previous: 19b5b63 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
45552 cycles |
45581 cycles |
1.00 |
ML-KEM-512 encaps |
54414 cycles |
54457 cycles |
1.00 |
ML-KEM-512 decaps |
69811 cycles |
69834 cycles |
1.00 |
ML-KEM-768 keypair |
75914 cycles |
75972 cycles |
1.00 |
ML-KEM-768 encaps |
86930 cycles |
86969 cycles |
1.00 |
ML-KEM-768 decaps |
107096 cycles |
107117 cycles |
1.00 |
ML-KEM-1024 keypair |
111924 cycles |
111830 cycles |
1.00 |
ML-KEM-1024 encaps |
125365 cycles |
125372 cycles |
1.00 |
ML-KEM-1024 decaps |
151908 cycles |
151473 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Graviton2
Details
| Benchmark suite | Current: ae30f98 | Previous: 19b5b63 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
28383 cycles |
28320 cycles |
1.00 |
ML-KEM-512 encaps |
34114 cycles |
34170 cycles |
1.00 |
ML-KEM-512 decaps |
44452 cycles |
44402 cycles |
1.00 |
ML-KEM-768 keypair |
48329 cycles |
48374 cycles |
1.00 |
ML-KEM-768 encaps |
54297 cycles |
54150 cycles |
1.00 |
ML-KEM-768 decaps |
68643 cycles |
68678 cycles |
1.00 |
ML-KEM-1024 keypair |
70481 cycles |
70563 cycles |
1.00 |
ML-KEM-1024 encaps |
78864 cycles |
79017 cycles |
1.00 |
ML-KEM-1024 decaps |
98636 cycles |
98700 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
AMD EPYC 3rd gen (c6a) (no-opt)
Details
| Benchmark suite | Current: ae30f98 | Previous: eceae93 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
40086 cycles |
40021 cycles |
1.00 |
ML-KEM-512 encaps |
48294 cycles |
48242 cycles |
1.00 |
ML-KEM-512 decaps |
62762 cycles |
62605 cycles |
1.00 |
ML-KEM-768 keypair |
65101 cycles |
65017 cycles |
1.00 |
ML-KEM-768 encaps |
75330 cycles |
75419 cycles |
1.00 |
ML-KEM-768 decaps |
94128 cycles |
94069 cycles |
1.00 |
ML-KEM-1024 keypair |
95613 cycles |
95222 cycles |
1.00 |
ML-KEM-1024 encaps |
109532 cycles |
109493 cycles |
1.00 |
ML-KEM-1024 decaps |
132460 cycles |
132453 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
AMD EPYC 4th gen (c7a) (no-opt)
Details
| Benchmark suite | Current: ae30f98 | Previous: 19b5b63 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
36191 cycles |
36160 cycles |
1.00 |
ML-KEM-512 encaps |
42603 cycles |
42572 cycles |
1.00 |
ML-KEM-512 decaps |
55409 cycles |
55383 cycles |
1.00 |
ML-KEM-768 keypair |
59524 cycles |
59403 cycles |
1.00 |
ML-KEM-768 encaps |
67820 cycles |
67604 cycles |
1.00 |
ML-KEM-768 decaps |
85029 cycles |
84820 cycles |
1.00 |
ML-KEM-1024 keypair |
88193 cycles |
87986 cycles |
1.00 |
ML-KEM-1024 encaps |
98301 cycles |
98017 cycles |
1.00 |
ML-KEM-1024 decaps |
120351 cycles |
119995 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Graviton3 (no-opt)
Details
| Benchmark suite | Current: ae30f98 | Previous: 19b5b63 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
38930 cycles |
38936 cycles |
1.00 |
ML-KEM-512 encaps |
44714 cycles |
44727 cycles |
1.00 |
ML-KEM-512 decaps |
56852 cycles |
56853 cycles |
1.00 |
ML-KEM-768 keypair |
64701 cycles |
65598 cycles |
0.99 |
ML-KEM-768 encaps |
71904 cycles |
72815 cycles |
0.99 |
ML-KEM-768 decaps |
87706 cycles |
88085 cycles |
1.00 |
ML-KEM-1024 keypair |
96063 cycles |
96009 cycles |
1.00 |
ML-KEM-1024 encaps |
106038 cycles |
105989 cycles |
1.00 |
ML-KEM-1024 decaps |
126745 cycles |
126642 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Graviton2 (no-opt)
Details
| Benchmark suite | Current: ae30f98 | Previous: 19b5b63 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
59446 cycles |
59457 cycles |
1.00 |
ML-KEM-512 encaps |
68618 cycles |
68767 cycles |
1.00 |
ML-KEM-512 decaps |
87322 cycles |
87462 cycles |
1.00 |
ML-KEM-768 keypair |
99689 cycles |
99400 cycles |
1.00 |
ML-KEM-768 encaps |
111402 cycles |
111427 cycles |
1.00 |
ML-KEM-768 decaps |
136176 cycles |
136336 cycles |
1.00 |
ML-KEM-1024 keypair |
149129 cycles |
148506 cycles |
1.00 |
ML-KEM-1024 encaps |
164521 cycles |
164520 cycles |
1.00 |
ML-KEM-1024 decaps |
196322 cycles |
195526 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Arm Cortex-A76 (Raspberry Pi 5) benchmarks
Details
| Benchmark suite | Current: ae30f98 | Previous: 19b5b63 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
28331 cycles |
28381 cycles |
1.00 |
ML-KEM-512 encaps |
34152 cycles |
34101 cycles |
1.00 |
ML-KEM-512 decaps |
44396 cycles |
44447 cycles |
1.00 |
ML-KEM-768 keypair |
48366 cycles |
48324 cycles |
1.00 |
ML-KEM-768 encaps |
54154 cycles |
54283 cycles |
1.00 |
ML-KEM-768 decaps |
68666 cycles |
68625 cycles |
1.00 |
ML-KEM-1024 keypair |
70528 cycles |
70588 cycles |
1.00 |
ML-KEM-1024 encaps |
78931 cycles |
78886 cycles |
1.00 |
ML-KEM-1024 decaps |
98638 cycles |
98660 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Arm Cortex-A72 (Raspberry Pi 4) benchmarks
Details
| Benchmark suite | Current: ae30f98 | Previous: 19b5b63 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
50672 cycles |
50719 cycles |
1.00 |
ML-KEM-512 encaps |
58483 cycles |
58738 cycles |
1.00 |
ML-KEM-512 decaps |
75183 cycles |
74859 cycles |
1.00 |
ML-KEM-768 keypair |
87334 cycles |
86523 cycles |
1.01 |
ML-KEM-768 encaps |
95805 cycles |
94395 cycles |
1.01 |
ML-KEM-768 decaps |
117947 cycles |
118457 cycles |
1.00 |
ML-KEM-1024 keypair |
130676 cycles |
130138 cycles |
1.00 |
ML-KEM-1024 encaps |
142553 cycles |
141998 cycles |
1.00 |
ML-KEM-1024 decaps |
173516 cycles |
174177 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
SpacemiT K1 8 (Banana Pi F3) benchmarks
Details
| Benchmark suite | Current: ae30f98 | Previous: 19b5b63 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
155038 cycles |
155039 cycles |
1.00 |
ML-KEM-512 encaps |
163114 cycles |
163167 cycles |
1.00 |
ML-KEM-512 decaps |
206291 cycles |
206378 cycles |
1.00 |
ML-KEM-768 keypair |
260837 cycles |
260805 cycles |
1.00 |
ML-KEM-768 encaps |
275645 cycles |
275601 cycles |
1.00 |
ML-KEM-768 decaps |
337672 cycles |
337639 cycles |
1.00 |
ML-KEM-1024 keypair |
395307 cycles |
395346 cycles |
1.00 |
ML-KEM-1024 encaps |
422195 cycles |
422233 cycles |
1.00 |
ML-KEM-1024 decaps |
506969 cycles |
507110 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Arm Cortex-A55 (Snapdragon 888) benchmarks
Details
| Benchmark suite | Current: ae30f98 | Previous: 19b5b63 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
59796 cycles |
59803 cycles |
1.00 |
ML-KEM-512 encaps |
67186 cycles |
67251 cycles |
1.00 |
ML-KEM-512 decaps |
85747 cycles |
85767 cycles |
1.00 |
ML-KEM-768 keypair |
101993 cycles |
102021 cycles |
1.00 |
ML-KEM-768 encaps |
113903 cycles |
113473 cycles |
1.00 |
ML-KEM-768 decaps |
142240 cycles |
141097 cycles |
1.01 |
ML-KEM-1024 keypair |
155122 cycles |
155020 cycles |
1.00 |
ML-KEM-1024 encaps |
172159 cycles |
177600 cycles |
0.97 |
ML-KEM-1024 decaps |
211303 cycles |
210243 cycles |
1.01 |
This comment was automatically generated by workflow using github-action-benchmark.
Great idea. I asked @bhess to help with this and we got it to work. I have no idea how meaningful and stable the benchmarks are, but we can find that out over time. |
Courtesy of Basil Hess/IBM, we know have a self-hosted POWER10 runner. This commit adds functional tests to CI. I had attempted to add it to the existign "kat_tests", but the sanitizer tests don't work properly on that platform. Note that opt tests are already enabled so that #1193 can simply be rebased on top of this. Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
hanno-becker
left a comment
There was a problem hiding this comment.
Many thanks @bhess for providing this, and to @mkannwischer to getting it integrated.
bhess
left a comment
There was a problem hiding this comment.
Thanks @mkannwischer and @hanno-becker for your work integrating and reviewing this!
Courtesy of @bhess /IBM, we now have a self-hosted POWER10 runner.
This commit adds functional tests to CI.
I had attempted to add it to the existign "kat_tests", but the sanitizer tests
don't work properly on that platform.
Note that opt tests are already enabled so that
#1193 can simply be
rebased on top of this.