Skip to content

Commit 0012a8f

Browse files
committed
proofs: added AutoCorrode/C proofs of a subset of poly.c functionality
AutoCorrode is a separation logic framework developed by AWS for use in the verification of the Nitro Isolation Engine. Originally targetting Rust, a new frontend for C11 has recently been added using an established Isabelle library, Isabelle/C, for parsing. C code can be embedded both in Isabelle theory files for reasoning about, or code can be loaded from .c files, parsed, and processed into Isabelle definitions automatically (post preprocessing with cpp). The C frontend is still a work-in-progress, but has sufficiently advanced enough that material from the poly.c file from mlkem-native can now be verified using an abstract specification and weakest-precondition style reasoning. In this initial commit: We define a pipeline which takes MLKEM .c and .h files and autogenerates AutoCorrode/C definitions. Setting this up correctly requires some care: C11 files must first be preprocessed using the C preprocessor, the relevant definitions and supporting types that we are interested in verifying must be filtered out from the resulting generated code, and any feature that we cannot support (e.g. __darwin_builtin types, when running on an Apple machine) must be filtered out. To do this we use a series of "manifest" files which identify the functions we are interested in working with, which is processed by dedicated Python script files and the `micro_c_translate` command in Isabelle. Using the pipeline above, we extract the `sub`, `add`, and `barrett_reduction` functions and supporting types from `poly.{c, h}` and the supporting `verify.{c, h}` file, specify them, and verify them with respect to a series of pre/postcondition specifications using Weakest Precondition-style reasoning. To do this, we define a machine "locale" or "interface" setting up the memory model of the AutoCorrode/C library. This presents a bit of a chicken-and-egg situation, as the AutoCorrode/C memory model requires knowledge of any types that will be mutated, however translated C functions must be defined within the scope of the machine model in order to constrain the types of memory addresses and memory values. As a result of this, extraction of C code proceeds in two stages, with the first stage extracting types alone, prior to the definition of the machine and used to define it, which is then used further when extracting C functions. Note for the time being there is no "inhabitant" or "model" of this machine model, which is kept axiomatic. As a consequence there is a small chance that this model cannot actually be realized by an explicit construction. Providing a model is future work, and will be entirely straightforward. Signed-off-by: Dominic Mulligan <dominic.p.mulligan@gmail.com>
1 parent 4e91a31 commit 0012a8f

25 files changed

Lines changed: 1483 additions & 0 deletions

.github/workflows/all.yml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,14 @@ jobs:
5151
needs: [ base, nix ]
5252
uses: ./.github/workflows/cbmc.yml
5353
secrets: inherit
54+
isabelle:
55+
name: Isabelle
56+
permissions:
57+
contents: 'read'
58+
id-token: 'write'
59+
needs: [ base ]
60+
uses: ./.github/workflows/isabelle.yml
61+
secrets: inherit
5462
oqs_integration:
5563
name: libOQS
5664
permissions:

.github/workflows/isabelle.yml

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
# Copyright (c) The mlkem-native project authors
2+
# SPDX-License-Identifier: Apache-2.0 OR ISC OR MIT
3+
4+
name: Isabelle
5+
permissions:
6+
contents: read
7+
on:
8+
workflow_call:
9+
workflow_dispatch:
10+
11+
jobs:
12+
isabelle-proofs:
13+
name: proofs/isabelle
14+
runs-on: ubuntu-latest
15+
container:
16+
image: makarius/isabelle:Isabelle2025-2
17+
options: --user root
18+
19+
steps:
20+
- name: Checkout mlkem-native
21+
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
22+
with:
23+
submodules: recursive
24+
25+
- name: Checkout AFP mirror
26+
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
27+
with:
28+
repository: isabelle-prover/mirror-afp-2025-1
29+
path: AutoCorrode/dependencies/afp
30+
31+
- name: Install build tooling
32+
run: apt-get update -qq && apt-get install -y -qq make > /dev/null
33+
34+
- name: Build Isabelle proofs
35+
run: make -C proofs/isabelle build ISABELLE_HOME=/home/isabelle/Isabelle/bin

.gitmodules

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
[submodule "AutoCorrode"]
2+
path = AutoCorrode
3+
url = https://github.com/awslabs/AutoCorrode

AutoCorrode

Submodule AutoCorrode added at ecb84dd

BUILDING.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -102,3 +102,24 @@ See [CONTRIBUTING.md](CONTRIBUTING.md) for instructions on how to setup and use
102102
### Running the HOL-Light proofs
103103

104104
Once you are in the `nix` shell or have all tools setup by hand, use `./scripts/tests hol_light` (or just `tests hol_light` in the `nix` shell) to re-check the HOL-Light proofs. Note that depending on the function, they will take a long time. See `tests hol_light --help` for details on the command line options, and [proofs/hol_light](proofs/hol_light) for more details on the HOL-Light proofs in general.
105+
106+
## Isabelle (AutoCorrode)
107+
108+
### Prerequisites
109+
110+
To run the Isabelle proofs in [`proofs/isabelle`](proofs/isabelle), you need Isabelle2025-2 and the AFP mirror used by AutoCorrode (containing `Word_Lib` and `Isabelle_C`).
111+
112+
### Running the Isabelle proofs
113+
114+
From the repository root:
115+
116+
```bash
117+
make -C proofs/isabelle setup-afp # optional helper to clone mirror-afp-2025-1
118+
make -C proofs/isabelle build
119+
```
120+
121+
To open jEdit with all required session directories configured:
122+
123+
```bash
124+
make -C proofs/isabelle jedit
125+
```

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,8 @@ interactive theorem prover and the [s2n-bignum](https://github.com/awslabs/s2n-b
6666
relevant parts of the Arm and x86 architectures). See [proofs/hol_light](proofs/hol_light) for details.
6767
The x86_64 FIPS-202 (Keccak) backend is still in C intrinsics and not yet covered by proof.
6868

69+
Selected C/functional components are also verified in [Isabelle/HOL](https://isabelle.in.tum.de/) using AutoCorrode. See [proofs/isabelle](proofs/isabelle).
70+
6971
## Security
7072

7173
All assembly in mlkem-native is constant-time in the sense that it is free of secret-dependent control flow, memory access,

proofs/README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,3 +11,7 @@ We use the [C Bounded Model Checker (CBMC)](https://github.com/diffblue/cbmc) to
1111
## Assembly verification: HOL-Light
1212

1313
We use the [HOL-Light](https://github.com/jrh13/hol-light) interactive theorem prover alongside the verification infrastructure from [s2n-bignum](https://github.com/awslabs/s2n-bignum) to show the functional correctness of all AArch64 assembly and all x86_64 ML-KEM arithmetic assembly in mlkem-native at the object-code level. See [proofs/hol_light](hol_light).
14+
15+
## C/functional verification in Isabelle (AutoCorrode)
16+
17+
We use [Isabelle/HOL](https://isabelle.in.tum.de/) together with [AutoCorrode](https://github.com/awslabs/AutoCorrode) to translate and verify selected C implementations and their refinement lemmas. See [proofs/isabelle](isabelle).

proofs/isabelle/Common.thy

Lines changed: 185 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,185 @@
1+
theory Common
2+
imports MLKEM_Poly_Definitions "Micro_C_Examples.C_While_Examples"
3+
begin
4+
5+
section \<open>Abstract polynomial arithmetic\<close>
6+
7+
text \<open>We model mlkem-native polynomials abstractly as fixed-size coefficient
8+
lists over the integers. This gives a clean mathematical specification
9+
independent of machine word sizes.\<close>
10+
11+
abbreviation MLKEM_N :: nat where
12+
\<open>MLKEM_N \<equiv> 256\<close>
13+
14+
type_synonym int_poly = \<open>int list\<close>
15+
16+
definition poly_add_int :: \<open>int_poly \<Rightarrow> int_poly \<Rightarrow> int_poly\<close> where
17+
\<open>poly_add_int ps qs = map2 (+) ps qs\<close>
18+
19+
definition poly_sub_int :: \<open>int_poly \<Rightarrow> int_poly \<Rightarrow> int_poly\<close> where
20+
\<open>poly_sub_int ps qs = map2 (-) ps qs\<close>
21+
22+
subsection \<open>Barrett reduction\<close>
23+
24+
abbreviation MLKEM_Q :: int where
25+
\<open>MLKEM_Q \<equiv> 3329\<close>
26+
27+
definition barrett_reduce_int :: \<open>int \<Rightarrow> int\<close> where
28+
\<open>barrett_reduce_int a = a - ((20159 * a + 2^25) div 2^26) * MLKEM_Q\<close>
29+
30+
lemma barrett_reduce_mod:
31+
shows \<open>barrett_reduce_int a mod MLKEM_Q = a mod MLKEM_Q\<close>
32+
unfolding barrett_reduce_int_def by algebra
33+
34+
section \<open>Concrete types and refinement\<close>
35+
36+
text \<open>
37+
Refinement relation: a concrete @{typ c_mlk_poly} refines an abstract
38+
@{typ int_poly} when its coefficient list has the correct length and its
39+
signed interpretation matches the abstract polynomial.
40+
\<close>
41+
definition refines_mlk_poly :: \<open>c_mlk_poly \<Rightarrow> int_poly \<Rightarrow> bool\<close> where
42+
\<open>refines_mlk_poly cp ap \<longleftrightarrow>
43+
length (c_mlk_poly_coeffs cp) = MLKEM_N \<and>
44+
List.map sint (c_mlk_poly_coeffs cp) = ap\<close>
45+
46+
text \<open>
47+
No-overflow condition: the mathematical sum of each coefficient pair
48+
fits in a signed 16-bit integer. This is required both for the C code
49+
to not abort (since @{const c_signed_add} checks for overflow) and for
50+
the refinement to integer arithmetic to hold.
51+
\<close>
52+
definition no_overflow_add :: \<open>int_poly \<Rightarrow> int_poly \<Rightarrow> bool\<close> where
53+
\<open>no_overflow_add ps qs \<longleftrightarrow>
54+
(\<forall>i < min (length ps) (length qs).
55+
ps ! i + qs ! i \<in> {-(2^15) ..< 2^15})\<close>
56+
57+
definition no_overflow_sub :: \<open>int_poly \<Rightarrow> int_poly \<Rightarrow> bool\<close> where
58+
\<open>no_overflow_sub ps qs \<longleftrightarrow>
59+
(\<forall>i < min (length ps) (length qs).
60+
ps ! i - qs ! i \<in> {-(2^15) ..< 2^15})\<close>
61+
62+
text \<open>
63+
The concrete (word-level) result of polynomial addition — internal helper
64+
for proofs.
65+
\<close>
66+
definition poly_add_c :: \<open>c_mlk_poly \<Rightarrow> c_mlk_poly \<Rightarrow> c_mlk_poly\<close> where
67+
\<open>poly_add_c p q = update_c_mlk_poly_coeffs
68+
(\<lambda>_. map2 (+) (c_mlk_poly_coeffs p) (c_mlk_poly_coeffs q)) p\<close>
69+
70+
subsection \<open>Refinement lemmas\<close>
71+
72+
lemma sint_plus_no_overflow:
73+
fixes a b :: \<open>'l::{len} sword\<close>
74+
assumes \<open>sint a + sint b \<in> {-(2^(LENGTH('l) - 1)) ..< 2^(LENGTH('l) - 1)}\<close>
75+
shows \<open>sint (a + b) = sint a + sint b\<close>
76+
using assms by (intro signed_arith_sint) (auto simp: word_size)
77+
78+
lemma sint_minus_no_overflow:
79+
fixes a b :: \<open>'l::{len} sword\<close>
80+
assumes \<open>sint a - sint b \<in> {-(2^(LENGTH('l) - 1)) ..< 2^(LENGTH('l) - 1)}\<close>
81+
shows \<open>sint (a - b) = sint a - sint b\<close>
82+
using assms by (intro signed_arith_sint) (auto simp: word_size)
83+
84+
text \<open>
85+
The key refinement theorem: under the no-overflow condition, the concrete
86+
word-level addition produces a result that refines the abstract integer sum.
87+
\<close>
88+
lemma poly_add_c_refines:
89+
assumes \<open>refines_mlk_poly p ap\<close>
90+
and \<open>refines_mlk_poly q aq\<close>
91+
and \<open>no_overflow_add ap aq\<close>
92+
shows \<open>refines_mlk_poly (poly_add_c p q) (poly_add_int ap aq)\<close>
93+
using assms by (auto simp add: c_mlk_poly.record_simps map2_map_map word_size refines_mlk_poly_def
94+
poly_add_c_def poly_add_int_def no_overflow_add_def intro!: nth_equalityI sint_plus_no_overflow)
95+
96+
subsection \<open>Auxiliary Lemmas\<close>
97+
98+
lemma nth_map2:
99+
assumes \<open>i < length xs\<close>
100+
and \<open>i < length ys\<close>
101+
shows \<open>map2 f xs ys ! i = f (xs ! i) (ys ! i)\<close>
102+
using assms by (induction xs arbitrary: i ys) (auto simp: less_Suc_eq_0_disj split: list.splits)
103+
104+
lemma inv_list_step:
105+
assumes \<open>n < length xs\<close>
106+
and \<open>n < length ys\<close>
107+
and \<open>length xs = length ys\<close>
108+
shows \<open>(take n (map2 f xs ys) @ drop n xs)[n := f (xs ! n) (ys ! n)] =
109+
take (Suc n) (map2 f xs ys) @ drop (Suc n) xs\<close>
110+
proof -
111+
let ?zs = \<open>map2 f xs ys\<close>
112+
from assms have ln: \<open>n < length ?zs\<close>
113+
by simp
114+
from assms have zn: \<open>?zs ! n = f (xs ! n) (ys ! n)\<close>
115+
by (simp add: nth_map2)
116+
from assms have drop_eq: \<open>drop n xs = xs ! n # drop (Suc n) xs\<close>
117+
by (metis Cons_nth_drop_Suc)
118+
have \<open>(take n ?zs @ drop n xs)[n := ?zs ! n] = take n ?zs @ (drop n xs)[0 := ?zs ! n]\<close>
119+
using ln by (simp add: list_update_append)
120+
also have \<open>\<dots> = take n ?zs @ ?zs ! n # drop (Suc n) xs\<close>
121+
using drop_eq by simp
122+
also have \<open>\<dots> = take (Suc n) ?zs @ drop (Suc n) xs\<close>
123+
using ln by (simp add: take_Suc_conv_app_nth)
124+
finally show ?thesis
125+
using zn by simp
126+
qed
127+
128+
lemma no_overflow_add_bounds:
129+
assumes \<open>refines_mlk_poly vr ar\<close>
130+
and \<open>refines_mlk_poly vb ab\<close>
131+
and \<open>no_overflow_add ar ab\<close> \<open>i < MLKEM_N\<close>
132+
shows \<open>sint (c_mlk_poly_coeffs vr ! i) + sint (c_mlk_poly_coeffs vb ! i) < 2 ^ 15\<close>
133+
and \<open>- (2 ^ 15) \<le> sint (c_mlk_poly_coeffs vr ! i) + sint (c_mlk_poly_coeffs vb ! i)\<close>
134+
proof -
135+
from assms(1) have lr: \<open>length (c_mlk_poly_coeffs vr) = MLKEM_N\<close>
136+
and mr: \<open>List.map sint (c_mlk_poly_coeffs vr) = ar\<close>
137+
unfolding refines_mlk_poly_def by auto
138+
from assms(2) have lb: \<open>length (c_mlk_poly_coeffs vb) = MLKEM_N\<close>
139+
and mb: \<open>List.map sint (c_mlk_poly_coeffs vb) = ab\<close>
140+
unfolding refines_mlk_poly_def by auto
141+
have \<open>ar ! i + ab ! i \<in> {-(2^15) ..< 2^15}\<close>
142+
using assms(3,4) lr lb mr mb unfolding no_overflow_add_def by auto
143+
moreover have \<open>ar ! i = sint (c_mlk_poly_coeffs vr ! i)\<close>
144+
using mr lr assms(4) by (simp add: nth_map[symmetric])
145+
moreover have \<open>ab ! i = sint (c_mlk_poly_coeffs vb ! i)\<close>
146+
using mb lb assms(4) by (simp add: nth_map[symmetric])
147+
ultimately show \<open>sint (c_mlk_poly_coeffs vr ! i) + sint (c_mlk_poly_coeffs vb ! i) < 2 ^ 15\<close>
148+
and \<open>- (2 ^ 15) \<le> sint (c_mlk_poly_coeffs vr ! i) + sint (c_mlk_poly_coeffs vb ! i)\<close>
149+
by auto
150+
qed
151+
152+
lemma no_overflow_sub_bounds:
153+
assumes \<open>refines_mlk_poly vr ar\<close>
154+
and \<open>refines_mlk_poly vb ab\<close>
155+
and \<open>no_overflow_sub ar ab\<close> \<open>i < MLKEM_N\<close>
156+
shows \<open>sint (c_mlk_poly_coeffs vr ! i) - sint (c_mlk_poly_coeffs vb ! i) < 2 ^ 15\<close>
157+
and \<open>- (2 ^ 15) \<le> sint (c_mlk_poly_coeffs vr ! i) - sint (c_mlk_poly_coeffs vb ! i)\<close>
158+
proof -
159+
from assms(1) have lr: \<open>length (c_mlk_poly_coeffs vr) = MLKEM_N\<close>
160+
and mr: \<open>List.map sint (c_mlk_poly_coeffs vr) = ar\<close>
161+
unfolding refines_mlk_poly_def by auto
162+
from assms(2) have lb: \<open>length (c_mlk_poly_coeffs vb) = MLKEM_N\<close>
163+
and mb: \<open>List.map sint (c_mlk_poly_coeffs vb) = ab\<close>
164+
unfolding refines_mlk_poly_def by auto
165+
have \<open>ar ! i - ab ! i \<in> {-(2^15) ..< 2^15}\<close>
166+
using assms(3,4) lr lb mr mb unfolding no_overflow_sub_def by auto
167+
moreover have \<open>ar ! i = sint (c_mlk_poly_coeffs vr ! i)\<close>
168+
using mr lr assms(4) by (simp add: nth_map[symmetric])
169+
moreover have \<open>ab ! i = sint (c_mlk_poly_coeffs vb ! i)\<close>
170+
using mb lb assms(4) by (simp add: nth_map[symmetric])
171+
ultimately show \<open>sint (c_mlk_poly_coeffs vr ! i) - sint (c_mlk_poly_coeffs vb ! i) < 2 ^ 15\<close>
172+
and \<open>- (2 ^ 15) \<le> sint (c_mlk_poly_coeffs vr ! i) - sint (c_mlk_poly_coeffs vb ! i)\<close>
173+
by auto
174+
qed
175+
176+
lemma MLKEM_N_sub_step [simp]:
177+
assumes \<open>k < MLKEM_N\<close>
178+
shows \<open>MLKEM_N - k = Suc (255 - k)\<close>
179+
using assms by simp
180+
181+
lemma mlkem_rev_index_bound [simp]:
182+
shows \<open>255 - k < MLKEM_N\<close>
183+
by simp
184+
185+
end
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
theory MLKEM_Functional_Correctness
2+
imports
3+
MLKEM_Poly_Functional_Correctness
4+
begin
5+
6+
text \<open>
7+
Top-level entry point for \<^verbatim>\<open>mlkem-native\<close> functional-correctness proofs.
8+
9+
Intentionally empty, barring imports.
10+
\<close>
11+
12+
end
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
theory MLKEM_Machine_Model
2+
imports "Micro_C_Examples.Simple_C_Functions"
3+
begin
4+
5+
text \<open>
6+
Machine-model assumptions used by generated \<^verbatim>\<open>mlkem-native\<close> C definitions.
7+
Types are extracted first so locale assumptions can depend on the generated
8+
C record/datatype names.
9+
\<close>
10+
11+
micro_c_file manifest: "generated/manifests/poly.types.manifest" "generated/c/poly.pre.c"
12+
13+
locale c_mlk_machine_model =
14+
reference reference_types +
15+
ref_c_mlk_poly: reference_allocatable reference_types _ _ _ _ _ _ _ c_mlk_poly_prism +
16+
ref_c_uint: reference_allocatable reference_types _ _ _ _ _ _ _ c_uint_prism +
17+
ref_c_int: reference_allocatable reference_types _ _ _ _ _ _ _ c_int_prism +
18+
ref_c_short: reference_allocatable reference_types _ _ _ _ _ _ _ c_short_prism
19+
for reference_types :: \<open>'s::{sepalg} \<Rightarrow> 'addr \<Rightarrow> 'gv \<Rightarrow> c_abort \<Rightarrow> 'i prompt \<Rightarrow>
20+
'o prompt_output \<Rightarrow> unit\<close>
21+
and c_mlk_poly_prism :: \<open>('gv, c_mlk_poly) prism\<close>
22+
and c_uint_prism :: \<open>('gv, c_uint) prism\<close>
23+
and c_int_prism :: \<open>('gv, c_int) prism\<close>
24+
and c_short_prism :: \<open>('gv, c_short) prism\<close>
25+
begin
26+
27+
adhoc_overloading store_reference_const \<rightleftharpoons> ref_c_mlk_poly.new
28+
adhoc_overloading store_reference_const \<rightleftharpoons> ref_c_uint.new
29+
adhoc_overloading store_reference_const \<rightleftharpoons> ref_c_int.new
30+
adhoc_overloading store_reference_const \<rightleftharpoons> ref_c_short.new
31+
adhoc_overloading store_update_const \<rightleftharpoons> update_fun
32+
33+
end
34+
35+
end

0 commit comments

Comments
 (0)