Skip to content

Commit f41359e

Browse files
authored
Merge pull request #21 from GaloisInc/improved-readme
Add an actual README that explains what copilot-verifier is
2 parents a845c2f + 34d0cd8 commit f41359e

1 file changed

Lines changed: 265 additions & 1 deletion

File tree

README.md

Lines changed: 265 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,265 @@
1-
TODO
1+
# Copilot Verifier
2+
3+
This "Copilot Verifier" is an add-on to the
4+
[Copilot Stream DSL](https://copilot-language.github.io)
5+
for verifying the correctness of C code generated
6+
by the `copilot-c99` package.
7+
8+
The main idea of the verifier is to use the
9+
[Crucible symbolic simulator](https://github.com/galoisinc/crucible)
10+
to interpret the semantics of the generated C program and
11+
and to produce verification conditions sufficient to guarantee
12+
that the meaning of the generated program corresponds in a precise
13+
way to the meaning of the original stream specification. The generated
14+
verification conditions are then dispatched to SMT solvers to
15+
be automatically solved. We will have more to say about exactly
16+
what is meant by this correspondence below.
17+
18+
## Using the Copilot Verifier
19+
20+
The main interface to the verifier is the `Copilot.Verifier.verify`
21+
function, which takes some options to drive the code generation
22+
process and a Copilot `Spec` object. It will invoke the Copilot
23+
code generator to obtain C sources, compile the C sources into
24+
LLVM bitcode using the `clang` compiler front-end, then
25+
parse and interpret the bitcode using `crucible`. After generating
26+
verification conditions, it will dispatch them to an SMT solver,
27+
and the result of those queries will be presented to the user.
28+
29+
There are a number of examples (based on similar examples
30+
from the main Copilot repository) demonstrating how to
31+
incorporate the verifier into a Copilot program.
32+
See the `copilot-verifier/examples` subdirectory of this repository.
33+
34+
## Details of the Verification
35+
36+
### Synopsis of Copilot semantics
37+
38+
The Copilot DSL represents a certain nicely-behaved
39+
class of discrete-time stream programs. Each `Stream`
40+
in Copilot denotes an infinite stream of values; one may
41+
just as well think that `Stream a` represents a pure mathematical
42+
function `ℕ → a` from natural numbers to values of type `a`.
43+
See the
44+
[Copilot manual](https://ntrs.nasa.gov/api/citations/20200003164/downloads/20200003164.pdf)
45+
for more details of the Copilot language itself and its semantics.
46+
47+
One of the central design considerations for Copilot is that is should
48+
be possible to implement stream programs using a fixed, finite amount
49+
of storage. As a result, Copilot will only accept stream programs
50+
that access a bounded amount of their input streams (including any
51+
recursive stream references). This allows an
52+
implementation strategy where the generated C code can use fixed-size
53+
ring buffers to store a limited number of previous stream values.
54+
55+
The execution model for the generated programs is that the program
56+
starts in a state corresponding to stream values at time `t = 0`;
57+
"external" stream input values are placed in distinguished global
58+
variables by the calling environment, which then executes a `step()`
59+
function to move to the next time step. The `step()` function captures
60+
the values of these external inputs and does whatever computation is
61+
necessary to update its internal state from time `t=n` to time
62+
`t=n+1`. In addition, it will call any defined "trigger" functions
63+
if the stream state at that time step satisfies the defined guard condition.
64+
In short, the generated C program steps one moment at a time through
65+
the stream program, consuming a sequence of input values provided by
66+
a cooperating environment and calling handler functions whenever
67+
states of interest occur.
68+
69+
### The Desired Correspondence
70+
71+
What does it mean, then, for a generated C program in this style
72+
to correctly implement a given stream program? The intuition
73+
is basically that after `n` calls to the `step` function,
74+
the state of the ring-buffers of the C program should correctly
75+
compute the value of the corresponding stream expression
76+
evaluated at index `n`, assuming the C program has been fed
77+
inputs corresponding to the first `n` values of the external stream
78+
inputs. Moreover, the trigger functions should be called from
79+
the `step` function exactly at the time values when the stream expressions
80+
evaluate to true.
81+
82+
The notion of correspondence for the values flowing in streams is
83+
relatively straightforward: these values consist of fixed-width
84+
machine integers, floating-point values, structs and fixed-length
85+
arrays. For each, the appropriate notion of equality is fairly clear.
86+
87+
Both the original `Stream` program and the generated C program
88+
can be viewed straightforwardly as a transition system, and under
89+
this view, the correspondence we want to establish is a bisimulation
90+
between the states of the high-level stream program and the low-level
91+
C program. The proof method for bisimulation requires us to provide
92+
a "correspondence" relation between the program states, and then prove
93+
three things about this relation:
94+
95+
1. that the initial states of the programs are in the relation;
96+
2. if we assume two arbitrary program states begin in the relation
97+
and each takes a single transition (consuming corresponding inputs),
98+
the resulting states are back in the relation;
99+
3. that any observable
100+
actions taken by one program are exactly mirrored by the other.
101+
102+
On the high-level side of the bisimulation, the program
103+
state is essentially just the value of the current time step `n`,
104+
whereas on the C side it consists of the regions of global memory that
105+
contain the ring-buffers and their current indices. The transition
106+
relation for the high-level program just increments the time value by
107+
one, and the transition for the C program is defined by the action
108+
of the generated `step()` function.
109+
110+
Suppose `s` is one of the stream definitions in the original Copilot
111+
program which is required to retain `k` previous values;
112+
let `buf` be the global variable name of the ring-buffer in the C
113+
program, and `idx` be the global variable name maintaining the
114+
current index into the ring buffer. Then the correspondence
115+
relation is basically that `0 <= idx < k` and
116+
`s[n+i] = buf[(idx+i) mod k]` as `i` ranges from `0 .. k-1`.
117+
By abuse of notation, here we mean that `s[j]` is
118+
the value of the stream expression `s` evaluated at index `j`,
119+
whereas `buf[j]` means the value obtained by reading the `j`th value
120+
of the buffer `buf` from memory. The overall correspondence relation
121+
is a conjunction of statements like this, one for each stream
122+
expression that is realized via a buffer.
123+
124+
### Implementing the Bisimulation proof steps
125+
126+
The kind of program correspondence property we desire is a largely
127+
mechanical affair. As the code under consideration is automatically
128+
generated, it has a very regular structure and is specifically
129+
intended to implement the semantics we wich to verify it against. As
130+
such, we expect these proofs to be highly automatable.
131+
132+
The proof principle of bisimulation itself is not amenable
133+
to reduction to SMT, as if falls outside the first-order theories
134+
those solvers understand. Likewise, the semantics of Copilot
135+
and C might possibly be reducible directly to SMT, but it would be
136+
impractical to do so. However, we can reduce the individual
137+
proof obligations mentioned above into a series of lower-level
138+
logical statements that are amenable to SMT proof by
139+
defining the logical semantics of stream programs, and using
140+
symbolic simulation techniques to interpret the semantics of the
141+
C program. Performing this reduction is the key contribution
142+
of `copilot-verifier`.
143+
144+
#### Initial state correspondence
145+
146+
The proof first obligation we must discharge is to show that
147+
the initial states of the two programs correspond. For each
148+
stream `s` there is a corresponding `k`, which is the length of
149+
the ring-buffer implementing it. We must simply verify that
150+
the C program initializes its buffer with the first `k` values
151+
of the stream `s`, and that the `idx` value starts at 0.
152+
Because of the restrictions Copilot places on programs, these
153+
first `k` values must be specified concretely and will not be
154+
able to depend on any external inputs. As such, this step
155+
is quite easily performed, as it requires only direct evaluation
156+
of concrete inputs.
157+
158+
#### Transition correspondence
159+
160+
The bulk of the proof effort consists of demonstrating that
161+
the bisimulation relation is preserved by transitions.
162+
In order to do this step, we must begin with totally symbolic
163+
initial states: we know nothing except that we are at some
164+
arbitrary time value `n`, and that the C program buffers
165+
correspond to their stream values as required by the relation.
166+
Thus, we create fresh symbolic variables for the streams
167+
from `n` to `n + k-1`, and for the values of all the involved
168+
external streams at time `n`. Then, we run forward the Copilot
169+
program by evaluating the stream recurrence expression to
170+
compute the value of each stream at time `n+k`.
171+
172+
Next we set up an initial state of the C program by choosing,
173+
for each ring buffer, an arbitrary value for its current index
174+
within its allowed range, and then writing the variables
175+
corresponding to each stream value into the buffers at
176+
their appropriate offsets. The symbolic simulator is then
177+
invoked to compute the state update effects of the `step()`
178+
function. Afterward, we read the poststate values from the
179+
ring-buffers and verify that they correspond to the stream
180+
values from `n+1` up to `n+k`.
181+
182+
As part of symbolic simulation, Crucible may also generate
183+
side-conditions that relate to memory safety of the program, or to
184+
error conditions that must be avoided. All of the bisimulation
185+
equivalence conditions and the safety side-conditions will be
186+
submitted to an SMT solver.
187+
188+
#### Observable effects
189+
190+
For our purposes, the only observable effects of a Copilot program
191+
relate to any "trigger" functions defined in the spec. Our task is to
192+
show that the generated C code calls the external trigger functions if
193+
and only if the corresponding guard condition is true, and that the
194+
arguments passed to those functions are as expected.
195+
This proof obligation is proved in the same phase along with
196+
the transition relation proof above because the `step()` function
197+
is responsible for both invoking triggers and for performing state
198+
updates.
199+
200+
The technique we use to perform this aspect of the proof is to
201+
install "override" functions to the external symbols corresponding
202+
to the C trigger functions before we begin symbolic simulation.
203+
In a real system, the external environment would be responsible
204+
for implementing these functions and taking whatever appropriate
205+
action is necessary when the triggers fire. However, we are verifying
206+
the generated code in isolation from its environment, so we have no
207+
implementation in hand. Instead, the override will
208+
essentially implement a stub function that simply captures its
209+
arguments and the path condition under which it was called.
210+
After simulation completes, the captured arguments and path condition
211+
are required to be equivalent to the corresponding trigger guard
212+
and arguments from the Copilot spec. These conditions are
213+
discharged to the SMT solver at the same time as the transition
214+
relation obligations.
215+
216+
Because of the way we model the trigger functions, we make a number of
217+
implicit assumptions about how the actual implementations of those
218+
functions must behave. The most important of those assumptions is that
219+
the trigger functions must not modify any memory under the control of
220+
the Copilot program, including its ring buffers and stack. We also
221+
assume that the trigger functions are well defined, i.e. they are
222+
memory safe and do not perform any undefined behavior.
223+
224+
### Caveats About the Verifier
225+
226+
We rely on the `clang` compiler front-end to consume C source files
227+
and produce LLVM intermediate language, which then becomes the input
228+
to the later verification steps. To the extent that the input program
229+
is not fully-portable C, `clang` may make implementation-specific
230+
decisions about how to compile the program which might be made
231+
different if compiled by a different compiler (e.g. `gcc`). We expect
232+
this aspect to be mitigated by the fact that Copilot programs are
233+
automatically generated into a rather simple subset of the C language,
234+
and is designed to be as simple as possible.
235+
Any code-generation bugs in `clang` itself may affect the soundness
236+
of our verifier. Again, however, Copilot generates a well-understood
237+
subset of the language, and we expect `clang` to be well-tested on
238+
the code patterns produced.
239+
240+
The semantics of LLVM bitcode, as encoded in the `crucible-llvm`
241+
package, may have errors that affect soundness. We mitigate this risk
242+
by testing our semantics against a corpus of verification problems
243+
produced for the SV-COMP verification competition, paying special
244+
attention to any soundness issues that arise. `Crux`, a standalone
245+
verification system based on `crucible-llvm`, was a participant in the
246+
2022 edition of SV-COMP.
247+
248+
The semantics of Copilot programs, as encoded in the
249+
`Copilot.Theorem.What4` module may have errors that affect soundness.
250+
For the moment we do not have an effective mitigation strategy for
251+
this risk other than manual examination and comparison against the
252+
intended semantics of Copilot, as encoded in the interpreter.
253+
254+
There is limited SMT solver support for floating-point values,
255+
especially for transcendental functions like the trig primitives. As a
256+
result, we reason about floating point expressions via uninterpreted
257+
functions. In other words, we leave the semantics of the
258+
floating-point operations totally abstract, and simply verify that the
259+
Copilot program and the corresponding C program apply the same
260+
operations in the same order. This is sound, but leaves the possibility
261+
that the compiler will apply some correct transformation to
262+
floating-point expressions that we are nonetheless unable to verify.
263+
However, on low optimizations and without the `--fast-math` flag,
264+
compilers generally (and `clang` in particular) are very reluctant to
265+
rearrange floating-point code, and the verifications generally succeed.

0 commit comments

Comments
 (0)