Skip to content

Commit 9c7b0aa

Browse files
committed
Additional comments in the README
1 parent 11c1a15 commit 9c7b0aa

1 file changed

Lines changed: 45 additions & 8 deletions

File tree

README.md

Lines changed: 45 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -41,8 +41,7 @@ in Copilot denotes an infinite stream of values; one may
4141
just as well think that `Stream a` represents a pure mathematical
4242
function `ℕ → a` from natural numbers to values of type `a`.
4343
See the
44-
[Copilot manual]
45-
(https://ntrs.nasa.gov/api/citations/20200003164/downloads/20200003164.pdf)
44+
[Copilot manual](https://ntrs.nasa.gov/api/citations/20200003164/downloads/20200003164.pdf)
4645
for more details of the Copilot language itself and its semantics.
4746

4847
One of the central design considerations for Copilot is that is should
@@ -220,9 +219,47 @@ functions must behave. The most important of those assumptions is that
220219
the trigger functions must not modify any memory under the control of
221220
the Copilot program, including its ring buffers and stack. We also
222221
assume that the trigger functions are well defined, i.e. they are
223-
memory safe and do not perform any undefined behavior. Finally, we
224-
assume that they implement "normal" control flow and will eventually
225-
return to their caller. This last requirement may well be violated if
226-
the trigger function actually performs some aborting action, or
227-
otherwise halts normal execution; however, this seems relatively
228-
harmless from the point of view of correctness of the generated code.
222+
memory safe and do not perform any undefined behavior.
223+
224+
### Caveats About the Verifier
225+
226+
We rely on the `clang` compiler front-end to consume C source files
227+
and produce LLVM intermediate language, which then becomes the input
228+
to the later verification steps. To the extent that the input program
229+
is not fully-portable C, `clang` may make implementation-specific
230+
decisions about how to compile the program which might be made
231+
different if compiled by a different compiler, (e.g. `gcc`). We expect
232+
this aspect to be mitigated by the fact that Copilot programs are
233+
automatically generated into a rather simple subset of the C language,
234+
and is designed to be as simple as possible.
235+
Any code-generation bugs in `clang` itself may affect the soundness
236+
of our verifier. Again, however, Copilot generates a well-understood
237+
subset of the language, and we expect `clang` to be well-tested on
238+
the code patterns produced.
239+
240+
The semantics of LLVM bitcode, as encoded in the `crucible-llvm`
241+
package, may have errors that affect soundness. We mitigate this risk
242+
by testing our semantics against a corpus of verification problems
243+
produced for the SV-COMP verification competition, paying special
244+
attention to any soundness issues that arise. `Crux`, a standalone
245+
verification system based on `crucible-llvm`, was a participant in the
246+
2022 edition of SV-COMP.
247+
248+
The semantics of Copilot programs, as encoded in the
249+
`Copilot.Theorem.What4` module may have errors that affect soundness.
250+
For the moment we do not have an effective mitigation strategy for
251+
this risk other than manual examination and comparison against the
252+
intended semantics of Copilot, as encoded in the interpreter.
253+
254+
There is limited SMT solver support for floating-point values,
255+
especially for trancendental functions like the trig primitives. As a
256+
result, we reason about floating point expressions via uninterpreted
257+
functions. In other words, we leave the semantics of the
258+
floating-point operations totally abstract, and simply verify that the
259+
Copilot program and the corresponding C program apply the same
260+
operations in the same order. This is sound, but leaves the possibility
261+
that the compiler will apply some correct transformation to
262+
floating-point expressions that we are nonetheless unable to verify.
263+
However, on low optimizations and without the `--fast-math` flag,
264+
compilers generally (and `clang` in particular) are very reluctant to
265+
rearrange floating-point code, and the verifications generally succeed.

0 commit comments

Comments
 (0)