Skip to content

Commit bf34c5c

Browse files
committed
contribution
1 parent cf30a76 commit bf34c5c

5 files changed

Lines changed: 587 additions & 0 deletions

File tree

doc/llm/CLAUDE.md

Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -181,6 +181,45 @@ SEARCH (fdom _)
181181
SEARCH (_ %/ _)
182182
```
183183

184+
### When the agent can't hold a persistent pipe
185+
186+
Some LLM front-ends (Claude Code, Codex CLI, etc.) expose the shell as
187+
one-shot command execution — the agent *cannot* keep stdin/stdout open
188+
to `easycrypt llm` across turns. Naively, every tactic would then
189+
require re-`LOAD`-ing the file prefix.
190+
191+
The `scripts/llm/` directory contains a reference fifo-backed daemon
192+
that lets many short-lived shell invocations drive one persistent EC
193+
REPL. Start it once at the beginning of a session, then issue each
194+
tactic as a separate `ec_send.sh` call. See `scripts/llm/README.md` for
195+
the full pattern.
196+
197+
### Iterate interactively, then codify
198+
199+
For long proofs — especially program-logic bodies with 5+ chained
200+
`ecall`s, or big algebraic derivations — write tactics one at a time
201+
in the REPL, inspecting `GOALS` after each, and only commit the final
202+
sequence back to the file once it works. Two reasons:
203+
204+
- **SMT budget.** Committing a long batch and issuing one LOAD
205+
compounds SMT cost across every intermediate side-condition at once.
206+
The same tactics often pass individually with a shorter per-call
207+
timeout.
208+
- **Cheap backtracking.** Use `CHECKPOINT` at every branching point
209+
and `REVERT` liberally — it's O(1), whereas re-LOAD re-compiles the
210+
whole prefix.
211+
212+
Pattern:
213+
214+
```
215+
LOAD "file.ec" 100 ← note uuid:N
216+
CHECKPOINT before_body
217+
<try tactic>
218+
GOALS ← inspect
219+
<try tactic>
220+
REVERT before_body ← if the branch fails
221+
```
222+
184223
## EasyCrypt proof strategy
185224

186225
### General approach
@@ -246,6 +285,64 @@ SEARCH (_ %/ _)
246285
- `rewrite lemma in H` modifies hypothesis `H` in place (it does
247286
not consume it). If you need to preserve the original, copy it
248287
first: `have H' := H; rewrite lemma in H'`.
288+
- **`have H := lemma args` may leave unevaluated lambdas.** Lemma
289+
applications that produce `init`/`map`/`filter` terms often carry
290+
β-redexes into the introduced hypothesis. Later `rewrite H` can then
291+
hang indefinitely while the kernel tries to unify. Force
292+
normalization at introduction with `have /= H := lemma args`.
293+
- **Tactic-repetition (`!tac`) over-applies on nested structures.**
294+
`!mapiE`, `!initiE`, `!allP` descend into inner layers at different
295+
array sizes, generate side-conditions that can't be closed by `/#`,
296+
and either leave stale goals or hang. Use the exact count needed
297+
(usually 1 or 2 for an equality: `rewrite mapiE 1:/# /= mapiE 1:/#`).
298+
- **`1:/#` vs `1:smt(hints)` in rewrite chains.** Inside a
299+
`rewrite lemma 1:...` position only `/#` parses; `1:smt(hints)` is
300+
a parse error. Break the chain:
301+
`rewrite lemma 1:/#; first smt(hints). rewrite next_lemma.`
302+
- **`suff` not `suffices`.** The surface syntax uses `suff H : P.`;
303+
`suffices` triggers a parse error.
304+
- **`rewrite allP` only touches the first `all`.** When unfolding
305+
range predicates on both sides of an implication
306+
(`all P x => all Q x`), use `!allP` — otherwise the subsequent
307+
`=> H j Hj` fails with "nothing to introduce" because the conclusion
308+
is still wrapped in `all`.
309+
- **`congr` vs `congr 1`.** `congr` walks down an array equality all
310+
the way to list/element equality; `congr 1` takes one structural
311+
step. Reach for `congr 1` when `congr` descends too far and exposes
312+
stale `init`/`mkseq` structure.
313+
314+
### Reconciling syntactically distinct constants
315+
316+
When a word-type modulus (`W13.modulus`, `W8.modulus`, …) appears on
317+
one side of a goal and its numeric literal (`8192`, `256`, …) on the
318+
other — for instance after one subterm was lowered through a circuit
319+
lemma while the other came from the spec — `smt()` treats them as
320+
distinct atoms and won't bridge. Normalize explicitly:
321+
322+
```easycrypt
323+
have W13_eq : W13.modulus = 8192 by smt(W13.ge2_modulus).
324+
rewrite W13_eq.
325+
```
326+
327+
Same pattern for `modulus_W`, cardinalities of finite types, and other
328+
constants that have both a symbolic and a numeric form.
329+
330+
### Ring/field commutativity: use the fully-qualified path
331+
332+
Bare `mulrC`/`addrC` may not resolve against a concrete ring clone's
333+
theory. When working in a cloned ring (`Zq`, `Fq`, polynomial rings
334+
over Zq, …), the fully-qualified name of the `ComRing`/`ZModule` lemma
335+
usually succeeds where the short form fails:
336+
337+
```easycrypt
338+
by rewrite Zq.ComRing.mulrC.
339+
```
340+
341+
Use `SEARCH` to find the right path:
342+
343+
```
344+
SEARCH (_ * _) (commutative _)
345+
```
249346

250347
## EasyCrypt language overview
251348

scripts/llm/README.md

Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
# LLM-oriented tooling for `easycrypt llm`
2+
3+
This directory contains two example clients for the interactive REPL
4+
exposed by `easycrypt llm` (see `doc/llm/CLAUDE.md` for the wire
5+
protocol). They are layered: each can be useful independently, depending
6+
on how your driver (human, script, or LLM agent) is structured.
7+
8+
| Layer | File | Use when |
9+
|-------|------|----------|
10+
| Python binding | `ec_llm.py` | Writing a Python program that drives one EasyCrypt REPL. |
11+
| Fifo daemon | `daemon.py` + `ec_send.sh` | Driving one persistent REPL from *many* short-lived shell invocations — notably LLM agents that issue one bash tool call per tactic. |
12+
13+
## Why a fifo daemon?
14+
15+
`easycrypt llm` is a stateful, long-running process. Python programs can
16+
keep its stdin/stdout pipe open across commands, but many LLM-agent
17+
front-ends (Claude Code, Codex CLI, etc.) only expose the shell via
18+
one-shot command execution. Without a persistent intermediary, every
19+
tactic would require re-LOADing the file prefix — minutes wasted per
20+
step on a large project.
21+
22+
The daemon opens `easycrypt llm` once, listens on a named pipe for JSON
23+
requests, and writes JSON responses to a second named pipe. The
24+
companion `ec_send.sh` is the shell-side client: it takes a single JSON
25+
request, forwards it through the pipes, and prints the structured
26+
response. Combined, they let an agent drive a persistent REPL with one
27+
bash call per step.
28+
29+
## Python binding (Layer 1)
30+
31+
```python
32+
from ec_llm import ECLLM
33+
34+
with ECLLM(cwd="/path/to/project", extra_args=["-p", "Z3"]) as ec:
35+
ok, resp = ec.load("proofs/myfile.ec", 42, nosmt=True)
36+
print(ec.goals())
37+
ok, resp = ec.tactic("smt().")
38+
print(resp)
39+
```
40+
41+
Everything in the `easycrypt llm` protocol is exposed as a method:
42+
`load`, `checkpoint`, `revert`, `undo`, `goals`, `goals_all`, `tactic`,
43+
`tactic_multiline`, `search`, `quiet`, `close`. Status parsing tracks
44+
the running uuid in `ec.uuid`.
45+
46+
CLI mode provides a quick "LOAD and report" check:
47+
48+
```
49+
python3 ec_llm.py proofs/myfile.ec 42 --cwd /path/to/project --nosmt
50+
```
51+
52+
## Fifo daemon (Layer 2)
53+
54+
Start the daemon once (most easily in the background):
55+
56+
```
57+
# Create the fifos if needed and start the daemon
58+
mkfifo /tmp/ec_in /tmp/ec_out 2>/dev/null || true
59+
python3 scripts/llm/daemon.py \
60+
--cwd /path/to/project \
61+
--prover Z3 --timeout 5 &
62+
```
63+
64+
Drive it with `ec_send.sh`:
65+
66+
```
67+
./ec_send.sh '{"op":"load","arg":"proofs/myfile.ec","arg2":"42","nosmt":true}'
68+
./ec_send.sh '{"op":"goals"}'
69+
./ec_send.sh '{"op":"tactic","arg":"smt()."}'
70+
./ec_send.sh '{"op":"checkpoint","arg":"before_split"}'
71+
./ec_send.sh '{"op":"tactic","arg":"split."}'
72+
./ec_send.sh '{"op":"revert","arg":"before_split"}'
73+
```
74+
75+
For payloads with shell-hostile content (EC escapes like `/\\`),
76+
write the JSON to a file and use `--file`:
77+
78+
```
79+
cat > /tmp/req.json <<'EOF'
80+
{"op":"tactic","arg":"rewrite /wpoly_srng !allP => H j Hj; smt(hint)."}
81+
EOF
82+
./ec_send.sh --file /tmp/req.json
83+
```
84+
85+
## Operational tips
86+
87+
- **Per-session setup.** Create fifos once per session. The daemon keeps
88+
them; don't delete them while it's running.
89+
- **Recovering from a dead daemon.** If the EasyCrypt process crashes
90+
(stack overflow, OOM on SMT), `ec_send.sh` will hang. Kill the daemon
91+
process, recreate the fifos (`rm -f /tmp/ec_in /tmp/ec_out; mkfifo
92+
/tmp/ec_in /tmp/ec_out`), and restart.
93+
- **Fast SMT feedback.** Pass `--timeout 1 --prover Z3` to the daemon
94+
during exploration; fragile SMT calls surface immediately. Raise the
95+
timeout when codifying a final proof.
96+
- **Parallel sessions.** To run multiple daemons side-by-side, give each
97+
a distinct pair of fifo paths and set `EC_IN_FIFO` / `EC_OUT_FIFO`
98+
for the `ec_send.sh` caller.
99+
- **Logging.** Exceptions inside the daemon are written to `/tmp/ec_daemon.log`
100+
(or `--log`). Check there first if a request seems silently dropped.
101+
102+
## Security note
103+
104+
The daemon trusts whatever shows up on its input fifo — it forwards the
105+
`arg` payload straight to EasyCrypt as tactic text. Don't expose the
106+
fifos on a shared filesystem, and don't run the daemon as a privileged
107+
user.

scripts/llm/daemon.py

Lines changed: 142 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,142 @@
1+
#!/usr/bin/env python3
2+
"""
3+
Fifo-backed daemon exposing one persistent `easycrypt llm` instance to many
4+
shell callers. Typical use case: an LLM agent (Claude Code, etc.) that
5+
invokes a bash tool *once per tactic* — the agent can't hold stdin/stdout
6+
open across turns, so without a daemon every tactic would re-LOAD the
7+
whole file prefix.
8+
9+
Protocol (matches the JSON line format produced by `ec_send.sh`):
10+
11+
request (one line on stdin-fifo):
12+
{"op": "<op>", "arg": "...", "arg2": "...", "nosmt": bool}
13+
14+
response (one line on stdout-fifo):
15+
{"ok": bool, "uuid": N, "resp": "..."}
16+
17+
Supported ops:
18+
load arg=path, arg2=line (string or int), optional nosmt: bool
19+
tactic arg="proc."
20+
tactic_multiline arg="<multi-line body>" (wraps in <BEGIN>/<DONE>)
21+
undo
22+
revert arg="<uuid-or-name>"
23+
checkpoint arg="<name>"
24+
goals
25+
goals_all
26+
search arg="<pattern>"
27+
quiet arg="on"|"off"
28+
close
29+
30+
Usage:
31+
# start daemon (in background)
32+
python3 scripts/llm/daemon.py --cwd /path/to/project \\
33+
--in-fifo /tmp/ec_in --out-fifo /tmp/ec_out &
34+
35+
# then drive with scripts/llm/ec_send.sh (see that file for details)
36+
37+
The fifos are created if they don't exist. Only one connected writer and
38+
one connected reader are expected at a time — the client (`ec_send.sh`)
39+
enforces this by opening them sequentially.
40+
"""
41+
42+
import argparse
43+
import json
44+
import os
45+
import sys
46+
import traceback
47+
48+
HERE = os.path.dirname(os.path.abspath(__file__))
49+
sys.path.insert(0, HERE)
50+
from ec_llm import ECLLM # noqa: E402
51+
52+
53+
def _ensure_fifo(path):
54+
if not os.path.exists(path):
55+
os.mkfifo(path)
56+
elif not os.path.exists(path): # raced out; recreate
57+
os.mkfifo(path)
58+
59+
60+
def _handle(req, ec):
61+
op = req.get("op")
62+
arg = req.get("arg", "")
63+
arg2 = req.get("arg2", "")
64+
65+
if op == "load":
66+
ok, resp = ec.load(arg, int(arg2), nosmt=bool(req.get("nosmt", False)))
67+
elif op == "tactic":
68+
ok, resp = ec.tactic(arg)
69+
elif op == "tactic_multiline":
70+
ok, resp = ec.tactic_multiline(arg)
71+
elif op == "undo":
72+
ok, resp = ec.undo()
73+
elif op == "revert":
74+
ok, resp = ec.revert(arg)
75+
elif op == "checkpoint":
76+
ok, resp = ec.checkpoint(arg)
77+
elif op == "goals":
78+
resp, ok = ec.goals(), True
79+
elif op == "goals_all":
80+
resp, ok = ec.goals_all(), True
81+
elif op == "search":
82+
resp, ok = ec.search(arg), True
83+
elif op == "quiet":
84+
ok, resp = ec.quiet(on=(arg.lower() == "on"))
85+
elif op == "close":
86+
ec.close()
87+
return {"ok": True, "uuid": ec.uuid, "resp": "bye"}, True
88+
else:
89+
return {"ok": False, "uuid": ec.uuid, "resp": f"unknown op: {op}"}, False
90+
91+
return {"ok": ok, "uuid": ec.uuid, "resp": resp}, False
92+
93+
94+
def main():
95+
p = argparse.ArgumentParser(description=__doc__.splitlines()[1])
96+
p.add_argument("--cwd", default=None, help="Project root (default: cwd)")
97+
p.add_argument("--in-fifo", default="/tmp/ec_in", help="Request fifo path")
98+
p.add_argument("--out-fifo", default="/tmp/ec_out", help="Response fifo path")
99+
p.add_argument("--log", default="/tmp/ec_daemon.log", help="Log file")
100+
p.add_argument("--prover", default=None, help="SMT prover (e.g. Z3)")
101+
p.add_argument("--timeout", type=int, default=None, help="Per-goal SMT timeout (s)")
102+
args = p.parse_args()
103+
104+
_ensure_fifo(args.in_fifo)
105+
_ensure_fifo(args.out_fifo)
106+
107+
extra = []
108+
if args.prover:
109+
extra += ["-p", args.prover]
110+
if args.timeout is not None:
111+
extra += ["-timeout", str(args.timeout)]
112+
113+
ec = ECLLM(cwd=args.cwd, extra_args=extra)
114+
with open(args.log, "w") as lf:
115+
lf.write(f"EC started. uuid={ec.uuid}\n")
116+
117+
while True:
118+
try:
119+
# Opening a fifo for reading blocks until a writer opens it.
120+
with open(args.in_fifo, "r") as fi:
121+
line = fi.readline()
122+
if not line.strip():
123+
continue
124+
req = json.loads(line)
125+
result, should_exit = _handle(req, ec)
126+
with open(args.out_fifo, "w") as fo:
127+
fo.write(json.dumps(result) + "\n")
128+
if should_exit:
129+
break
130+
except Exception:
131+
with open(args.log, "a") as lf:
132+
lf.write("EXCEPTION: " + traceback.format_exc() + "\n")
133+
try:
134+
with open(args.out_fifo, "w") as fo:
135+
fo.write(json.dumps({"ok": False, "uuid": -1,
136+
"resp": traceback.format_exc()}) + "\n")
137+
except Exception:
138+
pass
139+
140+
141+
if __name__ == "__main__":
142+
main()

0 commit comments

Comments
 (0)