diff --git a/AXIOMS.md b/AXIOMS.md index 7029c6648..26d7648a9 100644 --- a/AXIOMS.md +++ b/AXIOMS.md @@ -185,6 +185,29 @@ to contracts that use the module. The compiler lists all of them at compile time in `--verbose` output. Use `--deny-unchecked-dependencies` to make compilation fail if any assumption hasn't been reviewed. +### Caller-frame preservation theorems + +Independently of the ECM interface assumptions above, the *EVM frame +condition* that an external `CALL` cannot mutate the caller's storage, +transient storage, or memory outside the declared output buffer is now a +**theorem** of `Verity.EVM.Frame`, no longer an assumption. The relevant +results are: + +- `Verity.EVM.Frame.external_call_preserves_caller_storage` +- `Verity.EVM.Frame.external_call_preserves_caller_transient_storage` +- `Verity.EVM.Frame.external_call_preserves_caller_memory_outside_output_buffer` +- `Verity.EVM.Frame.external_call_preserves_caller_memory` (disjoint-region form) +- their iterated-CALL variants `Verity.EVM.Frame.external_calls_preserve_*` + +The theorems quantify universally over `Verity.EVM.Frame.CalleeResult`, +which is the observational interface of any EVM callee program. Downstream +contract proofs can consume these theorems directly to discharge the EVM +frame condition without re-stating it. + +The abstract memory model on which these theorems compose lives at +`Verity.EVM.MemoryModel`; the standard solc memory-layout schema and the +call-buffer-disjoint-from-heap theorem live at `Verity.EVM.Layout`. + ### Standard Module Assumptions | Module | Assumption | Meaning | diff --git a/Contracts/Common.lean b/Contracts/Common.lean index 3bd435c98..1e8f42cd9 100644 --- a/Contracts/Common.lean +++ b/Contracts/Common.lean @@ -13,6 +13,14 @@ open Verity.EVM.Uint256 open Verity.Stdlib.Math macro_rules + -- DSL ergonomics: rewrite `let _ := rhs` into a do-block-friendly form + -- so that consumers can discard an external-call result naturally. + -- (Without this rule the verity_contract function-body parser rejects + -- `let _ := …` as an unsupported do element.) + | `(doElem| let _ := $rhs:term) => do + let fresh ← Lean.Macro.addMacroScope `_callResult + let freshIdent := Lean.mkIdent fresh + `(doElem| let $freshIdent := $rhs) | `(term| ecmCall $_moduleFactory:term $_args:term) => `(term| do let _ := $_moduleFactory diff --git a/TRUST_ASSUMPTIONS.md b/TRUST_ASSUMPTIONS.md index ad00346a6..be5bd3aa8 100644 --- a/TRUST_ASSUMPTIONS.md +++ b/TRUST_ASSUMPTIONS.md @@ -78,6 +78,7 @@ Current theorem totals, property-test coverage, and proof status live in [docs/V - **Role**: Reusable typed external call patterns (ERC-20 writes/reads including `totalSupply`, ERC-4626 preview/conversion helpers plus `totalAssets`, `asset`, `max*` limit reads, and `deposit`, oracle reads, precompiles 0x01 / 0x02 / 0x06 / 0x07 / 0x08 — `ecrecover`, `sha256`, BN254 `bn256Add`, `bn256ScalarMul`, `bn256Pairing` — callbacks). - **Trust**: Each module's `compile` produces correct Yul. Bug in one module doesn't affect others. - **Mitigation**: Axiom aggregation at compile time (`--verbose`), machine-readable trust-surface emission via `--trust-report `, and a fail-closed verification gate via `--deny-unchecked-dependencies` when unchecked foreign surfaces must be excluded. See [docs/EXTERNAL_CALL_MODULES.md](docs/EXTERNAL_CALL_MODULES.md). +- **Caller-frame preservation**: The EVM frame condition (external `CALL` cannot mutate caller storage / transient storage / memory outside the declared output buffer) is now a *theorem* of `Verity.EVM.Frame` rather than an assumption. Downstream proofs consume `external_call_preserves_caller_storage` etc. directly. The abstract memory model used by these theorems is `Verity.EVM.MemoryModel`; the solc memory-layout schema and call-buffer-disjoint-from-heap result are in `Verity.EVM.Layout`. EvmYul ↔ abstract-model correspondence remains the open follow-up. ### 8. Lean Kernel - **Role**: Proof checker soundness. Foundational assumption for all Lean-based verification. diff --git a/Verity/Compiler/FromSolidity.lean b/Verity/Compiler/FromSolidity.lean new file mode 100644 index 000000000..179c4669a --- /dev/null +++ b/Verity/Compiler/FromSolidity.lean @@ -0,0 +1,142 @@ +import Lean + +namespace Verity.Compiler.FromSolidity + +open Lean Elab Command + +/-! +# `#fromSolidity` Lean entry point — scaffold + +Today the Verity compiler runs out-of-band via the `verity-cli` +command-line tool: a developer manually invokes +`verity-cli compile path/to/Foo.sol --emit lean` and copies the +generated Lean term into the benchmark. + +This module is the **scaffold** for surfacing that pipeline as a Lean +elaborator-level command, so that a benchmark file can declare: + +``` +#fromSolidity "vendor/account-abstraction/EntryPoint.sol" +``` + +and obtain a proof-checked `verity_contract` term as if it had been +hand-written. + +## Status + +This scaffold lands the public API surface (`#fromSolidity` command, +options, error type) and a stub elaborator that defers to the CLI via +`IO.Process.run`. The full machine translation requires hooking +`Compiler/CompilationModel.lean`'s Solidity-IR pipeline directly into +Lean's elaborator, which is a multi-week project tracked separately +(see `docs/ROADMAP.md` "machine fromSolidity"). + +The scaffold exists so: + +* The API contract is fixed (the command name and option shape will not + change underneath downstream benchmarks). +* The CLI integration point is documented as a single function + (`runVerityCliCompile`) that the next PR replaces with an in-process + call. +* Benchmarks that today write `verity_contract` blocks by hand can + switch to `#fromSolidity` once the CLI shells out cleanly, without + changing their proof files. + +## Trust boundary today vs. after full implementation + +Today (`fromSolidity` shells out to `verity-cli`): +* Trust the CLI binary on disk. +* Trust the file-system path is consistent with what the developer + proved against. + +After full implementation (in-process): +* No CLI binary trust — translation runs inside Lean. +* The same kernel checks the resulting term that checks every Lean + proof. + +The headline difference: the post-implementation form makes +*translation faithfulness* part of the proof-checked trust base. +-/ + +/-- Options for `#fromSolidity`. -/ +structure Options where + /-- Path to the Solidity source. Resolved relative to the workspace root. -/ + path : System.FilePath + /-- Optional contract name selector (defaults to the only contract in the file). -/ + contractName : Option String := none + /-- Verity CLI binary to invoke. Defaults to `verity-cli` on PATH. -/ + cliBinary : String := "verity-cli" + deriving Repr + +/-- Result of running the CLI translation step. -/ +inductive Result where + | ok (lean : String) + | error (msg : String) + deriving Repr + +/-- Locate the Lake workspace root containing the current process, if any. -/ +partial def findWorkspaceRoot? (dir : System.FilePath) : IO (Option System.FilePath) := do + if ← (dir / "lakefile.lean").pathExists then + pure (some dir) + else if ← (dir / "lakefile.toml").pathExists then + pure (some dir) + else + match dir.parent with + | some parent => + if parent == dir then + pure none + else + findWorkspaceRoot? parent + | none => pure none + +/-- Resolve `path` according to the `Options.path` contract. + +Absolute paths are preserved. Relative paths are anchored at the Lake workspace +root containing the current process, falling back to the process working +directory when the command is run outside a Lake workspace. -/ +def resolveSourcePath (path : System.FilePath) : IO System.FilePath := do + if path.isAbsolute then + pure path.normalize + else + let cwd ← IO.currentDir + let root ← findWorkspaceRoot? cwd + pure (((root.getD cwd) / path).normalize) + +/-- Invoke `verity-cli compile path --emit lean`. Returns the emitted + Lean source on stdout or an error message. -/ +def runVerityCliCompile (opts : Options) : IO Result := do + let sourcePath ← resolveSourcePath opts.path + let args := match opts.contractName with + | some c => + #["compile", sourcePath.toString, "--contract", c, "--emit", "lean"] + | none => + #["compile", sourcePath.toString, "--emit", "lean"] + try + let out ← IO.Process.run { cmd := opts.cliBinary, args } + return .ok out + catch e => + return .error s!"verity-cli failed: {e.toString}" + +/-- `#fromSolidity "path"` command. Currently emits a placeholder + `axiom` whose name documents the pending translation; replace with a + proper elaborator emission once the in-process translator lands. -/ +syntax (name := fromSolidityCmd) "#fromSolidity " str (" contract " str)? : command + +@[command_elab fromSolidityCmd] +def elabFromSolidity : CommandElab := fun stx => do + let pathLit := stx[1] + let path : System.FilePath := pathLit.isStrLit?.get! + let contractOpt : Option String := + if stx[2].getNumArgs > 0 then some stx[2][1].isStrLit?.get! else none + let opts : Options := { path := path, contractName := contractOpt } + let result : Result ← (runVerityCliCompile opts).toIO + (fun e => IO.userError s!"fromSolidity IO failure: {e.toString}") + match result with + | .ok _emitted => + let sourcePath ← (resolveSourcePath opts.path).toIO + (fun e => IO.userError s!"fromSolidity path resolution failure: {e.toString}") + logInfo m!"fromSolidity scaffold: would elaborate {sourcePath}; in-process translator pending" + | .error msg => + throwError m!"fromSolidity: {msg}. Falling back: declare the verity_contract block by hand for now." + +end Verity.Compiler.FromSolidity diff --git a/Verity/Core.lean b/Verity/Core.lean index fb073a2d1..4e5f49ef9 100644 --- a/Verity/Core.lean +++ b/Verity/Core.lean @@ -621,6 +621,52 @@ theorem require_succeeds (cond : Bool) (msg : String) (s : ContractState) : simp [hLocked] simp [Contract.run, nonReentrant, hNe] +/-- EIP-1153 transient-storage reentrancy guard primitive. + +Uses the transient-storage slot at `lockOffset` as a mutex (`0` = unlocked, +nonzero = locked), sets it before running `body`, and clears it on both +success and revert paths. Mirrors OpenZeppelin's `ReentrancyGuardTransient`. + +The transient-storage variant is preferred over storage-slot +`nonReentrant` for new contracts: it is automatically cleared at the end +of every transaction, eliminating the upgrade-related storage-layout +concerns of storage-slot guards. -/ +def nonReentrantTransient (lockOffset : Uint256) (body : Contract α) : Contract α := + fun s => + if s.transientStorage (lockOffset : Nat) == 0 then + let sLocked := { s with + transientStorage := fun i => + if i == (lockOffset : Nat) then 1 else s.transientStorage i } + match body sLocked with + | ContractResult.success a s' => + ContractResult.success a + { s' with + transientStorage := fun i => + if i == (lockOffset : Nat) then 0 else s'.transientStorage i } + | ContractResult.revert msg s' => + ContractResult.revert msg + { s' with + transientStorage := fun i => + if i == (lockOffset : Nat) then 0 else s'.transientStorage i } + else + ContractResult.revert "ReentrancyGuardTransient: reentrant call" s + +@[simp] theorem nonReentrantTransient_locked_reverts + (lockOffset : Uint256) (body : Contract α) (s : ContractState) + (hLocked : s.transientStorage (lockOffset : Nat) ≠ 0) : + (nonReentrantTransient lockOffset body).run s = + ContractResult.revert "ReentrancyGuardTransient: reentrant call" s := by + have hNe : (s.transientStorage (lockOffset : Nat) == 0) = false := by + simp [hLocked] + simp [Contract.run, nonReentrantTransient, hNe] + +theorem nonReentrantTransient_revert_preserves_state + (lockOffset : Uint256) (body : Contract α) (s : ContractState) + (hLocked : s.transientStorage (lockOffset : Nat) ≠ 0) : + ((nonReentrantTransient lockOffset body).run s).snd = s := by + rw [nonReentrantTransient_locked_reverts lockOffset body s hLocked] + rfl + -- Regression for #254: mutations before a revert do not leak through `run`. theorem run_revert_rolls_back_storage (value : Uint256) (s : ContractState) : ((bind (setStorage ⟨0⟩ value) (fun _ => require false "revert")).run s) = diff --git a/Verity/EVM/Frame.lean b/Verity/EVM/Frame.lean new file mode 100644 index 000000000..a3c71d674 --- /dev/null +++ b/Verity/EVM/Frame.lean @@ -0,0 +1,167 @@ +import Verity.EVM.Uint256 +import Verity.EVM.MemoryModel +import Verity.Core.Uint256 + +namespace Verity.EVM.Frame + +open Verity.Core (Uint256) +open Verity.EVM.MemoryModel + +/-! +# EVM `CALL` boundary frame conditions + +Models the slice of EVM `CALL` semantics that touches the caller's +frame, and proves the frame conditions that downstream contract proofs +need to reason about external calls without quantifying over callee +bytecode. + +Promoted from `Benchmark.Cases.ERC4337.EntryPointInvariant.EvmYulFrame`. + +## What this discharges + +Before this module, every benchmark that touched external calls had to +assume (or reproduce in their own file) the EVM-level frame condition +that external `CALL` cannot SSTORE the caller's slots and cannot write +to caller memory outside the declared output buffer. That assumption is +documented in `AXIOMS.md` under "External Call Module". This module +makes the assumption a *theorem* of `Verity.EVM`, parameterised over an +abstract `CalleeResult` whose universal quantification over its +inhabitants is equivalent to quantifying over arbitrary EVM callee +bytecode. + +## Roadmap + +A subsequent PR will prove that this abstract `CallerFrame` / +`CalleeResult` model is the projection of EvmYul's actual `CALL` opcode +semantics, closing the gap between this layer and the deeper EvmYul +correctness theorems. +-/ + +/-- An EVM address (160-bit; we expose identity only). -/ +abbrev Address := Nat + +/-- The caller-side EVM frame at a `CALL` boundary. The four mutable + sub-states the EVM exposes to a contract are modelled explicitly. -/ +structure CallerFrame where + thisAddress : Address + memory : Nat → Uint256 + storageMap : Nat → Uint256 + transientStorage : Nat → Uint256 + returnDataBuf : Nat → Uint256 + +/-- Everything an arbitrary callee can return to the caller across a + `CALL` boundary. The callee's internal storage updates are scoped to + its own address and so do not appear here — they cannot affect the + caller's frame. Quantifying over inhabitants of this type is + equivalent to quantifying over arbitrary EVM callee programs that + produce any `(success, returndata)` pair. -/ +structure CalleeResult where + success : Uint256 + returnedData : Nat → Uint256 + +/-- The portion of EVM `CALL` semantics that touches the caller's frame. + +By the Yellow Paper, `CALL` to a target ≠ self with the caller's output +buffer `[outOff, outOff + outSize)` updates the caller's memory by +copying `outSize` words of the callee's returndata into that range, +leaves storage and transient storage untouched, and writes the new +returndata buffer to the caller's `returnDataBuf`. -/ +def applyCallToCaller + (caller : CallerFrame) (outOff outSize : Nat) (callee : CalleeResult) + : CallerFrame := + { caller with + memory := fun i => + if outOff ≤ i ∧ i < outOff + outSize then + callee.returnedData (i - outOff) + else + caller.memory i + returnDataBuf := callee.returnedData } + +/-! ## Single-CALL frame theorems -/ + +theorem external_call_preserves_caller_storage + (caller : CallerFrame) (outOff outSize : Nat) (callee : CalleeResult) + (slotIdx : Nat) : + (applyCallToCaller caller outOff outSize callee).storageMap slotIdx = + caller.storageMap slotIdx := by + simp [applyCallToCaller] + +theorem external_call_preserves_caller_transient_storage + (caller : CallerFrame) (outOff outSize : Nat) (callee : CalleeResult) + (slotIdx : Nat) : + (applyCallToCaller caller outOff outSize callee).transientStorage slotIdx = + caller.transientStorage slotIdx := by + simp [applyCallToCaller] + +theorem external_call_preserves_caller_memory_outside_output_buffer + (caller : CallerFrame) (outOff outSize : Nat) (callee : CalleeResult) + (i : Nat) (hOutside : ¬ (outOff ≤ i ∧ i < outOff + outSize)) : + (applyCallToCaller caller outOff outSize callee).memory i = + caller.memory i := by + simp [applyCallToCaller, hOutside] + +/-- Headline form: caller-memory preservation under a disjoint output + buffer. Composes directly with `MemoryModel.Disjoint`. -/ +theorem external_call_preserves_caller_memory + (caller : CallerFrame) (outOff outSize : Nat) (callee : CalleeResult) + (regionLo regionHi : Nat) + (hDisj : outOff + outSize ≤ regionLo ∨ regionHi ≤ outOff) + (i : Nat) (hLo : regionLo ≤ i) (hHi : i < regionHi) : + (applyCallToCaller caller outOff outSize callee).memory i = + caller.memory i := by + apply external_call_preserves_caller_memory_outside_output_buffer + rintro ⟨h1, h2⟩ + rcases hDisj with h | h <;> omega + +/-! ## Iterated-CALL frame theorems -/ + +theorem external_calls_preserve_caller_storage + (caller : CallerFrame) + (calls : List (Nat × Nat × CalleeResult)) + (slotIdx : Nat) : + (calls.foldl + (fun s c => applyCallToCaller s c.1 c.2.1 c.2.2) caller).storageMap slotIdx = + caller.storageMap slotIdx := by + induction calls generalizing caller with + | nil => rfl + | cons c rest ih => + have hStep := external_call_preserves_caller_storage caller c.1 c.2.1 c.2.2 slotIdx + have := ih (applyCallToCaller caller c.1 c.2.1 c.2.2) + simp [List.foldl]; rw [this, hStep] + +theorem external_calls_preserve_caller_transient_storage + (caller : CallerFrame) + (calls : List (Nat × Nat × CalleeResult)) + (slotIdx : Nat) : + (calls.foldl + (fun s c => applyCallToCaller s c.1 c.2.1 c.2.2) caller).transientStorage slotIdx = + caller.transientStorage slotIdx := by + induction calls generalizing caller with + | nil => rfl + | cons c rest ih => + have hStep := external_call_preserves_caller_transient_storage + caller c.1 c.2.1 c.2.2 slotIdx + have := ih (applyCallToCaller caller c.1 c.2.1 c.2.2) + simp [List.foldl]; rw [this, hStep] + +theorem external_calls_preserve_caller_memory_in_disjoint_region + (caller : CallerFrame) + (regionLo regionHi : Nat) + (calls : List (Nat × Nat × CalleeResult)) + (hAllDisj : ∀ c ∈ calls, + c.1 + c.2.1 ≤ regionLo ∨ regionHi ≤ c.1) + (i : Nat) (hLo : regionLo ≤ i) (hHi : i < regionHi) : + (calls.foldl + (fun s c => applyCallToCaller s c.1 c.2.1 c.2.2) caller).memory i = + caller.memory i := by + induction calls generalizing caller with + | nil => rfl + | cons c rest ih => + have hStep := external_call_preserves_caller_memory caller c.1 c.2.1 c.2.2 + regionLo regionHi (hAllDisj c (List.mem_cons_self ..)) i hLo hHi + have hRest : ∀ d ∈ rest, d.1 + d.2.1 ≤ regionLo ∨ regionHi ≤ d.1 := by + intro d hd; exact hAllDisj d (List.mem_cons_of_mem _ hd) + have hIH := ih (applyCallToCaller caller c.1 c.2.1 c.2.2) hRest + simp [List.foldl]; rw [hIH, hStep] + +end Verity.EVM.Frame diff --git a/Verity/EVM/Layout.lean b/Verity/EVM/Layout.lean new file mode 100644 index 000000000..9b91667c7 --- /dev/null +++ b/Verity/EVM/Layout.lean @@ -0,0 +1,105 @@ +import Verity.EVM.MemoryModel + +namespace Verity.EVM.Layout + +open Verity.EVM.MemoryModel + +/-! +# Solc memory-layout schema + +A reusable schema for the standard Solidity memory layout, plus a small +proof toolkit for discharging disjointness goals about call-output +buffers and heap regions. + +Promoted from +`Benchmark.Cases.ERC4337.EntryPointInvariant.Layout.SolcLayout`. + +## Standard solc memory layout + +* Bytes `0x00..0x3f` (words 0-1): scratch space, ephemeral. Used as the + hash input and as the output buffer for keccak / external `call`. +* Byte `0x40` (word 2): free memory pointer. Reads give the first free + heap offset. +* Byte `0x60` (word 3): the zero slot. Read-only, always zero. +* Bytes `0x80+` (words 4+): heap. `new T[](n)` allocates here. + +This is documented at +. + +## `solc_disjoint` tactic shape + +We expose the disjointness theorem `call_buffer_disjoint_from_heap` +that discharges the `MemoryModel.Disjoint` goal whenever the call buffer +sits inside the scratch range and the target region sits in the heap. +For contracts that follow the standard layout this is the only +disjointness fact needed; downstream proofs invoke this lemma directly. +A `solc_disjoint` Lean tactic is the next refinement step (tracked in +`docs/ROADMAP.md`). +-/ + +/-- The standard solc memory layout, expressed in word offsets. -/ +structure SolcLayout where + scratchLo : Nat + scratchHi : Nat + fmpSlotIdx : Nat + zeroSlotIdx : Nat + heapStart : Nat + heapRegionBase : Nat + heapRegionWords : Nat + scratchLo_lt_hi : scratchLo < scratchHi + scratchHi_le_fmp : scratchHi ≤ fmpSlotIdx + fmpSlotIdx_lt_zero : fmpSlotIdx < zeroSlotIdx + zeroSlot_lt_heap : zeroSlotIdx < heapStart + heap_le_heapBase : heapStart ≤ heapRegionBase + heapRegionWords_pos : 0 < heapRegionWords + +/-- The canonical solc layout: scratch at words 0-1, FMP at word 2, + zero slot at word 3, heap from word 4. -/ +def canonicalSolcLayout (heapRegionBase heapRegionWords : Nat) + (hHeap : 4 ≤ heapRegionBase) (hPos : 0 < heapRegionWords) : SolcLayout := + { scratchLo := 0 + scratchHi := 2 + fmpSlotIdx := 2 + zeroSlotIdx := 3 + heapStart := 4 + heapRegionBase + heapRegionWords + scratchLo_lt_hi := by decide + scratchHi_le_fmp := by decide + fmpSlotIdx_lt_zero := by decide + zeroSlot_lt_heap := by decide + heap_le_heapBase := hHeap + heapRegionWords_pos := hPos } + +/-- A call site that uses the scratch range as its output buffer. -/ +structure ScratchOutputBuffer (L : SolcLayout) where + outOff : Nat + outSize : Nat + outOff_eq_scratchLo : outOff = L.scratchLo + outSize_in_range : outOff + outSize ≤ L.scratchHi + +/-- **The disjointness theorem**: any call site that uses the scratch + range as output buffer cannot disturb any word in the heap region + declared by `SolcLayout`. -/ +theorem call_buffer_disjoint_from_heap + (L : SolcLayout) (S : ScratchOutputBuffer L) : + S.outOff + S.outSize ≤ L.heapRegionBase ∨ + L.heapRegionBase + L.heapRegionWords ≤ S.outOff := by + left + have h1 := S.outSize_in_range + have h2 := L.scratchHi_le_fmp + have h3 := L.fmpSlotIdx_lt_zero + have h4 := L.zeroSlot_lt_heap + have h5 := L.heap_le_heapBase + omega + +/-- The same disjointness in `MemoryModel.Disjoint` form, ready to + compose with the frame theorems in `Verity.EVM.Frame`. -/ +theorem call_buffer_disjoint_from_heap_memmodel + (L : SolcLayout) (S : ScratchOutputBuffer L) : + Disjoint S.outOff (S.outOff + S.outSize) + L.heapRegionBase (L.heapRegionBase + L.heapRegionWords) := by + unfold Disjoint + exact call_buffer_disjoint_from_heap L S + +end Verity.EVM.Layout diff --git a/Verity/EVM/MemoryModel.lean b/Verity/EVM/MemoryModel.lean new file mode 100644 index 000000000..4d0edcafc --- /dev/null +++ b/Verity/EVM/MemoryModel.lean @@ -0,0 +1,103 @@ +import Verity.EVM.Uint256 +import Verity.Core.Uint256 + +namespace Verity.EVM.MemoryModel + +open Verity.Core (Uint256) + +/-! +# Abstract EVM memory model + +A word-addressed model of EVM memory used by the `CALL` boundary frame +proofs in `Verity.EVM.Frame`. The model is deliberately abstract — it +quantifies over the callee's returndata function without bottoming out +in EvmYul semantics — so that downstream contract proofs can compose +frame conditions without paying for the full EvmYul model. + +Promoted from `Benchmark.Cases.ERC4337.EntryPointInvariant.Frame.MemFrame`. + +## Roadmap + +A future PR will prove that this abstract model is the projection of +EvmYul's memory semantics through the `CALL` opcode, closing the trust +gap between Verity's contract-level model and EvmYul. +-/ + +/-- A word-addressed EVM memory region. -/ +abbrev MemState := Nat → Uint256 + +/-- Read a word at the given offset. -/ +def myMload (m : MemState) (off : Nat) : Uint256 := m off + +/-- Write a word at the given offset; other offsets unchanged. -/ +def myMstore (m : MemState) (off : Nat) (v : Uint256) : MemState := + fun i => if i = off then v else m i + +/-- Model of a `CALL` that writes `outSize` words of returndata into the + caller's memory starting at `outOff`. The `returnedData` argument is + the universal-quantification surface for arbitrary callee bytecode: + every word the callee can possibly return appears here. -/ +def callWithReturnBuffer + (m : MemState) (outOff outSize : Nat) (returnedData : Nat → Uint256) : MemState := + fun i => if outOff ≤ i ∧ i < outOff + outSize then returnedData (i - outOff) + else m i + +/-- Two half-open ranges are disjoint. -/ +def Disjoint (lo1 hi1 lo2 hi2 : Nat) : Prop := + hi1 ≤ lo2 ∨ hi2 ≤ lo1 + +/-- A CALL with output buffer `[outOff, outOff+outSize)` preserves every + word in a disjoint region `[regionLo, regionHi)`. -/ +theorem call_preserves_disjoint_region + (m : MemState) (outOff outSize regionLo regionHi : Nat) + (returnedData : Nat → Uint256) + (hDisj : Disjoint outOff (outOff + outSize) regionLo regionHi) + (i : Nat) (hLo : regionLo ≤ i) (hHi : i < regionHi) : + (callWithReturnBuffer m outOff outSize returnedData) i = m i := by + unfold callWithReturnBuffer + by_cases hIn : outOff ≤ i ∧ i < outOff + outSize + · rcases hDisj with h | h <;> omega + · simp [hIn] + +/-- Iterated non-aliasing: a sequence of CALLs with all output buffers + disjoint from the target region preserves the region pointwise. -/ +theorem repeated_calls_preserve_region + (m : MemState) (regionLo regionHi : Nat) (i : Nat) + (hLo : regionLo ≤ i) (hHi : i < regionHi) + (calls : List (Nat × Nat × (Nat → Uint256))) + (hAllDisj : ∀ c ∈ calls, + Disjoint c.1 (c.1 + c.2.1) regionLo regionHi) : + (calls.foldl (fun acc c => callWithReturnBuffer acc c.1 c.2.1 c.2.2) m) i = m i := by + induction calls generalizing m with + | nil => rfl + | cons c rest ih => + have hcDisj : Disjoint c.1 (c.1 + c.2.1) regionLo regionHi := + hAllDisj c (List.mem_cons_self ..) + have hRestDisj : ∀ d ∈ rest, + Disjoint d.1 (d.1 + d.2.1) regionLo regionHi := by + intro d hd; exact hAllDisj d (List.mem_cons_of_mem _ hd) + have hStep := call_preserves_disjoint_region m c.1 c.2.1 regionLo regionHi + c.2.2 hcDisj i hLo hHi + have hIh := ih (callWithReturnBuffer m c.1 c.2.1 c.2.2) hRestDisj + simp [List.foldl] + rw [hIh]; exact hStep + +/-- The headline aliasing theorem: if a word is written to memory before a + CALL with a disjoint output buffer, reading the same word after the + CALL yields the originally written value, regardless of the callee's + returndata. -/ +theorem memory_frame_under_arbitrary_callee + (regionLo regionHi outOff outSize : Nat) + (hDisj : Disjoint outOff (outOff + outSize) regionLo regionHi) + (m₀ : MemState) (writtenValue : Uint256) (i : Nat) + (hLo : regionLo ≤ i) (hHi : i < regionHi) + (returnedData : Nat → Uint256) : + let mAfterWrite := myMstore m₀ i writtenValue + let mAfterCall := callWithReturnBuffer mAfterWrite outOff outSize returnedData + myMload mAfterCall i = writtenValue := by + unfold myMload myMstore callWithReturnBuffer + by_cases hIn : outOff ≤ i ∧ i < outOff + outSize + · rcases hDisj with h | h <;> omega + · simp [hIn] + +end Verity.EVM.MemoryModel diff --git a/Verity/Trace.lean b/Verity/Trace.lean new file mode 100644 index 000000000..d7c73dd0c --- /dev/null +++ b/Verity/Trace.lean @@ -0,0 +1,133 @@ +namespace Verity.Trace + +/-! +# Counting trace machinery + +A reusable type-class–based abstraction for "this event happens exactly +N times" properties over contract execution traces. Promoted from +`Benchmark.Cases.ERC4337.EntryPointInvariant.Trace`. + +## Pattern + +A `Trace` is a `List Event` for some event type. A `MatchKey` is +arbitrary metadata (an address, a calldata, a selector); a contract +trace property is typically "the trace contains exactly one event +matching this key." + +We provide: + +* `countMatching` — generic counting predicate. +* `count_le_one_under_pairwise_distinct` — at most one matching event + when the source list of events generators are pairwise distinct on the + key. +* `count_ge_one_when_member_emitted` — at least one matching event when + a known emitter is in the source list with non-empty contribution. + +Downstream contracts instantiate with their concrete `Event`, `Key`, +and `matchEvent` predicate. +-/ + +/-- `countMatching` is positive iff some event matchEvent the key. -/ +def countMatching {Event Key : Type} + (matchEvent : Key → Event → Bool) (key : Key) (trace : List Event) : Nat := + (trace.filter (matchEvent key)).length + +/-- Pairwise distinctness on a key-extraction function. -/ +def Pairwise.distinctOn {α Key : Type} [DecidableEq Key] + (keyOf : α → Key) (xs : List α) : Prop := + List.Pairwise (fun a b => keyOf a ≠ keyOf b) xs + +/-! ## Generic at-most-once theorem + +Stated parametrically over an event-generation function `emit : α → +Option Event` (returns `none` when the generator contributes nothing, as +in the ERC-4337 `callData.length = 0` branch). The generator's domain +list `xs` is the source of truth; we case-split on it. +-/ + +/-- A generic execution loop: `xs` is the list of operations, `emit` + produces zero or one `Event` per operation. -/ +def emitLoop {α Event : Type} (emit : α → Option Event) : List α → List Event + | [] => [] + | x :: rest => + match emit x with + | some e => e :: emitLoop emit rest + | none => emitLoop emit rest + +/-- Every event in `emitLoop emit xs` came from some `x ∈ xs`. -/ +theorem emitLoop_event_origin {α Event : Type} + (emit : α → Option Event) (xs : List α) (e : Event) + (he : e ∈ emitLoop emit xs) : + ∃ x ∈ xs, emit x = some e := by + induction xs with + | nil => simp [emitLoop] at he + | cons hd rest ih => + simp only [emitLoop] at he + split at he + · rename_i e' hEq + rcases List.mem_cons.mp he with hHd | hTail + · refine ⟨hd, List.mem_cons_self .., ?_⟩ + rw [hEq, hHd] + · obtain ⟨x, hx, hex⟩ := ih hTail + exact ⟨x, List.mem_cons_of_mem _ hx, hex⟩ + · obtain ⟨x, hx, hex⟩ := ih he + exact ⟨x, List.mem_cons_of_mem _ hx, hex⟩ + +/-- If `x ∈ xs` and `emit x = some e`, then `e ∈ emitLoop emit xs`. -/ +theorem emitLoop_contains_emitted_event {α Event : Type} + (emit : α → Option Event) (xs : List α) (x : α) (hMem : x ∈ xs) + (e : Event) (hEmit : emit x = some e) : + e ∈ emitLoop emit xs := by + induction xs with + | nil => cases hMem + | cons hd rest ih => + simp only [emitLoop] + rcases List.mem_cons.mp hMem with hEq | hTail + · subst hEq + rw [hEmit] + exact List.mem_cons_self .. + · split + · exact List.mem_cons_of_mem _ (ih hTail) + · exact ih hTail + +/-- **At-most-once**: if all events emitted from `xs` are pairwise + distinct on a matching key, the trace contains at most one match. -/ +theorem count_le_one_under_pairwise_distinct + {α Event Key : Type} [DecidableEq Event] + (emit : α → Option Event) (matchEvent : Key → Event → Bool) (key : Key) + (xs : List α) + (hDistinct : List.Pairwise + (fun a b => ∀ ea eb, emit a = some ea → emit b = some eb → + ¬ (matchEvent key ea = true ∧ matchEvent key eb = true)) xs) : + countMatching matchEvent key (emitLoop emit xs) ≤ 1 := by + induction xs with + | nil => simp [emitLoop, countMatching] + | cons hd rest ih => + have hRestDistinct := (List.pairwise_cons.mp hDistinct).2 + have hHeadDistinct := (List.pairwise_cons.mp hDistinct).1 + simp only [emitLoop] + split + · rename_i e hEq + by_cases hHdMatch : matchEvent key e = true + · -- Head emits a matching event. No tail event can also match. + have hTailZero : + countMatching matchEvent key (emitLoop emit rest) = 0 := by + unfold countMatching + have : (emitLoop emit rest).filter (matchEvent key) = [] := by + apply List.filter_eq_nil_iff.mpr + intro e' he' hMatch' + obtain ⟨x', hx', hex'⟩ := emitLoop_event_origin emit rest e' he' + exact hHeadDistinct x' hx' e e' hEq hex' ⟨hHdMatch, hMatch'⟩ + rw [this]; simp + unfold countMatching at hTailZero ⊢ + simp only [List.filter_cons] + rw [if_pos hHdMatch] + simp [hTailZero] + · -- Head does not match. Recurse on tail. + unfold countMatching + simp only [List.filter_cons] + rw [if_neg (by simp [hHdMatch])] + exact ih hRestDistinct + · exact ih hRestDistinct + +end Verity.Trace diff --git a/artifacts/verification_status.json b/artifacts/verification_status.json index f8034f6ac..714a4af72 100644 --- a/artifacts/verification_status.json +++ b/artifacts/verification_status.json @@ -1,6 +1,6 @@ { "codebase": { - "core_lines": 665, + "core_lines": 711, "example_contracts": 15 }, "proofs": { diff --git a/docs/ROADMAP.md b/docs/ROADMAP.md index 5cd50dc60..c5b8e0ad4 100644 --- a/docs/ROADMAP.md +++ b/docs/ROADMAP.md @@ -272,6 +272,34 @@ Execution priorities: 5. Extend preserved observables to events/logs and typed errors. 6. Widen storage/layout-rich whole-contract coverage, then constructor / `fallback` / `receive`. +### ✅ **ERC-4337 Frame Primitives Landed** +**What**: Reusable primitives extracted from the ERC-4337 EntryPoint benchmark. +**Status**: Landed as `Verity.EVM.Frame`, `Verity.EVM.MemoryModel`, +`Verity.EVM.Layout`, `Verity.Trace`, `Verity.Compiler.FromSolidity`, plus +`nonReentrantTransient` in `Verity.Core` and a `let _ := …` discard rule in +`Contracts.Common`. Each closes a trust gap for every benchmark, not just +ERC-4337. + +**Follow-up items tracked here:** + +- **EvmYul ↔ frame correspondence**: prove `Verity.EVM.Frame.applyCallToCaller` + is the projection of EvmYul's `CALL` opcode semantics through the caller + frame. Closes the abstract-model gap. ~2-4 weeks. +- **Machine `fromSolidity`**: replace the CLI scaffold in + `Verity.Compiler.FromSolidity` with an in-process Solidity → Verity IR → + `verity_contract` translator. Multi-week. Tracked as the canonical + follow-up to the scaffold landed here. +- **`solc_disjoint` tactic**: today `Verity.EVM.Layout.call_buffer_disjoint_from_heap` + discharges the disjointness premise from `SolcLayout` invariants. A + proper tactic that closes `MemoryModel.Disjoint` goals automatically would + remove the boilerplate. ~1 week. +- **`verity_contract` doc-comment support**: `/-- … -/` doc strings inside + the contract DSL still break the parser. The fix requires adding a + doc-comment-prefixed alternative to the `verityFunction` syntax and + threading it through `parseFunction`. ~1 day, deferred from the present PR + because the parser surgery touches a hot path that needs cross-benchmark + testing. + ### 🟡 **Trust Reduction** (2 Remaining Components) **What**: Eliminate or verify remaining trusted components **Status**: 1/3 complete (function selectors resolved)