feat(evm): symbolic-assist worker#14783
Closed
figtracer wants to merge 7 commits into
Closed
Conversation
…ded fuzzing
Adds a directed-mutation worker that periodically replays a corpus seed under a
branch-trace inspector, finds the deepest unseen opposite-side branch ("frontier"),
and proposes ABI-aware calldata rewrites that flip the branch — without invoking
an SMT solver. Inspired by Echidna's Echidna.SymExec.Exploration and the
SaferMaker writeup (https://hackmd.io/@SaferMaker/EVM-Sym-Exec).
Architecture:
- BranchTraceInspector (crates/evm/evm/src/inspectors/branch_trace.rs):
observation-only REVM Inspector that records every JUMPI together with the
immediately-preceding compare opcode (EQ/LT/GT/SLT/SGT/ISZERO) and its
concrete operands. Uses the same edge-id hashing as EdgeCovInspector so
frontier ids can be looked up directly in the corpus history map.
- crates/evm/evm/src/executors/symexec/:
- types.rs: FrontierKey, FrontierStats, SymExecState, Candidate, hard caps.
- select.rs: seed scoring (favored > productive > shorter), deepest
unseen-frontier picker, attempt bookkeeping.
- mutate.rs: ABI-aware Redqueen-style calldata rewrites for scalar args
(uintN/intN/bool/address/bytesN). Skips dynamic types in v1.
- mod.rs: SymBackend trait + HeuristicAbiRewrite v1 backend +
run_symexec_assist top-level loop. Trait stays in place so v2 can swap in
z3/hevm/halmos without changing the loop.
- WorkerCorpus helpers: is_master, symexec_seed_pool, history_map_snapshot,
master_sync_dir, symexec_validate. The validator clones the executor and
history map and only reports new edges; it never mutates the live history
map, so the next calibrate() picks up the persisted file in the normal way
without poisoning the corpus.
- Config: FuzzCorpusConfig.symexec_assist (bool, default false) +
symexec_assist_interval (u32, default 200) + symexec_assist_active() guard
(requires coverage-guided + EVM edge coverage; sancov-edges disables it).
- Wiring:
- executors/fuzz/mod.rs: master worker only, periodic call after each sync,
stateful = false.
- executors/invariant/mod.rs: single worker, periodic call at the top of the
run loop, stateful = true (commits prefix txs during replay/validation,
resolves the final tx's ABI dynamically via FuzzRunIdentifiedContracts.
- Inspector plumbing: branch_trace propagates through InspectorStack ->
InspectorData -> RawCallResult.branch_trace (mirrors the existing
edge_coverage path).
Validation gate ensures only candidates that produce a real new EVM edge are
persisted to <corpus_dir>/worker0/sync/, where the existing corpus protocol
distributes them to other workers.
Hard CPU budget per cycle: 1 seed, 1 frontier, <=8 candidates, max 3 attempts
per frontier per campaign.
Tests:
- forge cli: symexec_assist_solves_equality_guard (stateless fuzz) — handler
with require(x == MAGIC) where MAGIC is a 32-byte distinctive value;
asserts the corpus directory contains the magic bytes after the run.
- forge cli: symexec_assist_solves_equality_guard_invariant (stateful
invariant) — same handler exposed via targetContract; trivial invariant;
same structural corpus check.
All existing tests still pass:
forge cli fuzz tests: 16 passed (incl. new regression)
forge cli invariant tests: 68 passed (incl. new regression)
foundry-evm lib tests: 21 passed
foundry-config lib tests: 154 passed
Future extensions kept open by the SymBackend trait:
- Z3PathFlip backend (real SMT)
- ExternalHevm / ExternalHalmos backends
- Mutating the final-tx sender for access-control branches
EOF
)
Amp-Thread-ID: https://ampcode.com/threads/T-019e2bc9-ece8-718f-9279-971e3e982bec
Co-authored-by: Amp <amp@ampcode.com>
figtracer
added a commit
to figtracer/foundry
that referenced
this pull request
May 18, 2026
- pick up upstream CallDetails.value field (added in foundry-rs#14482) and propagate it through the heuristic ABI rewrite in symexec/mutate.rs - drop unused FrontierKey import - docs/dev/symbolic-halmos-port.md: comprehensive plan for landing native symbolic testing in Foundry by porting Halmos in phases; positions foundry-rs#14783 as Phase 0 (concolic-lite assist) complementary to a future native engine Amp-Thread-ID: https://ampcode.com/threads/T-019e2bc9-ece8-718f-9279-971e3e982bec Co-authored-by: Amp <amp@ampcode.com>
387cfba to
04c35eb
Compare
- fmt/clippy/rustdoc fixes: const fn on BranchTrace::{len,is_empty},
allow too_many_arguments on SymBackend::propose, drop redundant clones
in regression tests, fix broken intra-doc link, wrap SaferMaker URL
- demote 'pub mod symexec' to 'pub(crate)' (private_interfaces lint)
and trim re-exports to the surface actually used by the crate
- propagate new CallDetails.value (introduced upstream in foundry-rs#14482)
through the heuristic ABI rewrite
- hoist fully-qualified inline imports (crate::executors::symexec::*,
crate::inspectors::BranchTrace, alloy_primitives::U256::ZERO,
std::time::*, foundry_common::fs::write_json_file, std::fs::*) to
top-of-file 'use' blocks across our diff
- drop unused MAX_SEEDS_PER_CYCLE constant
Amp-Thread-ID: https://ampcode.com/threads/T-019e2bc9-ece8-718f-9279-971e3e982bec
Co-authored-by: Amp <amp@ampcode.com>
04c35eb to
8b0593e
Compare
… multiple frontiers Live-testing the assist worker against a Foundry project surfaced three real gaps that this commit fixes: 1. Solidity with the optimizer enabled lowers `if (a == b)` to `SUB(a, b); JUMPI(skip, diff)` (and the Yul IR pipeline occasionally uses `XOR`). The branch-trace inspector previously only recognised `EQ`/`LT`/`GT`/`SLT`/`SGT`/`ISZERO`, so the most common shape of an equality guard in optimised bytecode was invisible. Record `SUB` and `XOR` as equality-class compares with `inverted = true`. 2. Even when the underlying compare was recognised, the predicate-held logic looked at the JUMPI direction directly. With Solidity's `EQ; ISZERO; JUMPI` lowering and the new SUB/XOR cases, the JUMPI direction is the *opposite* of the predicate. Track `inverted` on `CmpObservation` and compute `predicate_held = took_branch XOR cmp.inverted` in the mutation heuristic, plus widen the inert-op window through `DUP` / `SWAP` / `POP` so SOLC-emitted stack manipulation between the compare and the JUMPI doesn't clear the pending compare. 3. The frontier picker walked the trace in reverse and returned the first eligible branch. The very deepest unseen branch is almost always in the post-call test-harness bookkeeping (forge-std asserting on a returned bool, ABI cleanup, etc.) rather than the contract under test, so the worker spent its attempts mutating calldata to flip guards that no rewrite could possibly satisfy. Expose `collect_frontiers` to gather up to `MAX_FRONTIERS_PER_CYCLE = 16` deepest-first frontiers, skip trivially uninvertible `EQ(a, a)` compares, and have `HeuristicAbiRewrite::propose` fan candidate rewrites across all of them so a single replay reaches guards that live several frames back along the trace. Existing regression tests (`symexec_assist_solves_equality_guard` and its invariant counterpart) continue to pass. Amp-Thread-ID: https://ampcode.com/threads/T-019e2bc9-ece8-718f-9279-971e3e982bec Co-authored-by: Amp <amp@ampcode.com>
- Remove the `SymBackend` trait and `HeuristicAbiRewrite` newtype (single impl, single caller) and replace with a plain `propose_candidates` free function. The trait was placeholder scaffolding for a hypothetical SMT backend that does not exist; the worker has no symbolic engine of its own to feed one. - Drop module-level mentions of Echidna, the SaferMaker writeup, AFL / Redqueen, and "concolic-lite v1" / "v2". Describe the module by what it does today: a directed-mutation helper for coverage-guided fuzzing that captures EVM compare operands at runtime and rewrites scalar ABI args to flip frontier branches, with no SMT solver. - Drop the now-unused `pick_frontier` wrapper (`collect_frontiers` is the only caller path). Resolves the `dead_code` warning surfaced by rustc after the previous frontier-fan-out change. Amp-Thread-ID: https://ampcode.com/threads/T-019e2bc9-ece8-718f-9279-971e3e982bec Co-authored-by: Amp <amp@ampcode.com>
Three concrete bugs that left the worker's accepted candidates silently inert in live runs: 1. mutate.rs: `abi_encode_input` includes the function selector, so prepending the original selector ourselves produced calldata that was 4 bytes too long. Switch to `abi_encode_input_raw`. 2. corpus.rs::symexec_validate: defensively re-enable the edge coverage inspector on the cloned executor so candidate replay actually populates `result.edge_coverage` (otherwise `merge_all_coverage` could only ever report "no new coverage"). 3. corpus.rs::run_symexec_assist: route accepted candidates directly into the master's in-memory corpus via a new `insert_symexec_candidate` helper, instead of relying on the `sync/` -> `calibrate()` path. The sync filter is in second-resolution timestamps, so candidates produced inside the same second as the last sync (very common) were filtered out and never imported. corpus.rs::new_input now also returns a freshly-imported corpus entry verbatim on its first selection (when `total_mutations == 0`) before letting `abi_mutate` scramble it. Without this, a candidate whose value is the exact magic the worker discovered would be mutated immediately on its first selection and the precise value would be lost. Live benchmark (5000 runs, runtime-derived keccak guard, dictionary PUSH-bytes + storage scraping disabled) on seed 0x1: baseline passes (no counterexample found in 5000 runs); assist returns a real counterexample with the magic preimage as calldata. Amp-Thread-ID: https://ampcode.com/threads/T-019e2bc9-ece8-718f-9279-971e3e982bec Co-authored-by: Amp <amp@ampcode.com>
…play fidelity) Concrete bugs identified by the audit on the previous commit: 1. Hash-builder mismatch (CRITICAL). frontier_edge_id and EdgeCovInspector::store_hit each used an independent DefaultHashBuilder::default(). alloy's DefaultHashBuilder is foldhash::fast::RandomState — randomly seeded per instance — so frontier IDs landed in completely different history_map buckets than the real coverage map. Now expose the live inspector's hash_builder via EdgeCovInspector::hash_builder() and pass that exact builder into propose_candidates. 2. collect_edge_coverage(true) in symexec_validate was harmful: it replaced the cloned executor's EdgeCovInspector with a fresh instance whose hasher had a different random seed, so the validation's edge_coverage and the corpus's history_map were bucketing into mismatched indices. Bail out instead when the live executor has no EdgeCovInspector (runner.rs already enables it whenever symexec is active). 3. new_input's verbatim-on-first-pick was global to all coverage-guided fuzzing and credited no favorability for productive verbatim runs. Replaced with a per-entry try_verbatim_once flag on CorpusEntry, set only by insert_symexec_candidate, and we now set current_mutated so process_inputs credits productive verbatim hits. 4. Pending compare window in branch_trace was too permissive: it preserved across DUP, SWAP, POP — opcodes that can copy, reorder or discard the cmp result on the stack and let us attribute a stale compare to an unrelated JUMPI. Restricted to PUSH0..PUSH32 and JUMPDEST. 5. Replay used call_raw(..., U256::ZERO) and ignored warp/roll/value. Replaced with execute_tx (same helper invariant/fuzz use) so the traced path matches the path the real fuzz loop took. 6. Removed the duplicate write to worker0/sync/ — going through the sync timestamp filter was a re-import / stale-skip hazard, and the candidate already lives in the in-memory corpus + corpus/ directory via insert_symexec_candidate. After these fixes the worker on the 10-seed × 6-case benchmark finds 50/50 planted bugs (same as baseline) and correctly leaves the honest negative alone. It's no longer a robustness win on most cases; only one outlier (TB5 seed 0x7: baseline 1247 runs, assist 151) is materially faster. The previous "baseline misses, assist finds" pattern was largely an artifact of the buggy worker flooding the corpus with noise. Amp-Thread-ID: https://ampcode.com/threads/T-019e2bc9-ece8-718f-9279-971e3e982bec Co-authored-by: Amp <amp@ampcode.com>
Collaborator
Author
|
Closing: after correctness fixes (stable edge hashing, faithful replay, per-entry verbatim flag), the worker no longer shows a clear advantage over the existing coverage-guided fuzzer on the real-world benchmark (https://gist.github.com/figtracer/592995f52e002710205aa91f5db15018). Symbolic work belongs in #14796. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this is
A directed-mutation worker for the existing coverage-guided fuzzer. Each cycle: replay a corpus seed under a branch-trace inspector, find unseen opposite-side
JUMPIbranches whose preceding compare opcode operands are visible, propose ABI-aware scalar rewrites that would flip the branch, validate by re-running through a cloned executor (must produce a real new EVM edge), and insert accepted candidates directly into the master's in-memory corpus.No SMT solver, no symbolic execution. Only flips branches whose compare operands surface at runtime and are reachable by rewriting a scalar ABI argument. Keccak-of-input and similar non-invertible guards are out of scope and belong in the symbolic engine work (#14796).
Benchmark
Real-world benchmark (contracts, profiles, raw per-seed results): https://gist.github.com/figtracer/592995f52e002710205aa91f5db15018