Skip to content

refactor: in-memory MemoryMerkleDB reference; decouple vm2 + AVM fuzzer from world_state#24306

Open
charlielye wants to merge 3 commits into
cl/ipc-5-avm-cutoverfrom
cl/ipc-6-memory-merkle-db
Open

refactor: in-memory MemoryMerkleDB reference; decouple vm2 + AVM fuzzer from world_state#24306
charlielye wants to merge 3 commits into
cl/ipc-5-avm-cutoverfrom
cl/ipc-6-memory-merkle-db

Conversation

@charlielye

Copy link
Copy Markdown
Contributor

Summary

Introduces MemoryMerkleDB — a minimal, self-contained in-memory implementation of LowLevelMerkleDBInterface that faithfully reproduces world_state::WorldState's tree rules (genesis prefill, zero-hashes, indexed/append-only semantics) — and uses it to remove the AVM simulator's dependency on the in-process WorldState.

After this PR, production AVM (bb-avm-sim) talks to world state only via the generated IPC client (WsdbIpcMerkleDB), and the AVM fuzzer runs entirely on MemoryMerkleDB. vm2 no longer references world_state at all.

This is the precursor to extracting world_state/lmdblib/persistent merkle out of barretenberg into a top-level native-packages/.

What changes

  • MemoryMerkleDB (vm2/simulation/lib/memory_merkle_db.{hpp,cpp} + sparse_memory_tree.hpp): a faithful, full-height, sparse in-memory reference of the four AVM trees. Replaces the WorldState-backed PureRawMerkleDB (deleted).
  • Fidelity gate (memory_merkle_db.test.cpp): constructs an ephemeral WorldState and a MemoryMerkleDB with identical genesis, applies an identical sequence of appends/inserts/updates/pads/checkpoints, and asserts roots, sibling paths, low-leaf lookups, preimages, and leaf values match at every step. This both proves canonical fidelity and guards against drift.
  • MerkleTreeId relocated out of world_state/types.hpp (it's merkle vocabulary, not storage); SequentialInsertionResult repointed to its crypto::merkle_tree definition. vm2 drops the world_state includes and CMake dependency.
  • AVM fuzzer decoupled: the C++ side simulates on MemoryMerkleDB; the TS differential simulator self-bootstraps its own world state (NativeWorldStateService.tmp(), identical genesis by construction) instead of reading a shared on-disk lmdb. FuzzerWorldStateManager drops its WorldState member.

Validation

  • memory_merkle_db.test.cpp: 7/7 green (genesis, appends, pad, nullifier inserts, public-data insert+update, checkpoints, mixed sequence) against a real ephemeral WorldState.
  • FUZZING_AVM=ON: all three fuzzer targets compile/link; prover.fuzzer runs (simulate → check_circuit → prove → verify) with no divergence.
  • grep world_state over vm2/ (excluding tests) is clean.

One divergence found + fixed

MemoryIndexedTree reported the freshly-inserted leaf in insertion_witness_data[0].leaf where ContentAddressedIndexedTree reports the empty pre-write leaf; corrected to match.

Replace the AVM differential fuzzer's WorldState-backed PureRawMerkleDB
with a self-contained in-memory MemoryMerkleDB so vm2 no longer depends on
world_state in process.

- Add SparseMemoryTree: a full-height (up to depth 42) sparse Merkle tree
  with domain-separated Poseidon2 node hashing, since the dense MemoryTree
  is capped at depth 20.
- Add MemoryMerkleDB implementing LowLevelMerkleDBInterface over four full
  -height trees (note-hash/L1->L2 append-only, nullifier/public-data
  indexed), using the same AztecMerkleHashPolicy domain separators and
  indexed-tree genesis convention as the WorldState so roots and sibling
  paths agree. Empty padding leaves hash to zero, matching the WorldState's
  batch insertion. The indexed-tree insertion witness reports the empty
  pre-write leaf (matching ContentAddressedIndexedTree's original-leaf
  witness), not the freshly inserted leaf.
- Add memory_merkle_db.test.cpp: a canonical-fidelity gate that drives a
  real world_state::WorldState and a MemoryMerkleDB through the same
  genesis ({NULLIFIER:128, PUBLIC_DATA:128}) and op sequence (append note
  hashes, insert nullifiers, insert/update public data, pad, checkpoint
  create/commit/revert) and asserts equality of roots, sibling paths,
  low-leaf lookups, indexed-leaf preimages and append-only leaf values
  after every step. world_state is a test-only link dependency of
  vm2_tests for this; vm2 itself no longer links it.
- Cut the AVM fuzzer's C++ simulator and prover paths over to
  simulate_fast_internal / simulate_for_hint_collection_internal against a
  per-simulation copy of the in-memory DB; the file-backed world state is
  retained only for the TS differential.
- Delete PureRawMerkleDB and the simulate_fast_with_existing_ws /
  simulate_for_hint_collection entry points.
- Relocate MerkleTreeId, getMerkleTreeName and WorldStateRevision into
  crypto/merkle_tree/merkle_tree_id.hpp and SequentialInsertionResult /
  BatchInsertionResult into crypto/merkle_tree/response.hpp, re-exported
  from world_state for existing callers.
- Remove world_state from vm2's CMake dependencies.
The fuzzer no longer hands a shared on-disk lmdb between the C++ and TS
differential simulators. The C++ FuzzerWorldStateManager now seeds only an
in-memory MemoryMerkleDB (genesis 128), and the TS simulator self-bootstraps a
fresh NativeWorldStateService.tmp() per process. Both produce an identical
genesis by construction (same 128 nullifier/public-data prefill and
header-generator point), so the shared database is unnecessary.

- FuzzerWorldStateManager: remove the world_state::WorldState member; setup
  methods apply only to the in-memory DB; fork() re-seeds the DB and
  checkpoint/commit/revert/reset_world_state become no-ops.
- Drop wsDataDir/wsMapSizeKb from the serialized FuzzerSimulationRequest on
  both sides.
- Remove world_state from the avm_fuzzer CMake dependencies.
@charlielye charlielye force-pushed the cl/ipc-6-memory-merkle-db branch from c06723b to 815f8ac Compare June 26, 2026 13:36
The fuzzing-avm syntax check (ci-full-no-test-cache) compiles the avm_fuzzer,
which the cutover left with latent compile errors never caught by ci/x-fast:
- simulator.cpp: missing #include <thread> for std::this_thread
- merkle_check.fuzzer.cpp, fuzzer_lib.cpp: missing serialize/msgpack_impl.hpp for
  msgpack_encode_buffer
- emit_public_log.fuzzer.cpp, merkle_check.fuzzer.cpp: unqualified avm2:: -> bb::avm2::
@AztecBot

Copy link
Copy Markdown
Collaborator

Flakey Tests

🤖 says: This CI run detected 1 tests that failed, but were tolerated due to a .test_patterns.yml entry.

\033FLAKED\033 (8;;http://ci.aztec-labs.com/59d6488769b3f38c�59d6488769b3f38c8;;�):  yarn-project/end-to-end/scripts/run_test.sh simple src/e2e_epochs/epochs_invalidate_block.parallel.test.ts "proposer invalidates multiple checkpoints" (433s) (code: 0) group:e2e-p2p-epoch-flakes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants