refactor: in-memory MemoryMerkleDB reference; decouple vm2 + AVM fuzzer from world_state#24306
Open
charlielye wants to merge 3 commits into
Open
refactor: in-memory MemoryMerkleDB reference; decouple vm2 + AVM fuzzer from world_state#24306charlielye wants to merge 3 commits into
charlielye wants to merge 3 commits into
Conversation
43fddf5 to
33a2c98
Compare
4cd1cb1 to
c06723b
Compare
beff275 to
25da297
Compare
Replace the AVM differential fuzzer's WorldState-backed PureRawMerkleDB
with a self-contained in-memory MemoryMerkleDB so vm2 no longer depends on
world_state in process.
- Add SparseMemoryTree: a full-height (up to depth 42) sparse Merkle tree
with domain-separated Poseidon2 node hashing, since the dense MemoryTree
is capped at depth 20.
- Add MemoryMerkleDB implementing LowLevelMerkleDBInterface over four full
-height trees (note-hash/L1->L2 append-only, nullifier/public-data
indexed), using the same AztecMerkleHashPolicy domain separators and
indexed-tree genesis convention as the WorldState so roots and sibling
paths agree. Empty padding leaves hash to zero, matching the WorldState's
batch insertion. The indexed-tree insertion witness reports the empty
pre-write leaf (matching ContentAddressedIndexedTree's original-leaf
witness), not the freshly inserted leaf.
- Add memory_merkle_db.test.cpp: a canonical-fidelity gate that drives a
real world_state::WorldState and a MemoryMerkleDB through the same
genesis ({NULLIFIER:128, PUBLIC_DATA:128}) and op sequence (append note
hashes, insert nullifiers, insert/update public data, pad, checkpoint
create/commit/revert) and asserts equality of roots, sibling paths,
low-leaf lookups, indexed-leaf preimages and append-only leaf values
after every step. world_state is a test-only link dependency of
vm2_tests for this; vm2 itself no longer links it.
- Cut the AVM fuzzer's C++ simulator and prover paths over to
simulate_fast_internal / simulate_for_hint_collection_internal against a
per-simulation copy of the in-memory DB; the file-backed world state is
retained only for the TS differential.
- Delete PureRawMerkleDB and the simulate_fast_with_existing_ws /
simulate_for_hint_collection entry points.
- Relocate MerkleTreeId, getMerkleTreeName and WorldStateRevision into
crypto/merkle_tree/merkle_tree_id.hpp and SequentialInsertionResult /
BatchInsertionResult into crypto/merkle_tree/response.hpp, re-exported
from world_state for existing callers.
- Remove world_state from vm2's CMake dependencies.
The fuzzer no longer hands a shared on-disk lmdb between the C++ and TS differential simulators. The C++ FuzzerWorldStateManager now seeds only an in-memory MemoryMerkleDB (genesis 128), and the TS simulator self-bootstraps a fresh NativeWorldStateService.tmp() per process. Both produce an identical genesis by construction (same 128 nullifier/public-data prefill and header-generator point), so the shared database is unnecessary. - FuzzerWorldStateManager: remove the world_state::WorldState member; setup methods apply only to the in-memory DB; fork() re-seeds the DB and checkpoint/commit/revert/reset_world_state become no-ops. - Drop wsDataDir/wsMapSizeKb from the serialized FuzzerSimulationRequest on both sides. - Remove world_state from the avm_fuzzer CMake dependencies.
c06723b to
815f8ac
Compare
The fuzzing-avm syntax check (ci-full-no-test-cache) compiles the avm_fuzzer, which the cutover left with latent compile errors never caught by ci/x-fast: - simulator.cpp: missing #include <thread> for std::this_thread - merkle_check.fuzzer.cpp, fuzzer_lib.cpp: missing serialize/msgpack_impl.hpp for msgpack_encode_buffer - emit_public_log.fuzzer.cpp, merkle_check.fuzzer.cpp: unqualified avm2:: -> bb::avm2::
Collaborator
Flakey Tests🤖 says: This CI run detected 1 tests that failed, but were tolerated due to a .test_patterns.yml entry. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Introduces
MemoryMerkleDB— a minimal, self-contained in-memory implementation ofLowLevelMerkleDBInterfacethat faithfully reproducesworld_state::WorldState's tree rules (genesis prefill, zero-hashes, indexed/append-only semantics) — and uses it to remove the AVM simulator's dependency on the in-processWorldState.After this PR, production AVM (
bb-avm-sim) talks to world state only via the generated IPC client (WsdbIpcMerkleDB), and the AVM fuzzer runs entirely onMemoryMerkleDB.vm2no longer referencesworld_stateat all.This is the precursor to extracting
world_state/lmdblib/persistent merkle out of barretenberg into a top-levelnative-packages/.What changes
MemoryMerkleDB(vm2/simulation/lib/memory_merkle_db.{hpp,cpp}+sparse_memory_tree.hpp): a faithful, full-height, sparse in-memory reference of the four AVM trees. Replaces the WorldState-backedPureRawMerkleDB(deleted).memory_merkle_db.test.cpp): constructs an ephemeralWorldStateand aMemoryMerkleDBwith identical genesis, applies an identical sequence of appends/inserts/updates/pads/checkpoints, and asserts roots, sibling paths, low-leaf lookups, preimages, and leaf values match at every step. This both proves canonical fidelity and guards against drift.MerkleTreeIdrelocated out ofworld_state/types.hpp(it's merkle vocabulary, not storage);SequentialInsertionResultrepointed to itscrypto::merkle_treedefinition.vm2drops theworld_stateincludes and CMake dependency.MemoryMerkleDB; the TS differential simulator self-bootstraps its own world state (NativeWorldStateService.tmp(), identical genesis by construction) instead of reading a shared on-disk lmdb.FuzzerWorldStateManagerdrops itsWorldStatemember.Validation
memory_merkle_db.test.cpp: 7/7 green (genesis, appends, pad, nullifier inserts, public-data insert+update, checkpoints, mixed sequence) against a real ephemeralWorldState.FUZZING_AVM=ON: all three fuzzer targets compile/link;prover.fuzzerruns (simulate → check_circuit → prove → verify) with no divergence.grep world_stateovervm2/(excluding tests) is clean.One divergence found + fixed
MemoryIndexedTreereported the freshly-inserted leaf ininsertion_witness_data[0].leafwhereContentAddressedIndexedTreereports the empty pre-write leaf; corrected to match.