MerkleForge

Building a Faster, More Efficient Data Verification Toolkit for Modern Blockchains

⚠️ Research Software — Not Production Ready

MerkleForge is a final-year academic research project exploring the design, implementation, and benchmarking of Merkle tree variants in Rust. The codebase has not been independently security-audited. While correctness is a core goal and the library is covered by unit, property-based, and integration tests, no guarantees are made about its fitness for use in production systems, financial applications, or any environment where cryptographic correctness is safety-critical.

Use at your own risk. Feedback and bug reports are welcome.

The Problem

One of the major bottlenecks in modern blockchain networks is data verification. As blockchain systems process millions of transactions, verifying the integrity of state data becomes increasingly slow and resource-intensive. Every state lookup, transaction batch, and light-client proof depends on the speed, efficiency, and correctness of the underlying data structure.

The Rust ecosystem makes this worse by being fragmented. Libraries like rs-merkle cover only binary trees, while Ethereum-specific crates are maintained separately and incompatibly. Developers are forced to wire together niche tools that differ in documentation quality, testing coverage, and performance with no way to compare them systematically.

The Solution

MerkleForge is a high-performance Rust toolkit designed to attack this bottleneck directly. It provides an optimized, energy-efficient engine capable of generating fast cryptographic proofs, enabling distributed systems to validate data integrity without overwhelming storage or computational resources.

The toolkit unifies the three Merkle tree variants used across production blockchain systems — binary trees for transaction batching, sparse trees for account state, and Patricia tries for Ethereum compatibility — under a single cohesive API with pluggable hash functions and a rigorous benchmarking suite.

This is the implementation artifact of a final-year Software Engineering research project at MIVA Open University, supervised by Dr. Oluwasegun Ishaya Adelaiye.

Features

Unified API — BinaryMerkleTree<H>, SparseMerkleTree<H>, and MerklePatriciaTrie<H> all implement the same MerkleTree<H> trait. Switch variants without rewriting your integration code.
Pluggable hash functions — swap SHA-256, Keccak-256, and BLAKE3 with zero changes to tree logic. Rust generics monomorphise at compile time — no runtime overhead.
Zero-cost abstractions — the MerkleTree<H: HashFunction> bound compiles to the same machine code as a hand-rolled implementation for each hash.
Stateless proof verification — the ProofVerifier module lets light clients verify inclusion proofs using only the root hash, without storing the full tree.
Energy-efficient design — BLAKE3's lower CPU cycles per byte directly reduces energy consumption per proof. The benchmarking suite measures this across all three algorithms so you can make hardware-specific choices.
Systematic benchmarking — a dedicated merkle-bench crate runs Criterion statistical benchmarks (mean/median/stddev) across all tree sizes and hash algorithms.
Property-based testing — Proptest verifies mathematical invariants: any leaf change must update the root; a corrupted proof must fail verification.
#[forbid(unsafe_code)] — guaranteed memory safety across the entire workspace.
CI/CD — GitHub Actions runs tests, Clippy (pedantic), rustfmt, doc builds, and a benchmark compile check on every push.

Architecture

The toolkit is structured in two logical layers:

┌──────────────────────────────────────────────────────────────────────┐
│                          merkle-core                                 │
│  HashFunction trait · MerkleTree trait · ProofVerifier trait         │
│  MerkleProof · LeafIndex · NodeIndex · MerkleError                   │
└────────────────────────────┬─────────────────────────────────────────┘
                             │ depends on
          ┌──────────────────┼───────────────────┐
          ▼                  ▼                   ▼
  merkleforge-hash    merkle-variants        merkle-bench
  SHA-256 adapter    BinaryMerkleTree     Criterion suite
  Keccak-256         SparseMerkleTree     baseline_construction
  BLAKE3             PatriciaTrie         hash_throughput

Trait Hierarchy

HashFunction  ←  pluggable crypto abstraction
      │
      └──▶  MerkleTree<H: HashFunction>
                 insert() · remove() · root()
                 generate_proof() · metadata()
                      │
                      └──▶  ProofVerifier<H>  (stateless)
                                verify(root, leaf_data, proof)

Domain separation is enforced at the hash level:

Leaf hashes: H(0x00 || data)
Internal node hashes: H(0x01 || left || right)

This prevents second-preimage attacks where an attacker substitutes an internal node for a leaf.

Workspace Structure

MerkleForge/
├── Cargo.toml                        # workspace manifest + shared dep versions
│
├── merkle-core/                      # foundation — traits, types, errors
│   └── src/
│       ├── lib.rs
│       ├── error.rs                  # MerkleError (non_exhaustive)
│       ├── traits/
│       │   ├── hash_function.rs      # HashFunction trait
│       │   ├── merkle_tree.rs        # MerkleTree + ProofVerifier traits
│       │   └── serializable.rs       # blanket serde/bincode impl
│       └── types/
│           └── common.rs             # NodeIndex, LeafIndex, MerkleProof, ...
│
├── merkle-hash/                      # pluggable hash adapters
│   └── src/
│       ├── lib.rs
│       ├── sha256.rs
│       ├── keccak256.rs
│       └── blake3.rs
│
├── merkle-bench/                     # isolated benchmarking crate
│   ├── benches/
│   │   ├── baseline_construction.rs  # leaf + node hashing latency
│   │   └── hash_throughput.rs        # sustained MB/s per algorithm
│   └── src/lib.rs
│
└── .github/workflows/ci.yml          # automated test · lint · bench · docs

Note on crate names: The folder names above reflect the workspace directory structure. The published crate names on crates.io are merkle-core and merkleforge-hash (the original names were already taken).

Getting Started

Add the crates you need to your Cargo.toml:

[dependencies]
merkle-core = "0.1"
merkleforge-hash = "0.1"
# merkle-variants = "0.1"   # coming in Phase 2–4

Quick example (once `merkle-variants` lands in Phase 2)

use merkleforge_hash::Sha256;
use merkle_variants::BinaryMerkleTree;
use merkle_core::prelude::*;
 
fn main() -> Result<(), MerkleError> {
    let mut tree = BinaryMerkleTree::<Sha256>::new();
 
    tree.insert(b"alice:100")?;
    tree.insert(b"bob:250")?;
    tree.insert(b"carol:75")?;
 
    let root = tree.root().expect("tree is non-empty");
    println!("Root: {:?}", root);
 
    // Generate an inclusion proof for leaf 0 (alice)
    let proof = tree.generate_proof(LeafIndex(0))?;
    println!("Proof depth: {}", proof.depth());
 
    // Stateless verification — no tree needed
    let valid = BinaryMerkleTree::<Sha256>::verify(root, b"alice:100", &proof);
    assert!(valid);
    Ok(())
}

Swap the hash function

// From SHA-256 to BLAKE3 — one type parameter, zero other changes
use merkleforge_hash::Blake3;
let mut tree = BinaryMerkleTree::<Blake3>::new();

Hash Functions

Type	Algorithm	Output	Best for
`Sha256`	SHA-256	32 bytes	Production deployments with hardware SHA extensions
`Keccak256`	Keccak-256	32 bytes	Ethereum-compatible Patricia Tries
`Blake3`	BLAKE3	32 bytes	Maximum throughput and lowest energy cost per proof

All three adapters implement HashFunction with domain-separated leaf and internal-node hashing. SHA-256 and Keccak-256 use 0x00/0x01 byte prefixes; BLAKE3 uses its native derive_key mode with context strings for zero-overhead domain separation.

Full benchmark data comparing throughput, latency, and energy efficiency across all three will be published after Phase 5.

Tree Variants

Crate	Variant	Status	Best for
`merkle-variants`	`BinaryMerkleTree<H>`	🔨 Phase 2	Transaction batching, Bitcoin-style SPV
`merkle-variants`	`SparseMerkleTree<H>`	⏳ Phase 3	Account state, Layer-2 rollups
`merkle-variants`	`MerklePatriciaTrie<H>`	⏳ Phase 4	Ethereum state roots, EVM compatibility

Binary Merkle Tree

Balanced, power-of-two tree with iterative bottom-up construction. Optimised for throughput when the full leaf set is known at build time. Proof size is O(log n)

Sparse Merkle Tree

256-bit key space (2²⁵⁶ possible slots). Implements shortcut nodes (Buterin, 2018), precomputed empty-hash caching, and node batching (Ouvrard, 2018/2019) for sub-linear memory and storage use on mostly-empty trees. Supports one-phase batch updates (Ma et al., 2023) for rollup workloads — directly reducing the storage overhead that makes Layer-2 verification expensive.

Merkle Patricia Trie

Ethereum-compatible trie with four node types (branch, extension, leaf, empty), RLP encoding, and state-root outputs that match Ethereum mainnet. Validated against official Ethereum test vectors.

Proof Generation & Verification

A MerkleProof<D> carries the sibling hashes along the path from a specific leaf to the root:

pub struct MerkleProof<D> {
    pub leaf_index: LeafIndex,
    pub leaf_count: usize,
    pub path:       Vec<ProofNode<D>>,   // O(log n) siblings
}

Verification is stateless — a light client needs only the root hash (e.g. from a trusted block header) and the proof. No full tree. No full block. This is the core property that makes MerkleForge useful for resource-constrained nodes:

1. current = H(0x00 || leaf_data)
2. for each ProofNode in proof.path:
     if side == Left:  current = H(0x01 || sibling || current)
     if side == Right: current = H(0x01 || current || sibling)
3. return current == expected_root

Proofs are serde-serialisable out of the box via the blanket Serializable impl — store them in a database or send them over a network with no extra setup.

Benchmarking

The merkle-bench crate provides an isolated Criterion suite that reports mean, median, and standard deviation to eliminate noise.

# Run all benchmarks and open the HTML report
cargo bench --workspace
open target/criterion/report/index.html
 
# Run just the hash throughput comparison
cargo bench --bench hash_throughput

Metric	Status
Construction latency — leaf hash + node combine	✅ Phase 1
Throughput — sustained MB/s per algorithm (32 B → 1 MB)	✅ Phase 1
Tree construction — 100 / 1K / 10K / 100K / 1M leaves	🔜 Phase 5
Proof generation & verification latency	🔜 Phase 5
Proof size in bytes	🔜 Phase 5
Peak memory consumption (RSS)	🔜 Phase 5
Comparative results vs `rs-merkle` and `merkle_light`	🔜 Phase 5

Development Status

Phase	Scope	Status
1 — Core Infrastructure	Trait hierarchy, hash adapters, CI/CD	✅ Complete
2 — Binary Merkle Tree	`BinaryMerkleTree<H>`, property tests	🔨 In Progress
3 — Sparse Merkle Tree	`SparseMerkleTree<H>`, node batching	⏳ Planned
4 — Merkle Patricia Trie	Ethereum-compatible MPT, RLP	⏳ Planned
5 — Benchmarking	Full Criterion suite, comparative report	⏳ Planned
6 — Docs & Publication	`crates.io` publish, mdBook, paper	⏳ Planned

Roadmap

BinaryMerkleTree<H> with iterative construction and stateless proof verification
SparseMerkleTree<H> with shortcut nodes and one-phase batch updates
MerklePatriciaTrie<H> with RLP encoding, validated against Ethereum test vectors
Full Criterion benchmark suite with comparative results vs rs-merkle and merkle_light
mdBook user guide with copy-pasteable examples for each variant
Publish merkle-variants to crates.io
Research paper on benchmark findings and tree-type trade-off analysis

Contributing

This project is currently closed to external contributions while active development is underway as part of a final-year research project. Watch this space for an announcement when contributions open.

Bug reports and feedback via GitHub Issues are always welcome.

References

This library is informed by the following research:

Kuznetsov et al. (2024) — Adaptive Merkle trees for enhanced blockchain scalability
Ma et al. (2023) — One-phase batch update on sparse Merkle trees for rollups
Dahlberg et al. (2016) — Efficient sparse Merkle trees: caching strategies and secure proofs
Buterin, V. (2018) — Optimizing sparse Merkle trees
Ouvrard, P. A. (2018/2019) — Sparse Merkle tree performance-oriented implementations
Wood, G. (2014) — Ethereum Yellow Paper

Full bibliography in the accompanying research proposal.

License

Licensed under either of:

at your option.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.github/workflows		.github/workflows
merkle-bench		merkle-bench
merkle-core		merkle-core
merkle-hash		merkle-hash
target		target
CHANGELOG.md		CHANGELOG.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
LICENSE-APACHE		LICENSE-APACHE
LICENSE-MIT		LICENSE-MIT
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MerkleForge

Building a Faster, More Efficient Data Verification Toolkit for Modern Blockchains

The Problem

The Solution

Table of Contents

Features

Architecture

Trait Hierarchy

Workspace Structure

Getting Started

Quick example (once `merkle-variants` lands in Phase 2)

Swap the hash function

Hash Functions

Tree Variants

Binary Merkle Tree

Sparse Merkle Tree

Merkle Patricia Trie

Proof Generation & Verification

Benchmarking

Development Status

Roadmap

Contributing

References

License

About

Licenses found

Uh oh!

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MerkleForge

Building a Faster, More Efficient Data Verification Toolkit for Modern Blockchains

The Problem

The Solution

Table of Contents

Features

Architecture

Trait Hierarchy

Workspace Structure

Getting Started

Quick example (once merkle-variants lands in Phase 2)

Swap the hash function

Hash Functions

Tree Variants

Binary Merkle Tree

Sparse Merkle Tree

Merkle Patricia Trie

Proof Generation & Verification

Benchmarking

Development Status

Roadmap

Contributing

References

License

About

Topics

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Quick example (once `merkle-variants` lands in Phase 2)

Packages