Skip to content

perf(l1): short-circuit KECCAK256 on empty input#6775

Merged
edg-l merged 5 commits into
mainfrom
perf/keccak-empty-input
Jun 3, 2026
Merged

perf(l1): short-circuit KECCAK256 on empty input#6775
edg-l merged 5 commits into
mainfrom
perf/keccak-empty-input

Conversation

@edg-l
Copy link
Copy Markdown
Contributor

@edg-l edg-l commented Jun 2, 2026

Motivation

Benchmarkoor (ethrex bal-full suite) shows ethrex is at parity with reth for non-empty KECCAK256 (msg_size 32/256/1024 ≈ 1.0x throughput ratio), but ~19x slower on zero-length input:

msg_size ethrex MGas/s reth MGas/s gap
0 ~600 ~11,500 ~19x
32 ~680 ~605 0.9x
256 ~710 ~720 1.0x
1024 ~510 ~510 1.0x

The gap is isolated to running the keccak permutation on empty input. keccak256("") is a constant; fast clients return it directly.

Change

In the KECCAK256 handler, return the precomputed keccak256("") value as a U256 when len == 0, skipping both the memory load and the permutation. Gas accounting is unchanged. A unit test asserts the constant equals NativeCrypto.keccak256(&[]).

Testing

  • cargo test -p ethrex-levm --lib opcode_handlers::keccak passes (const verified against the hasher).
  • cargo clippy -p ethrex-levm clean.

Return the precomputed keccak256("") constant for zero-length input
instead of running the permutation. Benchmarkoor shows ethrex at ~19x
reth's gap on empty-input KECCAK256 while at parity for non-empty
hashing, isolating the cost to per-op hashing of empty data.
@edg-l edg-l requested a review from a team as a code owner June 2, 2026 12:34
@github-actions github-actions Bot added L1 Ethereum client performance Block execution throughput and performance in general labels Jun 2, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

🤖 Codex Code Review

No findings.

crates/vm/levm/src/opcode_handlers/keccak.rs:17-62 looks sound: the new fast path only changes the len == 0 case, still charges gas before branching, and calculate_memory_size(offset, 0) already resolves to 0 in crates/vm/levm/src/memory.rs:371-379, so this does not appear to alter EVM memory-expansion semantics. The constant’s limb ordering is also covered by the added unit test.

Residual risk: I could not run the Rust test locally in this environment because cargo/rustup attempted to write under a read-only /home/runner/.rustup, so this review is based on static analysis only.


Automated review by OpenAI Codex · gpt-5.4 · custom prompt

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Jun 2, 2026

Greptile Summary

This PR short-circuits the KECCAK256 opcode handler in the LEVM when input length is zero, returning the precomputed keccak256("") constant (EMPTY_KECCAK_U256) instead of running the full permutation. Gas accounting is unchanged; both calculate_memory_size and load_range already early-return for size == 0, so memory semantics are fully preserved.

  • Introduces a compile-time EMPTY_KECCAK_U256 constant with correct little-endian limb ordering, verified against NativeCrypto.keccak256(&[]) in a new unit test.
  • The optimization targets a well-documented ~19x throughput gap on zero-length inputs compared to reth, with no behavioral change for non-empty inputs.

Confidence Score: 5/5

Safe to merge — the change is narrowly scoped, the constant is mathematically verified by the accompanying test, and gas/memory accounting paths are untouched.

The short-circuit only fires when len == 0, a path that was already a no-op in calculate_memory_size and load_range. The constant's limb ordering matches the u256_from_big_endian encoding used everywhere else in the codebase, and the unit test provides a runtime proof against the live hasher. No edge cases were left uncovered.

No files require special attention.

Important Files Changed

Filename Overview
crates/vm/levm/src/opcode_handlers/keccak.rs Adds a compile-time EMPTY_KECCAK_U256 constant and short-circuits the KECCAK256 opcode handler for zero-length input, skipping the permutation entirely; gas accounting and memory semantics are unchanged.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[KECCAK256 opcode] --> B[Pop offset, len from stack]
    B --> C[Compute & charge gas\ncalculate_memory_size + keccak256 word cost]
    C --> D{len == 0?}
    D -- Yes --> E[Return EMPTY_KECCAK_U256\nprecomputed constant\nskip memory load & permutation]
    D -- No --> F[memory.load_range offset, len]
    F --> G[crypto.keccak256 bytes]
    G --> H[u256_from_big_endian hash]
    E --> I[stack.push hash]
    H --> I
    I --> J[OpcodeResult::Continue]
Loading

Reviews (1): Last reviewed commit: "perf(l1): short-circuit KECCAK256 on emp..." | Re-trigger Greptile

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

🤖 Kimi Code Review

The optimization to skip the keccak permutation for empty input is a standard performance improvement used by other clients (geth, revm, etc.). However, there's a potential consensus-critical issue regarding memory expansion.

Critical Check Required

crates/vm/levm/src/opcode_handlers/keccak.rs, lines 40-48:
The optimization skips memory.load_range(offset, len) when len == 0. If load_range has side effects (e.g., expanding the memory vector to accommodate offset + len bytes or validating memory access bounds), skipping it violates EVM semantics. Even with zero length, the EVM must ensure memory is expanded to at least offset bytes and charge appropriate gas.

Verify that:

  1. Memory expansion and gas charging for offset + len occur during the word_size calculation (line 31) or earlier in the gas pricing phase
  2. load_range is side-effect free and purely reads data

If memory expansion happens during load_range, you must either:

  • Keep the load_range(offset, 0) call to trigger expansion, or
  • Ensure memory expansion is handled independently before this handler executes

Code Correctness

Lines 17-21: The constant values correctly represent the little-endian U256 encoding of keccak256(""):

  • c5d2460186f7233c927e7db2dcc703c0e500b653ca82273b7bfad8045d85a470
  • Array order [least_sig_u64, ..., most_sig_u64] is correct for ethrex_common::U256

Style

Line 13: The import use ethrex_common::U256; should be grouped with the existing ethrex_common import (line 14) per Rust conventions:

use ethrex_common::{U256, utils::u256_from_big_endian};

Testing

Lines 56-60: The unit test correctly verifies the constant against the cryptographic implementation. Consider also adding an integration test ensuring KECCAK256 with size=0 charges the correct gas and handles memory expansion at the boundary, though this may be covered by existing opcode test suites.

Recommendation: Confirm the memory expansion behavior of load_range before merging. If expansion is handled elsewhere, this optimization is correct and safe.


Automated review by Kimi (Moonshot AI) · kimi-k2.5 · custom prompt

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

🤖 Claude Code Review

Here is the review for PR #6775perf(l1): short-circuit KECCAK256 on empty input.


PR Overview

Short-circuits the KECCAK256 opcode handler when len == 0 by returning a precomputed constant (keccak256("")) instead of running the Keccak permutation. Gas accounting is unchanged. The PR includes a unit test verifying the constant.


Correctness

Constant value — verified correct. keccak256("") = 0xc5d2460186f7233c927e7db2dcc703c0e500b653ca82273b7bfad8045d85a470. The U256([u64; 4]) constructor stores words in little-endian order (index 0 = least significant), and the encoding in the PR matches exactly:

[0] 0x7bfad8045d85a470  ← LSW
[1] 0xe500b653ca82273b
[2] 0x927e7db2dcc703c0
[3] 0xc5d2460186f7233c  ← MSW

The unit test confirms the constant against the live hasher, which is the right approach.

Memory expansion — no behavioral change. For len == 0:

  • calculate_memory_size(offset, 0) returns Ok(0) early (memory.rs:372–373)
  • memory::load_range(offset, 0) returns Bytes::new() early (memory.rs:128–129)

Both paths were already no-ops before the branch was introduced. The branch does not alter observable memory state.

Gas accounting — unchanged and correct. gas_cost::keccak256 is called before the if len == 0 branch (line 34–38), so it always fires. With new_memory_size=0 and size=0, only the static 21-gas base is charged, which matches the EVM spec.

EVM spec compliance — KECCAK256 of zero-length input must return the well-known constant. This is correct.


Minor Issues

Import style (line 14–15) — the two imports from ethrex_common should be merged per Rust convention:

// before
use ethrex_common::U256;
use ethrex_common::utils::u256_from_big_endian;

// after
use ethrex_common::{U256, utils::u256_from_big_endian};

Comment wording (line 17–18) — "hot path skips the permutation" is misleading. Empty-input keccak is an optimised cold path, not the hot path. Consider: "Skips the permutation for zero-length input; the result is a well-known constant."

Test hardening (optional) — the current test catches a wrong constant or a wrong hasher, but not a mutual bug where both return the same wrong value. Adding a literal byte assertion is a low-cost guard:

assert_eq!(
    NativeCrypto.keccak256(&[]),
    hex_literal::hex!("c5d2460186f7233c927e7db2dcc703c0e500b653ca82273b7bfad8045d85a470")
);

This is advisory — the existing test is already meaningful.


Summary

The optimization is correct, spec-compliant, and well-tested. Gas and memory semantics are preserved. The only actionable change is merging the two use ethrex_common:: lines. Everything else is advisory.


Automated review by Claude (Anthropic) · sonnet · custom prompt

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

Lines of code report

Total lines added: 28
Total lines removed: 0
Total lines changed: 28

Detailed view
+-----------------------------------------------------+-------+------+
| File                                                | Lines | Diff |
+-----------------------------------------------------+-------+------+
| ethrex/crates/vm/levm/src/opcode_handlers/keccak.rs | 57    | +28  |
+-----------------------------------------------------+-------+------+

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

Benchmark Results Comparison

No significant difference was registered for any benchmark run.

Detailed Results

Benchmark Results: BubbleSort

Command Mean [s] Min [s] Max [s] Relative
main_revm_BubbleSort 3.024 ± 0.009 3.004 3.034 1.08 ± 0.01
main_levm_BubbleSort 2.802 ± 0.036 2.766 2.885 1.00 ± 0.01
pr_revm_BubbleSort 2.959 ± 0.020 2.932 2.986 1.06 ± 0.01
pr_levm_BubbleSort 2.797 ± 0.019 2.778 2.842 1.00

Benchmark Results: ERC20Approval

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_ERC20Approval 996.1 ± 14.1 985.9 1031.8 1.00
main_levm_ERC20Approval 1055.0 ± 7.4 1045.3 1067.4 1.06 ± 0.02
pr_revm_ERC20Approval 996.6 ± 19.5 984.9 1048.3 1.00 ± 0.02
pr_levm_ERC20Approval 1066.8 ± 47.2 1036.1 1194.2 1.07 ± 0.05

Benchmark Results: ERC20Mint

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_ERC20Mint 135.0 ± 0.8 133.8 136.5 1.00 ± 0.01
main_levm_ERC20Mint 156.7 ± 0.9 155.7 157.9 1.16 ± 0.01
pr_revm_ERC20Mint 134.9 ± 1.1 133.3 136.7 1.00
pr_levm_ERC20Mint 156.7 ± 0.6 155.8 157.7 1.16 ± 0.01

Benchmark Results: ERC20Transfer

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_ERC20Transfer 236.2 ± 3.5 230.8 240.6 1.00 ± 0.02
main_levm_ERC20Transfer 264.9 ± 8.6 260.5 288.3 1.13 ± 0.04
pr_revm_ERC20Transfer 235.3 ± 1.5 232.3 237.8 1.00
pr_levm_ERC20Transfer 261.4 ± 1.4 258.9 263.8 1.11 ± 0.01

Benchmark Results: Factorial

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_Factorial 223.2 ± 1.6 222.1 227.3 1.00
main_levm_Factorial 271.2 ± 3.9 268.1 281.6 1.22 ± 0.02
pr_revm_Factorial 225.5 ± 6.1 222.0 242.4 1.01 ± 0.03
pr_levm_Factorial 269.6 ± 2.3 267.0 275.5 1.21 ± 0.01

Benchmark Results: FactorialRecursive

Command Mean [s] Min [s] Max [s] Relative
main_revm_FactorialRecursive 1.657 ± 0.027 1.619 1.688 1.01 ± 0.02
main_levm_FactorialRecursive 1.657 ± 0.013 1.639 1.675 1.01 ± 0.01
pr_revm_FactorialRecursive 1.673 ± 0.040 1.599 1.721 1.02 ± 0.02
pr_levm_FactorialRecursive 1.634 ± 0.007 1.623 1.646 1.00

Benchmark Results: Fibonacci

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_Fibonacci 204.7 ± 1.6 201.8 207.9 1.02 ± 0.01
main_levm_Fibonacci 254.2 ± 12.3 249.7 289.2 1.26 ± 0.06
pr_revm_Fibonacci 201.3 ± 1.5 198.0 203.0 1.00
pr_levm_Fibonacci 251.5 ± 2.9 249.6 259.2 1.25 ± 0.02

Benchmark Results: FibonacciRecursive

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_FibonacciRecursive 859.5 ± 8.2 848.1 872.1 1.18 ± 0.02
main_levm_FibonacciRecursive 732.7 ± 7.8 720.2 747.8 1.01 ± 0.02
pr_revm_FibonacciRecursive 873.9 ± 12.1 857.0 899.1 1.20 ± 0.02
pr_levm_FibonacciRecursive 727.9 ± 10.2 719.2 748.2 1.00

Benchmark Results: ManyHashes

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_ManyHashes 8.3 ± 0.0 8.3 8.4 1.00
main_levm_ManyHashes 10.0 ± 0.1 9.9 10.2 1.19 ± 0.01
pr_revm_ManyHashes 8.4 ± 0.4 8.2 9.5 1.01 ± 0.05
pr_levm_ManyHashes 9.9 ± 0.0 9.8 9.9 1.18 ± 0.01

Benchmark Results: MstoreBench

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_MstoreBench 256.7 ± 7.0 252.3 269.6 1.07 ± 0.03
main_levm_MstoreBench 238.8 ± 0.9 237.3 239.7 1.00
pr_revm_MstoreBench 254.5 ± 1.4 252.1 256.7 1.07 ± 0.01
pr_levm_MstoreBench 244.0 ± 1.9 241.6 246.5 1.02 ± 0.01

Benchmark Results: Push

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_Push 291.1 ± 1.1 289.8 293.6 1.00 ± 0.01
main_levm_Push 297.0 ± 0.5 296.3 298.0 1.02 ± 0.00
pr_revm_Push 290.1 ± 1.3 288.5 292.6 1.00
pr_levm_Push 306.0 ± 27.7 295.4 384.8 1.05 ± 0.10

Benchmark Results: SstoreBench_no_opt

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_SstoreBench_no_opt 167.3 ± 3.5 161.7 173.6 1.66 ± 0.03
main_levm_SstoreBench_no_opt 101.3 ± 0.6 100.6 102.7 1.00 ± 0.01
pr_revm_SstoreBench_no_opt 169.2 ± 9.3 161.4 185.2 1.67 ± 0.09
pr_levm_SstoreBench_no_opt 101.1 ± 0.1 100.9 101.3 1.00

Copy link
Copy Markdown
Contributor

@ElFantasma ElFantasma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved with a note... It's up to you to fix it or file a follow up issue.


/// `keccak256("")` as a `U256`. Returned directly for zero-length input so the
/// hot path skips the permutation entirely (matches what other clients do).
const EMPTY_KECCAK_U256: U256 = U256([
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We now have two constants for keccak256("")EMPTY_KECCACK_HASH: LazyLock<H256> in crates/common/constants.rs:38 (note the pre-existing typo — extra C) and this new EMPTY_KECCAK_U256: U256. They have to stay in lock-step forever, in two files, under two different names. That's the kind of duplication that tends to drift quietly.

The LazyLock was only needed because the existing definition uses hex::decode(...).expect(...) at runtime. A true const H256 is straightforward — H256 is just H256([u8; 32]) (verified: H256([2; 32]) construction is already used elsewhere in the tree), and hex_literal::hex! is already a workspace dep (used by crates/common/crypto/provider.rs and others). With that, both shapes become real consts side-by-side:

// in crates/common/constants.rs
pub const EMPTY_KECCAK_HASH: H256 = H256(hex_literal::hex!(
    "c5d2460186f7233c927e7db2dcc703c0e500b653ca82273b7bfad8045d85a470"
));
pub const EMPTY_KECCAK_U256: U256 = U256([
    0x7bfad8045d85a470, 0xe500b653ca82273b,
    0x927e7db2dcc703c0, 0xc5d2460186f7233c,
]);
// + a #[test] asserting the two encode the same bytes — catches drift.

Two paths I can see:

  1. In this PR: define both constants together in common::constants, deprecate (or rename) EMPTY_KECCACK_HASH. The rename touches ~17 files (git grep -l EMPTY_KECCACK_HASH returns: vm.rs, account.rs, block.rs, block_execution_witness.rs, healing/state.rs, snap_sync.rs, types/block.rs, store.rs, levm db.rs/mod.rs/tracing.rs, levm/account.rs, levm/db/gen_db.rs, store_tests.rs, archive_sync, ef_tests test_runner.rs). Mechanical s/EMPTY_KECCACK_HASH/EMPTY_KECCAK_HASH/g.
  2. Land this as-is, file a follow-up: this PR stays surgical (1 const, 1 test, hot-path fix); the unification + rename PR can stand on its own. If you go this way, a drift-catching test extension would be the cheapest in-PR guard: assert_eq!(EMPTY_KECCAK_U256, u256_from_big_endian(EMPTY_KECCACK_HASH.as_bytes())); next to the existing assertion catches divergence if either ever drifts.

Your call. Non-blocking on the perf merit of this PR either way.

@github-project-automation github-project-automation Bot moved this to In Review in ethrex_l1 Jun 2, 2026
@edg-l edg-l enabled auto-merge June 3, 2026 09:16
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 3, 2026

Benchmark Block Execution Results Comparison Against Main

Command Mean [s] Min [s] Max [s] Relative
base 61.022 ± 0.108 60.823 61.206 1.00 ± 0.00
head 60.793 ± 0.130 60.576 61.022 1.00

@edg-l edg-l added this pull request to the merge queue Jun 3, 2026
Merged via the queue into main with commit 890efb6 Jun 3, 2026
86 of 89 checks passed
@edg-l edg-l deleted the perf/keccak-empty-input branch June 3, 2026 12:36
@github-project-automation github-project-automation Bot moved this from Todo to Done in ethrex_performance Jun 3, 2026
@github-project-automation github-project-automation Bot moved this from In Review to Done in ethrex_l1 Jun 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

L1 Ethereum client performance Block execution throughput and performance in general

Projects

Status: Done
Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants