perf(l1): short-circuit KECCAK256 on empty input by edg-l · Pull Request #6775 · lambdaclass/ethrex

edg-l · 2026-06-02T12:34:33Z

Motivation

Benchmarkoor (ethrex bal-full suite) shows ethrex is at parity with reth for non-empty KECCAK256 (msg_size 32/256/1024 ≈ 1.0x throughput ratio), but ~19x slower on zero-length input:

msg_size	ethrex MGas/s	reth MGas/s	gap
0	~600	~11,500	~19x
32	~680	~605	0.9x
256	~710	~720	1.0x
1024	~510	~510	1.0x

The gap is isolated to running the keccak permutation on empty input. keccak256("") is a constant; fast clients return it directly.

Change

In the KECCAK256 handler, return the precomputed keccak256("") value as a U256 when len == 0, skipping both the memory load and the permutation. Gas accounting is unchanged. A unit test asserts the constant equals NativeCrypto.keccak256(&[]).

Testing

cargo test -p ethrex-levm --lib opcode_handlers::keccak passes (const verified against the hasher).
cargo clippy -p ethrex-levm clean.

Return the precomputed keccak256("") constant for zero-length input instead of running the permutation. Benchmarkoor shows ethrex at ~19x reth's gap on empty-input KECCAK256 while at parity for non-empty hashing, isolating the cost to per-op hashing of empty data.

github-actions · 2026-06-02T12:36:26Z

🤖 Codex Code Review

No findings.

crates/vm/levm/src/opcode_handlers/keccak.rs:17-62 looks sound: the new fast path only changes the len == 0 case, still charges gas before branching, and calculate_memory_size(offset, 0) already resolves to 0 in crates/vm/levm/src/memory.rs:371-379, so this does not appear to alter EVM memory-expansion semantics. The constant’s limb ordering is also covered by the added unit test.

Residual risk: I could not run the Rust test locally in this environment because cargo/rustup attempted to write under a read-only /home/runner/.rustup, so this review is based on static analysis only.

Automated review by OpenAI Codex · gpt-5.4 · custom prompt

greptile-apps · 2026-06-02T12:36:31Z

Greptile Summary

This PR short-circuits the KECCAK256 opcode handler in the LEVM when input length is zero, returning the precomputed keccak256("") constant (EMPTY_KECCAK_U256) instead of running the full permutation. Gas accounting is unchanged; both calculate_memory_size and load_range already early-return for size == 0, so memory semantics are fully preserved.

Introduces a compile-time EMPTY_KECCAK_U256 constant with correct little-endian limb ordering, verified against NativeCrypto.keccak256(&[]) in a new unit test.
The optimization targets a well-documented ~19x throughput gap on zero-length inputs compared to reth, with no behavioral change for non-empty inputs.

Confidence Score: 5/5

Safe to merge — the change is narrowly scoped, the constant is mathematically verified by the accompanying test, and gas/memory accounting paths are untouched.

The short-circuit only fires when len == 0, a path that was already a no-op in calculate_memory_size and load_range. The constant's limb ordering matches the u256_from_big_endian encoding used everywhere else in the codebase, and the unit test provides a runtime proof against the live hasher. No edge cases were left uncovered.

No files require special attention.

Important Files Changed

Filename	Overview
crates/vm/levm/src/opcode_handlers/keccak.rs	Adds a compile-time EMPTY_KECCAK_U256 constant and short-circuits the KECCAK256 opcode handler for zero-length input, skipping the permutation entirely; gas accounting and memory semantics are unchanged.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[KECCAK256 opcode] --> B[Pop offset, len from stack]
    B --> C[Compute & charge gas\ncalculate_memory_size + keccak256 word cost]
    C --> D{len == 0?}
    D -- Yes --> E[Return EMPTY_KECCAK_U256\nprecomputed constant\nskip memory load & permutation]
    D -- No --> F[memory.load_range offset, len]
    F --> G[crypto.keccak256 bytes]
    G --> H[u256_from_big_endian hash]
    E --> I[stack.push hash]
    H --> I
    I --> J[OpcodeResult::Continue]

_{Reviews (1): Last reviewed commit: "perf(l1): short-circuit KECCAK256 on emp..." | Re-trigger Greptile}

github-actions · 2026-06-02T12:36:32Z

🤖 Kimi Code Review

The optimization to skip the keccak permutation for empty input is a standard performance improvement used by other clients (geth, revm, etc.). However, there's a potential consensus-critical issue regarding memory expansion.

Critical Check Required

crates/vm/levm/src/opcode_handlers/keccak.rs, lines 40-48:
The optimization skips memory.load_range(offset, len) when len == 0. If load_range has side effects (e.g., expanding the memory vector to accommodate offset + len bytes or validating memory access bounds), skipping it violates EVM semantics. Even with zero length, the EVM must ensure memory is expanded to at least offset bytes and charge appropriate gas.

Verify that:

Memory expansion and gas charging for offset + len occur during the word_size calculation (line 31) or earlier in the gas pricing phase
load_range is side-effect free and purely reads data

If memory expansion happens during load_range, you must either:

Keep the load_range(offset, 0) call to trigger expansion, or
Ensure memory expansion is handled independently before this handler executes

Code Correctness

Lines 17-21: The constant values correctly represent the little-endian U256 encoding of keccak256(""):

c5d2460186f7233c927e7db2dcc703c0e500b653ca82273b7bfad8045d85a470
Array order [least_sig_u64, ..., most_sig_u64] is correct for ethrex_common::U256

Style

Line 13: The import use ethrex_common::U256; should be grouped with the existing ethrex_common import (line 14) per Rust conventions:

use ethrex_common::{U256, utils::u256_from_big_endian};

Testing

Lines 56-60: The unit test correctly verifies the constant against the cryptographic implementation. Consider also adding an integration test ensuring KECCAK256 with size=0 charges the correct gas and handles memory expansion at the boundary, though this may be covered by existing opcode test suites.

Recommendation: Confirm the memory expansion behavior of load_range before merging. If expansion is handled elsewhere, this optimization is correct and safe.

Automated review by Kimi (Moonshot AI) · kimi-k2.5 · custom prompt

github-actions · 2026-06-02T12:36:41Z

🤖 Claude Code Review

Here is the review for PR #6775 — perf(l1): short-circuit KECCAK256 on empty input.

PR Overview

Short-circuits the KECCAK256 opcode handler when len == 0 by returning a precomputed constant (keccak256("")) instead of running the Keccak permutation. Gas accounting is unchanged. The PR includes a unit test verifying the constant.

Correctness

Constant value — verified correct. keccak256("") = 0xc5d2460186f7233c927e7db2dcc703c0e500b653ca82273b7bfad8045d85a470. The U256([u64; 4]) constructor stores words in little-endian order (index 0 = least significant), and the encoding in the PR matches exactly:

[0] 0x7bfad8045d85a470  ← LSW
[1] 0xe500b653ca82273b
[2] 0x927e7db2dcc703c0
[3] 0xc5d2460186f7233c  ← MSW

The unit test confirms the constant against the live hasher, which is the right approach.

Memory expansion — no behavioral change. For len == 0:

calculate_memory_size(offset, 0) returns Ok(0) early (memory.rs:372–373)
memory::load_range(offset, 0) returns Bytes::new() early (memory.rs:128–129)

Both paths were already no-ops before the branch was introduced. The branch does not alter observable memory state.

Gas accounting — unchanged and correct. gas_cost::keccak256 is called before the if len == 0 branch (line 34–38), so it always fires. With new_memory_size=0 and size=0, only the static 21-gas base is charged, which matches the EVM spec.

EVM spec compliance — KECCAK256 of zero-length input must return the well-known constant. This is correct.

Minor Issues

Import style (line 14–15) — the two imports from ethrex_common should be merged per Rust convention:

// before
use ethrex_common::U256;
use ethrex_common::utils::u256_from_big_endian;

// after
use ethrex_common::{U256, utils::u256_from_big_endian};

Comment wording (line 17–18) — "hot path skips the permutation" is misleading. Empty-input keccak is an optimised cold path, not the hot path. Consider: "Skips the permutation for zero-length input; the result is a well-known constant."

Test hardening (optional) — the current test catches a wrong constant or a wrong hasher, but not a mutual bug where both return the same wrong value. Adding a literal byte assertion is a low-cost guard:

assert_eq!(
    NativeCrypto.keccak256(&[]),
    hex_literal::hex!("c5d2460186f7233c927e7db2dcc703c0e500b653ca82273b7bfad8045d85a470")
);

This is advisory — the existing test is already meaningful.

Summary

The optimization is correct, spec-compliant, and well-tested. Gas and memory semantics are preserved. The only actionable change is merging the two use ethrex_common:: lines. Everything else is advisory.

Automated review by Claude (Anthropic) · sonnet · custom prompt

github-actions · 2026-06-02T12:37:14Z

Lines of code report

Total lines added: 28
Total lines removed: 0
Total lines changed: 28

Detailed view

+-----------------------------------------------------+-------+------+
| File                                                | Lines | Diff |
+-----------------------------------------------------+-------+------+
| ethrex/crates/vm/levm/src/opcode_handlers/keccak.rs | 57    | +28  |
+-----------------------------------------------------+-------+------+

github-actions · 2026-06-02T12:49:10Z

Benchmark Results Comparison

No significant difference was registered for any benchmark run.

Detailed Results

Benchmark Results: BubbleSort

Command	Mean [s]	Min [s]	Max [s]	Relative
`main_revm_BubbleSort`	3.024 ± 0.009	3.004	3.034	1.08 ± 0.01
`main_levm_BubbleSort`	2.802 ± 0.036	2.766	2.885	1.00 ± 0.01
`pr_revm_BubbleSort`	2.959 ± 0.020	2.932	2.986	1.06 ± 0.01
`pr_levm_BubbleSort`	2.797 ± 0.019	2.778	2.842	1.00

Benchmark Results: ERC20Approval

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_ERC20Approval`	996.1 ± 14.1	985.9	1031.8	1.00
`main_levm_ERC20Approval`	1055.0 ± 7.4	1045.3	1067.4	1.06 ± 0.02
`pr_revm_ERC20Approval`	996.6 ± 19.5	984.9	1048.3	1.00 ± 0.02
`pr_levm_ERC20Approval`	1066.8 ± 47.2	1036.1	1194.2	1.07 ± 0.05

Benchmark Results: ERC20Mint

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_ERC20Mint`	135.0 ± 0.8	133.8	136.5	1.00 ± 0.01
`main_levm_ERC20Mint`	156.7 ± 0.9	155.7	157.9	1.16 ± 0.01
`pr_revm_ERC20Mint`	134.9 ± 1.1	133.3	136.7	1.00
`pr_levm_ERC20Mint`	156.7 ± 0.6	155.8	157.7	1.16 ± 0.01

Benchmark Results: ERC20Transfer

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_ERC20Transfer`	236.2 ± 3.5	230.8	240.6	1.00 ± 0.02
`main_levm_ERC20Transfer`	264.9 ± 8.6	260.5	288.3	1.13 ± 0.04
`pr_revm_ERC20Transfer`	235.3 ± 1.5	232.3	237.8	1.00
`pr_levm_ERC20Transfer`	261.4 ± 1.4	258.9	263.8	1.11 ± 0.01

Benchmark Results: Factorial

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_Factorial`	223.2 ± 1.6	222.1	227.3	1.00
`main_levm_Factorial`	271.2 ± 3.9	268.1	281.6	1.22 ± 0.02
`pr_revm_Factorial`	225.5 ± 6.1	222.0	242.4	1.01 ± 0.03
`pr_levm_Factorial`	269.6 ± 2.3	267.0	275.5	1.21 ± 0.01

Benchmark Results: FactorialRecursive

Command	Mean [s]	Min [s]	Max [s]	Relative
`main_revm_FactorialRecursive`	1.657 ± 0.027	1.619	1.688	1.01 ± 0.02
`main_levm_FactorialRecursive`	1.657 ± 0.013	1.639	1.675	1.01 ± 0.01
`pr_revm_FactorialRecursive`	1.673 ± 0.040	1.599	1.721	1.02 ± 0.02
`pr_levm_FactorialRecursive`	1.634 ± 0.007	1.623	1.646	1.00

Benchmark Results: Fibonacci

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_Fibonacci`	204.7 ± 1.6	201.8	207.9	1.02 ± 0.01
`main_levm_Fibonacci`	254.2 ± 12.3	249.7	289.2	1.26 ± 0.06
`pr_revm_Fibonacci`	201.3 ± 1.5	198.0	203.0	1.00
`pr_levm_Fibonacci`	251.5 ± 2.9	249.6	259.2	1.25 ± 0.02

Benchmark Results: FibonacciRecursive

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_FibonacciRecursive`	859.5 ± 8.2	848.1	872.1	1.18 ± 0.02
`main_levm_FibonacciRecursive`	732.7 ± 7.8	720.2	747.8	1.01 ± 0.02
`pr_revm_FibonacciRecursive`	873.9 ± 12.1	857.0	899.1	1.20 ± 0.02
`pr_levm_FibonacciRecursive`	727.9 ± 10.2	719.2	748.2	1.00

Benchmark Results: ManyHashes

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_ManyHashes`	8.3 ± 0.0	8.3	8.4	1.00
`main_levm_ManyHashes`	10.0 ± 0.1	9.9	10.2	1.19 ± 0.01
`pr_revm_ManyHashes`	8.4 ± 0.4	8.2	9.5	1.01 ± 0.05
`pr_levm_ManyHashes`	9.9 ± 0.0	9.8	9.9	1.18 ± 0.01

Benchmark Results: MstoreBench

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_MstoreBench`	256.7 ± 7.0	252.3	269.6	1.07 ± 0.03
`main_levm_MstoreBench`	238.8 ± 0.9	237.3	239.7	1.00
`pr_revm_MstoreBench`	254.5 ± 1.4	252.1	256.7	1.07 ± 0.01
`pr_levm_MstoreBench`	244.0 ± 1.9	241.6	246.5	1.02 ± 0.01

Benchmark Results: Push

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_Push`	291.1 ± 1.1	289.8	293.6	1.00 ± 0.01
`main_levm_Push`	297.0 ± 0.5	296.3	298.0	1.02 ± 0.00
`pr_revm_Push`	290.1 ± 1.3	288.5	292.6	1.00
`pr_levm_Push`	306.0 ± 27.7	295.4	384.8	1.05 ± 0.10

Benchmark Results: SstoreBench_no_opt

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_SstoreBench_no_opt`	167.3 ± 3.5	161.7	173.6	1.66 ± 0.03
`main_levm_SstoreBench_no_opt`	101.3 ± 0.6	100.6	102.7	1.00 ± 0.01
`pr_revm_SstoreBench_no_opt`	169.2 ± 9.3	161.4	185.2	1.67 ± 0.09
`pr_levm_SstoreBench_no_opt`	101.1 ± 0.1	100.9	101.3	1.00

ElFantasma

Approved with a note... It's up to you to fix it or file a follow up issue.

ElFantasma · 2026-06-02T19:54:47Z


+/// `keccak256("")` as a `U256`. Returned directly for zero-length input so the
+/// hot path skips the permutation entirely (matches what other clients do).
+const EMPTY_KECCAK_U256: U256 = U256([


We now have two constants for keccak256("") — EMPTY_KECCACK_HASH: LazyLock<H256> in crates/common/constants.rs:38 (note the pre-existing typo — extra C) and this new EMPTY_KECCAK_U256: U256. They have to stay in lock-step forever, in two files, under two different names. That's the kind of duplication that tends to drift quietly.

The LazyLock was only needed because the existing definition uses hex::decode(...).expect(...) at runtime. A true const H256 is straightforward — H256 is just H256([u8; 32]) (verified: H256([2; 32]) construction is already used elsewhere in the tree), and hex_literal::hex! is already a workspace dep (used by crates/common/crypto/provider.rs and others). With that, both shapes become real consts side-by-side:

// in crates/common/constants.rs pub const EMPTY_KECCAK_HASH: H256 = H256(hex_literal::hex!( "c5d2460186f7233c927e7db2dcc703c0e500b653ca82273b7bfad8045d85a470" )); pub const EMPTY_KECCAK_U256: U256 = U256([ 0x7bfad8045d85a470, 0xe500b653ca82273b, 0x927e7db2dcc703c0, 0xc5d2460186f7233c, ]); // + a #[test] asserting the two encode the same bytes — catches drift.

Two paths I can see:

In this PR: define both constants together in common::constants, deprecate (or rename) EMPTY_KECCACK_HASH. The rename touches ~17 files (git grep -l EMPTY_KECCACK_HASH returns: vm.rs, account.rs, block.rs, block_execution_witness.rs, healing/state.rs, snap_sync.rs, types/block.rs, store.rs, levm db.rs/mod.rs/tracing.rs, levm/account.rs, levm/db/gen_db.rs, store_tests.rs, archive_sync, ef_tests test_runner.rs). Mechanical s/EMPTY_KECCACK_HASH/EMPTY_KECCAK_HASH/g.

Land this as-is, file a follow-up: this PR stays surgical (1 const, 1 test, hot-path fix); the unification + rename PR can stand on its own. If you go this way, a drift-catching test extension would be the cheapest in-PR guard: assert_eq!(EMPTY_KECCAK_U256, u256_from_big_endian(EMPTY_KECCACK_HASH.as_bytes())); next to the existing assertion catches divergence if either ever drifts.

Your call. Non-blocking on the perf merit of this PR either way.

- merge ethrex_common imports - reword misleading hot-path comment - add drift guard test against EMPTY_KECCACK_HASH

github-actions · 2026-06-03T09:43:39Z

Benchmark Block Execution Results Comparison Against Main

Command	Mean [s]	Min [s]	Max [s]	Relative
`base`	61.022 ± 0.108	60.823	61.206	1.00 ± 0.00
`head`	60.793 ± 0.130	60.576	61.022	1.00

edg-l requested a review from a team as a code owner June 2, 2026 12:34

github-actions Bot assigned edg-l Jun 2, 2026

github-actions Bot added L1 Ethereum client performance Block execution throughput and performance in general labels Jun 2, 2026

github-project-automation Bot added this to ethrex_performance and ethrex_l1 Jun 2, 2026

github-project-automation Bot moved this to Todo in ethrex_performance Jun 2, 2026

MegaRedHand approved these changes Jun 2, 2026

View reviewed changes

ElFantasma approved these changes Jun 2, 2026

View reviewed changes

azteca1998 approved these changes Jun 2, 2026

View reviewed changes

github-project-automation Bot moved this to In Review in ethrex_l1 Jun 2, 2026

edg-l added 3 commits June 3, 2026 10:58

refactor(l1): address review on KECCAK256 empty-input fast path

8ef02c2

- merge ethrex_common imports - reword misleading hot-path comment - add drift guard test against EMPTY_KECCACK_HASH

refactor(l1,l2): fix EMPTY_KECCACK_HASH typo -> EMPTY_KECCAK_HASH

4c99952

docs: changelog entry for KECCAK256 empty-input fast path

f43eb32

edg-l enabled auto-merge June 3, 2026 09:16

Merge branch 'main' into perf/keccak-empty-input

5b824f8

edg-l added this pull request to the merge queue Jun 3, 2026

Merged via the queue into main with commit 890efb6 Jun 3, 2026
86 of 89 checks passed

edg-l deleted the perf/keccak-empty-input branch June 3, 2026 12:36

github-project-automation Bot moved this from Todo to Done in ethrex_performance Jun 3, 2026

github-project-automation Bot moved this from In Review to Done in ethrex_l1 Jun 3, 2026

Conversation

edg-l commented Jun 2, 2026

Motivation

Change

Testing

Uh oh!

github-actions Bot commented Jun 2, 2026

🤖 Codex Code Review

Uh oh!

greptile-apps Bot commented Jun 2, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

github-actions Bot commented Jun 2, 2026

🤖 Kimi Code Review

Uh oh!

github-actions Bot commented Jun 2, 2026

🤖 Claude Code Review

PR Overview

Correctness

Minor Issues

Summary

Uh oh!

github-actions Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Lines of code report

Uh oh!

github-actions Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Results Comparison

Benchmark Results: BubbleSort

Benchmark Results: ERC20Approval

Benchmark Results: ERC20Mint

Benchmark Results: ERC20Transfer

Benchmark Results: Factorial

Benchmark Results: FactorialRecursive

Benchmark Results: Fibonacci

Benchmark Results: FibonacciRecursive

Benchmark Results: ManyHashes

Benchmark Results: MstoreBench

Benchmark Results: Push

Benchmark Results: SstoreBench_no_opt

Uh oh!

ElFantasma left a comment

Choose a reason for hiding this comment

Uh oh!

ElFantasma Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Block Execution Results Comparison Against Main

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

github-actions Bot commented Jun 2, 2026 •

edited

Loading

github-actions Bot commented Jun 2, 2026 •

edited

Loading

github-actions Bot commented Jun 3, 2026 •

edited

Loading