feat(spec-specs, tests): EIP-8037 - CREATE failure refunds state gas to reservoir#2704
Merged
spencer-tb merged 2 commits intoethereum:eips/amsterdam/eip-8037from Apr 19, 2026
Conversation
78e199f to
bd337c5
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## eips/amsterdam/eip-8037 #2704 +/- ##
==========================================================
Coverage ? 88.18%
==========================================================
Files ? 524
Lines ? 31120
Branches ? 3036
==========================================================
Hits ? 27444
Misses ? 3161
Partials ? 515
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
bd337c5 to
f1364ce
Compare
Merged
4 tasks
3e1d7c4 to
44b47cc
Compare
f1364ce to
a4c74e8
Compare
Adds six tests in `test_state_gas_create.py` covering every failure branch in `generic_create` (silent failure, address collision, child revert, child exceptional halt, code deposit OOG) plus block level accounting with a mixed success and failure transaction: test_create_silent_failure_refunds_state_gas test_create_child_revert_refunds_state_gas test_create_child_halt_refunds_state_gas test_create_mixed_success_and_failure_block_accounting test_create_collision_refunds_state_gas test_create_code_deposit_oog_refunds_state_gas All six are strict discriminators of the spec change (24 out of 24 variants fail when the refund is reverted, all pass when applied). Tests that depend on a narrow gas window (child halt, collision, code deposit OOG) use a caller wrapper with tight gas tuning so the probe SSTORE can only succeed via the refunded reservoir. Tests that verify block header `gas_used` compute the expected value as `max(tx_regular, tx_state)` defensively from `factory_code.gas_cost(fork)` so the assertion stays correct if the underlying constants drift. Adds `init_code_at_high_bytes` helper to `spec.py` for placing short init code at the high bytes of a 32 byte memory slot, shared across the new tests. Removes `test_code_deposit_oog_reservoir_inflation_detection` because the new refund behavior intentionally inflates the parent reservoir on child failure, swamping the ordering signal that test relied on. Code deposit ordering remains covered by `test_create_oog_reservoir_inflation_detection`.
8b7da2f to
136312e
Compare
e723e7d
into
ethereum:eips/amsterdam/eip-8037
11 of 16 checks passed
This was referenced Apr 19, 2026
spencer-tb
added a commit
to spencer-tb/execution-specs
that referenced
this pull request
Apr 19, 2026
Ports three tests from the closed PR ethereum#2639 that cover reservoir behavior paths not exercised by the merged ethereum#2689/ethereum#2704/ethereum#2707 tests. test_top_level_halt_preserves_restored_reservoir (parametrized reservoir_delta in {-1, 0, 1} x child_termination in {revert, halt}) Regression test for the bal-devnet-3 Besu bug (ethereum#2644). Child runs an SSTORE then fails, restoring state gas to the parent. Parent then INVALIDs, triggering the top-level failure refund. Expected `header.gas_used = gas_limit_cap + min(reservoir_delta, 0)` so the reservoir (including any spill-restore) is preserved across the halt. test_callcode_value_no_new_account_state_gas CALLCODE transfers value to the caller, not to the target, so no new-account state gas is ever charged regardless of whether the target exists. The reservoir stays intact for a subsequent SSTORE. test_create_oog_during_state_gas_charge Parent CALLs an inner with only 20k gas forwarded. The inner's CREATE charges GAS_NEW_ACCOUNT which exceeds the forwarded budget, OOGing before any state gas lands. Per PR ethereum#2704 the refund restores the parent's reservoir and the parent's subsequent SSTORE succeeds from it.
spencer-tb
added a commit
to spencer-tb/execution-specs
that referenced
this pull request
Apr 19, 2026
Two tests that exercise state-gas paths the merged PRs don't
cover directly: both involve a CREATION tx (to=None) whose
initcode interacts with nested CREATE / SELFDESTRUCT semantics.
test_selfdestruct_in_create_tx_initcode
Creation tx whose initcode SELFDESTRUCTs to a new beneficiary.
The outer contract is in `tx_state.created_accounts` and
`accounts_to_delete`, so PR ethereum#2707 refunds its GAS_NEW_ACCOUNT
end-of-tx. The beneficiary's new-account charge is NOT
refunded (beneficiary is not in `created_accounts`), but it
equals the refund amount, so `state_gas_used` nets to zero.
Only the outer intrinsic_state remains in the header.
test_inner_create_succeeds_code_deposit_state_gas
(parametrized `outer_outcome` in {succeeds, reverts, halts} x
`create_opcode` in {CREATE, CREATE2})
Creation tx whose initcode does an inner CREATE that succeeds
and deploys 1 byte of code. The outer then terminates normally,
reverts, or halts.
* outer_succeeds: inner GAS_NEW_ACCOUNT + code-deposit
accumulate via `incorporate_child_on_success`. Block state
= 2 * GAS_NEW_ACCOUNT + inner code deposit.
* outer_reverts / outer_halts: top-level failure refund (PR
ethereum#2689) zeroes execution state gas. Only the outer intrinsic
remains.
Both tests complete the coverage gap between ethereum#2707/ethereum#2704/ethereum#2689
single-scenario tests for creation-tx initcode compositions.
spencer-tb
added a commit
to spencer-tb/execution-specs
that referenced
this pull request
Apr 19, 2026
Ports the remaining two tests from the
`feat/eip-8037-additional-tests` / `feat/eip-8037-tests-devnet3`
branches that were not yet covered.
test_nested_create_fail_parent_revert_state_gas
Two-layer refund composition: caller CALLs factory, factory
does CREATE with failing initcode, factory then REVERTs or
STOPs. Parametrized over `child_failure` (revert, halt) x
`parent_reverts` x `create_opcode`. Verifies the nonce
side effect of factory's CREATE is rolled back when the
parent reverts, and preserved (nonce=2) when it STOPs.
Complements PR ethereum#2704's single-layer refund tests by
exercising the caller→factory→inner chain through
`incorporate_child_on_error` at both depths.
test_create_stack_depth_state_gas_consumed
Deep-recursion robustness check. The contract CALLs itself
until gas exhaustion (EIP-150 63/64 rule limits effective
depth well below STACK_DEPTH_LIMIT at the current
`gas_limit_cap`; reaching depth 1024 is physically
infeasible since the cumulative survival factor is
`(63/64)**1024 ≈ 1e-7`). As recursion unwinds, frames run
an SSTORE; the outermost frame's SSTORE must succeed,
proving the reservoir threads through nested CALLs intact.
Docstring notes that despite the name (retained for
continuity with closed PR ethereum#2639), this exercises CALL's
silent-failure branch rather than `generic_create`'s
depth-1024 branch (which is unreachable at current gas
params — effectively dead code in the spec).
spencer-tb
added a commit
to spencer-tb/execution-specs
that referenced
this pull request
Apr 19, 2026
The three regression-fix tests in commit 4828ae6 used hardcoded empirical `block_regular` dicts (per CREATE/CREATE2 x self/external variant) to discriminate a spurious `GAS_NEW_ACCOUNT` charge on the CALL. The dicts are brittle to any regular-gas constant change and the spurious-charge discriminator is redundant: PR ethereum#2707's own tests (`test_create_selfdestruct_*`) already exercise the refund path. Drop `header_verify` from: test_call_value_to_self_destructed_header_gas_used test_call_value_to_self_destructed_burns_value test_call_zero_value_to_self_destructed_same_tx_account The tests still verify runtime behavior: NONEXISTENT created address and orchestrator balance burned to zero. Also adds a cross-over test for the ethereum#2704 + ethereum#2689 refund composition that PR ethereum#2704 does not exercise directly: test_inner_create_fail_refunds_in_creation_tx (parametrized `outer_outcome` in {succeeds, reverts}, `num_inner_ops` in {1, 3}, `create_opcode` in {CREATE, CREATE2}) Creation tx with `num_inner_ops` inner CREATE/CREATE2 calls whose initcode REVERTs. Each inner CREATE's GAS_NEW_ACCOUNT is refunded by PR ethereum#2704. Outer then succeeds or reverts. block_state == outer intrinsic in both cases; a client that regressed to pre-ethereum#2704 "gas persists" behavior would inflate it by `num_inner_ops * GAS_NEW_ACCOUNT`. Rewrites the inverted-premise test from the closed PR ethereum#2639.
marioevz
pushed a commit
that referenced
this pull request
Apr 20, 2026
spencer-tb
added a commit
to spencer-tb/execution-specs
that referenced
this pull request
Apr 20, 2026
Ports three tests from the closed PR ethereum#2639 that cover reservoir behavior paths not exercised by the merged ethereum#2689/ethereum#2704/ethereum#2707 tests. test_top_level_halt_preserves_restored_reservoir (parametrized reservoir_delta in {-1, 0, 1} x child_termination in {revert, halt}) Regression test for the bal-devnet-3 Besu bug (ethereum#2644). Child runs an SSTORE then fails, restoring state gas to the parent. Parent then INVALIDs, triggering the top-level failure refund. Expected `header.gas_used = gas_limit_cap + min(reservoir_delta, 0)` so the reservoir (including any spill-restore) is preserved across the halt. test_callcode_value_no_new_account_state_gas CALLCODE transfers value to the caller, not to the target, so no new-account state gas is ever charged regardless of whether the target exists. The reservoir stays intact for a subsequent SSTORE. test_create_oog_during_state_gas_charge Parent CALLs an inner with only 20k gas forwarded. The inner's CREATE charges GAS_NEW_ACCOUNT which exceeds the forwarded budget, OOGing before any state gas lands. Per PR ethereum#2704 the refund restores the parent's reservoir and the parent's subsequent SSTORE succeeds from it.
spencer-tb
added a commit
to spencer-tb/execution-specs
that referenced
this pull request
Apr 20, 2026
Two tests that exercise state-gas paths the merged PRs don't
cover directly: both involve a CREATION tx (to=None) whose
initcode interacts with nested CREATE / SELFDESTRUCT semantics.
test_selfdestruct_in_create_tx_initcode
Creation tx whose initcode SELFDESTRUCTs to a new beneficiary.
The outer contract is in `tx_state.created_accounts` and
`accounts_to_delete`, so PR ethereum#2707 refunds its GAS_NEW_ACCOUNT
end-of-tx. The beneficiary's new-account charge is NOT
refunded (beneficiary is not in `created_accounts`), but it
equals the refund amount, so `state_gas_used` nets to zero.
Only the outer intrinsic_state remains in the header.
test_inner_create_succeeds_code_deposit_state_gas
(parametrized `outer_outcome` in {succeeds, reverts, halts} x
`create_opcode` in {CREATE, CREATE2})
Creation tx whose initcode does an inner CREATE that succeeds
and deploys 1 byte of code. The outer then terminates normally,
reverts, or halts.
* outer_succeeds: inner GAS_NEW_ACCOUNT + code-deposit
accumulate via `incorporate_child_on_success`. Block state
= 2 * GAS_NEW_ACCOUNT + inner code deposit.
* outer_reverts / outer_halts: top-level failure refund (PR
ethereum#2689) zeroes execution state gas. Only the outer intrinsic
remains.
Both tests complete the coverage gap between ethereum#2707/ethereum#2704/ethereum#2689
single-scenario tests for creation-tx initcode compositions.
spencer-tb
added a commit
to spencer-tb/execution-specs
that referenced
this pull request
Apr 20, 2026
Ports the remaining two tests from the
`feat/eip-8037-additional-tests` / `feat/eip-8037-tests-devnet3`
branches that were not yet covered.
test_nested_create_fail_parent_revert_state_gas
Two-layer refund composition: caller CALLs factory, factory
does CREATE with failing initcode, factory then REVERTs or
STOPs. Parametrized over `child_failure` (revert, halt) x
`parent_reverts` x `create_opcode`. Verifies the nonce
side effect of factory's CREATE is rolled back when the
parent reverts, and preserved (nonce=2) when it STOPs.
Complements PR ethereum#2704's single-layer refund tests by
exercising the caller→factory→inner chain through
`incorporate_child_on_error` at both depths.
test_create_stack_depth_state_gas_consumed
Deep-recursion robustness check. The contract CALLs itself
until gas exhaustion (EIP-150 63/64 rule limits effective
depth well below STACK_DEPTH_LIMIT at the current
`gas_limit_cap`; reaching depth 1024 is physically
infeasible since the cumulative survival factor is
`(63/64)**1024 ≈ 1e-7`). As recursion unwinds, frames run
an SSTORE; the outermost frame's SSTORE must succeed,
proving the reservoir threads through nested CALLs intact.
Docstring notes that despite the name (retained for
continuity with closed PR ethereum#2639), this exercises CALL's
silent-failure branch rather than `generic_create`'s
depth-1024 branch (which is unreachable at current gas
params — effectively dead code in the spec).
4 tasks
spencer-tb
added a commit
that referenced
this pull request
Apr 20, 2026
spencer-tb
added a commit
to spencer-tb/execution-specs
that referenced
this pull request
Apr 21, 2026
spencer-tb
added a commit
to spencer-tb/execution-specs
that referenced
this pull request
Apr 21, 2026
spencer-tb
added a commit
that referenced
this pull request
Apr 21, 2026
9 tasks
This was referenced Apr 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🗒️ Description
Implements CREATE/CREATE2 failure refunds the account state gas charge (
GAS_NEW_ACCOUNT) tostate_gas_reservoir(STATE_BYTES_PER_NEW_ACCOUNT * cost_per_state_byte).EIPs change for reference: ethereum/EIPs#11532
Spec change: 2be49b2
CREATE/CREATE2keeps pay before execute (GAS_NEW_ACCOUNTcharged upfront). On any failure path, the charge is refunded tostate_gas_reservoir(andexecution_state_gas_useddecremented). Covers:Silent failures (insufficient balance, nonce overflow, stack depth), address collision, child frame revert, child exceptional halt (including code deposit OOG and EIP 3541 invalid prefix).
Tests: 136312e
Six new tests in
test_state_gas_create.pycovering every failure branch:test_create_silent_failure_refunds_state_gas, parametrizedfailure_mode=[nonce_overflow, insufficient_balance]. Silent failures refundGAS_NEW_ACCOUNT, verified viaheader_verifyon blockgas_used.-
test_create_child_revert_refunds_state_gas, parametrizedgas_limit_mode=[with_reservoir, spillover]×create_opcode=[CREATE, CREATE2]. Child REVERT refunds the parent's CREATE charge. The spillover variant runs with tx gas at the cap so the state charge spills intogas_left, then the refund returns to the reservoir (not back togas_left). Verified viaheader_verify.test_create_child_halt_refunds_state_gas, parametrizedfailure_mode=[initcode_halt, invalid_prefix]×create_opcode=[CREATE, CREATE2]. Exceptional halts consume all forwarded gas asregular_gas_usedso block accounting cannot strictly discriminate via header gas. Uses a caller wrapper plus tight gas tuning pattern so the probe SSTORE can only succeed via the refunded reservoir.test_create_mixed_success_and_failure_block_accounting, parametrizedcreate_opcode=[CREATE, CREATE2]. Successful CREATE followed by failed CREATE; block state gas reflects only the successful charges viaheader_verify.test_create_collision_refunds_state_gas, parametrizedcreate_opcode=[CREATE, CREATE2]with@pre_alloc_mutable. A contract pre deployed at the CREATE/CREATE2 target triggers theaccount_has_code_or_noncesilent failure branch. Tight gas tuning via a caller wrapper leaves the factory in the discrimination window so the probe SSTORE succeeds only via the refund.test_create_code_deposit_oog_refunds_state_gas, parametrizedcreate_opcode=[CREATE, CREATE2]. Initcode returnsMAX_CODE_SIZE + 1bytes, triggering an exceptional halt in code deposit. Tight gas tuning leaves the factory in the discrimination window so the probe SSTORE succeeds only via the refund.🔗 Related Issues or PRs
✅ Checklist
just statictype(scope):.