Skip to content

feat(spec-specs, tests): EIP-8037 - zero execution state gas on top-level failure#2689

Merged
spencer-tb merged 4 commits intoethereum:eips/amsterdam/eip-8037from
kclowes:feat/8037-top-level-res-refund
Apr 19, 2026
Merged

feat(spec-specs, tests): EIP-8037 - zero execution state gas on top-level failure#2689
spencer-tb merged 4 commits intoethereum:eips/amsterdam/eip-8037from
kclowes:feat/8037-top-level-res-refund

Conversation

@kclowes
Copy link
Copy Markdown
Contributor

@kclowes kclowes commented Apr 15, 2026

🗒️ Description

On a top level transaction failure (revert, exceptional halt, or out of gas) no state persists, so execution state gas should not be billed. This PR updates the spec and tests so the consumed execution state gas is returned to the reservoir and state_gas_used is zeroed before the transaction fee is computed.

EIPs change for reference: ethereum/EIPs#11476

Spec change: ad40e40

src/ethereum/forks/amsterdam/fork.py process_transaction, after process_message_call returns, when tx_output.error is set: tx_output.state_gas_left += tx_output.state_gas_used and tx_output.state_gas_used = 0. The billing formula tx.gas - gas_left - state_gas_left then sees the full reservoir restored and refunds the sender. Intrinsic state gas is untouched.

Existing fork fixups: 7a0358f

Updates two pre-EIP-8037 tests to branch on fork.is_eip_enabled(8037). They assumed sender balance was fully consumed on OOG, which is no longer true once the state gas portion is refunded.

New tests: c00d017

Six tests in tests/amsterdam/eip8037_state_creation_gas_cost_increase/test_state_gas_reservoir.py.

  • test_top_level_failure_refunds_execution_state_gas, parametrized failure_mode=[revert, halt, oog]. Receipt cumulative_gas_used excludes the SSTORE state gas that was rolled back, proving the sender is refunded.
  • test_top_level_failure_zeros_block_state_gas, parametrized failure_mode=[revert, halt, oog]. blockchain_test + header_verify asserting the block header gas_used equals only the regular gas component.
  • test_creation_tx_failure_preserves_intrinsic_state_gas. A creation tx whose initcode halts carries intrinsic state gas for the new account; the top level failure refund zeroes only execution state gas and preserves the intrinsic component in block accounting.
  • test_subcall_failure_does_not_zero_top_level_state_gas. Parent calls a child that reverts and then runs its own SSTORE. The top level tx succeeds, so the refund does not apply and the parent's state gas surfaces in the block header. Proves the refund is scoped to top level failures.
  • test_top_level_failure_refunds_spilled_state_gas. Reservoir sized to half the SSTORE state gas so the other half spills into gas_left. Proves the refund covers both portions, not just the reservoir.
  • test_top_level_failure_refunds_state_gas_propagated_from_child. A successful subcall runs SSTORE and returns to the parent, then the parent reverts. Proves the refund catches state gas propagated up via the child success path, not just direct top frame charges.

🔗 Related Issues or PRs

✅ Checklist

  • All: Ran fast static checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
    just static
  • All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
  • All: Considered updating the online docs in the ./docs/ directory.
  • All: Set appropriate labels for the changes (only maintainers can apply labels).

@kclowes kclowes added C-test Category: test A-test-specs Area: execution_testing.specs labels Apr 15, 2026
@kclowes kclowes marked this pull request as ready for review April 15, 2026 21:42
@kclowes kclowes requested a review from spencer-tb April 15, 2026 21:42
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 15, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (eips/amsterdam/eip-8037@44b47cc). Learn more about missing BASE report.

Additional details and impacted files
@@                    Coverage Diff                     @@
##             eips/amsterdam/eip-8037    #2689   +/-   ##
==========================================================
  Coverage                           ?   88.18%           
==========================================================
  Files                              ?      524           
  Lines                              ?    31120           
  Branches                           ?     3036           
==========================================================
  Hits                               ?    27444           
  Misses                             ?     3161           
  Partials                           ?      515           
Flag Coverage Δ
unittests 88.18% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jochem-brouwer
Copy link
Copy Markdown
Member

If execution fails (the computation goes out of gas) then the EIP-7702 authorizations still go through, and if an account does not exist, this account is created (so this is related to state cost). However I think in EIP-8037 this is taken from the intrinsic cost, and although this does create state, I think setting state gas used to 0 is correct in all cases... except maybe if the account is already there? Because then you would get a refund added to state gas, all regular gas is spent, but I think you should in the end get this refund back 🤔

As a side note, due to the base fees the sender account is guaranteed to exist (otherwise it could not pay for the transaction). But coinbase could be empty prior to the transaction. The gas fees paid for executing the failure could create the coinbase (although this is currently free, and should stay like so)

@kclowes kclowes marked this pull request as draft April 16, 2026 04:20
@kclowes
Copy link
Copy Markdown
Contributor Author

kclowes commented Apr 16, 2026

This adds a whole bunch of new fails, so converting back to draft for now.

@felix314159 felix314159 force-pushed the eips/amsterdam/eip-8037 branch from 9755dba to 3e1d7c4 Compare April 16, 2026 10:34
@kclowes kclowes force-pushed the feat/8037-top-level-res-refund branch 2 times, most recently from 152fcc2 to 9eb4af7 Compare April 16, 2026 19:39
@kclowes kclowes marked this pull request as ready for review April 16, 2026 20:23
@spencer-tb spencer-tb force-pushed the eips/amsterdam/eip-8037 branch from 3e1d7c4 to 44b47cc Compare April 17, 2026 16:01
@spencer-tb spencer-tb force-pushed the feat/8037-top-level-res-refund branch from 9eb4af7 to 081d253 Compare April 17, 2026 16:19
@spencer-tb spencer-tb changed the title feat(tests): 8037 Use zero state gas on top-level fail feat(spec-specs, tests): EIP-8037 - zero execution state gas on top-level failure Apr 17, 2026
Copy link
Copy Markdown
Contributor

@spencer-tb spencer-tb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Feel free to add more coverage!

@kclowes
Copy link
Copy Markdown
Contributor Author

kclowes commented Apr 17, 2026

@jochem-brouwer I'm not sure I understand your comment above. Are you opening these points for discussion or do you feel that there are changes that need to happen on this PR?

@kclowes kclowes force-pushed the feat/8037-top-level-res-refund branch from 081d253 to b799ce6 Compare April 17, 2026 21:04
@spencer-tb spencer-tb force-pushed the feat/8037-top-level-res-refund branch from 9d6c5d9 to c00d017 Compare April 19, 2026 07:20
@spencer-tb
Copy link
Copy Markdown
Contributor

If execution fails (the computation goes out of gas) then the EIP-7702 authorizations still go through, and if an account does not exist, this account is created (so this is related to state cost). However I think in EIP-8037 this is taken from the intrinsic cost, and although this does create state, I think setting state gas used to 0 is correct in all cases... except maybe if the account is already there? Because then you would get a refund added to state gas, all regular gas is spent, but I think you should in the end get this refund back 🤔

As a side note, due to the base fees the sender account is guaranteed to exist (otherwise it could not pay for the transaction). But coinbase could be empty prior to the transaction. The gas fees paid for executing the failure could create the coinbase (although this is currently free, and should stay like so)

Yes I think you get the refund back, this change only touches state_gas_used, so the 7702 existing account refund sits in state_gas_left and survives the failure path, getting refunded to the sender

My understanding here is the interaction between point 1 (this pr) and point 6 from marias doc, we'll have a seperate PR up for interactions between each point. This PR only targets point 1 spec and tests specifically :)

@spencer-tb
Copy link
Copy Markdown
Contributor

CI fails are expected static fails, and mirror issues on devops runners! Merging :D

@spencer-tb spencer-tb merged commit cef0fde into ethereum:eips/amsterdam/eip-8037 Apr 19, 2026
16 of 19 checks passed
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 19, 2026
`test_create_oog_from_eoa_refunds` asserted `balance=0` on the
sender for every OOG scenario. Under EIP-8037 this breaks on two
fronts:

- Top-level OOG refunds execution state gas to `state_gas_left`
  (ethereum#2689).
- Nested CREATE failure refunds `GAS_NEW_ACCOUNT` (this PR).

Both paths leave the sender with a positive residual balance after
the tx gas refund, so drop the strict balance assertion on
Amsterdam+ while keeping the legacy `balance=0` check for
pre-EIP-8037 forks. Mirrors the fixup already on
ethereum#2689 (duplicated here because the base
branch does not yet include it).
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 19, 2026
Ported from closed PR ethereum#2639. Adds four tests that exercise
scenarios the existing 2D accounting tests don't cover:

  test_tx_rejected_when_regular_gas_exceeds_block_limit_small
    Complements `test_block_regular_gas_limit`. Uses a tight block
    gas_limit (2 * intrinsic) and a rejected tx sized just one gas
    above the remaining regular budget. Catches clients that only
    handle the TX_MAX-sized case.

  test_block_2d_gas_tx_gas_limit_exceeds_regular_remaining
    Parametrized over `tx_gas_limit_equals_block_limit` and
    `tx_gas_limit_just_above_remaining`. A preceding STOP tx
    consumes regular gas, then a second tx has
    `gas_limit >> TX_MAX_GAS_LIMIT`. The pre-execution check must
    use `min(TX_MAX_GAS_LIMIT, tx.gas - intrinsic.state)` not the
    raw `tx.gas_limit`; clients that subtract the full gas_limit
    reject this valid block.

  test_receipt_cumulative_differs_from_header_gas_used
    Explicit assertion that 2D `header.gas_used` can diverge
    from 1D receipt `cumulative_gas_used` when state dominates.
    Catches clients that mix up the two values.

  test_failed_create_tx_state_gas_dominates (parametrized
    `revert`, `halt`)
    Creation tx (to=None) with tight gas where initcode fails.
    Intrinsic state gas (GAS_NEW_ACCOUNT) is preserved across
    the top-level failure refund; tight regular budget keeps
    block_regular below `create_state_gas` so the state
    dimension dominates the header. Complements PR ethereum#2689's
    `test_creation_tx_failure_preserves_intrinsic_state_gas`
    by covering the REVERT path and the tight-gas scenario.
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 19, 2026
Ports three tests from the closed PR ethereum#2639 that cover reservoir
behavior paths not exercised by the merged ethereum#2689/ethereum#2704/ethereum#2707
tests.

  test_top_level_halt_preserves_restored_reservoir (parametrized
    reservoir_delta in {-1, 0, 1} x child_termination in {revert,
    halt})
    Regression test for the bal-devnet-3 Besu bug (ethereum#2644). Child
    runs an SSTORE then fails, restoring state gas to the parent.
    Parent then INVALIDs, triggering the top-level failure
    refund. Expected `header.gas_used =
    gas_limit_cap + min(reservoir_delta, 0)` so the reservoir
    (including any spill-restore) is preserved across the halt.

  test_callcode_value_no_new_account_state_gas
    CALLCODE transfers value to the caller, not to the target,
    so no new-account state gas is ever charged regardless of
    whether the target exists. The reservoir stays intact for a
    subsequent SSTORE.

  test_create_oog_during_state_gas_charge
    Parent CALLs an inner with only 20k gas forwarded. The
    inner's CREATE charges GAS_NEW_ACCOUNT which exceeds the
    forwarded budget, OOGing before any state gas lands. Per
    PR ethereum#2704 the refund restores the parent's reservoir and the
    parent's subsequent SSTORE succeeds from it.
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 19, 2026
Two tests that exercise state-gas paths the merged PRs don't
cover directly: both involve a CREATION tx (to=None) whose
initcode interacts with nested CREATE / SELFDESTRUCT semantics.

  test_selfdestruct_in_create_tx_initcode
    Creation tx whose initcode SELFDESTRUCTs to a new beneficiary.
    The outer contract is in `tx_state.created_accounts` and
    `accounts_to_delete`, so PR ethereum#2707 refunds its GAS_NEW_ACCOUNT
    end-of-tx. The beneficiary's new-account charge is NOT
    refunded (beneficiary is not in `created_accounts`), but it
    equals the refund amount, so `state_gas_used` nets to zero.
    Only the outer intrinsic_state remains in the header.

  test_inner_create_succeeds_code_deposit_state_gas
    (parametrized `outer_outcome` in {succeeds, reverts, halts} x
     `create_opcode` in {CREATE, CREATE2})
    Creation tx whose initcode does an inner CREATE that succeeds
    and deploys 1 byte of code. The outer then terminates normally,
    reverts, or halts.
    * outer_succeeds: inner GAS_NEW_ACCOUNT + code-deposit
      accumulate via `incorporate_child_on_success`. Block state
      = 2 * GAS_NEW_ACCOUNT + inner code deposit.
    * outer_reverts / outer_halts: top-level failure refund (PR
      ethereum#2689) zeroes execution state gas. Only the outer intrinsic
      remains.

Both tests complete the coverage gap between ethereum#2707/ethereum#2704/ethereum#2689
single-scenario tests for creation-tx initcode compositions.
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 19, 2026
The three regression-fix tests in commit 4828ae6 used hardcoded
empirical `block_regular` dicts (per CREATE/CREATE2 x
self/external variant) to discriminate a spurious
`GAS_NEW_ACCOUNT` charge on the CALL. The dicts are brittle to any
regular-gas constant change and the spurious-charge discriminator
is redundant: PR ethereum#2707's own tests (`test_create_selfdestruct_*`)
already exercise the refund path.

Drop `header_verify` from:
  test_call_value_to_self_destructed_header_gas_used
  test_call_value_to_self_destructed_burns_value
  test_call_zero_value_to_self_destructed_same_tx_account

The tests still verify runtime behavior: NONEXISTENT created
address and orchestrator balance burned to zero.

Also adds a cross-over test for the ethereum#2704 + ethereum#2689 refund
composition that PR ethereum#2704 does not exercise directly:

  test_inner_create_fail_refunds_in_creation_tx (parametrized
    `outer_outcome` in {succeeds, reverts}, `num_inner_ops` in
    {1, 3}, `create_opcode` in {CREATE, CREATE2})
    Creation tx with `num_inner_ops` inner CREATE/CREATE2 calls
    whose initcode REVERTs. Each inner CREATE's GAS_NEW_ACCOUNT
    is refunded by PR ethereum#2704. Outer then succeeds or reverts.
    block_state == outer intrinsic in both cases; a client that
    regressed to pre-ethereum#2704 "gas persists" behavior would inflate
    it by `num_inner_ops * GAS_NEW_ACCOUNT`. Rewrites the
    inverted-premise test from the closed PR ethereum#2639.
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 19, 2026
…cstring

Expand the docstring to explicitly reference the geth bal-devnet-3
behavior the test codifies, spell out the regression signal, and
justify the `outer_halts` omission by pointing to PR ethereum#2689's
`test_creation_tx_failure_preserves_intrinsic_state_gas`.

if tx_output.error is not None:
tx_output.state_gas_left += tx_output.state_gas_used
tx_output.state_gas_used = Uint(0)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is incorrect: if an EIP-7702 account is delegated and the account we are delegating already exists, we get a state gas refund which should propagate here. AFAIK all other state updates are triggered by execution, so indeed if there is an error this should be subtracted.

The gas calculation for the block is now incorrect: in case of a EIP-7702 delegation the state gas used of the block should increment. This is here set to 0.

For instance the case where the tx fails vs tx does not fail (either call an account with INVALID/REVERT(0,0) as opcode and where the tx creates an EIP-7702 account PER_EMPTY_ACCOUNT_COST is paid in state gas (or at least the PER_AUTH_BASE_COST). However via current code this sets block state gas to 0 and execution gas to the intrinsic gas cost (plus any cost for the revert / or all gas in case of invalid). This fails in case state gas > execution gas.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I see where this comes from (from EIP-8037)

On child exceptional halt, the child’s gas_left is additionally consumed (zeroed).

This is incorrect in the EIP-7702 case, see EIP-7702 (see Behavior section):

Note, if transaction execution results in failure (e.g. any exceptional condition or code reverting), the processed delegation indicators is not rolled back.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jochem-brouwer I think this is right. The 7702 interactions are handled in forks/amsterdam/vm/eoa_delegation.py. I put up PR #2722 to add some tests around these scenarios. Let me know here or there if those look wrong or if I misunderstood something!

marioevz pushed a commit that referenced this pull request Apr 20, 2026
…evel failure (#2689)

Co-authored-by: spencer-tb <spencer.tb@ethereum.org>
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 20, 2026
Ported from closed PR ethereum#2639. Adds four tests that exercise
scenarios the existing 2D accounting tests don't cover:

  test_tx_rejected_when_regular_gas_exceeds_block_limit_small
    Complements `test_block_regular_gas_limit`. Uses a tight block
    gas_limit (2 * intrinsic) and a rejected tx sized just one gas
    above the remaining regular budget. Catches clients that only
    handle the TX_MAX-sized case.

  test_block_2d_gas_tx_gas_limit_exceeds_regular_remaining
    Parametrized over `tx_gas_limit_equals_block_limit` and
    `tx_gas_limit_just_above_remaining`. A preceding STOP tx
    consumes regular gas, then a second tx has
    `gas_limit >> TX_MAX_GAS_LIMIT`. The pre-execution check must
    use `min(TX_MAX_GAS_LIMIT, tx.gas - intrinsic.state)` not the
    raw `tx.gas_limit`; clients that subtract the full gas_limit
    reject this valid block.

  test_receipt_cumulative_differs_from_header_gas_used
    Explicit assertion that 2D `header.gas_used` can diverge
    from 1D receipt `cumulative_gas_used` when state dominates.
    Catches clients that mix up the two values.

  test_failed_create_tx_state_gas_dominates (parametrized
    `revert`, `halt`)
    Creation tx (to=None) with tight gas where initcode fails.
    Intrinsic state gas (GAS_NEW_ACCOUNT) is preserved across
    the top-level failure refund; tight regular budget keeps
    block_regular below `create_state_gas` so the state
    dimension dominates the header. Complements PR ethereum#2689's
    `test_creation_tx_failure_preserves_intrinsic_state_gas`
    by covering the REVERT path and the tight-gas scenario.
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 20, 2026
Ports three tests from the closed PR ethereum#2639 that cover reservoir
behavior paths not exercised by the merged ethereum#2689/ethereum#2704/ethereum#2707
tests.

  test_top_level_halt_preserves_restored_reservoir (parametrized
    reservoir_delta in {-1, 0, 1} x child_termination in {revert,
    halt})
    Regression test for the bal-devnet-3 Besu bug (ethereum#2644). Child
    runs an SSTORE then fails, restoring state gas to the parent.
    Parent then INVALIDs, triggering the top-level failure
    refund. Expected `header.gas_used =
    gas_limit_cap + min(reservoir_delta, 0)` so the reservoir
    (including any spill-restore) is preserved across the halt.

  test_callcode_value_no_new_account_state_gas
    CALLCODE transfers value to the caller, not to the target,
    so no new-account state gas is ever charged regardless of
    whether the target exists. The reservoir stays intact for a
    subsequent SSTORE.

  test_create_oog_during_state_gas_charge
    Parent CALLs an inner with only 20k gas forwarded. The
    inner's CREATE charges GAS_NEW_ACCOUNT which exceeds the
    forwarded budget, OOGing before any state gas lands. Per
    PR ethereum#2704 the refund restores the parent's reservoir and the
    parent's subsequent SSTORE succeeds from it.
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 20, 2026
Two tests that exercise state-gas paths the merged PRs don't
cover directly: both involve a CREATION tx (to=None) whose
initcode interacts with nested CREATE / SELFDESTRUCT semantics.

  test_selfdestruct_in_create_tx_initcode
    Creation tx whose initcode SELFDESTRUCTs to a new beneficiary.
    The outer contract is in `tx_state.created_accounts` and
    `accounts_to_delete`, so PR ethereum#2707 refunds its GAS_NEW_ACCOUNT
    end-of-tx. The beneficiary's new-account charge is NOT
    refunded (beneficiary is not in `created_accounts`), but it
    equals the refund amount, so `state_gas_used` nets to zero.
    Only the outer intrinsic_state remains in the header.

  test_inner_create_succeeds_code_deposit_state_gas
    (parametrized `outer_outcome` in {succeeds, reverts, halts} x
     `create_opcode` in {CREATE, CREATE2})
    Creation tx whose initcode does an inner CREATE that succeeds
    and deploys 1 byte of code. The outer then terminates normally,
    reverts, or halts.
    * outer_succeeds: inner GAS_NEW_ACCOUNT + code-deposit
      accumulate via `incorporate_child_on_success`. Block state
      = 2 * GAS_NEW_ACCOUNT + inner code deposit.
    * outer_reverts / outer_halts: top-level failure refund (PR
      ethereum#2689) zeroes execution state gas. Only the outer intrinsic
      remains.

Both tests complete the coverage gap between ethereum#2707/ethereum#2704/ethereum#2689
single-scenario tests for creation-tx initcode compositions.
spencer-tb added a commit that referenced this pull request Apr 20, 2026
…evel failure (#2689)

Co-authored-by: spencer-tb <spencer.tb@ethereum.org>
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 21, 2026
…evel failure (ethereum#2689)

Co-authored-by: spencer-tb <spencer.tb@ethereum.org>
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 21, 2026
…evel failure (ethereum#2689)

Co-authored-by: spencer-tb <spencer.tb@ethereum.org>
spencer-tb added a commit that referenced this pull request Apr 21, 2026
…evel failure (#2689)

Co-authored-by: spencer-tb <spencer.tb@ethereum.org>
@jangko jangko mentioned this pull request Apr 21, 2026
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-test-specs Area: execution_testing.specs C-test Category: test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants