feat(tests): stateful test filling for benchmark tests#2637
feat(tests): stateful test filling for benchmark tests#2637fselmo wants to merge 8 commits intoethereum:forks/amsterdamfrom
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## forks/amsterdam #2637 +/- ##
================================================
Coverage 86.24% 86.24%
================================================
Files 599 599
Lines 36984 36984
Branches 3795 3795
================================================
Hits 31895 31895
Misses 4525 4525
Partials 564 564
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
2aa1826 to
f01b02b
Compare
f01b02b to
1aa5f99
Compare
jochem-brouwer
left a comment
There was a problem hiding this comment.
Super exciting this is already close to what we would need! This is starting to come close to indeed what the MITM is doing, except now EELS is driving it 😄 👍
Lemme know if there are any questions after this review 😄 👍
| next_block_number = int(HexNumber(head_block["number"]) + 1) | ||
| next_timestamp = int(HexNumber(head_block["timestamp"]) + 1) | ||
| payload_attributes = self._payload_attributes( | ||
| next_block_number=next_block_number, |
There was a problem hiding this comment.
Block number is not part of payload attributes? See also the amsterdam version of PayloadAttributesV4
| next_timestamp=next_timestamp, | ||
| withdrawals=withdrawals, | ||
| ) | ||
| new_payload = self.testing_rpc.build_block( |
| new_payload = self.testing_rpc.build_block( | ||
| parent_block_hash=Hash(head_block["hash"]), | ||
| payload_attributes=payload_attributes, | ||
| transactions=None, |
There was a problem hiding this comment.
If the
transactionsparameter is JSONnull, the client MAY build a block from its local transaction pool (mempool). (from https://github.com/ethereum/execution-apis/blob/main/src/testing/testing_buildBlockV1.yaml)
I don't think we want this (?), I think
| transactions=None, | |
| transactions=[], |
| @pytest.fixture(scope="session") | ||
| def sender_fund_refund_gas_limit() -> int: | ||
| """Gas limit for fund/refund transactions (replaces sender).""" | ||
| return 21_000 |
There was a problem hiding this comment.
(should be read from fork config as this changes in eip2780)
| Resolved before ``execute_required_contracts`` via fixture | ||
| dependency, so this records the pre-setup chain head. | ||
| """ | ||
| block = eth_rpc.get_block_by_number("latest") |
There was a problem hiding this comment.
I think this should not be queried by number, this should be a command line argument where you provide the blockhash. A number is ambiguous, if you have a client and you stop the client at chain head, if it then reorgs we are now building on a block which no one has 😅 \
If no argument for snapshot hash provided, it is ok (with warn/log) to read chain head and use that.
There was a problem hiding this comment.
I agree, using the latest block can lead to unexpected behavior, like we saw in test_ec_pairing, where the block did not reset correctly to the original snapshot start point.
| start_hex = start_block["number"] | ||
| logger.info(f"Resetting chain to start block {start_hex}") | ||
| try: | ||
| debug_rpc.set_head(start_hex) |
There was a problem hiding this comment.
I'm not sure what this does, would it also set forkchoice in the client correctly? I think this is what should be done instead using standardized APIs:
Let's say we have done the setup and we have reached the start block. Build two testing_buildBlock on top of this start block. Everything should be empty, no withdrawals, no transactions) and all other parameters should be equal except extraData. One of the blocks will be the block hash you use as parentHash in testing_buildBlockV1 and the other one is the reorg block. We never build on top of this blockhash, we only use this to reorg to.
Why? If you try to fcu back to a block which is already in the current canonical chain, then most/all clients will throw. If we use this approach by creating two different blocks which are basically equal except their extraData, we can use it as fork block because everything is the same (RPC calls would also return the correct data on state, except of course last block hash).
This would use existing API and not debug_setHead as I think this differs for all clients and there might be unforeseen side effects.
BTW, this double-block approach is not my idea, it's from @kamilchodola 😄 👍
If something unclear here let us know 😄 👍
| ) | ||
|
|
||
|
|
||
| class StatefulPreRunFixture(CamelModel): |
There was a problem hiding this comment.
Is this a special class or a class with a special name? I think these prerun fixtures are essentially the same layout as normal fixtures right?
For benchmark/repricing effort we would need to control this presetup also. For repricing and also stateful benchmarks we cannot do test setup right before the benchmark block (this loads all state into memory cache and makes the ops super fast). We would need to create this state already in the snapshot / start block (the block where all tests start from).
To give an example, we would take a recent snapshot of mainnet and now in the prerun fill it will the desired data for repricings (basically setting up the state which we also have on perf-devnet-3, except that this entire state is much bigger and is thus worse than mainnet).
Once this is done, we don't run any tests (yet) and now use the chain state right after this prerun as the snapshot state. This is what we distribute to benchmarkoor. We also fill the benchmark fixtures on top of this.
We do this because now the state which we target in the benchs is already in the snapshot (we do not create it when actually running the benchs). This forces the client to actually read the DB and not use some memory cache.
There was a problem hiding this comment.
(Not super urgent because can also write one big test and once test finishes just use that to get the desired snapshot)
There was a problem hiding this comment.
This is a special setup phase that already exists in the gas-benchmarks mechanism (which is globals setting, or something else) It's not for individual benchmark setup; rather, it ensures common contracts (like the factory contract using Nick's method) are pre-deployed on the network. Without this initial setup, all subsequent create2 address calculations and contract deployments would fail.
| def set_head(self, block_number: str) -> None: | ||
| """`debug_setHead`: Reset chain head to the given block.""" | ||
| self.post_request( | ||
| request=RPCCall(method="setHead", params=[block_number]) |
There was a problem hiding this comment.
The fact that this is a number and not a hash indicates that we cannot for instance perform a reorg via this RPC call (we can only roll back into the canonical chain). This would be limiting. Also don´t know if this updates the forkchoice status of client.
|
Okay I realized I might be too skeptical about debug_setHead 😅 Ignore that if it's not right. |
LouisTsai-Csie
left a comment
There was a problem hiding this comment.
fill_stateful.py is mixing three things, types, pytest hooks, and fixtures, now without clear separation. It's a bit hard to know where to start.
The entry point (pytest_configure) is at L234 behind classes and helpers, and pytest_sessionfinish is way down at L587. Plus, fixtures are declared out of order (e.g., execute_required_contracts at L291 depends on stuff at L406 and L428), so we need to jump around to understand the flow.
I would suggest adding section headers and reorganizing (maybe types → helpers → pytest hooks → fixtures by lifecycle → per-test fixtures) might make it way easier to follow. wdyt?
| @click.command( | ||
| name="fill-stateful", | ||
| context_settings={ | ||
| "help_option_names": ["-h", "--help"], |
There was a problem hiding this comment.
I run uv run fill-stateful --help command and this shows up, is it expected behavior?
(ethereum-execution) caijiacheng@caijiachengs-MacBook-Air wt-forks-op-gas-cost % uv run fill-stateful --help
Retrying execution_testing.rpc.rpc.BaseRPC._make_request in 0.5 seconds as it raised ConnectionError: HTTPConnectionPool(host='localhost', port=8545): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x10be56e40>: Failed to establish a new connection: [Errno 61] Connection refused')).
Retrying execution_testing.rpc.rpc.BaseRPC._make_request in 1.0 seconds as it raised ConnectionError: HTTPConnectionPool(host='localhost', port=8545): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x10bedc910>: Failed to establish a new connection: [Errno 61] Connection refused')).
Retrying execution_testing.rpc.rpc.BaseRPC._make_request in 2.0 seconds as it raised ConnectionError: HTTPConnectionPool(host='localhost', port=8545): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x10bedccd0>: Failed to establish a new connection: [Errno 61] Connection refused')).Or i should connect to the node first?
There was a problem hiding this comment.
Not sure if this the root cause: the kwargs below (see fill_stateful function) is deleted and thus not being passed to the is_help_or_collectonly_mode function
| Resolved before ``execute_required_contracts`` via fixture | ||
| dependency, so this records the pre-setup chain head. | ||
| """ | ||
| block = eth_rpc.get_block_by_number("latest") |
There was a problem hiding this comment.
I agree, using the latest block can lead to unexpected behavior, like we saw in test_ec_pairing, where the block did not reset correctly to the original snapshot start point.
| if is_help_or_collectonly_mode(config): | ||
| return |
There was a problem hiding this comment.
Not sure - should this check done before the rpc_endpoint and engine_endpoint check
| if version >= 4 and response.execution_requests is not None: | ||
| params.append(response.execution_requests) |
There was a problem hiding this comment.
| if version >= 4 and response.execution_requests is not None: | |
| params.append(response.execution_requests) | |
| if version >= 4: | |
| assert response.execution_requests is not None | |
| params.append(response.execution_requests) |
I checked Engine API v4+ (link), and found the request field is required. Should we assert its presence instead of ignoring it when it’s None?
| # Mixed phases: SETUP takes precedence (a block containing | ||
| # any setup transaction is considered part of the setup). | ||
| if TestPhase.SETUP in phases: | ||
| return TestPhase.SETUP |
There was a problem hiding this comment.
As I recall, some benchmarks include a single setup transaction for 7702 authorizations, followed by the benchmark transactions within the same block. In these cases, we will be labeling the block as a setup block.
This isn’t an issue for gas-benchmarks, since they keep collecting transactions within the same phase until either the buffer reaches the block gas limit or the phase changes. At that point, it flushes the transactions into a block and generate the payload.
|
Quick question: who's responsible for mounting clients to the snapshot? In gas-benchmarks, Nethermind clients are mounted via overlay, a layer that captures state changes without modifying the snapshot itself. Does this PR have a similar mechanism? If not, where would that be handled? |
|
@LouisTsai-Csie I appreciate the review. This is a working PoC, not something I plan to merge as is. I should've clarified it's still a work in progress and I'm actually making this a lot more |
🗒️ Description
Adds a new fill-stateful CLI command that generates
BlockchainEngineStatefulFixturefiles by running benchmark tests against a live EL client via testing_buildBlockV1.Key points:
debug_setHeadThe generated fixtures are consumed by
benchmarkoorto replay benchmarks against any EL client from the same snapshot. Replaces the gas-benchmarks MITMProxy-based approach.Run against a running client that has
testing,eth,debugavailable and hastesting_buildBlockV1support...uv run fill-stateful \ --clean \ --engine-jwt-secret-file /path/to/jwt \ --fork Osaka \ tests/benchmark/stateful/ \ -k ...Optional:
--rpc-endpoint— defaults tohttp://localhost:8545--engine-endpoint— derived from RPC endpoint with port8551--chain-id— auto-detected from the client--rpc-seed-key— random key generated and funded via CL withdrawal--output— defaults to./fixtures🔗 Related Issues or PRs
#2560
✅ Checklist
just statictype(scope):.Cute Animal Picture