perf: optimize BAL lookups and eliminate redundant GetCodeHash calls#11446
perf: optimize BAL lookups and eliminate redundant GetCodeHash calls#11446kamilchodola wants to merge 6 commits into
Conversation
- Replace SortedDictionary with Dictionary in BlockAccessList for O(1) account lookups on the EVM hot path (was O(log n) with 20-byte span comparisons). Sorted enumeration preserved for encoding/validation. - Merge TryGetDelegation + GetCachedCodeInfo into a single call in InstructionCall, eliminating a redundant GetCodeHash per CALL opcode. - Inline IsDeadAccount in EXTCODEHASH to avoid a second GetCodeHash call for the same address.
|
Claude finished @kamilchodola's task in 6m 17s —— View job Code Review
SummaryThe core optimizations are valid and the benchmark numbers are strong. Three targeted changes — Dictionary for O(1) BAL lookups, collapsing FindingsMedium —
|
| Severity | Count | Items |
|---|---|---|
| Critical | 0 | — |
| High | 0 | — |
| Medium | 1 | LINQ OrderBy in AccountChangesSorted (style + validation perf) |
| Low | 1 | GetContentLength inconsistency |
|
|
||
| public EnumerableWithCount<AccountChanges> AccountChanges => new(_accountChanges.Values, _accountChanges.Values.Count); | ||
| public EnumerableWithCount<AccountChanges> AccountChangesSorted => new( | ||
| _accountChanges.Values.OrderBy(static a => a.Address, GenericComparer.GetOptimized<Address>()), |
There was a problem hiding this comment.
Medium — LINQ rule violation + O(n log n) sort on every access
AccountChangesSorted calls OrderBy on every access. The coding-style rules say "No LINQ when a simple for/foreach works" — here List<T>.Sort() is the idiomatic alternative.
More importantly, GetChangesAtIndex (called once per transaction index during block validation in BlockAccessListManager) now triggers an O(n log n) sort per call, whereas the previous SortedDictionary kept items pre-sorted. For a block with T transactions and A accounts this degrades validation from O(T·A) to O(T·A log A).
Suggest replacing with a sort-on-demand pattern:
public EnumerableWithCount<AccountChanges> AccountChangesSorted
{
get
{
List<AccountChanges> sorted = new(_accountChanges.Values);
sorted.Sort(static (a, b) => GenericComparer.GetOptimized<Address>().Compare(a.Address, b.Address));
return new EnumerableWithCount<AccountChanges>(sorted, sorted.Count);
}
}This avoids the LINQ overhead and hidden deferred-execution allocations while still producing the same result. The per-call O(n log n) cost is unchanged, but at least there's no additional LINQ machinery on top.
| => stream.EncodeArray([.. item.AccountChangesSorted], rlpBehaviors); | ||
|
|
||
| private static int GetContentLength(BlockAccessList item, RlpBehaviors rlpBehaviors) | ||
| => AccountChangesDecoder.Instance.GetContentLength([.. item.AccountChanges], rlpBehaviors); |
There was a problem hiding this comment.
Low — inconsistency: GetContentLength uses unsorted AccountChanges while Encode uses AccountChangesSorted
RLP encoding length is order-independent, so this is functionally correct. But it reads oddly side-by-side with the Encode method that explicitly sorts. For clarity, consider using AccountChanges (the simpler property) in both, or documenting why they differ.
| => AccountChangesDecoder.Instance.GetContentLength([.. item.AccountChanges], rlpBehaviors); | |
| => AccountChangesDecoder.Instance.GetContentLength([.. item.AccountChangesSorted], rlpBehaviors); |
|
Claude finished @kamilchodola's task in 3m 48s —— View job Code Review
SummaryAll findings from the first review have been correctly resolved. The code is correct, performant, and clean. No new issues found. Resolved Findings (verified)Medium (now fixed) — LINQ
|
| Severity | Count | Items |
|---|---|---|
| Critical | 0 | — |
| High | 0 | — |
| Medium | 0 | All resolved |
| Low | 0 | All resolved |
Ready to merge. The benchmark numbers (up to −26% at 300 M gas) are strong, both previous findings are properly fixed, and no new issues are present.
LukaszRozmej
left a comment
There was a problem hiding this comment.
Some minor optimizations still possible:
| List<AccountChanges> sorted = new(_accountChanges.Values); | ||
| sorted.Sort(static (a, b) => a.Address.CompareTo(b.Address)); | ||
| return new EnumerableWithCount<AccountChanges>(sorted, sorted.Count); |
There was a problem hiding this comment.
should this be cached (and dropped on any _accountChanges changes)?
| } | ||
| // Otherwise, push the account's code hash. | ||
| ValueHash256 hash = state.GetCodeHash(address); | ||
| if (state.GetBalance(address) == 0 && state.GetNonce(address) == 0 && hash == Keccak.OfAnEmptyString) |
There was a problem hiding this comment.
| if (state.GetBalance(address) == 0 && state.GetNonce(address) == 0 && hash == Keccak.OfAnEmptyString) | |
| if (state.GetBalance(address).IsZero && state.GetNonce(address).IsZero && hash == Keccak.OfAnEmptyString) |
| ValueHash256 hash = state.GetCodeHash(address); | ||
| if (state.GetBalance(address) == 0 && state.GetNonce(address) == 0 && hash == Keccak.OfAnEmptyString) |
There was a problem hiding this comment.
Maybe IsDeadAccount or IsNonZeroAccount would be better?
Or TryGetAccount and then IsEmpty or IsNull?
|
|
||
| // todo: optimize to use hashmaps where appropriate, separate data structures for tracing and state reading | ||
| private readonly SortedDictionary<Address, AccountChanges> _accountChanges = new(GenericComparer.GetOptimized<Address>()); | ||
| private readonly Dictionary<Address, AccountChanges> _accountChanges = new(); |
There was a problem hiding this comment.
should we pre-initialize with some sensible count? Like 4*transaction count in block?
EXPB Benchmark ComparisonRun: View workflow run superblocksScenario: Client Processing (SSE)
K6 TTFB
realblocksScenario: Client Processing (SSE)
K6 TTFB
|
…lining - Add [JsonIgnore] to AccountChangesSorted property to prevent it from being serialized in JSON-RPC responses (fixes Eth_get_block_access_list tests) - Revert EXTCODEHASH to use IsDeadAccount (single state lookup vs 3 separate calls, logically equivalent but more efficient)
Revert the optimization that merged TryGetDelegation + GetCachedCodeInfo into a single call. Moving GetCachedCodeInfo before the new-account gas check causes EIP-7702 delegation + OOG Pyspec tests to fail (callcode and staticcall variants). The exact mechanism needs further investigation but the original two-call sequence is required for correct gas accounting.
Summary
SortedDictionarywithDictionaryinBlockAccessListfor O(1) account lookups on the EVM hot path (was O(log n) with 20-byte span comparisons). Sorted enumeration preserved for encoding/validation viaAccountChangesSorted.TryGetDelegation+GetCachedCodeInfointo a single call inInstructionCall, eliminating a redundantGetCodeHashper CALL opcode.IsDeadAccountinEXTCODEHASHto avoid a secondGetCodeHashcall for the same address.Benchmark results (compute/perf-devnet-3, test_gas_op filter, dotTrace profiling enabled)
Types of changes
Testing