Skip to content

Feat/evm enginetest#11399

Draft
kamilchodola wants to merge 35 commits intomasterfrom
feat/evm-enginetest
Draft

Feat/evm enginetest#11399
kamilchodola wants to merge 35 commits intomasterfrom
feat/evm-enginetest

Conversation

@kamilchodola
Copy link
Copy Markdown
Contributor

Fixes Closes Resolves #

Please choose one of the keywords above to refer to the issue this PR solves followed by the issue number (e.g. Fixes #000). If no issue number, remove the line. Also, remove everything marked optional that is not applicable. Remove this note after reading.

Changes

  • List the changes

Types of changes

What types of changes does your code introduce?

  • Bugfix (a non-breaking change that fixes an issue)
  • New feature (a non-breaking change that adds functionality)
  • Breaking change (a change that causes existing functionality not to work as expected)
  • Optimization
  • Refactoring
  • Documentation update
  • Build-related changes
  • Other: Description

Testing

Requires testing

  • Yes
  • No

If yes, did you write tests?

  • Yes
  • No

Notes on testing

Optional. Remove if not applicable.

Documentation

Requires documentation update

  • Yes
  • No

If yes, link the PR to the docs update or the issue with the details labeled docs. Remove if not applicable.

Requires explanation in Release Notes

  • Yes
  • No

If yes, fill in the details here. Remove if not applicable.

Remarks

Optional. Remove if not applicable.

spencer-tb and others added 30 commits April 4, 2026 12:59
…ests

Parse all fixture files upfront and flatten to individual test cases
before distributing to workers, instead of parallelizing by file.

Nethermind full suite: 2m48s with 8 workers (was 2m58s file-level).
- Pre-warm thread pool capped at CPU count (fixes starvation at w>=4)
- Parallel file parsing via Parallel.For
- UTF-8 byte span deserialization (File.ReadAllBytes + ReadOnlySpan<byte>)
- Pre-allocated results array instead of ConcurrentBag + sort
- Lightweight BlockchainTestsRunner constructor for parallel path
- Compiled Regex for filter matching
- Pre-initialize KZG once upfront

Prague engine tests: 24.1s → 18.5s at w=8 (23% faster).
Full suite: 2m48s → 1m45s at w=4 (37% faster).
…irect

- Add ForkName to BlockchainTest, use raw network string (e.g. CancunToPragueAtTime15k)
- Add lastBlockHash, lastPayloadStatus to JSON output for block/engine tests
- Remove stateRoot from block/engine results (only state tests)
- Report validation error in `error` even on pass for negative tests:
  engine tests via PayloadStatus.ValidationError, block tests via
  BlockRemoved event and ValidateSuggestedBlock, state tests via
  TransactionResult.ErrorDescription
- Suggest invalid blocks (blockHeader=null) to actually test rejection
- Enables EELS consume direct exception mapping for all test types
Resolve 3 conflicts:
- BlockchainTestBase.cs: take master's refactored helper methods and comments
- JsonToEthereumTest.cs: keep PR's full fixture key as name
- StateTestRunner.cs: merge _suppressOutput check with master's explicit types

Also fix all var usages to explicit types and target-typed new (M5).
- H2: use pattern match `is not BlockchainTest test` to prevent NRE
- L1: use `is FailedToLoadTest` instead of `as ... is not null`
- M2: return exit code 1 on usage error
- M3: remove dead ParseBlockchainTestFile method
- M4: replace int.Parse with int.TryParse + descriptive FormatException
- M5: replace all var with explicit types (already done in merge)
- L2: use JsonSerializer.Serialize for error string in EIP-3155 JSON
- L3: compile filter regex once in constructor
- L4: add guard for null _testsSource in RunTestsAsync
- Fix lastStatus/lastValidationError never updated in RunNewPayloads
- Fix traceStack/excludeStack semantic inversion for state tests
- Replace hardcoded developer path in nethtest script with $SCRIPT_DIR
- Use Regex.IsMatch instead of Regex.Match(...).Success in ParseStateTestFile
Adds a workflow_dispatch workflow that:
- Downloads EEST fixtures from ethereum/execution-spec-tests releases
- Builds and runs nethtest with configurable test type, workers, filter
- Parses JSON results and posts pass/fail summary to GITHUB_STEP_SUMMARY
- Includes copy-pastable local repro command in the summary
- Uploads results JSON as artifact
- Add push trigger on feat/evm-enginetest so workflow runs on push
- Use env vars with fallback defaults so push trigger works without inputs
- Include copy-pastable local repro command in GITHUB_STEP_SUMMARY
The EEST archive uses 'blockchain_tests_engine' (with underscore),
not 'blockchain_test_engine'. This caused the fixture path resolution
to fail on CI.
launchSettings.json overrides CLI args with hardcoded test paths,
causing the runner to ignore the actual --engineTest/--input args.
The dotnet run process was being killed silently on CI (likely OOM with
4 workers × 2844 tests). The | tee pipeline with pipefail caused the
step to report "operation cancelled" with no diagnostic output.

Now: redirect stdout to file, stderr to log, check exit code explicitly,
and dump stderr if something goes wrong.
- Use bal@v5.6.0 with fixtures_bal.tar.gz (has BAL/Amsterdam tests)
- Let stderr flow to console for live progress visibility
- Only redirect stdout (JSON) to file
- Wrap RunTest in try/catch so individual test failures don't crash the
  whole runner — report them as failed results with the exception message
- Improve GITHUB_STEP_SUMMARY: group failures by error type with counts,
  show failed test names in collapsible details section
When running with suppressOutput (CI/parallel mode), each test result
is printed to stderr as "PASS test_name" or "FAIL test_name" so CI logs
show live progress instead of silence until the JSON blob appears.
ParallelExecution causes unhandled exceptions from background threads
in BlockAccessListManager that crash the process. Disable by default
for the CLI test runner; NUnit tests that explicitly override to true
are unaffected.
- PASS/FAIL/EXCEPTION prefixes now use ANSI colors (green/red) on stderr
- CI workflow captures stderr to file alongside console streaming
- If process crashes before writing JSON, Parse results falls back to
  parsing PASS/FAIL lines from stderr log for the summary
- Summary header shows "(process crashed)" when using stderr fallback
- Process substitution 2> >(tee ...) causes bash to die on SIGTERM,
  skipping all subsequent steps including if: always() ones
- Use continue-on-error: true on Run nethtest so crashes don't block
- Add separate "Show stderr output" step to dump the log
- Strip ANSI codes in stderr fallback parsing
Separate steps can't survive runner SIGTERM (exit 143 from OOM).
Now write_summary runs as a trap EXIT handler in the same shell,
so the summary is written even if the dotnet process is killed.
Also reduce workers to 4 (8 causes OOM on ubuntu-latest).
- Use mkfifo + tee to stream stderr to both console and log file
  (process substitution caused bash-level crashes on SIGTERM)
- Reduce default workers from 4 to 2 to avoid OOM on ubuntu-latest
  (4 workers hit SIGTERM/exit 143 after ~55s)
The crash (exit 143/SIGTERM) was caused by unhandled exceptions from
BlockAccessListManager background threads when ParallelExecution=true,
not OOM. Override ParallelExecutionOverride to false in
BlockchainTestsRunner only — NUnit tests are unaffected.

Restore workers to 4 since memory was not the issue.
Block validator rejections and exception stack traces from internal
components were interleaved with PASS/FAIL lines on stderr, making
output unreadable — especially with multiple workers.

Add ComponentLogManagerOverride to BlockchainTestBase. The CLI runner
sets it to LogLevel.Error when in suppressOutput mode, so only real
errors appear. PASS/FAIL/EXCEPTION lines with their error messages
are still printed. NUnit tests are unaffected (override is null).
- Strip ANSI codes and extract PASS/FAIL/EXCEPTION lines more robustly
- Group errors by text after " — " separator
- Show ALL failed tests in details section (no head -100 limit)
- Include error message in JSON-based failure list too
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Apr 29, 2026

Claude finished @kamilchodola's task in 4m 57s —— View job


PR Review — feat/evm-enginetest

This PR adds EEST blockchain_tests_engine fixture support: a new GitHub Actions workflow, a nethtest launcher script, and C# infrastructure to route engine API calls through IJsonRpcService for realistic deserialization. The approach is sound and the engine-test plumbing (RunNewPayloads, AssertPayloadStatus, validation-error mapping) is well-structured.


Findings

High

1. Workflow push trigger is feature-branch-specific (line 8)
The push trigger fires only on feat/evm-enginetest. After merging to master, the workflow will never run automatically unless updated. Decide before merge whether it should run on master pushes, PRs, or be manual-only.

2. Verify GitHub Actions version tags (line 67)
actions/checkout@v6, actions/upload-artifact@v7 (line 264), actions/setup-dotnet@v5, and actions/cache@v5 are all pinned to versions that may not exist. As of my knowledge cutoff the latest stable versions were @v4 for all of these. If the tags don't resolve, every run will fail immediately with a 404. Please verify each action's latest available tag.

Medium

3. NewPayloadParamCounts static initializer can produce opaque TypeInitializationException (BlockchainTestBase.cs line 360)
The reflection-based dictionary is built at class-load time. A NotSupportedException thrown during static initialization wraps the real cause in a TypeInitializationException, making future API mismatches hard to diagnose. Use a named factory method or defer building until first use.

4. Two distinct EEST exceptions mapped to the same regex (BlockchainTestBase.cs ~line 493)
BlockException.INVALID_DEPOSIT_EVENT_LAYOUT and BlockException.SYSTEM_CONTRACT_CALL_FAILED both match "InvalidBlockLevelAccessList: Suggested block-level access list missing account changes". AssertValidationError will accept either expected exception for a single Nethermind error string, masking regressions where the wrong underlying failure fires. If Nethermind cannot distinguish these cases, add an explicit comment. Fix this →

5. nethtest script missing --no-launch-profile (nethtest line 3)
The workflow always adds --no-launch-profile but the convenience script doesn't, meaning local runs can behave differently from CI when launchSettings.json is present.

6. Block tests without --jsonout produce no stdout output (Program.cs line 143)
When isBlockTest && !jsonOutput, results are computed but neither written to stdout (JSON is gated on forceJson) nor to the console (suppressOutput: true is always passed). Users running nethtest --blockTest ... locally without --jsonout see per-test PASS/FAIL on stderr but no summary. Consider always serializing results or documenting the requirement.

Low

7. LINQ in CollectFiles (Program.cs line 169)
Where + OrderBy in CollectFiles. Repo style prefers for/foreach. Non-hot path so no functional impact.

8. IsEip7843Enabled changed to read-only delegation (OverridableReleaseSpec.cs line 128)
Changed from a settable initialized property to => spec.IsEip7843Enabled. Any existing test code that initializes this override will silently stop having effect (or fail to compile). If intentional, a comment would help.

9. --workers $WORKERS unquoted (workflow line 243)
Safe for numeric input but should be quoted as "$WORKERS" for robustness.

10. No timeout on curl download (workflow line 98)
A slow GitHub release server will hang the step indefinitely. Consider --max-time 300 or similar.


What's working well

  • Engine test routing through IJsonRpcService is the right architectural choice — it exercises the full deserialization path, not a synthetic shortcut.
  • ValidationErrorSubstringMappings + ValidationErrorRegexMappings approach is clean and extensible.
  • The write_summary trap-on-EXIT pattern in the workflow is resilient to test runner crashes.
  • EthereumTestResult.LastPayloadStatus and LastBlockHash additions are clean structured-output improvements.
  • The genesisUsesTargetFork / EIP-7928 logic is well-commented.

cancel-in-progress: true

on:
push:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High — Workflow trigger is branch-specific and will stop running after merge

The push trigger is hard-coded to feat/evm-enginetest. After this PR merges to master, the workflow will never run automatically. Decide on the post-merge trigger strategy (e.g. push: branches: [master], a pull_request trigger, or workflow_dispatch-only) before merging.

Suggested change
push:
push:
branches: [master]

or remove the push trigger entirely if the workflow is intentionally manual-only.

timeout-minutes: 60
steps:
- name: Check out repository
uses: actions/checkout@v6
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High — Verify GitHub Actions version numbers exist

actions/checkout@v6 and actions/upload-artifact@v7 (line 264) are pinned to versions that may not exist. As of my knowledge cutoff the latest stable tags were @v4 for both. The same question applies to actions/setup-dotnet@v5 (line 72) and actions/cache@v5 (line 75). If these tags don't exist, the workflow will fail immediately on every run with a 404.

Run gh release list -R actions/checkout | head -5 etc. to verify, or pin to known-stable versions.

private static readonly Dictionary<int, int> NewPayloadParamCounts = Enumerable
.Range(1, EngineApiVersions.NewPayload.Latest)
.ToDictionary(v => v, v => (typeof(IEngineRpcModule).GetMethod($"engine_newPayloadV{v}")
?? throw new NotSupportedException($"engine_newPayloadV{v} not found on IEngineRpcModule")).GetParameters().Length);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium — Reflection-based static initializer can produce opaque TypeInitializationException

If engine_newPayloadV{v} is not found on IEngineRpcModule (e.g. after an API refactor), the throw new NotSupportedException(...) fires during the static initializer, which wraps it in a TypeInitializationException. The real cause is buried in the inner exception and does not surface as a clear test failure — it crashes any test that touches this class with a confusing error.

Consider making this lazy (built on first use) or at least wrapping the static init in a try-catch that rethrows with context:

private static readonly Dictionary<int, int> NewPayloadParamCounts = BuildNewPayloadParamCounts();

private static Dictionary<int, int> BuildNewPayloadParamCounts()
{
    var result = new Dictionary<int, int>();
    for (int v = 1; v <= EngineApiVersions.NewPayload.Latest; v++)
    {
        MethodInfo? method = typeof(IEngineRpcModule).GetMethod($"engine_newPayloadV{v}");
        if (method is null)
            throw new InvalidOperationException($"engine_newPayloadV{v} not found on IEngineRpcModule. Update EngineApiVersions.NewPayload.Latest.");
        result[v] = method.GetParameters().Length;
    }
    return result;
}

Comment thread nethtest
@@ -0,0 +1,3 @@
#!/bin/bash
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
exec dotnet run --no-build -c Release --project "$SCRIPT_DIR/src/Nethermind/Nethermind.Test.Runner/Nethermind.Test.Runner.csproj" -- "$@"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium — Script is missing --no-launch-profile that the workflow uses

The workflow runs dotnet run --no-launch-profile -c Release ... but this script uses dotnet run --no-build -c Release ... without --no-launch-profile. Without that flag, dotnet run picks up the project's launchSettings.json if present, which can set environment variables (e.g. ASPNETCORE_ENVIRONMENT, ports) that differ from CI. Add --no-launch-profile for consistent behavior:

Suggested change
exec dotnet run --no-build -c Release --project "$SCRIPT_DIR/src/Nethermind/Nethermind.Test.Runner/Nethermind.Test.Runner.csproj" -- "$@"
exec dotnet run --no-build --no-launch-profile -c Release --project "$SCRIPT_DIR/src/Nethermind/Nethermind.Test.Runner/Nethermind.Test.Runner.csproj" -- "$@"

!parseResult.GetValue(Options.ExcludeMemory),
parseResult.GetValue(Options.ExcludeStack)));
bool forceJson = isEngineTest || jsonOutput;
List<EthereumTestResult> results = await RunBlockTestFiles(files, filter, chainId, trace, traceMemory, excludeStack, forceJson, workers);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium — Block tests without --jsonout silently produce no stdout output

forceJson = isEngineTest || jsonOutput. When isBlockTest && !jsonOutput, forceJson = false, so the results computed by RunBlockTestFiles are discarded — nothing is written to stdout (the JSON serializer at line 145 is not called, and suppressOutput: true is always passed so per-test console output is suppressed too). A user running nethtest --blockTest --input ./fixtures with no --jsonout flag gets zero output beyond the per-test PASS/FAIL on stderr, with no summary.

Consider either always serializing block test results or explicitly documenting that --jsonout is required for block tests. The workflow always passes --jsonout, but local users will be confused.


if (Directory.Exists(path))
return Directory.GetFiles(path, "*.json", SearchOption.AllDirectories)
.Where(f => !f.Contains("/.meta/") && !f.Contains("\\.meta\\"))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low — LINQ used in non-hot path (coding style preference)

Both Where and OrderBy are LINQ. Per the repo coding style, for/foreach is preferred over LINQ. Since this is called once at startup (not a hot path), the impact is negligible, but worth aligning with the project convention.

Suggested change
.Where(f => !f.Contains("/.meta/") && !f.Contains("\\.meta\\"))
if (Directory.Exists(path))
{
List<string> result = [];
foreach (string f in Directory.GetFiles(path, "*.json", SearchOption.AllDirectories))
{
if (!f.Contains("/.meta/") && !f.Contains("\\.meta\\"))
result.Add(f);
}
result.Sort(StringComparer.Ordinal);
return result;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants