Feat/evm enginetest#11399
Conversation
…ests Parse all fixture files upfront and flatten to individual test cases before distributing to workers, instead of parallelizing by file. Nethermind full suite: 2m48s with 8 workers (was 2m58s file-level).
- Pre-warm thread pool capped at CPU count (fixes starvation at w>=4) - Parallel file parsing via Parallel.For - UTF-8 byte span deserialization (File.ReadAllBytes + ReadOnlySpan<byte>) - Pre-allocated results array instead of ConcurrentBag + sort - Lightweight BlockchainTestsRunner constructor for parallel path - Compiled Regex for filter matching - Pre-initialize KZG once upfront Prague engine tests: 24.1s → 18.5s at w=8 (23% faster). Full suite: 2m48s → 1m45s at w=4 (37% faster).
…r field, hide timeInMs
…irect - Add ForkName to BlockchainTest, use raw network string (e.g. CancunToPragueAtTime15k) - Add lastBlockHash, lastPayloadStatus to JSON output for block/engine tests - Remove stateRoot from block/engine results (only state tests) - Report validation error in `error` even on pass for negative tests: engine tests via PayloadStatus.ValidationError, block tests via BlockRemoved event and ValidateSuggestedBlock, state tests via TransactionResult.ErrorDescription - Suggest invalid blocks (blockHeader=null) to actually test rejection - Enables EELS consume direct exception mapping for all test types
… hardcoded string
Resolve 3 conflicts: - BlockchainTestBase.cs: take master's refactored helper methods and comments - JsonToEthereumTest.cs: keep PR's full fixture key as name - StateTestRunner.cs: merge _suppressOutput check with master's explicit types Also fix all var usages to explicit types and target-typed new (M5).
- H2: use pattern match `is not BlockchainTest test` to prevent NRE - L1: use `is FailedToLoadTest` instead of `as ... is not null` - M2: return exit code 1 on usage error - M3: remove dead ParseBlockchainTestFile method - M4: replace int.Parse with int.TryParse + descriptive FormatException - M5: replace all var with explicit types (already done in merge) - L2: use JsonSerializer.Serialize for error string in EIP-3155 JSON - L3: compile filter regex once in constructor - L4: add guard for null _testsSource in RunTestsAsync
- Fix lastStatus/lastValidationError never updated in RunNewPayloads - Fix traceStack/excludeStack semantic inversion for state tests - Replace hardcoded developer path in nethtest script with $SCRIPT_DIR - Use Regex.IsMatch instead of Regex.Match(...).Success in ParseStateTestFile
Adds a workflow_dispatch workflow that: - Downloads EEST fixtures from ethereum/execution-spec-tests releases - Builds and runs nethtest with configurable test type, workers, filter - Parses JSON results and posts pass/fail summary to GITHUB_STEP_SUMMARY - Includes copy-pastable local repro command in the summary - Uploads results JSON as artifact
- Add push trigger on feat/evm-enginetest so workflow runs on push - Use env vars with fallback defaults so push trigger works without inputs - Include copy-pastable local repro command in GITHUB_STEP_SUMMARY
The EEST archive uses 'blockchain_tests_engine' (with underscore), not 'blockchain_test_engine'. This caused the fixture path resolution to fail on CI.
launchSettings.json overrides CLI args with hardcoded test paths, causing the runner to ignore the actual --engineTest/--input args.
The dotnet run process was being killed silently on CI (likely OOM with 4 workers × 2844 tests). The | tee pipeline with pipefail caused the step to report "operation cancelled" with no diagnostic output. Now: redirect stdout to file, stderr to log, check exit code explicitly, and dump stderr if something goes wrong.
- Use bal@v5.6.0 with fixtures_bal.tar.gz (has BAL/Amsterdam tests) - Let stderr flow to console for live progress visibility - Only redirect stdout (JSON) to file
- Wrap RunTest in try/catch so individual test failures don't crash the whole runner — report them as failed results with the exception message - Improve GITHUB_STEP_SUMMARY: group failures by error type with counts, show failed test names in collapsible details section
When running with suppressOutput (CI/parallel mode), each test result is printed to stderr as "PASS test_name" or "FAIL test_name" so CI logs show live progress instead of silence until the JSON blob appears.
ParallelExecution causes unhandled exceptions from background threads in BlockAccessListManager that crash the process. Disable by default for the CLI test runner; NUnit tests that explicitly override to true are unaffected.
- PASS/FAIL/EXCEPTION prefixes now use ANSI colors (green/red) on stderr - CI workflow captures stderr to file alongside console streaming - If process crashes before writing JSON, Parse results falls back to parsing PASS/FAIL lines from stderr log for the summary - Summary header shows "(process crashed)" when using stderr fallback
- Process substitution 2> >(tee ...) causes bash to die on SIGTERM, skipping all subsequent steps including if: always() ones - Use continue-on-error: true on Run nethtest so crashes don't block - Add separate "Show stderr output" step to dump the log - Strip ANSI codes in stderr fallback parsing
Separate steps can't survive runner SIGTERM (exit 143 from OOM). Now write_summary runs as a trap EXIT handler in the same shell, so the summary is written even if the dotnet process is killed. Also reduce workers to 4 (8 causes OOM on ubuntu-latest).
- Use mkfifo + tee to stream stderr to both console and log file (process substitution caused bash-level crashes on SIGTERM) - Reduce default workers from 4 to 2 to avoid OOM on ubuntu-latest (4 workers hit SIGTERM/exit 143 after ~55s)
The crash (exit 143/SIGTERM) was caused by unhandled exceptions from BlockAccessListManager background threads when ParallelExecution=true, not OOM. Override ParallelExecutionOverride to false in BlockchainTestsRunner only — NUnit tests are unaffected. Restore workers to 4 since memory was not the issue.
Block validator rejections and exception stack traces from internal components were interleaved with PASS/FAIL lines on stderr, making output unreadable — especially with multiple workers. Add ComponentLogManagerOverride to BlockchainTestBase. The CLI runner sets it to LogLevel.Error when in suppressOutput mode, so only real errors appear. PASS/FAIL/EXCEPTION lines with their error messages are still printed. NUnit tests are unaffected (override is null).
- Strip ANSI codes and extract PASS/FAIL/EXCEPTION lines more robustly - Group errors by text after " — " separator - Show ALL failed tests in details section (no head -100 limit) - Include error message in JSON-based failure list too
|
Claude finished @kamilchodola's task in 4m 57s —— View job PR Review — feat/evm-enginetestThis PR adds EEST FindingsHigh1. Workflow push trigger is feature-branch-specific (line 8) 2. Verify GitHub Actions version tags (line 67) Medium3. 4. Two distinct EEST exceptions mapped to the same regex (BlockchainTestBase.cs ~line 493) 5. 6. Block tests without Low7. LINQ in 8. 9. 10. No timeout on What's working well
|
| cancel-in-progress: true | ||
|
|
||
| on: | ||
| push: |
There was a problem hiding this comment.
High — Workflow trigger is branch-specific and will stop running after merge
The push trigger is hard-coded to feat/evm-enginetest. After this PR merges to master, the workflow will never run automatically. Decide on the post-merge trigger strategy (e.g. push: branches: [master], a pull_request trigger, or workflow_dispatch-only) before merging.
| push: | |
| push: | |
| branches: [master] |
or remove the push trigger entirely if the workflow is intentionally manual-only.
| timeout-minutes: 60 | ||
| steps: | ||
| - name: Check out repository | ||
| uses: actions/checkout@v6 |
There was a problem hiding this comment.
High — Verify GitHub Actions version numbers exist
actions/checkout@v6 and actions/upload-artifact@v7 (line 264) are pinned to versions that may not exist. As of my knowledge cutoff the latest stable tags were @v4 for both. The same question applies to actions/setup-dotnet@v5 (line 72) and actions/cache@v5 (line 75). If these tags don't exist, the workflow will fail immediately on every run with a 404.
Run gh release list -R actions/checkout | head -5 etc. to verify, or pin to known-stable versions.
| private static readonly Dictionary<int, int> NewPayloadParamCounts = Enumerable | ||
| .Range(1, EngineApiVersions.NewPayload.Latest) | ||
| .ToDictionary(v => v, v => (typeof(IEngineRpcModule).GetMethod($"engine_newPayloadV{v}") | ||
| ?? throw new NotSupportedException($"engine_newPayloadV{v} not found on IEngineRpcModule")).GetParameters().Length); |
There was a problem hiding this comment.
Medium — Reflection-based static initializer can produce opaque TypeInitializationException
If engine_newPayloadV{v} is not found on IEngineRpcModule (e.g. after an API refactor), the throw new NotSupportedException(...) fires during the static initializer, which wraps it in a TypeInitializationException. The real cause is buried in the inner exception and does not surface as a clear test failure — it crashes any test that touches this class with a confusing error.
Consider making this lazy (built on first use) or at least wrapping the static init in a try-catch that rethrows with context:
private static readonly Dictionary<int, int> NewPayloadParamCounts = BuildNewPayloadParamCounts();
private static Dictionary<int, int> BuildNewPayloadParamCounts()
{
var result = new Dictionary<int, int>();
for (int v = 1; v <= EngineApiVersions.NewPayload.Latest; v++)
{
MethodInfo? method = typeof(IEngineRpcModule).GetMethod($"engine_newPayloadV{v}");
if (method is null)
throw new InvalidOperationException($"engine_newPayloadV{v} not found on IEngineRpcModule. Update EngineApiVersions.NewPayload.Latest.");
result[v] = method.GetParameters().Length;
}
return result;
}| @@ -0,0 +1,3 @@ | |||
| #!/bin/bash | |||
| SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)" | |||
| exec dotnet run --no-build -c Release --project "$SCRIPT_DIR/src/Nethermind/Nethermind.Test.Runner/Nethermind.Test.Runner.csproj" -- "$@" | |||
There was a problem hiding this comment.
Medium — Script is missing --no-launch-profile that the workflow uses
The workflow runs dotnet run --no-launch-profile -c Release ... but this script uses dotnet run --no-build -c Release ... without --no-launch-profile. Without that flag, dotnet run picks up the project's launchSettings.json if present, which can set environment variables (e.g. ASPNETCORE_ENVIRONMENT, ports) that differ from CI. Add --no-launch-profile for consistent behavior:
| exec dotnet run --no-build -c Release --project "$SCRIPT_DIR/src/Nethermind/Nethermind.Test.Runner/Nethermind.Test.Runner.csproj" -- "$@" | |
| exec dotnet run --no-build --no-launch-profile -c Release --project "$SCRIPT_DIR/src/Nethermind/Nethermind.Test.Runner/Nethermind.Test.Runner.csproj" -- "$@" |
| !parseResult.GetValue(Options.ExcludeMemory), | ||
| parseResult.GetValue(Options.ExcludeStack))); | ||
| bool forceJson = isEngineTest || jsonOutput; | ||
| List<EthereumTestResult> results = await RunBlockTestFiles(files, filter, chainId, trace, traceMemory, excludeStack, forceJson, workers); |
There was a problem hiding this comment.
Medium — Block tests without --jsonout silently produce no stdout output
forceJson = isEngineTest || jsonOutput. When isBlockTest && !jsonOutput, forceJson = false, so the results computed by RunBlockTestFiles are discarded — nothing is written to stdout (the JSON serializer at line 145 is not called, and suppressOutput: true is always passed so per-test console output is suppressed too). A user running nethtest --blockTest --input ./fixtures with no --jsonout flag gets zero output beyond the per-test PASS/FAIL on stderr, with no summary.
Consider either always serializing block test results or explicitly documenting that --jsonout is required for block tests. The workflow always passes --jsonout, but local users will be confused.
|
|
||
| if (Directory.Exists(path)) | ||
| return Directory.GetFiles(path, "*.json", SearchOption.AllDirectories) | ||
| .Where(f => !f.Contains("/.meta/") && !f.Contains("\\.meta\\")) |
There was a problem hiding this comment.
Low — LINQ used in non-hot path (coding style preference)
Both Where and OrderBy are LINQ. Per the repo coding style, for/foreach is preferred over LINQ. Since this is called once at startup (not a hot path), the impact is negligible, but worth aligning with the project convention.
| .Where(f => !f.Contains("/.meta/") && !f.Contains("\\.meta\\")) | |
| if (Directory.Exists(path)) | |
| { | |
| List<string> result = []; | |
| foreach (string f in Directory.GetFiles(path, "*.json", SearchOption.AllDirectories)) | |
| { | |
| if (!f.Contains("/.meta/") && !f.Contains("\\.meta\\")) | |
| result.Add(f); | |
| } | |
| result.Sort(StringComparer.Ordinal); | |
| return result; | |
| } |
Fixes Closes Resolves #
Please choose one of the keywords above to refer to the issue this PR solves followed by the issue number (e.g. Fixes #000). If no issue number, remove the line. Also, remove everything marked optional that is not applicable. Remove this note after reading.
Changes
Types of changes
What types of changes does your code introduce?
Testing
Requires testing
If yes, did you write tests?
Notes on testing
Optional. Remove if not applicable.
Documentation
Requires documentation update
If yes, link the PR to the docs update or the issue with the details labeled
docs. Remove if not applicable.Requires explanation in Release Notes
If yes, fill in the details here. Remove if not applicable.
Remarks
Optional. Remove if not applicable.