Skip to content

feat(ci): overhaul fixture releases#2888

Open
spencer-tb wants to merge 5 commits into
ethereum:forks/amsterdamfrom
spencer-tb:ci/overhaul-fixture-releases
Open

feat(ci): overhaul fixture releases#2888
spencer-tb wants to merge 5 commits into
ethereum:forks/amsterdamfrom
spencer-tb:ci/overhaul-fixture-releases

Conversation

@spencer-tb
Copy link
Copy Markdown
Contributor

@spencer-tb spencer-tb commented May 20, 2026

🗒️ Description

This PR outlines a detailed description and updates our test fixture release process for non-benchmark tests.

Key Changes

  • Release configs are cleaned up:
    • evm-impl.yaml is merged into evm.yaml and each entry now carries its impl, repo, ref, evm-bin and xdist.
    • A benchmark client alias is kept so feature.yaml workflows can reference a client by name, and build-evm-base resolves which builder to run via the impl field.
    • The ethereumjs t8n tool is removed from the release path.
  • We now have tests, devnet and benchmark features (in forks/amsterdam).
    • devnet is specifically chosen so we don't ever need to update feature.yaml for future devnet features. The devnet keyword is used as a substring match on release tagging, and <feat>-devnet keeps its friendly name.
  • release_fixture_feature.yaml is replaced with release_fixtures.yaml:
    • Triggered now via workflow_dispatch with explicit feature + version inputs. The git tag is tests-<feature>@<version> (the tests feature tags as tests@<version>) and the release title is <feature>@<version>.
    • Input validation fails fast: version must match vX.Y.Z, feature must be non-empty, *-devnet requires a branch, and an evm override must be a key in evm.yaml.
    • Optional evm / evm_repo / evm_ref inputs override the client impl and the t8n tool repo/ref for a one off release.
    • Releases are tagged with the gh cli on success only, no tag push, no delete/recreate process moving forward.
    • Releases are only drafted in EELS (EEST mirror is removed, EEST is now archived).
    • LLLC and Solc dependencies are removed.
  • Multi-runner split: BPO forks are filled within the osaka range (no standalone bpo split).

Tagging A Release

To tag releases we must now use the Github CLI. This has the benefit of only creating tags in EELS if the fixture building process is successful. No more tag deletion and recreation, only workflow triggers.

The following can be ran locally or optionally triggered with the github actions website UI.

gh workflow run release_fixtures.yaml -f feature=tests -f version=v20.3.1
# devnet releases additionally require the branch to release from:
gh workflow run release_fixtures.yaml -f feature=bal-devnet -f version=v7.0.0 -f branch=bal-devnet-7
# optional: override the client / t8n repo+ref for a one-off release
gh workflow run release_fixtures.yaml -f feature=tests -f version=v20.3.1 \
  -f evm=geth -f evm_repo=ethereum/go-ethereum -f evm_ref=master

Downloading A Release

Fixtures can be downloaded by 2 seperate methods. gh release download for the raw tarball, or consume cache if you want tag resolution (@latest), local caching, and --input integration with consume subcommands.

# via gh cli
gh release download tests@v20.3.1 --repo ethereum/execution-specs
gh release download tests-bal-devnet@v7.0.0 --repo ethereum/execution-specs

# latest tests release by publish time:
LATEST=$(gh release list --repo ethereum/execution-specs --limit 100 \
  --json tagName --jq '[.[] | select(.tagName | startswith("tests@v")) | .tagName][0]')
gh release download "$LATEST" --repo ethereum/execution-specs

# via consume cache
uv run consume cache --input=tests@v20.3.1
uv run consume cache --input=tests@latest
uv run consume cache --input=bal-devnet@v7.0.0

Fixed Release Types

tests@vX.Y.Z

In the past (pre-Weld) we had the following release features: stable & develop, where stable was a subset for develop. To converge on these 2 features we now define the tests feature. This is the invariant to the benchmark feature.

The tests feature acts as our mainnet set of tests. That is for now fill --until BPO4, all tests for all forks until last mainnet fork. This will be released weekly on any change or addition to the tests always from the latest development branch: forks/amsterdam currently. Clients will use this release on their main/master branches in CI, and eventually this release will contain all tests from ethereum/tests & ethereum/legacy-tests (allowing us to archive both of these repos). TLDR; one tag type to verify you will not break mainnet. Additionally this will be ran in our Hive CI (under what is currently labelled generic).

Here we define a new semantic versioning type for our tests releases:

  • X is the fork number, BPO2 is at index 20,
    • This makes it clear what fork the release is up to date with.
  • Y is a consensus-breaking spec change targeting fork X. Should rarely occur.
  • Z is a non-breaking change (refactor), new or modified tests.

The first release of this type in EELS will be tests@v20.0.0, to catch up from the last fork. For the fork under development (Amsterdam) the first tests release will be tagged once all CFI'd EIPs are deemed successful in a devnet (purposely ambiguous here, things change in ethereum). Typically this will resolve to the last devnet before the first testnet; this first release can be viewed as the testnet release. For Amsterdam we will tag tests@v24.0.0 likely after glamsterdam-devnet-6 is deemed successful. Spec changes can still occur and that is why we have Y in the new fixture versioning scheme.

tests-<feat>-devnet@vX.Y.Z

This release type will follow on from the current devnet release process but more explicitly. Today <feat> is bal and soon it will be glamsterdam. <feat> can essentially be any keyword but typically is the fork name or the headliner feature.

The devnet feature is entirely for test releases during the fork development process for the upcoming fork. Here we still fill for all forks so clients can make sure they do not introduce any regressions, currently fill --until Amsterdam, all tests for all forks until the development fork. Clients will use this release on their devnet branches in CI; they must pass all of these tests before being included in the devnet that the release is tagged for. This release will be ran in Hive CI under the same naming scheme as we do currently.

For the bal devnets today we use the tag tests-bal@vX.Y.Z. This PR will change the process to tests-bal-devnet@vX.Y.Z. As bal-devnet-7 is the last of the bal's we will start the new tagging scheme for glamsterdam-devnet-5. The devnet releases will additionally follow a new versioning scheme:

  • X is the devnet number, so for glamsterdam-devnet-5 this would be 5,
    • This makes it clear what devnet the release targets.
  • Y/Z follow the same semantics as the tests releases.

Following the latter, the first glamsterdam-devnet-5 release will be tagged as tests-glamsterdam-devnet@v5.0.0. All devnet releases must be tagged from a devnet branch in EELS, in this case glamsterdam-devnet-5. Here we are specifically choosing to diverge from the EELS / branch naming scheme to align every repo under the same devnet name.

🔗 Related Issues or PRs

✅ Checklist

  • All: Ran fast static checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
    just static
  • All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
  • All: Considered updating the online docs in the ./docs/ directory.
  • All: Set appropriate labels for the changes (only maintainers can apply labels).

Cute Animal Picture

Put a link to a cute animal picture inside the parenthesis-->

@spencer-tb spencer-tb added C-feat Category: an improvement or new feature P-medium A-ci Area: Continuous Integration labels May 20, 2026
@spencer-tb spencer-tb force-pushed the ci/overhaul-fixture-releases branch from abb005f to 8bd6b77 Compare May 20, 2026 14:00
@codecov
Copy link
Copy Markdown

codecov Bot commented May 22, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 90.50%. Comparing base (62b914c) to head (cfbdc9d).
⚠️ Report is 32 commits behind head on forks/amsterdam.

Additional details and impacted files
@@                 Coverage Diff                 @@
##           forks/amsterdam    #2888      +/-   ##
===================================================
+ Coverage            87.16%   90.50%   +3.34%     
===================================================
  Files                  586      535      -51     
  Lines                35791    32407    -3384     
  Branches              3364     3011     -353     
===================================================
- Hits                 31198    29331    -1867     
+ Misses                3943     2559    -1384     
+ Partials               650      517     -133     
Flag Coverage Δ
unittests 90.50% <ø> (+3.34%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@spencer-tb spencer-tb force-pushed the ci/overhaul-fixture-releases branch 2 times, most recently from 7eac7b3 to d558bc7 Compare May 22, 2026 12:33
@spencer-tb
Copy link
Copy Markdown
Contributor Author

This PR needs to be merged to fully test out the new workflow, but I did some smaller smoke tests on my fork:

To keep the workflow runnable in minutes on a fork (no gigachungus runners), I scoped both features to a small Cancun test module and pointed build at ubuntu-latest. The split/combine/release wiring is identical to production, the only difference is the fill scope and runner diffs.

Smoke test releases can be found here: https://github.com/spencer-tb/execution-specs/releases

@spencer-tb spencer-tb force-pushed the ci/overhaul-fixture-releases branch from d558bc7 to b2ac847 Compare May 25, 2026 13:40
@spencer-tb spencer-tb marked this pull request as ready for review May 25, 2026 13:40
@chfast
Copy link
Copy Markdown
Member

chfast commented May 25, 2026

  1. "consensus" name not very good. I skip the section on the first read because I though this is for CL. I don't want very picky, but some of my suggestions: just "tests", "tests-main", "tests-stable" (in a sense for stable spec).
  2. I don't really care about the version number so you can chose any versioning. However, you are adding implicit matching rule: bal-devnet-7 → tests-bal-devnet@v7.x.y. This breaks semver (which I don't care about, as mentioned). Make sure you really want this.
  3. Having weekly / bi-weekly cadence would be big improvement for me. I often contribute new test cases for "tests-consensus" and want to integrate them to my CI as soon as possible. Do releases often. Skip if nothing new to release.

Comment thread .github/configs/feature.yaml Outdated
@marioevz
Copy link
Copy Markdown
Member

  1. "consensus" name not very good. I skip the section on the first read because I though this is for CL. I don't want very picky, but some of my suggestions: just "tests", "tests-main", "tests-stable" (in a sense for stable spec).

    1. I don't really care about the version number so you can chose any versioning. However, you are adding implicit matching rule: bal-devnet-7 → tests-bal-devnet@v7.x.y. This breaks semver (which I don't care about, as mentioned). Make sure you really want this.

    2. Having weekly / bi-weekly cadence would be big improvement for me. I often contribute new test cases for "tests-consensus" and want to integrate them to my CI as soon as possible. Do releases often. Skip if nothing new to release.

  1. I had not thought about, I think it's important. Maybe we should rollback to "stable" or similar?
  2. I agree that breaks semver but also think that it should not be a dealbreaker. I see these releases as more "ephemeral" in the sense that these are probably not going to be in client's CI workflows for more than a month or two, so this versioning scheme is ok for now IMO.
  3. I think this could be the next step: Once we have a good release process, we can start looking for automation for a certain release cadence (run a workflow that lists the stable tests for the current commit, compares against the list from the previous release, if any addition, release and list the changes).

@taratorio
Copy link
Copy Markdown
Contributor

This is great, thanks. Just one question about benchmark releases. Does it make sense for those to also have benchmark-consensus and benchmark-devnet variants? For example, recently ive been hitting a slight complication which ive had to work around - the benchmark fixtures dont have BALs in them so I couldnt use them locally for optimisation work on top of the devnet changes. I had to synthesise my own fixtures essentially. If we had benchmark-devnet variant for those I wouldve not had to do this workaround

@LouisTsai-Csie
Copy link
Copy Markdown
Collaborator

I support @taratorio 's idea if this does not complicate the workflow too much

Copy link
Copy Markdown
Member

@danceratopz danceratopz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for this effort! Looking forward to getting all releases in execution-specs!

I feel a little bit weird about hijacking the MAJOR version for fork/devnet version (fork version is MINOR in EELS releases, so another slight inconsistency). We do have to hope PandaOps never launches feat-devnet-alphaone or similar to avoid invalid versions 😆

On face value, it seems to fit well. And i think one of your major motivations is to avoid hard-coding releases in the hive-tests runner config repos...
https://github.com/ethpandaops/hive-tests/blob/52cf2fcfa6bb698c11344994d8e223e69dd3a969/.github/workflows/hive-devnet-7.yaml#L161-L162

But it might be worth thinking through a little more. If we have two devnet versions (bal-devnet-3,bal-devnet-7) running at the same time, does it still simplify artifact download? I guess we never have hive runner configs for two devnets simultaneously?

As-is we could already do:

consume cache --input=bal-devnet-7@latest

and if we aim to deprecate consume cache then both versions are equally convenient with gh release download, I think?

Versioning scheme in this PR:

#!/usr/bin/env bash

TAG=$(gh release list --repo ethereum/execution-specs --limit 100 --json tagName \
    --jq 'map(select(.tagName | startswith("tests-bal-devnet@v7."))) | .[0].tagName')
gh release download "$TAG" --repo ethereum/execution-specs --pattern '*.tar.gz'

Versioning scheme currently:

#!/usr/bin/env bash
TAG=$(gh release list --repo ethereum/execution-specs --limit 100 --json tagName \
    --jq 'map(select(.tagName | startswith("tests-bal-devnet-7@v"))) | .[0].tagName')
gh release download "$TAG" --repo ethereum/execution-specs --pattern '*.tar.gz'

If this does not greatly help downstream convenience I would opt for explicit hard-coded release names here bal-devnet-7) and not add our own custom convention to versioning.

If we keep it, I wonder, at the risk of complicating the workflows, we should automate/hard-code the major version to avoid user error (see the docs corrections below) as suggested here. This would avoid duplicating this in the case of the devnet branch and put it close to the fill --until=<fork> config in the case of a fork branch. The comments on the invalid tag/version highlight that this is error prone.

Could you restructure the docs to keep all (or most of) the "Formats and Release Layout" section, but move it below the "Release Tracks" (or "Test Release Types" if we rename)?

I think removing blockchain_test_engine from the benchmark spec deserves its own PR as it gets a bit lost here.

Comment thread .github/configs/feature.yaml Outdated
Comment thread .github/configs/feature.yaml Outdated
Comment thread .github/configs/evm.yaml
Comment thread .github/configs/evm.yaml
Comment thread .github/configs/evm.yaml Outdated
Comment thread docs/running_tests/releases.md Outdated
Comment thread docs/running_tests/releases.md Outdated
Comment thread docs/running_tests/releases.md Outdated
Comment thread .github/workflows/release_fixtures.yaml
Comment thread .github/workflows/release_fixtures.yaml
@danceratopz
Copy link
Copy Markdown
Member

This is great, thanks. Just one question about benchmark releases. Does it make sense for those to also have benchmark-consensus and benchmark-devnet variants?

Hey @taratorio, thanks for asking and the feedback! Yes, this def makes sense in general and we can add these as required.

For example, recently ive been hitting a slight complication which ive had to work around - the benchmark fixtures dont have BALs in them so I couldnt use them locally for optimisation work on top of the devnet changes. I had to synthesise my own fixtures essentially. If we had benchmark-devnet variant for those I wouldve not had to do this workaround

Were you filling these fixtures yourself? The latest release https://github.com/ethereum/execution-specs/releases/tag/tests-benchmark%40v0.0.9 is only filled for Osaka. There might have been a bit of flux here due to incompatible/incomplete t8n tools for EELS and geth (benchmark releases have been using geth), but as-is today 😆 on forks/amsterdam with ethereum/go-ethereum#35025 both regular and benchmark test fixtures have blockAccessList. They should be in the next benchmark release, which will target Amsterdam!

@taratorio
Copy link
Copy Markdown
Contributor

This is great, thanks. Just one question about benchmark releases. Does it make sense for those to also have benchmark-consensus and benchmark-devnet variants?

Hey @taratorio, thanks for asking and the feedback! Yes, this def makes sense in general and we can add these as required.

For example, recently ive been hitting a slight complication which ive had to work around - the benchmark fixtures dont have BALs in them so I couldnt use them locally for optimisation work on top of the devnet changes. I had to synthesise my own fixtures essentially. If we had benchmark-devnet variant for those I wouldve not had to do this workaround

Were you filling these fixtures yourself? The latest release https://github.com/ethereum/execution-specs/releases/tag/tests-benchmark%40v0.0.9 is only filled for Osaka. There might have been a bit of flux here due to incompatible/incomplete t8n tools for EELS and geth (benchmark releases have been using geth), but as-is today 😆 on forks/amsterdam with ethereum/go-ethereum#35025 both regular and benchmark test fixtures have blockAccessList. They should be in the next benchmark release, which will target Amsterdam!

yes, I filled a few that I was interested in with erigon but not via t8n but my own quick and hacky way

@danceratopz
Copy link
Copy Markdown
Member

danceratopz commented May 27, 2026

Just discussed with @spencer-tb and @LouisTsai-Csie, this is our suggestion going forward:

  1. Instead of changing what is now "mainnet" to "consensus" (in this PR), we just tag these releases as tests@vX.Y.Z. No stable or other labelling required, they're just the tests. We aim for a on-demand release schedule, which could be as frequently as weekly if tests are added/fixed. The forks covered/filled-for in these releases should be slightly ahead of clients' testnet/mainnet release schedules. I.e., we include Amsterdam in good time before for the first testnet releases.
  2. The benchmark changes will move to another PR. Benchmark devnet releases:
    • A: For devnet 7, merge 8037 PR, bump Osaka->Amsterdam, pick/decide on an EVM :), then create the release from forks/amsterdam.
    • B: In general, for now, if necessary just add a new entry to feature.yaml (Move benchmark changes to another PR). I.e., if we need a benchmark-glamsterdam-devnet-5@v1.0.0, we can create a new entry for that.
  3. We simplify release the versioning scheme that can be applied to any test fixture release to:
    X: fix spec (spec change)
    Y: fix test
    Z: new test
  4. The next/new release versions start at:
    • tests@v1.0.0
    • benchmark@v1.0.0 maybe from the devnet-7 release on / maybe the next one TBD ??
    • next devnet release - <feat>-devnet-<N>@v1.0.0

@marioevz
Copy link
Copy Markdown
Member

Just discussed with @spencer-tb and @LouisTsai-Csie, this is our suggestion going forward:

1. Instead of changing what is now "mainnet" to "consensus" (in this PR), we just tag these releases as `tests@vX.Y.Z`. No stable or other labelling required, they're just the tests. We aim for a on-demand release schedule, which could be as frequently as weekly if tests are added/fixed. The forks covered/filled-for in these releases should be slightly ahead of clients' testnet/mainnet release schedules. I.e., we include Amsterdam in good time before for the first testnet releases.

2. The benchmark changes will move to another PR. Benchmark devnet releases:
   
   * A: For devnet 7, merge 8037 PR, bump Osaka->Amsterdam, pick/decide on an EVM :), then create the release from forks/amsterdam.
   * B: In general, for now, if necessary just add a new entry to feature.yaml (Move benchmark changes to another PR). I.e., if we need a benchmark-glamsterdam-devnet-5@v1.0.0, we can create a new entry for that.

3. We simplify release the versioning scheme that can be applied to any test fixture release to:
   X: fix spec (spec change)
   Y: fix test
   Z: new test

4. The next/new release versions start at:
   
   * `tests@v1.0.0`
   * `benchmark@v1.0.0` maybe from the devnet-7 release on / maybe the next one TBD ??
   * next devnet release -` <feat>-devnet-<N>@v1.0.0`

I mostly agree with this comment except for points 3 and 4, since I think the MAJOR should refer to the devnet/fork:

My suggestion is as follows:

For tests@vX.Y.Z:

  • X: Fork-based number
  • Y: Consensus-breaking spec change targeting fork X
  • Z: Non-breaking spec change (refactoring), new tests, modified tests

For <feat>-devnet@vX.Y.Z:

  • X: Devnet number
  • Y: Consensus-breaking spec change targeting devnet X
  • Z: Non-breaking spec change (refactoring), new tests, modified tests

I.e. if a client is targeting to join fork/devnet X, it should target MAJOR equal to X, must take the latest MINOR, and should ideally take the latest PATCH.

With fork/devnet as MAJOR, the version number alone tells a client whether a release is relevant to them. Under the alternative (MAJOR tracks any spec change), you'd have to combine the version and the -N suffix in the release name to figure out whether a spec change targets your devnet or a different one. Devnet compatibility shouldn't require parsing two fields IMO.

Concretely, the mindset just from looking at the version (ignoring the name) should be: I'm a client dev targeting devnet 7, and I was passing tests contained inv7.0.0, but now v7.1.0 has been released, which means there was a spec change in the devnet my client was targeting, hence I should read the release notes to figure out if the spec change affects my client's ability to join the devnet.

On the topic of parallel maintenance of two different devnets, this scheme handles this naturally IMO. E.g. if we are maintaining devnets 3 & 7 for the same feature, releasing v3.0.1 after or alongside v7.0.0 is not a problem (think Python 2.x vs 3.x). It is a well-known Semver pattern, and Semver is good at this. We should simply make this rule clear and follow it so we are predictable.

On consume cache, it's a solvable problem. We will need to update our tooling, but it's not a big issue.

On benchmarking, major and minor should mirror the feature they are targeting, while the patch moves freely at its own pace.

@danceratopz
Copy link
Copy Markdown
Member

🚢 :shipit:

@spencer-tb spencer-tb force-pushed the ci/overhaul-fixture-releases branch from ac49382 to 93a501c Compare June 3, 2026 11:49
@spencer-tb spencer-tb force-pushed the ci/overhaul-fixture-releases branch 10 times, most recently from 9ffaf6a to bfafa90 Compare June 3, 2026 16:05
@spencer-tb spencer-tb force-pushed the ci/overhaul-fixture-releases branch from bfafa90 to 4b74a4a Compare June 4, 2026 11:13
@spencer-tb
Copy link
Copy Markdown
Contributor Author

PR is now consolidated. Description updated. Benchmark release changes are moved to this PR: #2954.

@danceratopz danceratopz self-requested a review June 4, 2026 12:18
Copy link
Copy Markdown
Member

@danceratopz danceratopz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for handling all the comments, respectively pushing the benchmark changes to #2954!

I would like to go over the docs again before merge, but posting this beforehand as the larger blocker right now: Both the validation scripts and docs assume our branches have the fork bal-devnet-7 (as the rest of the ecosystem uses), whereas we currently use devnets/bal/7 in EELS.

Comment on lines -15 to -65
## Fixture Output Directory Structure

Inside each format directory, fixtures are grouped by **target fork**.

The top-level subdirectory identifies the fork **under test**. Below it,
fixtures mirror the `./tests/` source layout: each directory corresponds
to the fork where the functionality was originally introduced. Because
tests declare `valid_from`, a single target fork directory contains
fixtures from every prior fork whose tests are still valid at that fork.

### Consensus fixture layout

```text
fixtures/
└── blockchain_tests/
├── for_prague/ # filled targeting Prague
│ ├── istanbul/ # tests introduced in Istanbul
│ │ └── eip1344_chainid/...
│ ├── cancun/ # tests introduced in Cancun
│ │ └── eip4844_blobs/...
│ └── prague/ # tests introduced in Prague
│ └── eip7702_set_code_tx/...
└── for_osaka/ # filled targeting Osaka
├── istanbul/
│ └── eip1344_chainid/...
├── cancun/
│ └── eip4844_blobs/...
├── prague/
│ └── eip7702_set_code_tx/...
└── osaka/ # tests introduced in Osaka
└── eip7692_eof_v1/...
```
| Blob Transaction Tests | - using the [eels/execute-blobs Simulator](./execute/hive.md#the-eelsexecute-blobs-simulator) | None; executed directly from Python source,</br>using a release tag |

Other format directories (`state_tests/`, `blockchain_tests_engine/`)
follow the same layout.
## Release Tracks

### Benchmark fixture layout
Fixtures are released on independent tracks. Each track has its own tag namespace, artifact, and cadence.

When filling with `--gas-benchmark-values`, benchmark tests additionally
include the gas limit in the subdirectory name (`for_{fork}_at_{gas}M`,
where `{gas}` is in millions, zero-padded to four digits), with one
subdirectory per gas value:
| Track | Tag | Artifact | Scope | Built from |
| --------- | -------------------- | ------------------------------- | ------------------------------------------------------------------ | ----------------- |
| Consensus | `consensus@vX.Y.Z` | `fixtures_consensus.tar.gz` | All forks, all tests (including legacy tests) | latest `forks/*` branch |
| Devnet | `<feat>-devnet@vX.Y.Z` | `fixtures_<feat>-devnet.tar.gz` | All forks, all tests, for an upcoming-fork feature under active devnet testing | the devnet branch |
| Benchmark | `benchmark@vX.Y.Z` | `fixtures_benchmark.tar.gz` | EVM benchmarking tests | latest `forks/*` branch |

```text
fixtures/
└── blockchain_tests/
├── for_osaka_at_0001M/ # 1M gas benchmark
│ └── benchmark/compute/...
└── for_osaka_at_0002M/ # 2M gas benchmark
└── benchmark/compute/...
```
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still missing!

branch:

```bash
gh workflow run release_fixtures.yaml -f feature=bal-devnet -f version=v7.0.0 -f branch=bal-devnet-7
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (and the changes in .github/scripts/) is forward looking to an EELS branch scheme rename?

devnets/bal/7 -> bal-devnet-7.

I just asked in the chat ;-)

Comment thread .github/workflows/release_fixtures.yaml Outdated
Comment thread .github/workflows/release_fixtures.yaml Outdated
Comment thread packages/testing/src/execution_testing/specs/benchmark.py
Comment on lines +80 to +130
def validate_inputs(feature: str, version: str, branch: str) -> None:
"""
Validate the release dispatch inputs before building a matrix.

Centralize the feature/version checks here so they are unit-testable
rather than living as inline bash in the release workflow.

For `<feat>-devnet` releases the major version (`X` of `vX.Y.Z`)
must equal the devnet number encoded in the release branch, so a
`bal-devnet` release from `bal-devnet-7` must be tagged `v7.*.*`.
"""
if not feature:
fail("feature name is empty")
if not VERSION_RE.match(version):
fail(f"version '{version}' must match vX.Y.Z (e.g. v20.0.0)")

# A bare `devnet` has no friendly `<feat>-` prefix to tag with.
if feature in ("devnet", "-devnet"):
fail("devnet releases require a <feat>- prefix, e.g. bal-devnet")

# `<feat>-devnet-<n>`: the devnet index belongs in the version (X of
# vX.Y.Z), not in the feature name.
if "-devnet-" in feature:
suggested_feature, _, suggested_index = feature.rpartition("-")
fail(
"devnet index must go in 'version', not the feature name; "
f"did you mean feature={suggested_feature} "
f"version=v{suggested_index}.0.0?"
)

if feature.endswith("-devnet"):
if not branch:
fail(
"devnet releases require a 'branch' input, "
"e.g. branch=bal-devnet-7"
)
match = re.search(r"(\d+)$", branch)
if not match:
fail(
f"could not parse a devnet number from branch '{branch}' "
"(expected a trailing number, e.g. bal-devnet-7)"
)
devnet_number = int(match.group(1))
major = int(version.lstrip("v").split(".")[0])
if major != devnet_number:
minor_patch = version.split(".", 1)[1]
fail(
f"version major (v{major}) must equal the devnet number "
f"({devnet_number}) from branch '{branch}'; "
f"did you mean version=v{devnet_number}.{minor_patch}?"
)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This validation assumes execution-specs uses <feat>-devnet-<X> scheme, currently we use devnets/bal/7.

We need to sort this out first, there's some debate in the aforementioned chat.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-ci Area: Continuous Integration C-feat Category: an improvement or new feature P-medium

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants