Use sharding in CI workflows by rockbmb · Pull Request #619 · open-web3-stack/polkadot-ecosystem-tests

rockbmb · 2026-05-18T18:17:40Z

Closes #610.

Context

#610 listed three CI perf items:
*a shared SQLite DB cache,

Subway binary caching, and
a switch to --pool=forks.

The latter two already landed in master; the DB cache was not worthwhile - a full 14-run benchmark (sqlite3 ± DB, better-sqlite3 ± DB) showed ~0-5% deltas at PET scale, well within run-to-run noise. Test time is WASM-bound, not storage-bound, so adding a shared SQLite layer did not improve it.

What does produce significant improvement is splitting the work across more runners.

Changes

Shard the test matrix 3 ways per network via Vitest's --shard flag. The two ecosystem-test jobs (polkadot, kusama) become six (polkadot×{1,2,3}, kusama×{1,2,3}), each running ~1/3 of the test files in parallel on separate runners. Applied to both ci.yml and update-known-good.yml.

Per-shard --reporter=blob outputs are uploaded as artifacts and combined by a follow-on merge-reports job (vitest --merge-reports), so the GitHub run summary still shows a single unified set of test totals instead of one section per shard. Each shard writes to blob-${network}-${shard}.json to avoid filename collisions when artifacts are merged.

update-known-good.yml's failed-chains-${network} artifact is renamed to failed-chains-${network}-${shard} to avoid collisions; the existing notify job already unions chain names across all downloaded artifacts, so per-chain GitHub-issue notifications continue to work.

Also drops two pruned RPC endpoints (wss://bridgehub-kusama.public.curie.radiumblock.co/ws, wss://bulletin.amperfix.de) and adds wss://bulletin-rpc.polkadot.io for Polkadot Bulletin. Both removed endpoints accept connections and serve block headers, but fail on state_getStorage at PET's pinned blocks with UnknownBlock: State already discarded.

Impact

CI wall time on a clean run drops from ~35min to ~15min (measured on this branch). Each shard still gets the full 9-chain Subway pool (no infra changes), so per-shard RPC cache duplication across runners is the tradeoff for the parallelism. A future move to self-hosted runners with a long-lived Subway pool would close that gap further.

--retry=3 reduced to --retry=2 since #616 cut endpoint flakiness; timeout-minutes reduced from 150 to 60 to match the new envelope.

github-actions

The introduction of sharding in the CI workflow is a great step for performance. However, reducing the test retry count could impact the stability of the CI pipeline if tests are flaky. The changes to the .env files are just block number updates and have no issues.

- ci.yml: add --reporter=blob per shard, upload as artifact, and a merge-reports job that combines them via vitest --merge-reports so the run summary shows one unified set of test totals instead of one section per shard. - update-known-good.yml: shard the tests matrix 3 ways to match ci.yml, and disambiguate the failed-chains artifact name by shard so the notify job downloads all per-(network, shard) reports without collisions.

The previous amperfix endpoint has pruned state at PET's pinned block (probed live: UnknownBlock error in 136ms). bulletin-rpc.polkadot.io serves the same block in 110ms. simplystaking's spectrum endpoint is also pruned and not used.

Vitest's default blob filename is blob-${shard}-${total}.json, which collides across networks when 'actions/download-artifact' merges all artifacts into one directory. Each shard now writes to a uniquely named file, so the merge-reports job can parse all six blobs without the JSON-after-JSON SyntaxError seen in run 26061812546.

rockbmb · 2026-05-19T01:10:35Z

I am also testing this in the runtimes repo: polkadot-fellows/runtimes#1180

If all looks good over there as well (that work is based on this), I will merge this one.

rockbmb added 2 commits May 18, 2026 17:26

Bump block numbers

334a1a7

Use vitest sharding in CI workflow

1a29548

rockbmb added this to the Refactors & redesigns milestone May 18, 2026

rockbmb self-assigned this May 18, 2026

rockbmb added enhancement New feature or request ci labels May 18, 2026

github-actions Bot reviewed May 18, 2026

View reviewed changes

Comment thread .github/workflows/ci.yml Outdated

rockbmb added 4 commits May 18, 2026 18:35

Remove non-archive radiumblock endpoint for BridgeHub Kusama

19cb875

rockbmb changed the title ~~Use sharding in CI workflow~~ Use sharding in CI workflows May 18, 2026

xlc approved these changes May 19, 2026

View reviewed changes

rockbmb merged commit 5fd1659 into master May 19, 2026
13 checks passed

rockbmb deleted the ci-sharding branch May 19, 2026 14:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use sharding in CI workflows#619

Use sharding in CI workflows#619
rockbmb merged 6 commits into
masterfrom
ci-sharding

rockbmb commented May 18, 2026 •

edited

Loading

Uh oh!

github-actions Bot left a comment

Uh oh!

Uh oh!

rockbmb commented May 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rockbmb commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context

Changes

Impact

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rockbmb commented May 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rockbmb commented May 18, 2026 •

edited

Loading