Skip to content

[DX03578] speed up local CRE startup #21719

Merged
Tofel merged 6 commits intodevelopfrom
codex/dx-3578-local-cre-startup-speedup
Mar 30, 2026
Merged

[DX03578] speed up local CRE startup #21719
Tofel merged 6 commits intodevelopfrom
codex/dx-3578-local-cre-startup-speedup

Conversation

@Tofel
Copy link
Copy Markdown
Contributor

@Tofel Tofel commented Mar 26, 2026

by parallelizing:

  • secret generation (big one!)
  • JD linking
  • job proposal
  • job approval

Before

Environment setup completed successfully in 63.87 seconds

or

Environment setup completed successfully in 70.01 seconds

After

Environment setup completed successfully in 40.93 seconds

or

Environment setup completed successfully in 33.99 seconds

(on local machine, without building Docker image)

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 26, 2026

✅ No conflicts with other open PRs targeting develop

@trunk-io
Copy link
Copy Markdown

trunk-io Bot commented Mar 26, 2026

Static BadgeStatic BadgeStatic BadgeStatic Badge

View Full Report ↗︎Docs

@github-actions
Copy link
Copy Markdown
Contributor

I see you updated files related to core. Please run make gocs in the root directory to add a changeset as well as in the text include at least one of the following tags:

  • #added For any new functionality added.
  • #breaking_change For any functionality that requires manual action for the node to boot.
  • #bugfix For bug fixes.
  • #changed For any change to the existing functionality.
  • #db_update For any feature that introduces updates to database schema.
  • #deprecation_notice For any upcoming deprecation functionality.
  • #internal For changesets that need to be excluded from the final changelog.
  • #nops For any feature that is NOP facing and needs to be in the official Release Notes for the release.
  • #removed For any functionality/config that is removed.
  • #updated For any functionality that is updated.
  • #wip For any change that is not ready yet and external communication about it should be held off till it is feature complete.

@Tofel Tofel force-pushed the codex/dx-3578-local-cre-startup-speedup branch 3 times, most recently from 272ccda to 721a668 Compare March 30, 2026 10:24
@Tofel Tofel force-pushed the codex/dx-3578-local-cre-startup-speedup branch from 721a668 to da13c82 Compare March 30, 2026 10:51
mchain0
mchain0 previously approved these changes Mar 30, 2026
@Tofel Tofel marked this pull request as ready for review March 30, 2026 13:33
@Tofel Tofel requested review from a team as code owners March 30, 2026 13:33
Copilot AI review requested due to automatic review settings March 30, 2026 13:33
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Risk Rating: HIGH (broad concurrency changes in CRE system-test harness + CI workflow timeout reduction)

This PR speeds up local CRE environment startup by parallelizing several system-test setup steps (node metadata/key generation, JD linking, job proposal/approval) and by reordering some feature initialization to reduce blocking.

Changes:

  • Parallelize node metadata/key generation and several job-spec proposal flows while preserving deterministic per-node/spec ordering via a merge helper.
  • Parallelize JD linking across DONs and cache/reuse JD chain-config bundle IDs; add timing logs for key steps.
  • Add a temporary wait for workflow-worker capability-registry sync state (non-Kubernetes) to mitigate a known race, and reduce GitHub Actions job timeouts.

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
system-tests/lib/cre/types.go Parallelize node metadata creation; add timing logs for node metadata + key generation; extend capability registry input.
system-tests/lib/cre/types_test.go Add tests for newNodes ordering and error index reporting.
system-tests/lib/cre/features/solana/v2/solana.go Parallelize Solana v2 job proposals and merge results deterministically.
system-tests/lib/cre/features/evm/v2/evm.go Parallelize EVM v2 job proposals across (chain,node) work items and merge results deterministically.
system-tests/lib/cre/features/read_contract/read_contract.go Parallelize per-chain job proposals and merge results deterministically.
system-tests/lib/cre/features/log_event_trigger/log_event_trigger.go Parallelize per-chain job proposals and merge results deterministically.
system-tests/lib/cre/features/jobhelpers/helpers.go Add helpers for bounded parallelism + ordered merge of per-worker proposal results.
system-tests/lib/cre/features/jobhelpers/helpers_test.go Add test ensuring merge preserves input order.
system-tests/lib/cre/features/sets/sets.go Reorder features so OCR3-related pieces run later to avoid blocking other job features.
system-tests/lib/cre/features/consensus/v2/consensus.go Minor formatting-only adjustment.
system-tests/lib/cre/environment/environment.go Pass provider into capability-registry configuration; update call to new ctx-aware API; add stage log line.
system-tests/lib/cre/don.go Parallelize DON JD linking, cache OCR2 bundle IDs per chain type, add JD chain-config timing logs, add retry/list helpers, add rounding helper.
system-tests/lib/cre/don_test.go Add tests for JD chain-config creation behavior (skip existing, create missing, timeout).
system-tests/lib/cre/don/jobs/jobs.go Optimize approvals by fetching proposals once per node and approving per node concurrently (sequential within node).
system-tests/lib/cre/don/jobs/jobs_test.go Add tests for approval concurrency/behavior and workflow-spec “already approved” handling.
system-tests/lib/cre/contracts/keystone.go Make capability registry configuration ctx-aware; add post-config wait for workflow-worker registry sync; refactor v1/v2 flows.
system-tests/lib/cre/contracts/registry_pickup_wait.go Add polling logic to wait for workflow-worker DB registry-sync snapshots (non-Kubernetes).
.github/workflows/cre-system-tests.yaml Reduce workflow job timeout from 60 to 10 minutes.
.github/workflows/cre-regression-system-tests.yaml Reduce workflow job timeout from 60 to 10 minutes.

Scrupulous human review areas:

  • Concurrency limiting choices in job proposal paths (especially EVM v2) to avoid overwhelming JD/nodes.
  • The new capability-registry sync polling (DB access patterns, timeouts, and correctness of the readiness condition).
  • The CI timeout reduction impact on real-world runtime variability (cold caches, image pulls, runner performance).

Suggested reviewers (per CODEOWNERS):

  • For system-test harness changes (default owners): @smartcontractkit/core, @smartcontractkit/foundations.
  • For workflow/CI timeout changes (/.github/**): @smartcontractkit/devex-cicd (also @smartcontractkit/devex-tooling, @smartcontractkit/core).
  • For capability-registry / registrysyncer-adjacent behavior: @smartcontractkit/keystone.

Comment thread system-tests/lib/cre/features/evm/v2/evm.go Outdated
Comment thread system-tests/lib/cre/don.go Outdated
Comment thread .github/workflows/cre-system-tests.yaml
Comment thread .github/workflows/cre-regression-system-tests.yaml
@cl-sonarqube-production
Copy link
Copy Markdown

@Tofel Tofel enabled auto-merge March 30, 2026 14:33
@Tofel Tofel added this pull request to the merge queue Mar 30, 2026
Merged via the queue into develop with commit 2c8e5ea Mar 30, 2026
128 of 129 checks passed
@Tofel Tofel deleted the codex/dx-3578-local-cre-startup-speedup branch March 30, 2026 16:44
prashantkumar1982 pushed a commit that referenced this pull request Apr 2, 2026
* parallelize secrets generation, JD linking and job proposal/approvals

* clean up

* lints lints

* allow CRE system/regression tests to run max 10 minutes in the CI

* CR changes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants