Skip to content

fix(ci): pre-clone nargo external git deps with retry to survive DNS flakes#23263

Closed
AztecBot wants to merge 1 commit into
merge-train/spartanfrom
claudebox/fix-pr-23253-dequeue
Closed

fix(ci): pre-clone nargo external git deps with retry to survive DNS flakes#23263
AztecBot wants to merge 1 commit into
merge-train/spartanfrom
claudebox/fix-pr-23253-dequeue

Conversation

@AztecBot
Copy link
Copy Markdown
Collaborator

Why

PR #23253 (the merge-train/spartan train PR) was dequeued from the merge queue on commit b5431aa3e4 (run 25829983065). The aztec-nr shard failed because nargo couldn't clone noir-lang/sha256@v0.3.0:

Cloning into '/home/aztec-dev/nargo/github.com/noir-lang/sha256/v0.3.0'...
fatal: unable to access 'https://github.com/noir-lang/sha256/': Could not resolve host: github.com
Cannot read file /home/aztec-dev/nargo/github.com/noir-lang/sha256/v0.3.0/Nargo.toml - does it exist?

With parallel --halt now,fail=1 in ci.sh and 10 grind shards in merge-queue-heavy, one shard's DNS hiccup kills the entire run. Same flavour as the recent CRS HTTP fix (#23244) — no code regression, just a transient network blip.

nargo clones git deps on-demand and has no retry. noir-projects/bootstrap.sh prep already warms the cache with nargo fmt --check, but the top-level Makefile invokes each sub-bootstrap (aztec-nr/, noir-contracts/, noir-protocol-circuits/) in parallel directly, bypassing that prep.

What

New noir-projects/scripts/prefetch_nargo_git_deps.sh:

  • Walks every Nargo.toml under noir-projects/, extracts (owner/repo, tag) git dep pairs.
  • For each missing dep, git clone --depth 1 --branch <tag> into $NARGO_HOME/github.com/<owner>/<repo>/<tag>/, up to 3 attempts with 2s/4s backoff, cleaning partial state between attempts.
  • Per-dep flock makes it safe under concurrent calls from parallel make targets; idempotent re-runs are sub-50ms no-ops.

Wired into the start of build in aztec-nr, noir-contracts, noir-protocol-circuits bootstraps (covers mock-protocol-circuits too, which delegates to noir-protocol-circuits), and into noir-projects/bootstrap.sh prep for symmetry.

Verification

Locally:

  • Clones all 7 unique deps into the layout nargo expects.
  • Idempotent re-run completes in ~36ms with no network I/O.
  • 4× concurrent invocations against an empty $NARGO_HOME lock-arbitrate correctly.

A full ./bootstrap.sh ci run takes hours and the underlying flake is not reproducible on demand, so local verification is structural rather than behavioural. CI will exercise the change on real worker nodes.

Detailed analysis: https://gist.github.com/AztecBot/c0c66726b060de2dc986318121bc0f11

ClaudeBox log: https://claudebox.work/s/a27c0434cc7008b9?run=1

@AztecBot AztecBot added ci-draft Run CI on draft PRs. claudebox Owned by claudebox. it can push to this PR. labels May 13, 2026
@alexghr alexghr closed this May 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-draft Run CI on draft PRs. claudebox Owned by claudebox. it can push to this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants