Skip to content

fix(ci): make redis_setexz a clean noop when redis is unavailable#24046

Closed
AztecBot wants to merge 1 commit into
nextfrom
cb/ci-redis-setexz-noop-no-redis
Closed

fix(ci): make redis_setexz a clean noop when redis is unavailable#24046
AztecBot wants to merge 1 commit into
nextfrom
cb/ci-redis-setexz-noop-no-redis

Conversation

@AztecBot

Copy link
Copy Markdown
Collaborator

Problem

The nightly barretenberg debug build failed: https://github.com/AztecProtocol/aztec-claude/actions/runs/27398926124

The job dies almost immediately (~1s of script time) with:

--- Run barretenberg-debug CI ---
gzip: stdout: Broken pipe
##[error]Process completed with exit code 1.

Root cause

ci3/source enables set -euo pipefail (via source_options). The first thing bootstrap_ec2 does is:

echo "CI booting..." | redis_setexz "$CI_LOG_ID" 300   # ci3/bootstrap_ec2:24

and redis_setexz was:

function redis_setexz {
  gzip | redis_cli -x SETEX $1 $2 &>/dev/null
}

redis_cli is intentionally a noop when redis is unavailable (CI_REDIS_AVAILABLE=0) — but a noop that never reads its stdin. So the upstream gzip in the pipe gets SIGPIPE/EPIPE, fails, and under pipefail aborts the whole script before the build can even start. cache_log already guards this exact case by draining stdin (cat >/dev/null) when redis is unavailable; redis_setexz did not, so it remained a latent failure for any environment where redis can't be reached (mirror repos, external contributors, local runs without docker).

In normal aztec-packages CI redis is always reachable, so this path never triggered. It surfaced on the aztec-claude mirror nightly, where AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY / BUILD_INSTANCE_SSH_KEY are empty, so no redis tunnel is established and CI_REDIS_AVAILABLE=0.

Fix

Guard redis_setexz on CI_REDIS_AVAILABLE exactly like redis_cli/cache_log already do, draining stdin when redis is unavailable so the upstream producer never hits a broken pipe.

Verified red/green in isolation under set -euo pipefail with CI_REDIS_AVAILABLE=0: the old function aborts the caller (exit 141 / broken pipe); the new function returns cleanly and the caller continues.

Secondary note (not fixed here — config/sync, not code)

The deeper reason the scheduled nightly runs on aztec-claude at all is that the mirror's next is stale. The workflow in this repo already carries a guard added in 4a284902a0a (2026-06-04, "chore: don't run scheduled jobs on forks"):

if: ${{ github.event_name != 'schedule' || github.repository == 'AztecProtocol/aztec-packages' }}

The aztec-claude copy is pinned to a 2026-05-20 upstream merge — before that guard landed — so it still runs the scheduled job on the mirror. Once aztec-claude syncs next, the scheduled run will be skipped there entirely. This PR fixes the underlying script fragility regardless, so the broken pipe can't recur in any no-redis environment.


Created by claudebox · group: slackbot

@AztecBot AztecBot added ci-draft Run CI on draft PRs. ci-no-fail-fast Sets NO_FAIL_FAST in the CI so the run is not aborted on the first failure claudebox Owned by claudebox. it can push to this PR. labels Jun 12, 2026
@AztecBot

Copy link
Copy Markdown
Collaborator Author

Automatically closing this stale claudebox draft PR (no updates for 5+ days). Re-open if still needed.

@AztecBot AztecBot closed this Jun 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-draft Run CI on draft PRs. ci-no-fail-fast Sets NO_FAIL_FAST in the CI so the run is not aborted on the first failure claudebox Owned by claudebox. it can push to this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant