fix(containers): apt install fallback to archive.ubuntu.com#5266
Conversation
The agent and squid Dockerfiles rewrite the apt mirror to azure.archive.ubuntu.com (faster on Azure-hosted GitHub runners). The existing fallback to archive.ubuntu.com only triggered when 'apt-get update' reported 'Failed to fetch' (metadata failures). When the Azure mirror's metadata was reachable but the package (.deb) downloads timed out during 'apt-get install', the retry path re-ran apt_update_retry (which succeeded, leaving sources pointed at azure) and retried the install against the same failing mirror, causing slow retries and ultimately a hard build failure. Extract the mirror rewrite into a shared force_archive_mirror helper, have apt_update_retry reuse it, and add apt_install_retry / apt_upgrade_retry that force the archive.ubuntu.com mirror before retrying. This covers the install and upgrade phases, not just update, so a flaky Azure mirror no longer fails the build (archive.ubuntu.com is already in the firewall allowlist). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
✅ Coverage Check PassedOverall Coverage
📁 Per-file Coverage Changes (1 files)
Coverage comparison generated by |
There was a problem hiding this comment.
Pull request overview
This PR improves build reliability for the agent and squid container images by adding retry helpers that fall back from the Azure Ubuntu apt mirror to archive.ubuntu.com not only during apt-get update, but also when apt-get install/upgrade fails mid-download.
Changes:
- Extracted apt mirror fallback logic into
force_archive_mirror()and reused it from update retries. - Added
apt_install_retry()(andapt_upgrade_retry()in the agent image) to force the archive mirror before retrying installs/upgrades. - Updated Dockerfile RUN blocks to use the new retry helpers consistently.
Show a summary per file
| File | Description |
|---|---|
| containers/squid/Dockerfile | Adds force_archive_mirror + apt_install_retry and wires them into the Squid image’s apt flow. |
| containers/agent/Dockerfile | Adds force_archive_mirror + apt_install_retry + apt_upgrade_retry across multiple apt blocks to reduce flaky CI builds. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 2/2 changed files
- Comments generated: 9
| if [ -d /etc/apt/sources.list.d ]; then \ | ||
| find /etc/apt/sources.list.d -name '*.sources' -exec \ | ||
| sed -i 's|http://azure.archive.ubuntu.com|http://archive.ubuntu.com|g' {} + 2>/dev/null || true; \ | ||
| fi; \ |
| local i; for i in 1 2 3; do \ | ||
| rm -rf /var/lib/apt/lists/* && apt-get update 2>&1 | tee /tmp/apt-update.log && \ | ||
| if ! grep -q "Failed to fetch" /tmp/apt-update.log; then return 0; fi; \ | ||
| echo "apt-get update attempt $i/3 had fetch failures, retrying in $((i*10))s..." >&2; sleep $((i*10)); \ | ||
| done; \ | ||
| echo "All apt-get update retries failed, falling back to archive.ubuntu.com..." >&2; \ | ||
| force_archive_mirror; \ | ||
| }; \ |
| if [ -d /etc/apt/sources.list.d ]; then \ | ||
| find /etc/apt/sources.list.d -name '*.sources' -exec \ | ||
| sed -i 's|http://azure.archive.ubuntu.com|http://archive.ubuntu.com|g' {} + 2>/dev/null || true; \ | ||
| fi; \ |
| apt_update_retry() { \ | ||
| local i; for i in 1 2 3; do \ | ||
| rm -rf /var/lib/apt/lists/* && apt-get update 2>&1 | tee /tmp/apt-update.log && \ | ||
| if ! grep -q "Failed to fetch" /tmp/apt-update.log; then return 0; fi; \ | ||
| echo "apt-get update attempt $i/3 had fetch failures, retrying in $((i*10))s..." >&2; sleep $((i*10)); \ | ||
| done; \ | ||
| echo "All apt-get update retries failed, falling back to archive.ubuntu.com..." >&2; \ | ||
| force_archive_mirror; \ | ||
| }; \ |
| if [ -d /etc/apt/sources.list.d ]; then \ | ||
| find /etc/apt/sources.list.d -name '*.sources' -exec \ | ||
| sed -i 's|http://azure.archive.ubuntu.com|http://archive.ubuntu.com|g' {} + 2>/dev/null || true; \ | ||
| fi; \ |
| apt_update_retry() { \ | ||
| local i; for i in 1 2 3; do \ | ||
| rm -rf /var/lib/apt/lists/* && apt-get update 2>&1 | tee /tmp/apt-update.log && \ | ||
| if ! grep -q "Failed to fetch" /tmp/apt-update.log; then return 0; fi; \ | ||
| echo "apt-get update attempt $i/3 had fetch failures, retrying in $((i*10))s..." >&2; sleep $((i*10)); \ | ||
| done; \ | ||
| echo "All apt-get update retries failed, falling back to archive.ubuntu.com..." >&2; \ | ||
| force_archive_mirror; \ | ||
| }; \ |
| if [ -d /etc/apt/sources.list.d ]; then \ | ||
| find /etc/apt/sources.list.d -name '*.sources' -exec \ | ||
| sed -i 's|http://azure.archive.ubuntu.com|http://archive.ubuntu.com|g' {} + 2>/dev/null || true; \ | ||
| fi; \ |
| apt_update_retry() { \ | ||
| local i; for i in 1 2 3; do \ | ||
| rm -rf /var/lib/apt/lists/* && apt-get update 2>&1 | tee /tmp/apt-update.log && \ | ||
| if ! grep -q "Failed to fetch" /tmp/apt-update.log; then return 0; fi; \ | ||
| echo "apt-get update attempt $i/3 had fetch failures, retrying in $((i*10))s..." >&2; sleep $((i*10)); \ | ||
| done; \ | ||
| echo "All apt-get update retries failed, falling back to archive.ubuntu.com..." >&2; \ | ||
| force_archive_mirror; \ | ||
| }; \ |
| if [ -d /etc/apt/sources.list.d ]; then \ | ||
| find /etc/apt/sources.list.d -name '*.sources' -exec \ | ||
| sed -i 's|http://azure.archive.ubuntu.com|http://archive.ubuntu.com|g' {} + 2>/dev/null || true; \ | ||
| fi; \ | ||
| rm -rf /var/lib/apt/lists/* && apt-get update; \ |
|
✅ Copilot review passed with no inline comments. @lpcox Add the |
Address review feedback on the apt mirror fallback helpers: - force_archive_mirror: also rewrite security.ubuntu.com (not just azure.archive.ubuntu.com) inside deb822 /etc/apt/sources.list.d/*.sources entries, matching the /etc/apt/sources.list branch. Otherwise, if the initial Azure-mirror rewrite never ran (e.g. DNS failure) and the base image uses .sources files, the fallback would leave security.ubuntu.com in place and apt-get update could keep failing. - apt_update_retry: stop masking apt-get update's exit code. The previous 'apt-get update 2>&1 | tee' pipeline returned tee's status (0) under /bin/sh (no pipefail), so a non-"Failed to fetch" failure (e.g. a dpkg lock error) was treated as success and never retried. Redirect to a log file, capture the real exit code, and retry/fall back when either the command fails or "Failed to fetch" appears. Applied consistently across all apt blocks in both Dockerfiles. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Addressed the review feedback in 4608238: 1. 2. Applied consistently across all apt blocks in both Dockerfiles. Validation: all 12 |
The runGhCommand tests spawn the real `gh` binary, so they are subject to runner contention. Jest's 5s default could fire before the helper's own 30s COMMAND_TIMEOUT_MS, producing a spurious "Exceeded timeout of 5000 ms" failure on a slow runner (observed on Node 22). Give these tests a 30s timeout to match the internal command timeout. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
❌ Smoke Claude failed |
|
✅ Contribution Check completed successfully! |
|
🔑 Smoke Copilot PAT PAT auth validated. All systems operational. ✅ |
|
🔌 Smoke Services — All services reachable! ✅ |
|
✅ Smoke Copilot BYOK AOAI (api-key) completed. Copilot AOAI BYOK (api-key) mode operational. 🔓 |
|
✅ Smoke Gemini completed. All facets verified. 💎 Smoke test completed with FAIL status. Comment added to PR #5266. |
|
✨ The prophecy is fulfilled... Smoke Codex has completed its mystical journey. The stars align. 🌟 |
|
✅ Smoke Copilot BYOK AOAI (Entra) completed. Copilot AOAI BYOK (Entra) mode operational. 🔓 |
|
Chroot tests passed! Smoke Chroot - All security and functionality tests succeeded. |
|
✅ Build Test Suite completed successfully! |
|
📡 Smoke OTel Tracing completed. All tracing scenarios validated. ✅ |
|
📰 VERDICT: Smoke Copilot has concluded. All systems operational. This is a developing story. 🎤 |
|
✅ Smoke Copilot BYOK completed. Copilot BYOK mode operational. 🔓 |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
📡 Smoke OTel Tracing completed. All tracing scenarios validated. ✅ |
|
✨ The prophecy is fulfilled... Smoke Codex has completed its mystical journey. The stars align. 🌟 |
|
📰 VERDICT: Smoke Copilot has concluded. All systems operational. This is a developing story. 🎤 |
|
✅ Build Test Suite completed successfully! |
|
❌ Smoke Claude failed |
|
🔌 Smoke Services — All services reachable! ✅ |
|
🔑 Smoke Copilot PAT PAT auth validated. All systems operational. ✅ |
|
✅ Smoke Copilot BYOK completed. Copilot BYOK mode operational. 🔓 |
|
✅ Smoke Gemini completed. All facets verified. 💎 |
🤖 Smoke Test Results — Copilot Engine ValidationPR: fix(containers): apt install fallback to archive.ubuntu.com
Overall: FAIL — Pre-step data collection did not resolve template variables (
|
🔍 Smoke Test: API Proxy OpenTelemetry Tracing
All scenarios pass. OTEL integration is functional.
|
|
Merged PRs: Warning Firewall blocked 1 domainThe following domain was blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "registry.npmjs.org"See Network Configuration for more information.
|
🔬 Chroot Version Comparison Results
Overall: ❌ Not all tests passed — Python and Node.js versions differ between host and chroot environments.
|
Running in direct BYOK mode (AWF_AUTH_TYPE=github-oidc + AWF_AUTH_AZURE_* + COPILOT_PROVIDER_BASE_URL) via api-proxy → Azure OpenAI (Foundry, o4-mini-aw) authenticated via Microsoft Entra Overall: PASS
|
🔐 Smoke Test: Copilot PAT Auth — PASS
Overall: PASS · Auth mode: PAT (COPILOT_GITHUB_TOKEN) cc @lpcox
|
|
Smoke Test: Copilot BYOK (Direct) Mode ✅ PASS
Running in direct BYOK mode ( cc @lpcox
|
|
@lpcox - PRs: fix(containers): apt install fallback to archive.ubuntu.com; docs: sync schemas and specs with source changes
|
🏗️ Build Test Suite Results
Overall: 8/8 ecosystems passed — ✅ PASS
|
Smoke Test Results
Overall status: FAIL Warning Firewall blocked 1 domainThe following domain was blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "localhost"See Network Configuration for more information.
|
Smoke Test Results — Services Connectivity
Overall: FAIL
|
Problem
The
agentandsquidDockerfiles rewrite the apt mirror to azure.archive.ubuntu.com (normally much faster on Azure-hosted GitHub runners). They already retry on transient failures, but the fallback toarchive.ubuntu.comonly triggered whenapt-get updatereportedFailed to fetch— i.e. metadata failures.When the Azure mirror's metadata was reachable but the package downloads timed out during
apt-get install:the retry path re-ran
apt_update_retry(which succeeded, leaving sources still pointed at azure) and retried the install against the same failing mirror → slow backoff retries and ultimately a hard build failure.This is what intermittently killed the smoke-claude agent job (the build runs lazily inside the
Execute … CLIstep under--build-local), and caused the slow azure pulls observed in CI.Fix
force_archive_mirrorhelper.apt_update_retryreuse it.apt_install_retry/apt_upgrade_retrythat force the archive.ubuntu.com mirror before retrying, covering the install and upgrade phases — not just update.A flaky Azure mirror now transparently falls back to
archive.ubuntu.com(already in the firewall allowlist) instead of failing the build. Applied consistently across all apt blocks in both Dockerfiles (theiptables-initcontainer is built from the agent image, so it's covered;api-proxyis alpine/apk and unaffected).Validation
RUNblocks passbash -nsyntax checks (after mimicking Docker's comment stripping).force_archive_mirrorrewrites sources → install retried against archive → succeeds (exit 0).src/services/{agent,squid}-service.test.tsstill pass (40 tests); lint + build pass via pre-commit hook.Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com