Skip to content

Commit 8dec6b2

Browse files
authored
fix(ci3): cache DNS on build instances to dodge link-local PPS throttling (#24105)
## Problem CI DNS failures (`curl: (6) Could not resolve host …`, e.g. the `chonk_inputs.sh` S3 download) are consistent with AWS's **link-local PPS limit**: traffic to the Amazon resolver (the VPC `.2` address / `169.254.169.253`) is capped at **~1024 packets/sec per ENI**, and over that, packets are silently dropped (`linklocal_allowance_exceeded` in `ethtool -S`). We confirmed the build's DNS path makes this likely: the devbox container **and** nested docker-in-docker both get `nameserver 172.31.0.2` and query the VPC resolver directly — no caching. The host's `systemd-resolved` *is* caching (~48% hit on host-only traffic) but listens on loopback only (`127.0.0.53`), so containers can't use it. With the build's parallelism (and the larger spot instances), the aggregate DNS rate blows past 1024 pps. ## Fix Route container DNS through the host's caching `systemd-resolved`: - Expose its stub on the instance's **primary private IP** (derived from `ip route get`, no IMDS dependency) via `DNSStubListenerExtra` — that's the one address reachable from the devbox container *and* nested dind (unlike the docker0 gateway). - Point containers at it with `docker run --dns <priv_ip>`. Non-loopback nameservers propagate through the nested dockerd, so dind inherits it. Repeat lookups become cache hits and never reach the throttled resolver. ## Safety This can only help, never break resolution: if the IP can't be derived, `systemd-resolved` isn't active, or the stub doesn't come up on the IP (5×0.5s health check via `ss`), `priv_ip` is cleared and `--dns` is omitted — leaving DNS exactly as today. No counter instrumentation included — we're treating link-local throttling as the known cause. (PR #379's `linklocal_allowance_exceeded` logging can confirm before/after if desired.) ## Validation - `bash -n` on `ci3/bootstrap_ec2`; rendered+`bash -n` the injected host-script block; verified the `ip route get` parse and the `${priv_ip:+--dns …}` expansion locally. - Full validation is the PR's own CI run: a build instance that resolves through the cache and (ideally) flat `linklocal_allowance_exceeded`.
2 parents 872362d + 1e44a97 commit 8dec6b2

1 file changed

Lines changed: 31 additions & 1 deletion

File tree

ci3/bootstrap_ec2

Lines changed: 31 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -259,6 +259,36 @@ sudo systemctl mask --now apt-daily.timer apt-daily-upgrade.timer apt-daily.serv
259259
sudo sysctl fs.inotify.max_user_watches=1048576 &>/dev/null
260260
sudo sysctl fs.inotify.max_user_instances=1048576 &>/dev/null
261261
262+
# DNS caching. CI's massively parallel jobs resolve the same handful of hosts (S3,
263+
# Docker Hub, npm, cargo, github) thousands of times. By default every lookup — from
264+
# the devbox container and nested docker-in-docker alike — goes straight to the VPC
265+
# resolver, which AWS caps at ~1024 pps per ENI for link-local services; over that,
266+
# packets are silently dropped and surface as "could not resolve host".
267+
# Route container lookups through the host's (caching) systemd-resolved instead, by
268+
# exposing its stub on the instance's primary private IP — the one address reachable
269+
# from both the devbox container and nested dind — and pointing containers at it via
270+
# --dns on docker run below. priv_ip is cleared (and --dns omitted, leaving DNS
271+
# unchanged) if any step fails, so this can only help, never break resolution.
272+
priv_ip=\$(ip -4 route get 169.254.169.253 2>/dev/null | awk '{for(i=1;i<=NF;i++) if(\$i=="src"){print \$(i+1); exit}}' || true)
273+
if [ -n "\$priv_ip" ] && systemctl is-active --quiet systemd-resolved \
274+
&& echo "DNSStubListenerExtra=\$priv_ip" | sudo tee -a /etc/systemd/resolved.conf >/dev/null \
275+
&& sudo systemctl restart systemd-resolved; then
276+
# Only route containers to the cache once it's actually listening on priv_ip.
277+
for _ in 1 2 3 4 5; do
278+
sudo ss -lnu "sport = :53" 2>/dev/null | grep -qF "\$priv_ip:53" && { dns_ready=1; break; }
279+
sleep 0.5
280+
done
281+
if [ "\${dns_ready:-0}" = 1 ]; then
282+
echo "HOST: DNS cache active on \$priv_ip (systemd-resolved)."
283+
else
284+
echo "HOST: DNS cache failed to bind \$priv_ip; using default resolver."
285+
priv_ip=
286+
fi
287+
else
288+
echo "HOST: DNS cache not configured; using default resolver."
289+
priv_ip=
290+
fi
291+
262292
# Pin host processes to top CPU cores to keep benchmark cores clean.
263293
# CPU layout: physical cores 0..N/2-1, hyperthreads N/2..N-1.
264294
# OS gets top 8 physical cores + their hyperthread siblings.
@@ -309,7 +339,7 @@ start_build() {
309339
local_uid=\$(id -u)
310340
local_gid=\$(id -g)
311341
312-
docker run --privileged --rm \${docker_args:-} \
342+
docker run --privileged --rm \${docker_args:-} \${priv_ip:+--dns \$priv_ip} \
313343
--name aztec_build \
314344
--hostname $docker_hostname \
315345
-v bootstrap_ci_local_docker:/var/lib/docker \

0 commit comments

Comments
 (0)