Skip to content

Mirror public images to internal registry#313

Open
pawelchcki wants to merge 25 commits into
masterfrom
update_images
Open

Mirror public images to internal registry#313
pawelchcki wants to merge 25 commits into
masterfrom
update_images

Conversation

@pawelchcki

@pawelchcki pawelchcki commented Feb 27, 2026

Copy link
Copy Markdown
Collaborator

Pull all public Docker images from registry.ddbuild.io mirrors. Adds bin/mirror_images.py to manage the list, lock digests, lint, and sync. Removes CircleCI.

  • example/ keeps public refs (works outside DD infra)
  • MIRROR_REGISTRY="" in GHA for public pulls
  • Auto-cancel pipelines on new push

- Add mirror_images.py for lock file generation, lint checking, and mirroring
- Add mirror_images.lock.yaml with resolved digests for all images
- Use MIRROR_REGISTRY ARG for flexible image sourcing in Dockerfiles
- Add GitLab CI jobs for linting and mirroring images
- Auto-cancel previous pipelines on new push
- Add AGENTS.md with codebase overview for AI agents
@pawelchcki pawelchcki force-pushed the update_images branch 2 times, most recently from aa7c235 to 07d3131 Compare March 6, 2026 22:23
@pawelchcki pawelchcki force-pushed the update_images branch 2 times, most recently from 7defe6c to 00532b0 Compare March 6, 2026 22:33
@pawelchcki pawelchcki marked this pull request as ready for review March 6, 2026 22:35
@pawelchcki pawelchcki requested a review from a team as a code owner March 6, 2026 22:35
@pawelchcki pawelchcki requested review from cataphract and removed request for a team March 6, 2026 22:35
Copilot AI review requested due to automatic review settings March 10, 2026 15:20
- Migrate all Docker image references to use registry.ddbuild.io mirrors
- Factorize and simplify registry usage across CI and test configs
- Fix OpenResty detection for mirrored base images
- Ensure examples are usable outside Datadog internal infrastructure
- Add build-step retries in run.py for transient failures
- Increase npm fetch-retries in Dockerfiles
- Remove CircleCI configuration

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates all public Docker image references (Docker Hub, ghcr.io, registry.k8s.io) in CI configs, Dockerfiles, and compose files to use an internal registry.ddbuild.io/ci/nginx-datadog/mirror/ registry. It also introduces tooling to manage the mirror lifecycle and adds CI automation for linting and mirroring.

Changes:

  • Adds bin/mirror_images.py CLI tool with lint, add, lock, relock, and mirror subcommands; mirror_images.yaml source config; and mirror_images.lock.yaml lock file with digests for 101 images.
  • Replaces all public image references across CI configs (.gitlab/build-and-test-*.yml, .gitlab/common.yml), Dockerfiles (build_env/, test/, example/, injection/), and compose files with mirror-prefixed equivalents.
  • Adds GitLab CI mirror-images stage with lint and mirror jobs, auto-cancellation on new commits, and an AGENTS.md codebase overview.

Reviewed changes

Copilot reviewed 20 out of 21 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
bin/mirror_images.py New CLI tool for managing mirrored image lifecycle
mirror_images.yaml Source-of-truth list of all images to mirror
mirror_images.lock.yaml Auto-generated lock file with resolved digests (missing alpine:3.20.3)
.gitlab/mirror.yml New CI stage with lint and mirror jobs
.gitlab-ci.yml Adds mirror-images stage, auto-cancel, and interruptible default
.gitlab/build-and-test-fast.yml Replaces all public image refs with mirror refs
.gitlab/build-and-test-all.yml Replaces all public image refs with mirror refs
.gitlab/common.yml Replaces ingress-nginx and openresty image refs
build_env/Dockerfile Introduces MIRROR_REGISTRY ARG; uses mirrored uv and alpine
Makefile Passes MIRROR_REGISTRY build-arg conditionally
.github/workflows/system-tests.yml Sets MIRROR_REGISTRY="" for GHA to use public images
test/Dockerfile Replaces Python and uv image refs with mirrors
test/services/client/Dockerfile Replaces alpine/curl-http3 with mirror ref
injection/ingress-nginx/docker-compose.yaml Replaces testagent and nginx refs with mirrors
example/tracing/docker-compose.yml Replaces datadog/agent with mirror ref
example/openresty/docker-compose.yml Replaces datadog/agent with mirror ref
example/tracing/services/client/Dockerfile Replaces alpine:3.19 with mirror ref
example/openresty/services/client/Dockerfile Replaces alpine:3.19 with mirror ref
example/ingress-nginx/test-application.yaml Replaces ealen/echo-server with mirror ref
example/ingress-nginx/helm/values.yaml Updates image path but leaves conflicting registry: docker.io
AGENTS.md New file with codebase overview for AI agents

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread example/ingress-nginx/helm/values.yaml Outdated
Comment thread mirror_images.yaml Outdated
Comment thread .gitlab/mirror.yml
Comment thread .gitlab/mirror.yml
@cataphract

cataphract commented Mar 10, 2026

Copy link
Copy Markdown
Contributor

Wouldn't it be simpler to use our internal mirror that automatically proxies and caches the images? (486234852809.dkr.ecr.us-east-1.amazonaws.com 669783387624.dkr.ecr.us-east-1.amazonaws.com I believe) It's disadvantage is that you have no control of cache invalidation, but since you want to resolve to individual digests through locking, it doesn't really matter. This would avoid all the boilerplate to mirror manually.

@pawelchcki

Copy link
Copy Markdown
Collaborator Author

Wouldn't it be simpler to use our internal mirror that automatically proxies and caches the images (486234852809.dkr.ecr.us-east-1.amazonaws.com 669783387624.dkr.ecr.us-east-1.amazonaws.com I believe)

Having explicit mirrors also helps us to be a little bit more resilent in case image is pulled form upstream repository. than a pull-through cache Also its better to use the official registry.ddbuild.io rather than one of the rand ecs repositories

@codecov-commenter

codecov-commenter commented Mar 10, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 68.85%. Comparing base (b9e382b) to head (b176db1).

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #313   +/-   ##
=======================================
  Coverage   68.85%   68.85%           
=======================================
  Files          56       56           
  Lines        7471     7471           
  Branches     1058     1058           
=======================================
  Hits         5144     5144           
  Misses       1820     1820           
  Partials      507      507           
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

- Add new images: nginx 1.29.6, openresty 1.29.2.1, ingress-nginx v1.13.8/v1.14.4/v1.15.0, alpine 3.20.3
- Extend mirror lint to scan GitLab CI matrix variables for undeclared images
- Move datadog/agent entry to end of mirror_images.yaml
- Update dd-trace-cpp submodule
Replace bare `except Exception: continue` with specific `(OSError,
yaml.YAMLError)` catches and log warnings to stderr, preventing silent
failures in the lint command.
- Lint now skips example/ directory (examples use public image references)
- Remove example-only images from mirror_images.yaml
- Use cwd as PROJECT_DIR so the script works from other repos
- Fall back to script-relative mirror_images.yaml if not found in cwd
- Add comment explaining GitLab CI auto_cancel workflow
- Condense AGENTS.md
- Move print override after imports with explanatory comment
- Wrap YAML path discovery in _find_mirror_yaml() to avoid leaked locals
- Rename ambiguous variables: lock→progress_lock, _check→_resolve_tag/_check_if_mirrored, d→tag_digest, ref→digest_ref
- Extract _current_lock_entries(), _extract_matrix_combos(), _expand_matrix_images() helpers
- Consolidate _is_external() conditions and check_digest_exists() tool branches
- Use collections.deque for BFS traversal, list comprehension for _version_sort_key
- Move results dict mutation inside progress_lock for thread safety
- Revert dd-trace-cpp to origin/master commit (f8c3913)
- Mark mirror_images.lock.yaml as linguist-generated
@pawelchcki pawelchcki changed the title Mirror public test images to internal registry Use mirrored images + mirror_images.py tooling Mar 13, 2026
@pawelchcki pawelchcki changed the title Use mirrored images + mirror_images.py tooling Mirror public images to internal registry Mar 13, 2026
Resolve AGENTS.md conflict by combining both sections.
@xlamorlette-datadog

xlamorlette-datadog commented Apr 3, 2026

Copy link
Copy Markdown
Contributor

Since this PR was lastly updated three weeks ago, and it has conflicts and failed tests, I convert it to draft.

@xlamorlette-datadog xlamorlette-datadog marked this pull request as draft April 3, 2026 08:48
Add the new base/test images introduced by PR #378
(nginx 1.30.1, 1.31.0) along with the others already present in the PR's
build/test matrices: nginx 1.28.3, 1.29.7, 1.29.8, 1.30.0 (+ alpine
variants), amazonlinux 2023.11.20260427.1, ingress-nginx v1.13.9 / v1.14.5
/ v1.15.1, and openresty 1.29.2.3-alpine.
Fetch the canonical mirror_images.py from dd-repo-tools via
`uv run --no-config --script <pinned-url>` instead of carrying our own
991-line copy. The wrapper exports MIRROR_DEST_REGISTRY to preserve this
repo's `registry.ddbuild.io/ci/nginx-datadog/mirror` prefix, since the
upstream script defaults to `libdatadog-build/mirror`.
The new revision auto-detects the destination registry from the git
origin remote, so the wrapper no longer needs to export
MIRROR_DEST_REGISTRY to preserve this repo's `nginx-datadog/mirror`
prefix.
@datadog-prod-us1-5

datadog-prod-us1-5 Bot commented May 26, 2026

Copy link
Copy Markdown

Pipelines

Fix all issues with BitsAI

⚠️ Warnings

🚦 7 Pipeline jobs failed

System Tests | main / End-to-end #1 / nginx 1   View in Datadog   GitHub Actions

🛟 This job is unlikely to succeed on retry. Please review your pipeline configuration. Action 'nick-fields/retry' not allowed due to repository ownership policies.

System Tests | main / End-to-end #2 / nginx 2   View in Datadog   GitHub Actions

🛟 This job is unlikely to succeed on retry. Please review your pipeline configuration. Error: Action 'nick-fields/retry@ad984534de44a9489a53aefd81eb77f87c70dc60' is not allowed due to enterprise restrictions on actions.

System Tests | main / End-to-end #3 / nginx 3   View in Datadog   GitHub Actions

🛟 This job is unlikely to succeed on retry. Please review your pipeline configuration. The action 'nick-fields/retry@ad984534de44a9489a53aefd81eb77f87c70dc60' is not allowed due to enterprise restrictions on actions.

View all 7 failed jobs.

ℹ️ Info

🎯 Code Coverage (details)
Patch Coverage: 100.00%
Overall Coverage: 67.62%

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 0279422 | Docs | Datadog PR Page | Give us feedback!

Bump mirror_images.py to fb4f39a, which supports the new yaml schema:
top-level `images:` list plus an `ignore:` block. Move per-repo path
exclusions (example/, dd-trace-cpp/, libddwaf/) out of the script and
into the config, where they belong.
@pawelchcki pawelchcki marked this pull request as ready for review May 27, 2026 09:14
@pawelchcki pawelchcki requested a review from a team as a code owner May 27, 2026 09:14

@xlamorlette-datadog xlamorlette-datadog left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Please address comments and resolve conflicts (and update images list).

Comment thread bin/mirror_images

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suffix the file name with .sh (as other bash script files in the folder).
(Then update references in .gitlab/mirror.yml.)

Comment thread bin/mirror_images
#!/usr/bin/env bash
set -euo pipefail
exec uv run --no-config --script \
https://binaries.ddbuild.io/dd-repo-tools/default/ca/fb4f39a542e4dd42b646c300b539c7a9f4201531/mirror_images.py \

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment explaining when and how to update this URL.
(Also, it may be interesting to add a link to the source file.)

Comment thread .gitlab/mirror.yml
# and crane binary for copying images.
image: registry.ddbuild.io/images/docker:27.3.1
tags: ["arch:amd64"]
needs: []

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should lint-mirror-images be a pre-requisite?

Comment thread AGENTS.md

`make format` to format, `make lint` to check. Rebuild formatter image after editing `Dockerfile.formatter` with `make build-formatter-image`.

Public Docker images must use `registry.ddbuild.io` mirrors. Managed by `bin/mirror_images` (`add`, `lock`, `relock`, `mirror`, `lint`). Config: `mirror_images.yaml` / `mirror_images.lock.yaml`. In `build_env/Dockerfile`, `MIRROR_REGISTRY` ARG is `""` in GHA to use public registries.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GitHub actions rather than GHA?

Comment thread mirror_images.yaml
# - "nginx:1.27.5":
# target: "registry.ddbuild.io/ci/nginx-datadog/custom/nginx:1.27.5"
#
# Default destination registry: registry.ddbuild.io/ci/nginx-datadog/mirror

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this configured? Please add a comment.

Comment thread mirror_images.yaml
# - "nginx:1.27.5":
# target: "registry.ddbuild.io/ci/nginx-datadog/custom/nginx:1.27.5"
#
# Default destination registry: registry.ddbuild.io/ci/nginx-datadog/mirror

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please explain how to update mirror_images.lock.yaml when one modifies this file.

Also, is there a command to run to update all the files that references images? (I'm guessing this from the presence of the ignore section at the end.)

Comment thread .gitlab-ci.yml
- build-and-test-fast
- build-all
- test-all
- mirror-images

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the build and test stages need the mirror-images stage to be run before them, don't they?
How much time should this take?…

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants