Skip to content

Fix cooldown breaking Docker updates when registry API calls fail#14149

Merged
robaiken merged 8 commits into
mainfrom
copilot/fix-docker-cooldown-issue
May 20, 2026
Merged

Fix cooldown breaking Docker updates when registry API calls fail#14149
robaiken merged 8 commits into
mainfrom
copilot/fix-docker-cooldown-issue

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Feb 10, 2026

What are you trying to accomplish?

When cooldown is enabled for Docker dependencies, the update checker makes additional API calls (digest + blob HEAD) to determine tag publication dates. These calls can fail with 404, 401, or other errors depending on the registry/image. The unhandled errors crash the entire update process, preventing any Docker updates from being proposed.

Three bugs fixed:

  • get_tag_publication_details had no error handling — registry errors (NotFound, auth failures, rate limiting) propagated up and killed the update
  • apply_cooldown used next on missing publication details — if details couldn't be fetched for any tag, the method returned [], silently blocking all updates
  • publication_detail used incorrect T.castT.cast(nil, PackageRelease) fails at Sorbet runtime when get_tag_publication_details returns nil
# Before: 404 on blob → exception → update crashes
client.dohead "v2/#{repo}/blobs/#{digest}"  # raises DockerRegistry2::NotFound

# After: caught, logged, cooldown skipped for that tag
rescue *transient_docker_errors,
       DockerRegistry2::RegistryAuthenticationException,
       RestClient::Forbidden,
       RestClient::TooManyRequests => e
  Dependabot.logger.warn("Failed to fetch publication details for #{repo}:#{tag.name}, skipping cooldown: ...")
  nil

The apply_cooldown fix changes next to return [tag] when details are nil — treating unknown publication dates as "not in cooldown" rather than skipping the tag entirely.

Anything you want to highlight for special attention from reviewers?

The apply_cooldown behavior change is the most significant: previously, tags with nil publication details were skipped (could result in no update). Now they're returned immediately (cooldown is bypassed). This is the correct tradeoff — cooldown is an optimization that should degrade gracefully, not block updates entirely.

How will you know you've accomplished your goal?

  • 7 new tests added: 5 unit tests covering each error type in get_tag_publication_details, plus 2 integration tests validating end-to-end behavior (cooldown + blob 404, cooldown + auth failure)
  • All 121 existing update_checker tests continue to pass
  • The reproduction cases from the issue (alpine with cooldown, axllent/mailpit with private registry) would no longer crash

Checklist

  • I have run the complete test suite to ensure all tests and linters pass.
  • I have thoroughly tested my code changes to ensure they work as expected, including adding additional tests for new functionality.
  • I have written clear and descriptive commit messages.
  • I have provided a detailed description of the changes in the pull request, including the problem it addresses, how it fixes the problem, and any relevant details about the implementation.
  • I have ensured that the code is well-documented and easy to understand.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • tuf-repo-cdn.sigstore.dev
    • Triggering command: /opt/bin/cosign cosign verify --certificate-oidc-issuer REDACTED --certificate-identity-regexp REDACTED ghcr.io/regclient/regctl:v0.11.1 (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Original prompt

This section details on the original issue you should resolve

<issue_title>Cooldown breaks breaks docker updates</issue_title>
<issue_description>### Is there an existing issue for this?

  • I have searched the existing issues

Package ecosystem

Docker

Package manager version

No response

Language version

No response

Manifest location and content before the Dependabot update

FROM alpine:3.20.0

dependabot.yml content

version: 2
updates:
  - package-ecosystem: "docker"
    directory: "/"
    schedule:
      interval: "weekly"
      day: "sunday"
      time: "12:00"
      timezone: "Europe/Helsinki"
    cooldown:
      default-days: 1
    groups:
      all:
        patterns:
          - "*"

Updated dependency

No response

What you expected to see, versus what you actually saw

Over one year old alpine version should be updated but isn't. Removing cooldown makes it update. If private registry is used same happens but with actual errors.

Native package manager behavior

No response

Images of the diff or a link to the PR, issue, or logs

Example repo with test runs: https://github.com/heppu/dependabot-bug

Smallest manifest that reproduces the issue

All in repo above.</issue_description>

<agent_instructions>Seems like we are getting issues when getting the blogs or tag lists for certain docker images. Error 1 we are getting when we are trying to get the tags, error 2 we seem to be getting a 404 when trying to get a blob 2026/02/09 13:37:49 [588] 404 https://registry.hub.docker.com:443/v2/axllent/mailpit/blobs/sha256:f58cd09655f4a1fac184e4e4d5de497b659f1dd3897c04e8d2a07031984af708

error 1: https://github.com/heppu/dependabot-bug/actions/runs/21796308221/job/62884360984

updater | 2026/02/08 10:02:50 INFO <job_1237101327> Checking if alpine 3.20.0 needs updating
  proxy | 2026/02/08 10:02:50 [012] GET https://registry.hub.docker.com:443/v2/library/alpine/tags/list
  proxy | 2026/02/08 10:02:50 [012] 401 https://registry.hub.docker.com:443/v2/library/alpine/tags/list
2026/02/08 10:02:50 [012] Remote response: {"errors":[{"code":"UNAUTHORIZED","message":"authentication required","detail":[{"Type":"repository","Class":"","Name":"library/alpine","Action":"pull"}]}]}

error 2: https://github.com/biodiversitycellatlas/bca-website/actions/runs/21827259651/job/62975376658

dater | 2026/02/09 13:37:50 ERROR <job_1238150414> Error processing axllent/mailpit (DockerRegistry2::NotFound)
2026/02/09 13:37:50 ERROR <job_1238150414> 404 Not Found
2026/02/09 13:37:50 ERROR <job_1238150414> /home/dependabot/dependabot-updater/vendor/ruby/3.4.0/gems/docker_registry2-1.18.2/lib/registry/registry.rb:374:in 'DockerRegistry2::Registry#do_bearer_req'
2026/02/09 13:37:50 ERROR <job_1238150414> /home/dependabot/dependabot-updater/vendor/ruby/3.4.0/gems/docker_registry2-1.18.2/lib/registry/registry.rb:308:in 'DockerRegistry2::Registry#doreq'
2026/02/09 13:37:50 ERROR <job_1238150414> /home/dependabot/dependabot-updater/vendor/ruby/3.4.0/gems/docker_registry2-1.18.2/lib/registry/registry.rb:43:in '
```</agent_instructions>

## Comments on the Issue (you are @copilot in this section)

<comments>
</comments>


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI and others added 2 commits February 10, 2026 16:11
…efully

When cooldown is enabled for Docker dependencies, the update checker
makes additional API calls (digest + blob HEAD) to determine tag
publication dates. These calls can fail with 404, 401, or other errors
for certain images/registries.

Changes:
- Add error handling in get_tag_publication_details to catch registry
  errors (NotFound, auth, rate limiting) and return nil instead of
  crashing the entire update process
- Fix apply_cooldown to treat tags with unknown publication dates as
  "not in cooldown" instead of skipping them (which could block all
  updates when the registry doesn't support the required API calls)
- Remove incorrect T.cast in publication_detail that would fail when
  get_tag_publication_details returns nil

Co-authored-by: robaiken <6567647+robaiken@users.noreply.github.com>
Co-authored-by: robaiken <6567647+robaiken@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix cooldown breaks preventing docker updates Fix cooldown breaking Docker updates when registry API calls fail Feb 10, 2026
Copilot AI requested a review from robaiken February 10, 2026 16:15
@robaiken
Copy link
Copy Markdown
Contributor

@copilot fix conflict

…oldown-issue

# Conflicts:
#	docker/spec/dependabot/docker/update_checker_spec.rb

Co-authored-by: robaiken <6567647+robaiken@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 24, 2026

@copilot fix conflict

Resolved the merge conflict in docker/spec/dependabot/docker/update_checker_spec.rb — kept both our cooldown error-handling tests and the new tests from main (architecture suffix, timestamp validation, cooldown_period? with semver). All 167 tests pass. Merge commit: 9d81be6.

@robaiken
Copy link
Copy Markdown
Contributor

@copilot could you refresh this PR now that #14691 has landed? A few things to do:

  • Rebase on main so the diff is clean against the merged manifest-list fix.
    Confirm the rescue list is still right. fix(docker): use manifests endpoint for manifest-list digests #14691 fixed the multi-arch/OCI case at the source, so DockerRegistry2::NotFound is no longer expected from that path. The current rescue list here doesn't include NotFound, which looks correct, just double-check nothing in the specs still relies on simulating a 404 for the manifest-list scenario.
  • Update the PR description to be explicit about the apply_cooldown behavior change: when publication details can't be determined, the tag is now used (cooldown bypassed) rather than skipped. That's a reasonable default but it's a policy decision worth calling out for reviewers.
  • Update the title to reflect the post-fix(docker): use manifests endpoint for manifest-list digests #14691 scope, something like "Skip cooldown gracefully when registry API calls fail (auth, 403, 429, transient)".

Copilot AI and others added 2 commits May 19, 2026 14:51
Co-authored-by: robaiken <6567647+robaiken@users.noreply.github.com>
…text

Co-authored-by: robaiken <6567647+robaiken@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 19, 2026

@copilot could you refresh this PR now that #14691 has landed? A few things to do:

  • Rebase on main so the diff is clean against the merged manifest-list fix.
    ...

Done. Merged main (which includes #14691) into the branch so the diff is clean.

Rescue list verification: DockerRegistry2::NotFound is not listed explicitly in the rescue — it's included via *transient_docker_errors. This is still correct because a 404 can still occur for reasons unrelated to the manifest-list scenario (deleted images, race conditions between digest lookup and HEAD request). The #14691 fix prevents the common case (manifest-list digests hitting /blobs/ instead of /manifests/), but the rescue provides defense-in-depth for edge cases.

No specs simulate a 404 specifically for the manifest-list scenario — the existing test mocks dohead raising NotFound generically (any endpoint), which remains valid.

apply_cooldown behavior change: When publication details can't be determined (nil from get_tag_publication_details), the tag is now used immediately (cooldown bypassed) rather than skipped. This treats cooldown as a best-effort optimization that degrades gracefully.

I've updated the test name from "blob request returns 404" to "HEAD request returns 404" to reflect that the endpoint is now dynamically selected per #14691.

@github-actions github-actions Bot added the L: docker Docker containers label May 20, 2026
@robaiken
Copy link
Copy Markdown
Contributor

@copilot lint Offenses:

docker/spec/dependabot/docker/update_checker_spec.rb:1535:9: C: [Correctable] RSpec/ReceiveMessages: Use receive_messages instead of multiple stubs on lines [1536, 1538].
allow(mock_client).to receive(:tags).and_return("tags" => %w(17.04 17.10))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
docker/spec/dependabot/docker/update_checker_spec.rb:1536:9: C: [Correctable] RSpec/ReceiveMessages: Use receive_messages instead of multiple stubs on lines [1535, 1538].
allow(mock_client).to receive(:digest).and_return("sha256:abc123")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
docker/spec/dependabot/docker/update_checker_spec.rb:1538:9: C: [Correctable] RSpec/ReceiveMessages: Use receive_messages instead of multiple stubs on lines [1535, 1536].
allow(mock_client).to receive(:manifest_digest) ...
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
docker/spec/dependabot/docker/update_checker_spec.rb:1559:9: C: [Correctable] RSpec/ReceiveMessages: Use receive_messages instead of multiple stubs on lines [1562].
allow(mock_client).to receive(:tags).and_return("tags" => %w(17.04 17.10))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
docker/spec/dependabot/docker/update_checker_spec.rb:1562:9: C: [Correctable] RSpec/ReceiveMessages: Use receive_messages instead of multiple stubs on lines [1559].
allow(mock_client).to receive(:manifest_digest) ...
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Co-authored-by: robaiken <6567647+robaiken@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 20, 2026

@copilot lint Offenses:

docker/spec/dependabot/docker/update_checker_spec.rb:1535:9: C: [Correctable] RSpec/ReceiveMessages: Use receive_messages instead of multiple stubs on lines [1536, 1538].
...

Fixed — replaced individual receive stubs with receive_messages in both cooldown test blocks. Lint now passes clean.

@robaiken robaiken marked this pull request as ready for review May 20, 2026 12:30
@robaiken robaiken requested a review from a team as a code owner May 20, 2026 12:30
Copilot AI review requested due to automatic review settings May 20, 2026 12:30
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes Docker update failures when cooldown logic performs extra registry API calls (digest + HEAD) that can return errors (e.g., 404/401/403/429). Instead of letting these errors crash the update or silently filter out all candidates, the update checker now degrades gracefully by bypassing cooldown when publication details can’t be determined.

Changes:

  • Add error handling in get_tag_publication_details to catch common registry/auth/rate-limit failures, log a warning, and return nil.
  • Change apply_cooldown to treat missing/unknown publication timestamps as “skip cooldown and allow the tag” rather than excluding it (which could previously block all updates).
  • Add/extend specs to cover the new error-handling and “still returns latest version” behavior.
Show a summary per file
File Description
docker/lib/dependabot/docker/update_checker.rb Makes cooldown publication-date fetching resilient to registry failures and ensures cooldown can’t block updates when dates are unavailable.
docker/spec/dependabot/docker/update_checker_spec.rb Adds test coverage for registry errors (404/auth/403/429) and verifies updates still proceed when cooldown publication lookups fail.

Copilot's findings

  • Files reviewed: 2/2 changed files
  • Comments generated: 0

@robaiken robaiken merged commit 6463993 into main May 20, 2026
63 checks passed
@robaiken robaiken deleted the copilot/fix-docker-cooldown-issue branch May 20, 2026 14:43
Copy link
Copy Markdown

@albertoblue87-netizen albertoblue87-netizen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

Copy link
Copy Markdown

@albertoblue87-netizen albertoblue87-netizen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

L: docker Docker containers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cooldown breaks breaks docker updates

5 participants