🐛 OCPBUGS-62942: Fix ClusterExtension deletion when BoxcutterRuntime is enabled#2299
🐛 OCPBUGS-62942: Fix ClusterExtension deletion when BoxcutterRuntime is enabled#2299tmshort wants to merge 3 commits intooperator-framework:mainfrom
Conversation
Add detailed specification document explaining the root cause and solution for OCPBUGS-62942, where ClusterExtensions cannot be deleted when the BoxcutterRuntime feature gate is enabled and catalogs are unavailable. The specification documents: - Problem statement and root cause analysis - Solution design with code flow analysis - Implementation steps and testing plan - Risks and mitigations 🤖 Generated with [Claude Code](https://claude.com/claude-code) via /jira:solve OCPBUGS-62942 origin Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Todd Short <tshort@redhat.com>
Move the deletion timestamp check to occur immediately after finalizer error handling in the reconcile loop. This ensures that resolution, unpacking, and installation are skipped entirely when a ClusterExtension is being deleted, regardless of whether finalizers have been updated. This fixes OCPBUGS-62942 where ClusterExtensions could not be deleted after enabling the BoxcutterRuntime feature gate, because the controller would attempt to resolve bundles even during deletion. If the original catalog was no longer available (deleted or cache cleared), this would fail with "cache for catalog not found" error and prevent deletion. The fix ensures that when a ClusterExtension has a deletion timestamp, the reconcile loop returns early without attempting any resolution or installation operations. Fixes: OCPBUGS-62942 🤖 Generated with [Claude Code](https://claude.com/claude-code) via /jira:solve OCPBUGS-62942 origin Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Todd Short <tshort@redhat.com>
Add TestClusterExtensionDeletionWithUnavailableCatalog to verify that when a ClusterExtension is being deleted, the reconcile loop does not attempt resolution even if the catalog is unavailable (which would cause a "cache for catalog not found" error). This test ensures that both the RevisionStatesGetter and Resolver are not called during deletion, preventing errors that would block deletion when catalogs are unavailable. The test creates a ClusterExtension with a finalizer, deletes it, and verifies that reconciliation succeeds without calling the resolver or revision states getter, both of which are configured to return errors. 🤖 Generated with [Claude Code](https://claude.com/claude-code) via /jira:solve OCPBUGS-62942 origin Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Todd Short <tshort@redhat.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
✅ Deploy Preview for olmv1 ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
Closing this PR - the fix doesn't actually change the logic. Just reordering two early-return checks that both check for the same condition (deletion) doesn't fix anything. Need to investigate the actual root cause more carefully. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2299 +/- ##
==========================================
- Coverage 74.32% 74.24% -0.09%
==========================================
Files 90 90
Lines 7008 7008
==========================================
- Hits 5209 5203 -6
- Misses 1392 1395 +3
- Partials 407 410 +3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Summary
Fixes OCPBUGS-62942 where ClusterExtensions cannot be deleted after enabling the BoxcutterRuntime feature gate when the original catalog is no longer available.
Problem
When BoxcutterRuntime feature gate is enabled after ClusterExtensions have been installed, attempting to delete those ClusterExtensions fails with:
Root Cause
The controller's reconcile loop attempted to resolve bundles even when a ClusterExtension was being deleted. If the original catalog was no longer available (deleted or cache cleared), this resolution would fail and prevent deletion from completing.
Solution
Move the deletion timestamp check in the
reconcile()function to occur immediately after finalizer error handling. This ensures that resolution, unpacking, and installation are skipped entirely when a ClusterExtension is being deleted, regardless of whether finalizers have been updated.Changes
clusterextension_controller.goto check for deletion timestamp before attempting resolutionTestClusterExtensionDeletionWithUnavailableCatalogto verify the fixcontrib/spec-OCPBUGS-62942.mdTesting
make fmtRelated Issues
🤖 Generated with Claude Code via
/jira:solve OCPBUGS-62942 originCo-Authored-By: Claude noreply@anthropic.com