Skip to content

OCPBUGS-90710: Destroy all private cluster backend service resources#10643

Open
barbacbd wants to merge 2 commits into
openshift:release-4.17from
barbacbd:OCPBUGS-90710-release-4.17
Open

OCPBUGS-90710: Destroy all private cluster backend service resources#10643
barbacbd wants to merge 2 commits into
openshift:release-4.17from
barbacbd:OCPBUGS-90710-release-4.17

Conversation

@barbacbd

@barbacbd barbacbd commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

When a private cluster was created, the destroy process would not find
and destroy backend services. This was happening specifically when no
backends were created/found.

Enhanced backend service discovery to check firewall rules for cluster
ID tags when backends are empty. This allows proper identification of
orphaned backend services that should be deleted during cluster destroy.

Also added global health check discovery to ensure all related resources
are properly cleaned up.

Summary by CodeRabbit

  • Bug Fixes
    • Improved cloud controller backend service discovery to more accurately match and select the correct regional backend services and avoid missed associations.
    • Enhanced instance discovery so returned instance details now include label information for more reliable identification and tracking.
  • Documentation
    • (No user-facing documentation changes included in this update.)

This fixes a regression where finding instances by label was not working
correctly. The labels field must be requested when querying for instance
information.
@openshift-ci-robot openshift-ci-robot added jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Jun 22, 2026
@openshift-ci-robot

Copy link
Copy Markdown
Contributor

@barbacbd: This pull request references Jira Issue OCPBUGS-43779, which is invalid:

  • expected the bug to be open, but it isn't
  • expected the bug to target either version "4.17." or "openshift-4.17.", but it targets "4.19.0" instead
  • expected the bug to be in one of the following states: NEW, ASSIGNED, POST, but it is Closed (Done-Errata) instead
  • expected Jira Issue OCPBUGS-43779 to depend on a bug targeting a version in 4.18.0, 4.18.z and in one of the following states: VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA), but no dependents were found

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

When a private cluster was created, the destroy process would not find
and destroy backend services. This was happening specifically when no
backends were created/found.

Enhanced backend service discovery to check firewall rules for cluster
ID tags when backends are empty. This allows proper identification of
orphaned backend services that should be deleted during cluster destroy.

Also added global health check discovery to ensure all related resources
are properly cleaned up.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown

Walkthrough

listCloudControllerBackendServices is rewritten to directly paginate regional backend services with inline name-regex filtering and conditional validation: firewall rule tag matching when no backends are present, or instance-group URL set membership when backends exist. Imports are updated to support string matching and explicit API field selection. Instance query field selection is extended to request labels alongside machineType.

Changes

GCP Destroy Logic Updates

Layer / File(s) Summary
Backend service discovery refactoring
pkg/destroy/gcp/cloudcontroller.go
Imports are updated to add strings and googleapi for field selection and string matching. listCloudControllerBackendServices is rewritten to directly list regional backend services, construct an instance-group URL set, fetch all firewall rules once for validation, and apply conditional filtering: firewall rule tag matching when Backends is empty, or instance-group URL membership when Backends is non-empty. Matched services produce cloudResource entries with backend_services quota.
Instance label field selection
pkg/destroy/gcp/instance.go
The Fields string passed to listInstancesWithFilter for both byName and byLabel paths is updated to request labels in addition to machineType.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 14 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Test Structure And Quality ❓ Inconclusive The custom check requests review of Ginkgo test code, but the PR contains no Ginkgo tests. The codebase uses standard Go testing (testing.T) in pkg/destroy/gcp/, and no tests were added for modifie... Clarify whether tests using standard Go testing framework should be reviewed instead, or whether Ginkgo tests were specifically required for this PR.
✅ Passed checks (14 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: fixing backend service resource destruction for private clusters in OCPBUGS-90710.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed This PR contains no Ginkgo tests. The codebase uses Go's standard testing package only. No test files were added or modified by this PR.
Microshift Test Compatibility ✅ Passed This PR modifies only GCP resource destruction logic (cloudcontroller.go, instance.go) with no new Ginkgo e2e tests added, making the MicroShift test compatibility check not applicable.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No Ginkgo e2e tests are added in this PR. Changes are limited to internal GCP cluster destruction/cleanup helper functions, not test code.
Topology-Aware Scheduling Compatibility ✅ Passed This PR modifies GCP resource cleanup utility functions in the installer's destroy module, not deployment manifests or operator code. No scheduling constraints or workload definitions are introduced.
Ote Binary Stdout Contract ✅ Passed The modified files (pkg/destroy/gcp/cloudcontroller.go and pkg/destroy/gcp/instance.go) contain only methods on ClusterUninstaller, not process-level code. No stdout writes detected (no fmt.Print/P...
Ipv6 And Disconnected Network Test Compatibility ✅ Passed No Ginkgo e2e tests are added in this PR. Changes are to pkg/destroy/gcp package code (cloudcontroller.go and instance.go), which are regular GCP resource cleanup utilities, not test files.
No-Weak-Crypto ✅ Passed No weak cryptographic algorithms, custom crypto implementations, or non-constant-time secret comparisons detected. Changes use only basic string operations on non-sensitive GCP resource metadata.
Container-Privileges ✅ Passed PR contains only Go source code files for the OpenShift installer's destroy process; no Kubernetes manifests or container configurations present that could violate privilege-related security policies.
No-Sensitive-Data-In-Logs ✅ Passed No sensitive data (passwords, tokens, API keys, PII, session IDs) is exposed in logging statements. All logged values are generic resource identifiers and operational status information appropriate...

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.12.2)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions


Comment @coderabbitai help to get the list of available commands.

@openshift-ci openshift-ci Bot requested review from bfournie and patrickdillon June 22, 2026 18:58
@openshift-ci

openshift-ci Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign patrickdillon for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@barbacbd barbacbd changed the title OCPBUGS-43779: Destroy all private cluster backend service resources OCPBUGS-90710: Destroy all private cluster backend service resources Jun 22, 2026
@openshift-ci-robot openshift-ci-robot removed the jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. label Jun 22, 2026
@openshift-ci-robot

Copy link
Copy Markdown
Contributor

@barbacbd: This pull request references Jira Issue OCPBUGS-90710, which is invalid:

  • expected the bug to target the "4.17.z" version, but no target version was set
  • release note text must be set and not match the template OR release note type must be set to "Release Note Not Required". For more information you can reference the OpenShift Bug Process.
  • expected Jira Issue OCPBUGS-90710 to depend on a bug targeting a version in 4.18.0, 4.18.z and in one of the following states: VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA), but no dependents were found

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

When a private cluster was created, the destroy process would not find
and destroy backend services. This was happening specifically when no
backends were created/found.

Enhanced backend service discovery to check firewall rules for cluster
ID tags when backends are empty. This allows proper identification of
orphaned backend services that should be deleted during cluster destroy.

Also added global health check discovery to ensure all related resources
are properly cleaned up.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@barbacbd

Copy link
Copy Markdown
Contributor Author

/jira refresh

@openshift-ci-robot

Copy link
Copy Markdown
Contributor

@barbacbd: This pull request references Jira Issue OCPBUGS-90710, which is invalid:

  • expected the bug to target the "4.17.z" version, but no target version was set
  • release note text must be set and not match the template OR release note type must be set to "Release Note Not Required". For more information you can reference the OpenShift Bug Process.
  • expected Jira Issue OCPBUGS-90710 to depend on a bug targeting a version in 4.18.0, 4.18.z and in one of the following states: VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA), but no dependents were found

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@barbacbd

Copy link
Copy Markdown
Contributor Author

/jira refresh

@openshift-ci-robot

Copy link
Copy Markdown
Contributor

@barbacbd: This pull request references Jira Issue OCPBUGS-90710, which is invalid:

  • release note text must be set and not match the template OR release note type must be set to "Release Note Not Required". For more information you can reference the OpenShift Bug Process.
  • expected Jira Issue OCPBUGS-90710 to depend on a bug targeting a version in 4.18.0, 4.18.z and in one of the following states: VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA), but no dependents were found

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@barbacbd

Copy link
Copy Markdown
Contributor Author

/jira refresh

@openshift-ci-robot

Copy link
Copy Markdown
Contributor

@barbacbd: This pull request references Jira Issue OCPBUGS-90710, which is invalid:

  • release note text must be set and not match the template OR release note type must be set to "Release Note Not Required". For more information you can reference the OpenShift Bug Process.

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@barbacbd

Copy link
Copy Markdown
Contributor Author

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Jun 22, 2026
@openshift-ci-robot

Copy link
Copy Markdown
Contributor

@barbacbd: This pull request references Jira Issue OCPBUGS-90710, which is valid.

7 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.17.z) matches configured target version for branch (4.17.z)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)
  • release note text is set and does not match the template
  • dependent bug Jira Issue OCPBUGS-48611 is in the state Closed (Done-Errata), which is one of the valid states (VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA))
  • dependent Jira Issue OCPBUGS-48611 targets the "4.18.0" version, which is one of the valid target versions: 4.18.0, 4.18.z
  • bug has dependents
Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@barbacbd

Copy link
Copy Markdown
Contributor Author

/cc @tthvo

@openshift-ci openshift-ci Bot requested a review from tthvo June 22, 2026 19:11
@barbacbd barbacbd force-pushed the OCPBUGS-90710-release-4.17 branch from 4fe72c0 to 062b048 Compare June 22, 2026 19:15
@tthvo

tthvo commented Jun 22, 2026

Copy link
Copy Markdown
Member

/retest

@tthvo

tthvo commented Jun 23, 2026

Copy link
Copy Markdown
Member

@coderabbitai full review

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown
✅ Action performed

Full review finished.

@tthvo

tthvo commented Jun 23, 2026

Copy link
Copy Markdown
Member

/retest

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/destroy/gcp/cloudcontroller.go`:
- Around line 199-204: Fix the nil pointer dereference in the
listHealthChecksWithFilter call by replacing the nil parameter with
o.healthCheckList, which is the global HealthChecks API object that the function
expects to use. Additionally, either add a case handler for the
"globalhealthcheck" typeName in the destroy loop to handle deletion of these
items when they are inserted via insertPendingItems, or change the typeName
string from "globalhealthcheck" to "healthcheck" to use the existing global
health check handler.
- Around line 43-66: The firewall list is being fetched separately for every
backend service with no backends (N+1 problem) and only returns the first page
due to using .Do() instead of handling pagination. Hoist the firewall list fetch
outside the loop that iterates through backend services so it is called once per
cloudcontroller operation. Replace the Firewalls.List().Do() call with
Firewalls.List().Pages() to properly handle pagination across all firewall
rules. Reuse the paginated firewall list across all backend service checks
instead of refetching for each item, and return the error instead of continuing
to avoid silently skipping cleanup when transient failures occur.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 1ad8130e-3e8d-45be-a9df-78b865153b2d

📥 Commits

Reviewing files that changed from the base of the PR and between b102c3a and 062b048.

📒 Files selected for processing (2)
  • pkg/destroy/gcp/cloudcontroller.go
  • pkg/destroy/gcp/instance.go

Comment thread pkg/destroy/gcp/cloudcontroller.go
Comment thread pkg/destroy/gcp/cloudcontroller.go Outdated
Comment thread pkg/destroy/gcp/cloudcontroller.go Outdated
Comment thread pkg/destroy/gcp/cloudcontroller.go Outdated
Comment thread pkg/destroy/gcp/cloudcontroller.go Outdated
Comment thread pkg/destroy/gcp/cloudcontroller.go Outdated
Comment thread pkg/destroy/gcp/cloudcontroller.go Outdated
Comment thread pkg/destroy/gcp/cloudcontroller.go Outdated
When a private cluster was created, the destroy process would not find
and destroy backend services. This was happening specifically when no
backends were created/found.

Enhanced backend service discovery to check firewall rules for cluster
ID tags when backends are empty. This allows proper identification of
orphaned backend services that should be deleted during cluster destroy.

Also added global health check discovery to ensure all related resources
are properly cleaned up.
@barbacbd barbacbd force-pushed the OCPBUGS-90710-release-4.17 branch from ecf4ac4 to 631c430 Compare June 23, 2026 13:14
@barbacbd

Copy link
Copy Markdown
Contributor Author

/retest-required

@openshift-ci

openshift-ci Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

@barbacbd: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gcp-ovn 631c430 link true /test e2e-gcp-ovn
ci/prow/e2e-gcp-ovn-xpn 631c430 link false /test e2e-gcp-ovn-xpn
ci/prow/e2e-gcp-ovn-byo-vpc 631c430 link false /test e2e-gcp-ovn-byo-vpc
ci/prow/e2e-aws-ovn 631c430 link true /test e2e-aws-ovn
ci/prow/images 631c430 link true /test images
ci/prow/e2e-gcp-secureboot 631c430 link false /test e2e-gcp-secureboot

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@tthvo

tthvo commented Jun 23, 2026

Copy link
Copy Markdown
Member

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@tthvo

tthvo commented Jun 23, 2026

Copy link
Copy Markdown
Member

/retest

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/destroy/gcp/cloudcontroller.go`:
- Around line 28-102: The function listCloudControllerBackendServices lacks unit
test coverage, which violates project coding guidelines. Add unit tests that
cover the three main filtering scenarios: (1) backend services with no backends
that are validated by checking if firewall targetTags contain the cluster ID,
(2) backend services with backends where all backend Groups match instance-group
URLs from the urls set, and (3) backend services that should be filtered out due
to mismatched backends or missing firewall tag matches. Ensure your tests mock
the compute service Firewalls.List and RegionBackendServices.List calls
appropriately.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: b5f2b3c2-9312-4f66-bc11-1f66162a4027

📥 Commits

Reviewing files that changed from the base of the PR and between 062b048 and 631c430.

📒 Files selected for processing (1)
  • pkg/destroy/gcp/cloudcontroller.go

Comment thread pkg/destroy/gcp/cloudcontroller.go
@tthvo

tthvo commented Jun 23, 2026

Copy link
Copy Markdown
Member

/payload-job periodic-ci-openshift-release-main-ci-4.17-e2e-gcp-ovn

@openshift-ci

openshift-ci Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

@tthvo: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-ci-4.17-e2e-gcp-ovn

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/55e14d40-6f33-11f1-83b6-eb89d48c8156-0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants