Skip to content

ESO-237: Fix reconciliation blocked when managed labels are externally modified#153

Merged
openshift-merge-bot[bot] merged 6 commits into
openshift:mainfrom
chiragkyal:fix-recon
Jun 18, 2026
Merged

ESO-237: Fix reconciliation blocked when managed labels are externally modified#153
openshift-merge-bot[bot] merged 6 commits into
openshift:mainfrom
chiragkyal:fix-recon

Conversation

@chiragkyal

@chiragkyal chiragkyal commented Jun 16, 2026

Copy link
Copy Markdown
Member

Description

When the app=external-secrets label on a managed resource is changed
externally, the resource disappears from the label-filtered informer
cache. Subsequent reconciliation attempts to create the resource,
receives AlreadyExists, and gets stuck in a permanent error loop.

Changes

  1. Add a createWithFallback helper that handles AlreadyExists by falling back to an uncached UpdateWithRetry, restoring the managed labels and annotations.
  2. Update the controller predicate that check both ObjectOld and ObjectNew, so label-removal events are not silently dropped. Simplify mapFunc accordingly.

Summary by CodeRabbit

  • New Features

    • Added automatic restoration of managed labels on Kubernetes resources (ServiceAccounts, Roles, Deployments, NetworkPolicies) after external removal
    • Enhanced metadata-only reconciliation for ConfigMaps while preserving managed data
  • Bug Fixes

    • Improved handling of resource creation conflicts and cache inconsistencies
    • Fixed metadata restoration during already-existing resource scenarios
  • Tests

    • Added comprehensive test coverage for managed label restoration across multiple resource types
    • Enhanced test coverage for ConfigMap metadata management and error scenarios

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 16, 2026
@openshift-ci-robot

openshift-ci-robot commented Jun 16, 2026

Copy link
Copy Markdown

@chiragkyal: This pull request references ESO-237 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Description

When the app=external-secrets label on a managed resource is changed
externally, the resource disappears from the label-filtered informer
cache. Subsequent reconciliation attempts to create the resource,
receives AlreadyExists, and gets stuck in a permanent error loop.

Changes

  1. Add a createWithFallback helper that handles AlreadyExists by falling back to an uncached UpdateWithRetry, restoring the managed labels and annotations.
  2. Update the controller predicate that check both ObjectOld and ObjectNew, so label-removal events are not silently dropped. Simplify mapFunc accordingly.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Introduces a createWithFallback helper on the reconciler that handles AlreadyExists errors (caused by label-filtered informer cache inconsistencies) by performing an uncached UpdateWithRetry to restore managed labels and annotations. All resource creation paths (certificate, deployment, network policies, RBAC, secret, service accounts, services, validating webhook, and ConfigMap) are migrated to this helper. The controller's SetupWithManager predicate is updated to trigger reconciliation when the managed label is removed from a watched object.

Changes

Managed Label Restoration

Layer / File(s) Summary
createWithFallback helper implementation
pkg/controller/external_secrets/utils.go, pkg/controller/external_secrets/utils_test.go, pkg/controller/external_secrets/test_utils.go
Adds imports and implements createWithFallback: on AlreadyExists it resolves GVK, removes obsolete annotations, and calls UncachedClient.UpdateWithRetry to restore managed metadata, emitting distinct events for create vs. restore. Unit tests cover all paths including GVK fallback. test_utils.go scheme registration is updated to include client-go and cert-manager types.
SetupWithManager predicate for label removal
pkg/controller/external_secrets/controller.go
Decouples event filtering from request mapping: mapFunc unconditionally enqueues the singleton ExternalSecretsConfig, while a new predicate.Funcs guards create/delete/generic events on label presence and allows update events when either old or new object carries the managed label.
ConfigMap AlreadyExists and metadata-only patch
pkg/controller/external_secrets/configmap.go, pkg/controller/external_secrets/configmap_test.go
Adds apierrors import and reworks the not-found path (treats AlreadyExists on Create as a cache miss, patches only metadata) and the exists path (replaces full UpdateWithRetry with a conditional metadata-only patch). TestEnsureTrustedCABundleConfigMap covers create, no-op, patch, AlreadyExists regression, proxy-disabled, and all error propagation paths.
Migrate all resource creation paths to createWithFallback
pkg/controller/external_secrets/certificate.go, deployments.go, networkpolicy.go, rbacs.go, secret.go, serviceaccounts.go, services.go, validatingwebhook.go
Replaces direct r.Create + common.FromClientError wrapping + explicit "created" event emission with r.createWithFallback across all resource reconcilers (certificate, deployment, 3 network policy branches, 4 RBAC kinds, secret, service account, service, validating webhook).
Update unit test error message expectations
pkg/controller/external_secrets/certificate_test.go, networkpolicy_test.go, rbacs_test.go, secret_test.go, service_test.go, serviceaccounts_test.go, validatingwebhook_test.go
Updates wantErr strings across all resource unit tests to match the new error message format from createWithFallback (e.g., failed to create Certificate namespace/name: err instead of certificate resource name: err).
E2E tests for managed label restoration
test/e2e/e2e_test.go
Adds the "Managed Label Restoration" Ginkgo context with an AfterEach verifying ExternalSecretsConfig stays Ready and operand pods remain up, plus three It blocks that remove the managed app=external-secrets label from a ServiceAccount, Role, and Deployment and poll for operator-driven restoration.

Sequence Diagram(s)

sequenceDiagram
  rect rgba(173, 216, 230, 0.5)
    Note over SetupWithManager,predicate.Funcs: Label removal detection
    SetupWithManager->>predicate.Funcs: configure Update predicate
    Note right of predicate.Funcs: allow if old OR new object has managed label
  end
  rect rgba(144, 238, 144, 0.5)
    Note over Reconciler,UncachedClient: createWithFallback on AlreadyExists
    Reconciler->>CachedClient: Create(desired)
    CachedClient-->>Reconciler: AlreadyExists
    Reconciler->>Reconciler: resolve GVK, remove obsolete annotations
    Reconciler->>UncachedClient: UpdateWithRetry(desired with restored labels)
    UncachedClient-->>Reconciler: nil
    Reconciler->>eventRecorder: emit "restored" Reconciled event
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • openshift/external-secrets-operator#12: Establishes the initial desired-state reconciliation flows for ServiceAccount, Service, and Deployment resources that this PR extends with createWithFallback label restoration.
  • openshift/external-secrets-operator#146: Reworks NetworkPolicy reconciliation with prefixed names, proxy egress, and migration cleanup in the same networkpolicy.go file where this PR migrates creation paths to createWithFallback.

Suggested reviewers

  • swghosh
🚥 Pre-merge checks | ✅ 13 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Microshift Test Compatibility ⚠️ Warning New test "Managed Label Restoration" uses ExternalSecretsConfig (operator.openshift.io/v1alpha1), an OpenShift-specific API unavailable on MicroShift, without [Skipped:MicroShift], [apigroup:...] t... Add [apigroup:operator.openshift.io] tag to Context name, add [Skipped:MicroShift] label, or guard with exutil.IsMicroShiftCluster() check to skip on MicroShift.
✅ Passed checks (13 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically summarizes the main change: fixing a reconciliation issue when managed labels are externally modified.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed All test titles added in this PR are stable and deterministic with no dynamic information; no pod names, timestamps, UUIDs, node names, namespaces with random suffixes, IPs, fmt.Sprintf calls, or v...
Test Structure And Quality ✅ Passed Tests exhibit good quality: Each test has single responsibility; setup/cleanup properly handled via preReq functions (units) and AfterEach (e2e); all timeouts specified; assertions include messages...
Single Node Openshift (Sno) Test Compatibility ✅ Passed The new "Managed Label Restoration" e2e test only verifies label restoration and pod readiness; it makes no multi-node assumptions and works on SNO with all pods on a single node.
Topology-Aware Scheduling Compatibility ✅ Passed PR introduces no new scheduling constraints. Changes are limited to controller logic for label restoration via createWithFallback helper and predicate updates; no deployment manifests, affinity rul...
Ote Binary Stdout Contract ✅ Passed The PR properly handles stdout writes: e2e_suite_test.go's TestE2E() function uses GinkgoWriter for output, no klog writes occur without stderr redirection, and all test code follows proper Ginkgo...
Ipv6 And Disconnected Network Test Compatibility ✅ Passed The new "Managed Label Restoration" e2e test context contains no IPv4 assumptions, hardcoded IPv4 addresses, external connectivity requirements, or IPv4-specific parsing. It uses only cluster-inter...
No-Weak-Crypto ✅ Passed No weak cryptography usage detected. The PR contains no MD5, SHA1, DES, RC4, 3DES, Blowfish, ECB mode imports or usages; no custom crypto implementations; and no non-constant-time secret/token comp...
Container-Privileges ✅ Passed PR contains no manifest changes and existing manifests have no privileged: true, hostPID/hostNetwork/hostIPC: true, SYS_ADMIN capabilities, or allowPrivilegeEscalation: true.
No-Sensitive-Data-In-Logs ✅ Passed All logging statements in the PR changes log only safe metadata: resource kind, namespace, resource name, and error messages. No passwords, tokens, API keys, PII, session IDs, credentials, or sensi...

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/controller/external_secrets/configmap.go`:
- Around line 75-79: The fallback recovery path in the desiredConfigMap update
block does not preserve external annotations from the live ConfigMap. While the
code copies Data and BinaryData from actualConfigMap and removes obsolete
annotations via RemoveObsoleteAnnotations, it fails to merge back any external
annotations that exist on the live ConfigMap before calling UpdateWithRetry.
Merge the annotations from actualConfigMap into desiredConfigMap (preserving
external annotations that aren't managed by the controller) before the
UpdateWithRetry call. Additionally, extend the regression test in
pkg/controller/external_secrets/configmap_test.go to assert that external
annotations are preserved when the AlreadyExists recovery update path is
triggered.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 2ed66a5d-9c8d-4e45-a044-2d82020f48f7

📥 Commits

Reviewing files that changed from the base of the PR and between 86b890f and 6b694a6.

📒 Files selected for processing (20)
  • pkg/controller/external_secrets/certificate.go
  • pkg/controller/external_secrets/certificate_test.go
  • pkg/controller/external_secrets/configmap.go
  • pkg/controller/external_secrets/configmap_test.go
  • pkg/controller/external_secrets/controller.go
  • pkg/controller/external_secrets/deployments.go
  • pkg/controller/external_secrets/networkpolicy.go
  • pkg/controller/external_secrets/networkpolicy_test.go
  • pkg/controller/external_secrets/rbacs.go
  • pkg/controller/external_secrets/rbacs_test.go
  • pkg/controller/external_secrets/secret.go
  • pkg/controller/external_secrets/secret_test.go
  • pkg/controller/external_secrets/service_test.go
  • pkg/controller/external_secrets/serviceaccounts.go
  • pkg/controller/external_secrets/serviceaccounts_test.go
  • pkg/controller/external_secrets/services.go
  • pkg/controller/external_secrets/utils.go
  • pkg/controller/external_secrets/utils_test.go
  • pkg/controller/external_secrets/validatingwebhook.go
  • pkg/controller/external_secrets/validatingwebhook_test.go

Comment thread pkg/controller/external_secrets/configmap.go Outdated
@codecov-commenter

codecov-commenter commented Jun 16, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 43.70370% with 76 lines in your changes missing coverage. Please review.
✅ Project coverage is 46.67%. Comparing base (86b890f) to head (1175fc1).
⚠️ Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
pkg/controller/external_secrets/utils.go 55.55% 13 Missing and 7 partials ⚠️
pkg/controller/external_secrets/configmap.go 0.00% 16 Missing ⚠️
pkg/controller/external_secrets/rbacs.go 0.00% 4 Missing and 4 partials ⚠️
pkg/controller/external_secrets/secret.go 50.00% 5 Missing and 3 partials ⚠️
pkg/controller/external_secrets/networkpolicy.go 0.00% 4 Missing and 2 partials ⚠️
pkg/controller/external_secrets/test_utils.go 0.00% 5 Missing ⚠️
pkg/controller/external_secrets/controller.go 89.65% 3 Missing ⚠️
pkg/controller/external_secrets/certificate.go 0.00% 2 Missing ⚠️
pkg/controller/external_secrets/deployments.go 0.00% 1 Missing and 1 partial ⚠️
pkg/controller/external_secrets/serviceaccounts.go 0.00% 1 Missing and 1 partial ⚠️
... and 2 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #153      +/-   ##
==========================================
- Coverage   46.69%   46.67%   -0.02%     
==========================================
  Files          29       29              
  Lines        4420     4486      +66     
==========================================
+ Hits         2064     2094      +30     
- Misses       2054     2089      +35     
- Partials      302      303       +1     
Flag Coverage Δ
e2e 46.67% <43.70%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@bharath-b-rh bharath-b-rh left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just few suggestions.

Comment thread pkg/controller/external_secrets/certificate_test.go Outdated
Comment thread pkg/controller/external_secrets/configmap.go
Comment thread pkg/controller/external_secrets/configmap.go Outdated
Comment thread pkg/controller/external_secrets/controller.go
Signed-off-by: chiragkyal <ckyal@redhat.com>
Signed-off-by: chiragkyal <ckyal@redhat.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
pkg/controller/external_secrets/configmap.go (2)

72-76: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve external annotations before patching metadata (AlreadyExists recovery path).

The current code patches metadata with only the desired annotations from desiredConfigMap. Since MergePatch replaces the entire annotations map (RFC 7386), any external annotations added by other controllers or users are removed. Fetch the existing ConfigMap and merge its annotations before calling patchConfigMapMetadata.

Suggested fix
 		r.log.V(1).Info("trusted CA bundle ConfigMap exists on API server but absent from label-filtered cache, patching metadata", "name", configMapName)
+		actualConfigMap := &corev1.ConfigMap{}
+		if getErr := r.UncachedClient.Get(r.ctx, client.ObjectKeyFromObject(desiredConfigMap), actualConfigMap); getErr != nil {
+			return common.FromClientError(getErr, "failed to fetch existing %s trusted CA bundle ConfigMap", configMapName)
+		}
+		mergedAnnotations := maps.Clone(actualConfigMap.GetAnnotations())
+		if mergedAnnotations == nil {
+			mergedAnnotations = map[string]string{}
+		}
+		for k, v := range desiredConfigMap.GetAnnotations() {
+			mergedAnnotations[k] = v
+		}
+		desiredConfigMap.SetAnnotations(mergedAnnotations)
 		common.RemoveObsoleteAnnotations(desiredConfigMap, resourceMetadata)
 		if patchErr := r.patchConfigMapMetadata(desiredConfigMap, r.UncachedClient); patchErr != nil {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/controller/external_secrets/configmap.go` around lines 72 - 76, Before
calling patchConfigMapMetadata in the AlreadyExists recovery path, fetch the
existing ConfigMap from the API server to retrieve its current annotations.
Merge the existing ConfigMap's annotations with the desired annotations in
desiredConfigMap to preserve any external annotations added by other controllers
or users, then pass the merged ConfigMap to patchConfigMapMetadata. This
prevents MergePatch (RFC 7386) from replacing the entire annotations map and
losing external annotations.

87-91: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve external annotations before patching metadata (exists + metadata-modified path).

Similar to the AlreadyExists path, this code replaces the entire annotations map when patching. External annotations from existingConfigMap should be merged into desiredConfigMap before patching to avoid losing metadata managed by other controllers.

Suggested fix
 	if exist && common.ObjectMetadataModified(desiredConfigMap, existingConfigMap, &resourceMetadata) {
 		r.log.V(1).Info("trusted CA bundle ConfigMap has been modified, patching metadata to desired state", "name", configMapName)
+		mergedAnnotations := maps.Clone(existingConfigMap.GetAnnotations())
+		if mergedAnnotations == nil {
+			mergedAnnotations = map[string]string{}
+		}
+		for k, v := range desiredConfigMap.GetAnnotations() {
+			mergedAnnotations[k] = v
+		}
+		desiredConfigMap.SetAnnotations(mergedAnnotations)
 		common.RemoveObsoleteAnnotations(desiredConfigMap, resourceMetadata)
 		if err := r.patchConfigMapMetadata(desiredConfigMap, r.CtrlClient); err != nil {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/controller/external_secrets/configmap.go` around lines 87 - 91, When
patching the ConfigMap metadata in the metadata-modified path, external
annotations from the existingConfigMap are being lost. Before calling
common.RemoveObsoleteAnnotations on desiredConfigMap, merge the annotations from
existingConfigMap into desiredConfigMap to preserve annotations managed by other
controllers. This ensures that only obsolete annotations are removed while
external annotations are retained, similar to the handling in the AlreadyExists
path.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@pkg/controller/external_secrets/configmap.go`:
- Around line 72-76: Before calling patchConfigMapMetadata in the AlreadyExists
recovery path, fetch the existing ConfigMap from the API server to retrieve its
current annotations. Merge the existing ConfigMap's annotations with the desired
annotations in desiredConfigMap to preserve any external annotations added by
other controllers or users, then pass the merged ConfigMap to
patchConfigMapMetadata. This prevents MergePatch (RFC 7386) from replacing the
entire annotations map and losing external annotations.
- Around line 87-91: When patching the ConfigMap metadata in the
metadata-modified path, external annotations from the existingConfigMap are
being lost. Before calling common.RemoveObsoleteAnnotations on desiredConfigMap,
merge the annotations from existingConfigMap into desiredConfigMap to preserve
annotations managed by other controllers. This ensures that only obsolete
annotations are removed while external annotations are retained, similar to the
handling in the AlreadyExists path.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 5cec64ab-709d-46c7-8a97-89df9e9e6c42

📥 Commits

Reviewing files that changed from the base of the PR and between 6b694a6 and 8c5fe48.

📒 Files selected for processing (13)
  • pkg/controller/external_secrets/certificate_test.go
  • pkg/controller/external_secrets/configmap.go
  • pkg/controller/external_secrets/configmap_test.go
  • pkg/controller/external_secrets/controller.go
  • pkg/controller/external_secrets/networkpolicy_test.go
  • pkg/controller/external_secrets/rbacs_test.go
  • pkg/controller/external_secrets/secret_test.go
  • pkg/controller/external_secrets/service_test.go
  • pkg/controller/external_secrets/serviceaccounts_test.go
  • pkg/controller/external_secrets/test_utils.go
  • pkg/controller/external_secrets/utils.go
  • pkg/controller/external_secrets/utils_test.go
  • pkg/controller/external_secrets/validatingwebhook_test.go
✅ Files skipped from review due to trivial changes (4)
  • pkg/controller/external_secrets/certificate_test.go
  • pkg/controller/external_secrets/networkpolicy_test.go
  • pkg/controller/external_secrets/rbacs_test.go
  • pkg/controller/external_secrets/secret_test.go
🚧 Files skipped from review as they are similar to previous changes (4)
  • pkg/controller/external_secrets/service_test.go
  • pkg/controller/external_secrets/utils.go
  • pkg/controller/external_secrets/utils_test.go
  • pkg/controller/external_secrets/controller.go

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/e2e/e2e_test.go`:
- Around line 1069-1152: Add meaningful failure messages to all assertions
within the Eventually blocks across the three restoration test cases
(ServiceAccount, Role, and Deployment). For each error assertion lacking a
message (g.Expect(err).NotTo(HaveOccurred())), add a diagnostic message like
"failed to get ServiceAccount" or "failed to get Role". For each label assertion
lacking a message (g.Expect(resource.Labels).To(HaveKeyWithValue(...))), add a
message like "operator should restore app=external-secrets on ServiceAccount
saName" or similar for Role and Deployment. Apply these messages consistently
across all three test functions to ensure CI failures are easily triageable.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 6c9b66ef-791e-4c5a-a8b5-de4d4b65bf7a

📥 Commits

Reviewing files that changed from the base of the PR and between 8c5fe48 and 2654a78.

📒 Files selected for processing (1)
  • test/e2e/e2e_test.go

Comment thread test/e2e/e2e_test.go
Comment on lines +1069 to +1152
Eventually(func(g Gomega) {
sa, err := clientset.CoreV1().ServiceAccounts(operandNamespace).Get(ctx, saName, metav1.GetOptions{})
g.Expect(err).NotTo(HaveOccurred())
g.Expect(sa.Labels).To(HaveKeyWithValue(managedLabelKey, managedLabelValue))
}, time.Minute, 5*time.Second).Should(Succeed())

By("Removing the managed label from the ServiceAccount")
Expect(retry.RetryOnConflict(retry.DefaultRetry, func() error {
sa, err := clientset.CoreV1().ServiceAccounts(operandNamespace).Get(ctx, saName, metav1.GetOptions{})
if err != nil {
return err
}
delete(sa.Labels, managedLabelKey)
_, err = clientset.CoreV1().ServiceAccounts(operandNamespace).Update(ctx, sa, metav1.UpdateOptions{})
return err
})).To(Succeed(), "should remove the managed label")

By("Waiting for operator to restore the managed label")
Eventually(func(g Gomega) {
sa, err := clientset.CoreV1().ServiceAccounts(operandNamespace).Get(ctx, saName, metav1.GetOptions{})
g.Expect(err).NotTo(HaveOccurred())
g.Expect(sa.Labels).To(HaveKeyWithValue(managedLabelKey, managedLabelValue),
"operator should restore app=external-secrets on ServiceAccount %s", saName)
}, 2*time.Minute, 5*time.Second).Should(Succeed())
})

It("should restore the app=external-secrets label on a Role after external removal", func() {
roleName := "external-secrets"

By("Verifying Role has the managed label initially")
Eventually(func(g Gomega) {
role, err := clientset.RbacV1().Roles(operandNamespace).Get(ctx, roleName, metav1.GetOptions{})
g.Expect(err).NotTo(HaveOccurred())
g.Expect(role.Labels).To(HaveKeyWithValue(managedLabelKey, managedLabelValue))
}, time.Minute, 5*time.Second).Should(Succeed())

By("Removing the managed label from the Role")
Expect(retry.RetryOnConflict(retry.DefaultRetry, func() error {
role, err := clientset.RbacV1().Roles(operandNamespace).Get(ctx, roleName, metav1.GetOptions{})
if err != nil {
return err
}
delete(role.Labels, managedLabelKey)
_, err = clientset.RbacV1().Roles(operandNamespace).Update(ctx, role, metav1.UpdateOptions{})
return err
})).To(Succeed(), "should remove the managed label")

By("Waiting for operator to restore the managed label")
Eventually(func(g Gomega) {
role, err := clientset.RbacV1().Roles(operandNamespace).Get(ctx, roleName, metav1.GetOptions{})
g.Expect(err).NotTo(HaveOccurred())
g.Expect(role.Labels).To(HaveKeyWithValue(managedLabelKey, managedLabelValue),
"operator should restore app=external-secrets on Role %s", roleName)
}, 2*time.Minute, 5*time.Second).Should(Succeed())
})

It("should restore the app=external-secrets label on a Deployment after external removal", func() {
depName := "external-secrets"

By("Verifying Deployment has the managed label initially")
Eventually(func(g Gomega) {
dep, err := clientset.AppsV1().Deployments(operandNamespace).Get(ctx, depName, metav1.GetOptions{})
g.Expect(err).NotTo(HaveOccurred())
g.Expect(dep.Labels).To(HaveKeyWithValue(managedLabelKey, managedLabelValue))
}, time.Minute, 5*time.Second).Should(Succeed())

By("Removing the managed label from the Deployment")
Expect(retry.RetryOnConflict(retry.DefaultRetry, func() error {
dep, err := clientset.AppsV1().Deployments(operandNamespace).Get(ctx, depName, metav1.GetOptions{})
if err != nil {
return err
}
delete(dep.Labels, managedLabelKey)
_, err = clientset.AppsV1().Deployments(operandNamespace).Update(ctx, dep, metav1.UpdateOptions{})
return err
})).To(Succeed(), "should remove the managed label")

By("Waiting for operator to restore the managed label")
Eventually(func(g Gomega) {
dep, err := clientset.AppsV1().Deployments(operandNamespace).Get(ctx, depName, metav1.GetOptions{})
g.Expect(err).NotTo(HaveOccurred())
g.Expect(dep.Labels).To(HaveKeyWithValue(managedLabelKey, managedLabelValue),
"operator should restore app=external-secrets on Deployment %s", depName)
}, 2*time.Minute, 5*time.Second).Should(Succeed())

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add meaningful failure messages to assertions in the new restoration specs.

Several new assertions omit diagnostic messages (for example g.Expect(err).NotTo(HaveOccurred())), which makes e2e failures harder to triage in CI. Please add explicit messages to the error and label assertions in these Eventually blocks.

As per coding guidelines, “Flag Ginkgo test assertions that lack meaningful failure messages. Assertions should include messages like 'failed to create test pod' to help diagnose what went wrong.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e/e2e_test.go` around lines 1069 - 1152, Add meaningful failure
messages to all assertions within the Eventually blocks across the three
restoration test cases (ServiceAccount, Role, and Deployment). For each error
assertion lacking a message (g.Expect(err).NotTo(HaveOccurred())), add a
diagnostic message like "failed to get ServiceAccount" or "failed to get Role".
For each label assertion lacking a message
(g.Expect(resource.Labels).To(HaveKeyWithValue(...))), add a message like
"operator should restore app=external-secrets on ServiceAccount saName" or
similar for Role and Deployment. Apply these messages consistently across all
three test functions to ensure CI failures are easily triageable.

Source: Coding guidelines

Comment thread pkg/controller/external_secrets/utils.go Outdated
if !exist {
if err := r.Create(r.ctx, desired); err != nil {
return common.FromClientError(err, "failed to create %s secret resource", secretName)
if err := r.createWithFallback(desired, resourceMetadata, secretName); err != nil {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need to use the same logic as that of CNO co-managed ConfigMap for Secrets too.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please share your thought process behind this? The ConfigMap uses a metadata-only MergePatch because CNO co-manages its Data. Whereas the webhook TLS Secret has no such co-manager; its Data comes entirely from a static asset owned by this operator.

Could you clarify what specific scenario you have in mind? Is there an external controller that writes to the Secret's Data that we should be aware of? or I'm missing something

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The operator creates an empty secret when cert-manager is not configured, during this scenario, the cert-controller component of external-secrets will inject the TLS content to it which will be used by the webhook. The cert-controller expects the secret to be present and has only updates it.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification. Updated in the latest commit.

Comment thread pkg/controller/external_secrets/configmap.go Outdated
Comment thread pkg/controller/external_secrets/certificate.go Outdated
Comment thread pkg/controller/external_secrets/controller.go Outdated
Comment on lines +269 to +271
isManagedResource := func(object client.Object) bool {
return object.GetLabels() != nil && object.GetLabels()[requestEnqueueLabelKey] == requestEnqueueLabelValue
})
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
isManagedResource := func(object client.Object) bool {
return object.GetLabels() != nil && object.GetLabels()[requestEnqueueLabelKey] == requestEnqueueLabelValue
})
}
isManagedResource := func(object client.Object) bool {
admit := object.GetLabels() != nil && object.GetLabels()[requestEnqueueLabelKey] == requestEnqueueLabelValue
if !admit {
logIgnored(object)
}
return admit
})

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would cause confusion for UpdateFunc. If logging is inside isManagedResource, and ObjectOld doesn't match but ObjectNew does, we'd log "not of interest" for ObjectOld, then accept the event anyway. That's misleading.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah agree and we could improve it with an optional arg, whether log or not. My intention is to avoid repetitive code which will be flagged by linter. Or we could use the suggestion made on trust-manager as well, where we were passing objects as vargs and check if present in any of it.
Ref: openshift/cert-manager-operator#379 (comment)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we can do like that also; thanks. Updated.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the change missing from the commit or are we not making the change?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's already pushed I think, could you please check the latest changes.

Comment thread pkg/controller/external_secrets/controller.go Outdated
Comment thread test/e2e/e2e_test.go
if tt.wantCreate && cached.CreateCallCount() == 0 {
t.Error("expected Create to be called, but it wasn't")
}
if tt.wantPatch && cached.PatchCallCount() == 0 && uncached.PatchCallCount() == 0 {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be?

Suggested change
if tt.wantPatch && cached.PatchCallCount() == 0 && uncached.PatchCallCount() == 0 {
if tt.wantPatch && (cached.PatchCallCount() == 0 || uncached.PatchCallCount() == 0) {

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified as per the latest changes.

@bharath-b-rh

Copy link
Copy Markdown
Contributor

cc: @snarayan-redhat for docs approval. Please help add release notes for the changes.

== Fixed issues

* Before this release, if the `app=external-secrets` managed label was externally removed from a resource that the {external-secrets-operator} owns, the resource fell out of the label-filtered informer cache. Subsequent reconciliation attempts to create the resource received an `AlreadyExists` error, causing the controller to enter a permanent error loop. With this release, the controller detects this cache-miss condition and restores the managed labels and annotations directly on the API server using an uncached client, without interrupting the operand. (link:https://issues.redhat.com/browse/ESO-237[ESO-237])

@snarayan-redhat

Copy link
Copy Markdown

/label docs-approved

@openshift-ci openshift-ci Bot added the docs-approved Signifies that Docs has signed off on this PR label Jun 17, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
pkg/controller/external_secrets/configmap_test.go (1)

248-255: ⚡ Quick win

Add a symmetric negative assertion for unexpected Create calls.

The table asserts Create when expected, but it doesn’t fail when Create is called unexpectedly in non-create scenarios.

Suggested test hardening
 			if tt.wantCreate && cached.CreateCallCount() == 0 {
 				t.Error("expected Create to be called, but it wasn't")
 			}
+			if !tt.wantCreate && cached.CreateCallCount() > 0 {
+				t.Error("expected no Create call, but one was made")
+			}
 			if tt.wantPatch && cached.PatchCallCount() == 0 {
 				t.Error("expected Patch to be called, but it wasn't")
 			}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/controller/external_secrets/configmap_test.go` around lines 248 - 255,
The test currently has a negative assertion to verify that Patch is not called
when unexpected (using !tt.wantPatch && cached.PatchCallCount() > 0), but lacks
a symmetric negative assertion for Create calls. Add a similar negative
assertion check after the positive Create assertion that verifies when
tt.wantCreate is false and cached.CreateCallCount() is greater than zero, the
test fails with an appropriate error message indicating that an unexpected
Create call was made. This ensures symmetry in test validation for both Create
and Patch operations.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@Makefile`:
- Around line 216-218: The E2E_GINKGO_LABEL_FILTER variable is currently running
tests labeled with Platform:Generic that also include Proxy:HTTP labels, which
require cluster-wide proxy configuration but are not gated on whether proxy is
actually configured. Modify the E2E_GINKGO_LABEL_FILTER value to exclude tests
with the Proxy:HTTP label from the default run, ensuring that proxy-dependent
tests are only executed when proxy configuration is explicitly provided or
enabled. Update the filter expression to use Ginkgo's label syntax to exclude or
restrict tests that have the Proxy:HTTP label requirement.

---

Nitpick comments:
In `@pkg/controller/external_secrets/configmap_test.go`:
- Around line 248-255: The test currently has a negative assertion to verify
that Patch is not called when unexpected (using !tt.wantPatch &&
cached.PatchCallCount() > 0), but lacks a symmetric negative assertion for
Create calls. Add a similar negative assertion check after the positive Create
assertion that verifies when tt.wantCreate is false and cached.CreateCallCount()
is greater than zero, the test fails with an appropriate error message
indicating that an unexpected Create call was made. This ensures symmetry in
test validation for both Create and Patch operations.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 7e8912d3-42f2-4836-b44c-4d4da948ec22

📥 Commits

Reviewing files that changed from the base of the PR and between 2654a78 and cb5500c.

📒 Files selected for processing (14)
  • Makefile
  • pkg/controller/external_secrets/certificate.go
  • pkg/controller/external_secrets/configmap.go
  • pkg/controller/external_secrets/configmap_test.go
  • pkg/controller/external_secrets/deployments.go
  • pkg/controller/external_secrets/networkpolicy.go
  • pkg/controller/external_secrets/rbacs.go
  • pkg/controller/external_secrets/secret.go
  • pkg/controller/external_secrets/serviceaccounts.go
  • pkg/controller/external_secrets/services.go
  • pkg/controller/external_secrets/utils.go
  • pkg/controller/external_secrets/utils_test.go
  • pkg/controller/external_secrets/validatingwebhook.go
  • test/e2e/e2e_test.go
🚧 Files skipped from review as they are similar to previous changes (3)
  • pkg/controller/external_secrets/secret.go
  • pkg/controller/external_secrets/rbacs.go
  • pkg/controller/external_secrets/utils.go

Comment thread Makefile Outdated
Signed-off-by: chiragkyal <ckyal@redhat.com>
…d TLS data

Signed-off-by: chiragkyal <ckyal@redhat.com>
@chiragkyal

Copy link
Copy Markdown
Member Author

/retest

@chiragkyal

Copy link
Copy Markdown
Member Author

/test e2e-operator

@bharath-b-rh

Copy link
Copy Markdown
Contributor

/lgtm

/label px-approved

Find the list of pre-merge tests performed for the changes here
/label qe-approved

@openshift-ci openshift-ci Bot added px-approved Signifies that Product Support has signed off on this PR qe-approved Signifies that QE has signed off on this PR labels Jun 18, 2026
@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Jun 18, 2026
@chiragkyal

Copy link
Copy Markdown
Member Author

/hold
The e2e failure needs some investigation I think

@openshift-ci openshift-ci Bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. and removed lgtm Indicates that a PR is ready to be merged. labels Jun 18, 2026
Comment thread test/e2e/e2e_test.go
// Resources intentionally excluded:
// - Deployment: Kubernetes adds deployment.kubernetes.io/revision on every rollout,
// so annotation events are suppressed to avoid infinite reconcile loops.
Context("Managed Annotation Restoration", Ordered, Label("Platform:AWS"), func() {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Context("Managed Annotation Restoration", Ordered, Label("Platform:AWS"), func() {
Context("Managed Annotation Restoration", Ordered, Label("Platform:Generic"), func() {

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have Platfrom:Generic filter applied? Otherwise the tests won't run.

…urces

Signed-off-by: chiragkyal <ckyal@redhat.com>
@openshift-ci

openshift-ci Bot commented Jun 18, 2026

Copy link
Copy Markdown

@chiragkyal: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@bharath-b-rh

Copy link
Copy Markdown
Contributor

/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Jun 18, 2026
@bharath-b-rh

Copy link
Copy Markdown
Contributor

/approve
/unhold

@openshift-ci openshift-ci Bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 18, 2026
@openshift-ci

openshift-ci Bot commented Jun 18, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bharath-b-rh, chiragkyal

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 18, 2026
@openshift-merge-bot openshift-merge-bot Bot merged commit 35d2ed9 into openshift:main Jun 18, 2026
9 checks passed
@chiragkyal chiragkyal deleted the fix-recon branch June 18, 2026 17:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. docs-approved Signifies that Docs has signed off on this PR jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. px-approved Signifies that Product Support has signed off on this PR qe-approved Signifies that QE has signed off on this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants