Skip to content

OLS-3348 Reconcile agentic alerts adapter as lightspeed-operator operand#1759

Open
blublinsky wants to merge 1 commit into
openshift:mainfrom
blublinsky:adapter-operator
Open

OLS-3348 Reconcile agentic alerts adapter as lightspeed-operator operand#1759
blublinsky wants to merge 1 commit into
openshift:mainfrom
blublinsky:adapter-operator

Conversation

@blublinsky

@blublinsky blublinsky commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Description

Summary

  • Add internal/controller/alertsadapter/ to reconcile the agentic alerts adapter as a first-class lightspeed-operator operand (OLS-3348).
  • Opt-in via spec.ols.deployment.alertsAdapter.configMapRef. When unset, the operator tears down operand resources and reports AlertsAdapterReady=True with Reason=NotConfigured.
  • Phase 1 (when enabled): validate user-managed runtime ConfigMap in openshift-lightspeed (config.yaml required when CM exists; missing CM allowed), ServiceAccount, proposals ClusterRole/ClusterRoleBinding, config Role/RoleBinding, Alertmanager RoleBinding in openshift-monitoring, NetworkPolicy.
  • Phase 2: Deployment (lightspeed-agentic-alerts-adapter, 1 replica) with ALERTMANAGER_URL, POD_NAMESPACE, and AlertsAdapterReady status condition.
  • Runtime config: operator does not create/update ConfigMap data; ConfigMap changes trigger rolling restart via watcher (RestartAlertsAdapter).
  • Disable / finalizer: RemoveAlertsAdapter() deletes deployment, namespaced RBAC, SA, NetworkPolicy, monitoring RoleBinding, and proposals ClusterRole/ClusterRoleBinding.
  • API: spec.ols.deployment.alertsAdapter (AlertsAdapterSpec: scheduling/resources + configMapRef); --alerts-adapter-image flag in cmd/main.go (default Konflux :main).
  • Operator RBAC: rolebindings in openshift-monitoring namespace.
  • Docs: .ai/spec, ARCHITECTURE.md, AGENTS.md. Release notes on OLS-3348.
  • Deferred (OLS-3236): OLM bundle regeneration, CSV --alerts-adapter-image arg, related_images.json entry, agentic console.

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up dependent library

Related Tickets & Documents

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • make test (unit tests including internal/controller/alertsadapter and watcher coverage)
  • Enable adapter:
    spec:
      ols:
        deployment:
          alertsAdapter:
            configMapRef:
              name: alerts-adapter-config
  • Create ConfigMap alerts-adapter-config in openshift-lightspeed with key config.yaml (see adapter manifests)
  • Verify AlertsAdapterReady=True, deployment running, proposals RBAC present
  • Edit ConfigMap data → adapter pod restarts
  • Remove configMapRef → operand resources removed, NotConfigured condition

Summary by CodeRabbit

  • New Features
    • Added an agentic alerts adapter operand with phase-based reconciliation and an AlertsAdapterReady condition, including a “NotConfigured” state when disabled.
    • Introduced spec.ols.deployment.alertsAdapter.configMapRef for opt-in runtime configuration, including automatic restart on ConfigMap updates.
    • Added --alerts-adapter-image to control the alerts adapter image used by the operator.
  • Bug Fixes
    • Improved finalizer cleanup to remove alerts-adapter operand RBAC/resources and extended watcher-driven restarts for it.
  • Documentation
    • Updated architecture, reconciliation, configuration surface, CRD API, and lifecycle docs for the alerts adapter.
  • Tests
    • Added alerts adapter reconciliation and watcher integration coverage.
  • Chores
    • Enhanced make run to support passthrough ARGS.

@coderabbitai

coderabbitai Bot commented Jun 25, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 06da39de-dfde-4ea0-8ac9-6cb529d3cbbc

📥 Commits

Reviewing files that changed from the base of the PR and between ccc58ef and 386f287.

⛔ Files ignored due to path filters (3)
  • api/v1alpha1/zz_generated.deepcopy.go is excluded by !**/zz_generated.deepcopy.go
  • config/crd/bases/ols.openshift.io_olsconfigs.yaml is excluded by !config/crd/bases/**
  • config/rbac/role.yaml is excluded by !config/rbac/role.yaml
📒 Files selected for processing (29)
  • .ai/spec/how/project-structure.md
  • .ai/spec/how/reconciliation.md
  • .ai/spec/what/bundle-composition.md
  • .ai/spec/what/crd-api.md
  • .ai/spec/what/reconciliation.md
  • .ai/spec/what/resource-lifecycle.md
  • AGENTS.md
  • ARCHITECTURE.md
  • Makefile
  • api/v1alpha1/olsconfig_types.go
  • cmd/main.go
  • internal/controller/alertsadapter/assets.go
  • internal/controller/alertsadapter/assets_test.go
  • internal/controller/alertsadapter/deployment.go
  • internal/controller/alertsadapter/reconciler.go
  • internal/controller/alertsadapter/reconciler_test.go
  • internal/controller/alertsadapter/suite_test.go
  • internal/controller/appserver/reconciler.go
  • internal/controller/olsconfig_controller.go
  • internal/controller/olsconfig_helpers.go
  • internal/controller/reconciler/interface.go
  • internal/controller/utils/constants.go
  • internal/controller/utils/errors.go
  • internal/controller/utils/resource_defaults_test.go
  • internal/controller/utils/testing.go
  • internal/controller/utils/types.go
  • internal/controller/utils/utils.go
  • internal/controller/watchers/watchers.go
  • internal/controller/watchers/watchers_test.go
✅ Files skipped from review due to trivial changes (7)
  • ARCHITECTURE.md
  • .ai/spec/how/reconciliation.md
  • .ai/spec/what/resource-lifecycle.md
  • AGENTS.md
  • .ai/spec/what/bundle-composition.md
  • .ai/spec/what/reconciliation.md
  • .ai/spec/how/project-structure.md
🚧 Files skipped from review as they are similar to previous changes (15)
  • internal/controller/reconciler/interface.go
  • internal/controller/utils/resource_defaults_test.go
  • internal/controller/watchers/watchers.go
  • internal/controller/utils/types.go
  • internal/controller/alertsadapter/suite_test.go
  • internal/controller/utils/constants.go
  • internal/controller/watchers/watchers_test.go
  • internal/controller/utils/testing.go
  • api/v1alpha1/olsconfig_types.go
  • .ai/spec/what/crd-api.md
  • internal/controller/utils/utils.go
  • internal/controller/olsconfig_controller.go
  • internal/controller/olsconfig_helpers.go
  • cmd/main.go
  • Makefile

📝 Walkthrough

Walkthrough

The PR adds an optional alerts-adapter operand with new API fields, image wiring, reconciliation, deployment generation, cleanup, watcher restarts, and documentation updates. It also changes the appserver metrics secret reconciliation to skip in local development mode.

Changes

Alerts adapter rollout

Layer / File(s) Summary
Config and contract
api/v1alpha1/olsconfig_types.go, cmd/main.go, internal/controller/reconciler/interface.go, internal/controller/olsconfig_helpers.go, internal/controller/utils/{constants.go,types.go,testing.go,utils.go,resource_defaults_test.go}, .ai/spec/what/{crd-api.md,bundle-composition.md,reconciliation.md,resource-lifecycle.md}, .ai/spec/how/{project-structure.md,reconciliation.md}, AGENTS.md, ARCHITECTURE.md
alertsAdapter config fields, readiness constants, image defaults, reconciler getters, config-map helpers, and matching documentation/spec updates are added.
Assets and deployment
internal/controller/alertsadapter/{assets.go,deployment.go,assets_test.go}
The alerts-adapter package generates its service account, RBAC, network policy, config-map access, selector labels, and deployment, with tests covering generation and ConfigMap-dependent volume mounting behavior.
Reconcile and finalize
internal/controller/alertsadapter/{reconciler.go,reconciler_test.go,suite_test.go}, internal/controller/olsconfig_controller.go, internal/controller/utils/errors.go, internal/controller/watchers/{watchers.go,watchers_test.go}
Phase 1/2 reconciliation, disablement cleanup, finalizer teardown, and watcher-triggered deployment restarts are wired through the controller and covered by tests.

Local dev appserver behavior

Layer / File(s) Summary
Local dev skip
internal/controller/appserver/reconciler.go, Makefile
reconcileMetricsReaderSecret now returns early in local development mode, and the run target forwards optional ARGS to go run ./cmd/main.go.

Sequence Diagram(s)

sequenceDiagram
  participant OLSConfigController
  participant alertsadapter
  participant Watchers
  participant KubernetesAPI

  OLSConfigController->>alertsadapter: ReconcileAlertsAdapterResources()
  alertsadapter->>KubernetesAPI: create/update or remove operand resources
  OLSConfigController->>alertsadapter: ReconcileAlertsAdapterDeployment()
  alertsadapter->>KubernetesAPI: generate or update Deployment
  Watchers->>alertsadapter: RestartAlertsAdapter()
  alertsadapter->>KubernetesAPI: patch Deployment pod template annotation
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested reviewers

  • bparees
  • joshuawilson

Possibly related PRs

  • openshift/lightspeed-operator#1751: Updates the same alerts-adapter lifecycle/spec area, including AlertsAdapterReady, spec.ols.deployment.alertsAdapter, and image wiring.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 46.51% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly matches the main change: adding and reconciling the agentic alerts adapter as a lightspeed-operator operand.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@openshift-ci openshift-ci Bot requested review from joshuawilson and xrajesh June 25, 2026 15:15
@openshift-ci

openshift-ci Bot commented Jun 25, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign bparees for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
internal/controller/alertsadapter/deployment.go (1)

100-101: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Wrap owner-reference errors with a shared error constant.

Return from SetControllerReference should follow repo error-wrapping format (fmt.Errorf("%s: %w", ErrConstant, err)) for consistent debugging/context in reconciler logs.

As per coding guidelines: “Wrap errors with fmt.Errorf("%s: %w", ErrConstant, err) when adding context in Go error handling.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/controller/alertsadapter/deployment.go` around lines 100 - 101, The
SetControllerReference error path in deployment reconciliation should use the
shared wrapped-error format instead of returning the raw error. Update the
controllerutil.SetControllerReference call site in the deployment helper so the
returned error is wrapped with the existing error constant via fmt.Errorf("%s:
%w", ErrConstant, err), keeping context consistent with repo logging and
reconciler error handling.

Sources: Coding guidelines, Path instructions

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/controller/alertsadapter/reconciler.go`:
- Around line 191-199: The NetworkPolicy update path in the alerts adapter
reconciler only copies the desired Spec, so label differences detected by
utils.NetworkPolicyEqual are never fixed and can cause repeated reconciles. In
the reconciler’s update branch, alongside foundNP.Spec = np.Spec, also
synchronize the desired Labels from np onto foundNP before calling r.Update,
keeping the update path consistent with the equality check and preventing a
reconcile loop.

In `@internal/controller/alertsadapter/suite_test.go`:
- Around line 78-98: The suite setup in alertsadapter/suite_test.go registers
several API schemes but misses configv1, so ClusterVersion is not recognized
when the test client creates it. Add configv1.AddToScheme alongside the other
AddToScheme calls in the test setup before client.New, using the existing
configv1 import, so k8sClient.Create with ClusterVersion works correctly.

---

Nitpick comments:
In `@internal/controller/alertsadapter/deployment.go`:
- Around line 100-101: The SetControllerReference error path in deployment
reconciliation should use the shared wrapped-error format instead of returning
the raw error. Update the controllerutil.SetControllerReference call site in the
deployment helper so the returned error is wrapped with the existing error
constant via fmt.Errorf("%s: %w", ErrConstant, err), keeping context consistent
with repo logging and reconciler error handling.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: d38107cb-e362-433d-866e-14a779361185

📥 Commits

Reviewing files that changed from the base of the PR and between fb5c715 and 62159d9.

⛔ Files ignored due to path filters (3)
  • api/v1alpha1/zz_generated.deepcopy.go is excluded by !**/zz_generated.deepcopy.go
  • config/crd/bases/ols.openshift.io_olsconfigs.yaml is excluded by !config/crd/bases/**
  • config/rbac/role.yaml is excluded by !config/rbac/role.yaml
📒 Files selected for processing (25)
  • .ai/spec/how/project-structure.md
  • .ai/spec/how/reconciliation.md
  • .ai/spec/what/bundle-composition.md
  • .ai/spec/what/crd-api.md
  • .ai/spec/what/reconciliation.md
  • AGENTS.md
  • ARCHITECTURE.md
  • Makefile
  • api/v1alpha1/olsconfig_types.go
  • cmd/main.go
  • internal/controller/alertsadapter/assets.go
  • internal/controller/alertsadapter/assets_test.go
  • internal/controller/alertsadapter/deployment.go
  • internal/controller/alertsadapter/reconciler.go
  • internal/controller/alertsadapter/reconciler_test.go
  • internal/controller/alertsadapter/suite_test.go
  • internal/controller/olsconfig_controller.go
  • internal/controller/olsconfig_helpers.go
  • internal/controller/reconciler/interface.go
  • internal/controller/utils/constants.go
  • internal/controller/utils/errors.go
  • internal/controller/utils/resource_defaults_test.go
  • internal/controller/utils/testing.go
  • internal/controller/utils/types.go
  • internal/controller/utils/utils.go

Comment thread internal/controller/alertsadapter/reconciler.go
Comment thread internal/controller/alertsadapter/suite_test.go

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
api/v1alpha1/olsconfig_types.go (1)

356-375: 🗄️ Data Integrity & Integration | 🟠 Major | 🏗️ Heavy lift

Don't expose an unsupported alertsAdapter.replicas knob.

Line 358 inlines Config, so the CRD accepts spec.ols.deployment.alertsAdapter.replicas, but Lines 369-371 document that Alerts Adapter is always reconciled at 1 replica. That means values like 0 or 3 serialize cleanly and are then ignored by the controller. Either switch this field to an alerts-adapter-specific config type without Replicas, or add schema validation that only permits 1 here. As per path instructions, "kubebuilder markers, validation, and defaulting must stay consistent with controller behavior in internal/controller/".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/v1alpha1/olsconfig_types.go` around lines 356 - 375, The
AlertsAdapterSpec currently inlines Config, which exposes an unsupported
alertsAdapter.replicas field even though the controller always reconciles Alerts
Adapter at 1 replica. Update AlertsAdapterSpec to avoid inheriting
Config.Replicas or add schema validation that only allows 1 for this spec, and
keep the kubebuilder markers/defaulting aligned with the behavior in the Alerts
Adapter reconciliation logic under internal/controller/. Use AlertsAdapterSpec
and Config as the main symbols to locate the change.

Source: Path instructions

🧹 Nitpick comments (1)
internal/controller/utils/utils.go (1)

474-482: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Base alerts-adapter labels on DefaultLabels().

This duplicates the shared managed-by / part-of labels instead of deriving from the common label set, so alerts-adapter assets can drift the next time the operator’s default labels change.

Suggested refactor
 func GenerateAlertsAdapterSelectorLabels() map[string]string {
-	return map[string]string{
-		"app":                          AlertsAdapterDeploymentName,
-		"app.kubernetes.io/component":  AlertsAdapterComponentLabel,
-		"app.kubernetes.io/managed-by": "lightspeed-operator",
-		"app.kubernetes.io/name":       AlertsAdapterDeploymentName,
-		"app.kubernetes.io/part-of":    "openshift-lightspeed",
-	}
+	labels := DefaultLabels()
+	labels["app"] = AlertsAdapterDeploymentName
+	labels["app.kubernetes.io/component"] = AlertsAdapterComponentLabel
+	labels["app.kubernetes.io/name"] = AlertsAdapterDeploymentName
+	return labels
 }

As per coding guidelines, "Use utils.DefaultLabels() for consistent labeling in asset generation functions".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/controller/utils/utils.go` around lines 474 - 482,
`GenerateAlertsAdapterSelectorLabels` is hardcoding shared labels instead of
deriving them from the common label set. Update this function to start from
`DefaultLabels()` and then add the alerts-adapter specific selector entries such
as `app`, `app.kubernetes.io/component`, and `app.kubernetes.io/name`, so the
`managed-by` and `part-of` values stay consistent with the rest of the operator.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/controller/alertsadapter/assets_test.go`:
- Around line 102-115: The “missing ConfigMap” spec in ValidateUserConfigMap is
mutating the shared default AlertsAdapter ConfigMap and leaving it deleted,
which can make other controller tests flaky. Update the test around
crWithAlertsAdapterConfigMapRef and k8sClient.Get/Delete to avoid touching
utils.AlertsAdapterConfigMapName directly, either by using a uniquely named
ConfigMap for this case or by restoring the original object in cleanup. Keep the
change localized to the ValidateUserConfigMap test in assets_test.go and follow
the existing envtest/Ginkgo patterns used by the controller tests.

In `@internal/controller/alertsadapter/reconciler_test.go`:
- Around line 103-123: The Ordered block in ReconcileAlertsAdapterResources
setup leaks created state because it only provisions the alerts adapter
ConfigMap and resources in BeforeAll without cleaning them up. Add an AfterAll
in the same test context to explicitly call RemoveAlertsAdapter(...) and delete
the ConfigMap created via k8sClient, using the same
enabledCR/testReconcilerInstance objects so the cleanup matches the setup. This
should follow the existing Ginkgo v2/envtest patterns used in the controller
tests to prevent cross-spec interference with assets_test.go.

In `@internal/controller/alertsadapter/reconciler.go`:
- Around line 162-181: The RBAC reconciliation paths are only doing existence
checks, so managed RoleBinding/ClusterRole/ClusterRoleBinding resources never
get updated when their desired spec changes. In reconcileConfigRoleBinding and
the other RBAC reconciliations referenced in this diff, fetch the existing
object, compare its relevant desired fields against the output of the
corresponding generator, and update or recreate it when drift is detected
instead of returning success immediately. Use the existing helper generators and
reconciliation functions to keep the fix consistent across all RBAC resources.
- Around line 148-156: In alerts adapter reconciliation, the early return in
reconciler.go is unsafe because it indexes ResourceNames[0] without checking the
slice length and only compares a single field, so update the logic in the role
reconciliation path to compare the full Rules content before returning early and
avoid any direct indexing unless ResourceNames is confirmed non-empty. Use the
foundRole and role comparison in the reconcile flow to ensure verb, API group,
and resource name differences are all detected before skipping the update.

---

Outside diff comments:
In `@api/v1alpha1/olsconfig_types.go`:
- Around line 356-375: The AlertsAdapterSpec currently inlines Config, which
exposes an unsupported alertsAdapter.replicas field even though the controller
always reconciles Alerts Adapter at 1 replica. Update AlertsAdapterSpec to avoid
inheriting Config.Replicas or add schema validation that only allows 1 for this
spec, and keep the kubebuilder markers/defaulting aligned with the behavior in
the Alerts Adapter reconciliation logic under internal/controller/. Use
AlertsAdapterSpec and Config as the main symbols to locate the change.

---

Nitpick comments:
In `@internal/controller/utils/utils.go`:
- Around line 474-482: `GenerateAlertsAdapterSelectorLabels` is hardcoding
shared labels instead of deriving them from the common label set. Update this
function to start from `DefaultLabels()` and then add the alerts-adapter
specific selector entries such as `app`, `app.kubernetes.io/component`, and
`app.kubernetes.io/name`, so the `managed-by` and `part-of` values stay
consistent with the rest of the operator.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 2c803b2b-b2fe-45b1-b87b-60481dadb527

📥 Commits

Reviewing files that changed from the base of the PR and between 62159d9 and 8ee2078.

⛔ Files ignored due to path filters (3)
  • api/v1alpha1/zz_generated.deepcopy.go is excluded by !**/zz_generated.deepcopy.go
  • config/crd/bases/ols.openshift.io_olsconfigs.yaml is excluded by !config/crd/bases/**
  • config/rbac/role.yaml is excluded by !config/rbac/role.yaml
📒 Files selected for processing (28)
  • .ai/spec/how/project-structure.md
  • .ai/spec/how/reconciliation.md
  • .ai/spec/what/bundle-composition.md
  • .ai/spec/what/crd-api.md
  • .ai/spec/what/reconciliation.md
  • .ai/spec/what/resource-lifecycle.md
  • AGENTS.md
  • ARCHITECTURE.md
  • Makefile
  • api/v1alpha1/olsconfig_types.go
  • cmd/main.go
  • internal/controller/alertsadapter/assets.go
  • internal/controller/alertsadapter/assets_test.go
  • internal/controller/alertsadapter/deployment.go
  • internal/controller/alertsadapter/reconciler.go
  • internal/controller/alertsadapter/reconciler_test.go
  • internal/controller/alertsadapter/suite_test.go
  • internal/controller/olsconfig_controller.go
  • internal/controller/olsconfig_helpers.go
  • internal/controller/reconciler/interface.go
  • internal/controller/utils/constants.go
  • internal/controller/utils/errors.go
  • internal/controller/utils/resource_defaults_test.go
  • internal/controller/utils/testing.go
  • internal/controller/utils/types.go
  • internal/controller/utils/utils.go
  • internal/controller/watchers/watchers.go
  • internal/controller/watchers/watchers_test.go
✅ Files skipped from review due to trivial changes (4)
  • .ai/spec/what/resource-lifecycle.md
  • AGENTS.md
  • .ai/spec/how/reconciliation.md
  • .ai/spec/what/bundle-composition.md
🚧 Files skipped from review as they are similar to previous changes (15)
  • internal/controller/utils/resource_defaults_test.go
  • internal/controller/reconciler/interface.go
  • internal/controller/olsconfig_helpers.go
  • internal/controller/utils/types.go
  • internal/controller/utils/constants.go
  • .ai/spec/what/reconciliation.md
  • internal/controller/olsconfig_controller.go
  • cmd/main.go
  • .ai/spec/what/crd-api.md
  • internal/controller/alertsadapter/deployment.go
  • Makefile
  • ARCHITECTURE.md
  • internal/controller/alertsadapter/suite_test.go
  • .ai/spec/how/project-structure.md
  • internal/controller/utils/testing.go

Comment thread internal/controller/alertsadapter/assets_test.go Outdated
Comment thread internal/controller/alertsadapter/reconciler_test.go
Comment thread internal/controller/alertsadapter/reconciler.go Outdated
Comment on lines +162 to +181
func reconcileConfigRoleBinding(r reconciler.Reconciler, ctx context.Context, cr *olsv1alpha1.OLSConfig) error {
rb, err := GenerateConfigRoleBinding(r, cr)
if err != nil {
return fmt.Errorf("%s: %w", utils.ErrGenerateAlertsAdapterConfigRoleBinding, err)
}

foundRB := &rbacv1.RoleBinding{}
err = r.Get(ctx, client.ObjectKey{Name: rb.Name, Namespace: r.GetNamespace()}, foundRB)
if err != nil && errors.IsNotFound(err) {
r.GetLogger().Info("creating alerts adapter config role binding", "RoleBinding", rb.Name)
if err := r.Create(ctx, rb); err != nil {
return fmt.Errorf("%s: %w", utils.ErrCreateAlertsAdapterConfigRoleBinding, err)
}
return nil
} else if err != nil {
return fmt.Errorf("%s: %w", utils.ErrGetAlertsAdapterConfigRoleBinding, err)
}

r.GetLogger().Info("alerts adapter config role binding reconciled", "RoleBinding", rb.Name)
return nil

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | 🏗️ Heavy lift

These RBAC reconciliations never heal drift.

Once the RoleBinding / ClusterRole / ClusterRoleBinding / Alertmanager RoleBinding exists, reconcile reports success without comparing desired fields. That leaves stale rules or subjects in place after manual edits or future permission changes, even though these are operator-managed resources. Compare and update them (or recreate bindings when immutable fields differ).

Also applies to: 206-247, 250-269

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/controller/alertsadapter/reconciler.go` around lines 162 - 181, The
RBAC reconciliation paths are only doing existence checks, so managed
RoleBinding/ClusterRole/ClusterRoleBinding resources never get updated when
their desired spec changes. In reconcileConfigRoleBinding and the other RBAC
reconciliations referenced in this diff, fetch the existing object, compare its
relevant desired fields against the output of the corresponding generator, and
update or recreate it when drift is detected instead of returning success
immediately. Use the existing helper generators and reconciliation functions to
keep the fix consistent across all RBAC resources.

Comment thread config/rbac/role.yaml
- list
- patch
- update
- watch

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this role?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes — this is operator RBAC, not operand RBAC. When the alerts adapter is enabled we reconcile a RoleBinding in openshift-monitoring that binds the adapter ServiceAccount to the existing monitoring-alertmanager-view Role so it can poll Alertmanager. The operator needs permission to create/delete that RoleBinding on opt-in/opt-out.

Same pattern as the existing openshift-lightspeed namespace Role for operand RBAC, but scoped to openshift-monitoring and limited to rolebindings only.

Alternative would be a static manifest (like the prometheus RoleBindings), but then we couldn't lifecycle it with the operand when configMapRef is removed.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kustomize build config/manifests is failing

func getUserConfigMap(r reconciler.Reconciler, ctx context.Context, cr *olsv1alpha1.OLSConfig) (*corev1.ConfigMap, error) {
name, ok := utils.AlertsAdapterConfigMapRef(cr)
if !ok {
return nil, fmt.Errorf("%s: alerts adapter configMapRef is not set", utils.ErrValidateAlertsAdapterConfigMap)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so if the configmap ref is not set then it's error? The error is then propagated at line 212 here right? What is the reason please?

I thought missing configMapRef is a valid state - user is not interested about the alert-adapter at all.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its not. We proceed normally

VolumeMounts: []corev1.VolumeMount{
{
Name: utils.TmpVolumeName,
MountPath: utils.TmpVolumeMountPath,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is overriden during bundle creation or when please?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't overridden anywhere — it's always set in the deployment generator, same as postgres and appserver.

We use RestrictedContainerSecurityContext() (readOnlyRootFilesystem: true), so the container needs a writable emptyDir at /tmp. ApplyPodDeploymentConfig only applies scheduling fields (nodeSelector, tolerations, affinity, replicas) from OLSConfig; it doesn't touch volumes. Bundle/CSV generation doesn't modify operand Deployment specs at runtime either — the operator reconciles this from code on each loop.

@vimalk78 vimalk78 left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OLS-3348 Review — Score: 92/100 (jira mode, round 1)

8/9 ACs pass, 0 fail, 1 needs human review.

Well-structured alerts adapter controller following established opt-in operand patterns. Cross-namespace RBAC, ConfigMap watcher integration, and status condition handling are all correctly implemented. Comprehensive unit test coverage.

Acceptance Criteria

# AC Status
1 spec.ols.deployment.alertsAdapter.configMapRef field accepted PASS
2 --alerts-adapter-image flag overrides default PASS
3 Deployment with correct env vars, volumes, security context PASS
4 ClusterRole/CRB for agentic.openshift.io/proposals PASS
5 RoleBinding in openshift-monitoring for Alertmanager PASS
6 AlertsAdapterReady condition (health or NotConfigured) PASS
7 ConfigMap watcher restarts adapter on config changes PASS
8 Finalizer removes all alerts adapter resources PASS
9 CSV --alerts-adapter-image arg and relatedImages entry NEEDS_REVIEW

AC #9: CSV/relatedImages explicitly deferred to follow-up PR. Image uses hardcoded Konflux URL with comment marking it as temporary — consistent with pre-productization pattern.

Code Quality

should-fix — RemoveAlertsAdapter fail-fast error handling (alertsadapter/reconciler.go)
The function returns on the first error, which could leave subsequent resources (RoleBindings, ClusterRole) behind if an earlier deletion fails. Phase 1 reconciliation uses continue-on-error to maximize progress. The finalizer caller handles the error gracefully (logs and continues), so finalization isn't blocked — but the function itself would be more robust collecting all errors and continuing through all tasks.

nice-to-have — Bundled unrelated changes (appserver/reconciler.go, Makefile)
LOCAL_DEV_MODE guard for metrics reader secret and $(ARGS) Makefile support are unrelated to the alerts adapter work. Small and low-risk, but ideally in a separate PR.

@openshift-ci

openshift-ci Bot commented Jun 26, 2026

Copy link
Copy Markdown

@blublinsky: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/bundle-e2e-4-21 386f287 link true /test bundle-e2e-4-21

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants