Skip to content

Commit 8ee2078

Browse files
committed
OLS-3348 Reconcile agentic alerts adapter as lightspeed-operator operand
1 parent fb5c715 commit 8ee2078

31 files changed

Lines changed: 3092 additions & 74 deletions

.ai/spec/how/project-structure.md

Lines changed: 15 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,9 @@
2121
| `internal/controller/console/reconciler.go` | `ReconcileConsoleUIResources()`, `ReconcileConsoleUIDeploymentAndPlugin()`, `RemoveConsoleUI()` | Console UI Phase 1 + Phase 2 + cleanup |
2222
| `internal/controller/console/deployment.go` | `GenerateConsoleUIDeployment()` | Console UI deployment generation |
2323
| `internal/controller/console/assets.go` | ConsolePlugin CR generator, nginx config, service, network policy | Console UI resource generation |
24+
| `internal/controller/alertsadapter/reconciler.go` | `ReconcileAlertsAdapterResources()`, `ReconcileAlertsAdapterDeployment()`, `RemoveAlertsAdapter()`, `RestartAlertsAdapter()` | Alerts adapter Phase 1 + Phase 2 + operand teardown (disable/finalizer) + rolling restart |
25+
| `internal/controller/alertsadapter/deployment.go` | `GenerateDeployment()` | Alerts adapter deployment generation |
26+
| `internal/controller/alertsadapter/assets.go` | SA, ClusterRole, ClusterRoleBinding, config Role/RoleBinding, monitoring RoleBinding, NetworkPolicy generators | Alerts adapter resource generation |
2427
| `internal/controller/reconciler/interface.go` | `Reconciler` interface | Dependency injection interface for component packages |
2528
| `internal/controller/utils/constants.go` | ~200 constants | Resource names, ports, paths, annotation keys, defaults |
2629
| `internal/controller/utils/errors.go` | ~80 error message constants | Structured error messages for all operations |
@@ -69,10 +72,13 @@ OLSConfigReconciler.Reconcile()
6972
+-- console.ReconcileConsoleUIResources()
7073
+-- postgres.ReconcilePostgresResources()
7174
+-- appserver.ReconcileAppServerResources()
75+
+-- alertsadapter.ReconcileAlertsAdapterResources()
76+
(opt-in via configMapRef; RemoveAlertsAdapter() when disabled)
7277
6. reconcileDeploymentsAndStatus() -- Phase 2: Deployments, Services, TLS certs, status
7378
+-- console.ReconcileConsoleUIDeploymentAndPlugin()
7479
+-- postgres.ReconcilePostgresDeployment()
7580
+-- appserver.ReconcileAppServerDeployment()
81+
+-- alertsadapter.ReconcileAlertsAdapterDeployment() # when configMapRef set
7682
+-- checkDeploymentStatus() per deployment -> build newStatus
7783
+-- UpdateStatusCondition()
7884
```
@@ -90,14 +96,14 @@ External secret/configmap changes
9096
-> Match against SystemResources list (by name+namespace)
9197
-> OR match against WatcherAnnotationKey annotation
9298
-> Resolve "ACTIVE_BACKEND" to appserver deployment name
93-
-> Call RestartAppServer() / RestartPostgres() / RestartConsoleUI()
99+
-> Call RestartAppServer() / RestartPostgres() / RestartConsoleUI() / RestartAlertsAdapter()
94100
-> Set force-reload annotation with current timestamp
95101
```
96102

97103
## Key Abstractions
98104

99105
### Image Management
100-
Default images are stored in a `defaultImages` map in `cmd/main.go` keyed by logical name (e.g., `"lightspeed-service"`, `"postgres-image"`, `"console-plugin"`). Default values come from `internal/relatedimages/` which reads `related_images.json` at build time. Command-line flags override individual images. The map is passed to the reconciler via `OLSConfigReconcilerOptions` as individual named fields (e.g., `LightspeedServiceImage`, `ConsoleUIImage`).
106+
Default images are stored in a `defaultImages` map in `cmd/main.go` keyed by logical name (e.g., `"lightspeed-service"`, `"postgres-image"`, `"console-plugin"`, `"alerts-adapter"`). Default values come from `internal/relatedimages/` which reads `related_images.json` at build time. Command-line flags override individual images. The map is passed to the reconciler via `OLSConfigReconcilerOptions` as individual named fields (e.g., `LightspeedServiceImage`, `ConsoleUIImage`, `AlertsAdapterImage`).
101107

102108
### WatcherConfig
103109
Declarative configuration for external resource watching. Contains:
@@ -108,7 +114,7 @@ Declarative configuration for external resource watching. Contains:
108114
The special deployment name `"ACTIVE_BACKEND"` resolves to the AppServer deployment name (`lightspeed-app-server`).
109115

110116
### Component Package Pattern
111-
Each component (appserver, postgres, console) follows the same package structure:
117+
Each component (appserver, postgres, console, alertsadapter) follows the same package structure:
112118
- `reconciler.go`: Phase 1 (resources) and Phase 2 (deployment) entry points
113119
- `deployment.go`: Deployment spec generation and update detection
114120
- `assets.go` and/or `config.go`: Resource and config generation
@@ -117,17 +123,18 @@ The packages receive `reconciler.Reconciler` interface, never import the control
117123
### Reconciler Interface (`internal/controller/reconciler/interface.go`)
118124
Embeds `client.Client` and adds getter methods for:
119125
- `GetScheme()`, `GetLogger()`, `GetNamespace()`
120-
- Image getters: `GetAppServerImage()`, `GetPostgresImage()`, `GetConsoleUIImage()`, `GetOpenShiftMCPServerImage()`, `GetDataverseExporterImage()`
126+
- Image getters: `GetAppServerImage()`, `GetPostgresImage()`, `GetConsoleUIImage()`, `GetAlertsAdapterImage()`, `GetOpenShiftMCPServerImage()`, `GetDataverseExporterImage()`
121127
- Version getters: `GetOpenShiftMajor()`, `GetOpenshiftMinor()`
122128
- Config getters: `IsPrometheusAvailable()`, `GetWatcherConfig()`
123129

124130
### Finalizer Pattern
125131
The OLSConfig CR uses finalizer `ols.openshift.io/finalizer` (defined in `utils.OLSConfigFinalizer`). On deletion:
126132
1. Remove Console UI (deactivate plugin, delete ConsolePlugin CR)
127-
2. List all owned resources via owner references
128-
3. Explicitly delete owned resources
129-
4. Wait up to 3 minutes for deletion (poll every 5 seconds)
130-
5. Remove finalizer (proceeds even if cleanup times out)
133+
2. Remove alerts adapter operand resources (`alertsadapter.RemoveAlertsAdapter()`: deployment, namespaced RBAC, SA, NetworkPolicy, monitoring RoleBinding, proposals ClusterRole/ClusterRoleBinding)
134+
3. List all owned resources via owner references
135+
4. Explicitly delete owned resources
136+
5. Wait up to 3 minutes for deletion (poll every 5 seconds)
137+
6. Remove finalizer (proceeds even if cleanup times out)
131138

132139
## Integration Points
133140

.ai/spec/how/reconciliation.md

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -18,22 +18,25 @@ Reconcile(ctx, req)
1818
-> handleFinalizer() # Add/remove finalizer, run cleanup
1919
-> reconcileOperatorResources() # ServiceMonitor, NetworkPolicy (operator-level)
2020
-> annotateExternalResources() # Validate secrets, annotate for watching
21-
-> reconcileIndependentResources() # Phase 1: console, postgres, backend resources
21+
-> reconcileIndependentResources() # Phase 1: console, postgres, backend, alerts adapter resources
2222
| |-- console.ReconcileConsoleUIResources()
2323
| |-- postgres.ReconcilePostgresResources()
24-
| +-- appserver.ReconcileAppServerResources()
24+
| |-- appserver.ReconcileAppServerResources()
25+
| +-- alertsadapter.ReconcileAlertsAdapterResources()
26+
| (opt-in via configMapRef; RemoveAlertsAdapter() when disabled)
2527
-> reconcileDeploymentsAndStatus() # Phase 2: deployments + status update
2628
|-- console.ReconcileConsoleUIDeploymentAndPlugin()
2729
|-- postgres.ReconcilePostgresDeployment()
2830
|-- appserver.ReconcileAppServerDeployment()
31+
|-- alertsadapter.ReconcileAlertsAdapterDeployment() # when configMapRef set
2932
|-- checkDeploymentStatus() for each # Collect diagnostics
3033
+-- UpdateStatusCondition() # Single status update
3134
```
3235

3336
## Key Abstractions
3437

3538
### Reconciler Interface
36-
The `reconciler.Reconciler` interface breaks the circular dependency between the main controller and component packages. Component packages (appserver, postgres, console) receive this interface instead of importing the controller package directly. It embeds `client.Client` and adds getter methods for images, namespace, and OpenShift version.
39+
The `reconciler.Reconciler` interface breaks the circular dependency between the main controller and component packages. Component packages (appserver, postgres, console, alertsadapter) receive this interface instead of importing the controller package directly. It embeds `client.Client` and adds getter methods for images, namespace, and OpenShift version.
3740

3841
### ReconcileSteps Pattern
3942
Both phases use a slice of `ReconcileSteps` structs, each containing a Name, reconcile function, and (for Phase 2) a ConditionType and Deployment name. Phase 1 iterates with continue-on-error; Phase 2 iterates but tracks all conditions and diagnostics.
@@ -44,7 +47,7 @@ Two ownership models:
4447
2. **External resources**: Watches() with custom predicates. Annotation-based filtering. Secret/ConfigMap handlers compare data and trigger deployment restarts.
4548

4649
### Finalizer Cleanup
47-
The `finalizeOLSConfig()` method uses `listOwnedResources()` which queries every resource type by owner reference UID (not labels). This is more reliable than label-based cleanup. The wait loop polls with a fixed interval and timeout, using `wait.PollUntilContextTimeout`.
50+
The `finalizeOLSConfig()` method removes Console UI, deletes all alerts adapter operand resources via `alertsadapter.RemoveAlertsAdapter()` (deployment, namespaced RBAC, SA, NetworkPolicy, cross-namespace monitoring RoleBinding, proposals ClusterRole/ClusterRoleBinding), then uses `listOwnedResources()` which queries every resource type by owner reference UID (not labels). This is more reliable than label-based cleanup. The wait loop polls with a fixed interval and timeout, using `wait.PollUntilContextTimeout`.
4851

4952
### Status Update Mechanics
5053
`UpdateStatusCondition()` uses `retry.RetryOnConflict` with `client.MergeFrom` patch. It preserves `LastTransitionTime` for conditions whose status hasn't changed. It re-fetches the CR before each update attempt to get the latest ResourceVersion.

.ai/spec/what/bundle-composition.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -36,8 +36,8 @@ The lightspeed-operator OLM bundle installs both the lightspeed-operator control
3636

3737
### Agentic Operand Deployment
3838

39-
16. [PLANNED: OLS-3236] The lightspeed-operator deploys the agentic alerts adapter and the agentic console plugin as fully reconciled operands, with Phase 1/2 reconciliation, status conditions, health monitoring, and finalizer cleanup. The agentic-operator does not deploy these operands.
40-
17. [PLANNED: OLS-3236] Agentic operand images default to `:main` tags until Konflux onboarding provides SHA-pinned productized images. CLI flags (`--alerts-adapter-image`, `--agentic-console-image`) on the lightspeed-operator deployment override the defaults.
39+
16. The lightspeed-operator reconciles the agentic alerts adapter as a fully managed operand (OLS-3348, opt-in via `spec.ols.deployment.alertsAdapter.configMapRef`): Phase 1/2 reconciliation when enabled, `AlertsAdapterReady` status condition (`NotConfigured` when disabled), health monitoring, operand teardown on disable, ConfigMap watcher restarts, and finalizer cleanup via `RemoveAlertsAdapter()`. The agentic console plugin remains [PLANNED: OLS-3236].
40+
17. Agentic operand images default to `:main` tags until Konflux onboarding provides SHA-pinned productized images. The `--alerts-adapter-image` flag is implemented on the lightspeed-operator binary; wiring it into the CSV deployment spec is [PLANNED: OLS-3236]. The `--agentic-console-image` flag is [PLANNED: OLS-3236].
4141

4242
## Configuration Surface
4343

@@ -48,7 +48,7 @@ The lightspeed-operator OLM bundle installs both the lightspeed-operator control
4848
| Agentic controller startup flags | CSV deployment spec args | Operand image overrides for the agentic controller |
4949
| Agentic controller `--sandbox-mode` | CSV deployment spec args | `bare-pod` (default) or `sandbox-claim` — selects sandbox provisioning strategy |
5050
| Agentic controller `--agentic-sandbox-image` | CSV deployment spec args | [PLANNED: OLS-3236] Sandbox container image (default: `:main` tag, overridable) |
51-
| Lightspeed controller `--alerts-adapter-image` | CSV deployment spec args | [PLANNED: OLS-3236] Alerts adapter container image (default: `:main` tag) |
51+
| Lightspeed controller `--alerts-adapter-image` | `cmd/main.go` flag (implemented); CSV deployment spec args [PLANNED: OLS-3236] | Alerts adapter container image (default: Konflux `:main` tag) |
5252
| Lightspeed controller `--agentic-console-image` | CSV deployment spec args | [PLANNED: OLS-3236] Agentic console plugin container image (default: `:main` tag) |
5353

5454
## Constraints
@@ -61,4 +61,4 @@ The lightspeed-operator OLM bundle installs both the lightspeed-operator control
6161

6262
| Ticket | Summary |
6363
|---|---|
64-
| OLS-3236 | Migrate agentic console deployment from agentic-operator to lightspeed-operator. Add alerts-adapter as new operand. Add `--alerts-adapter-image` and `--agentic-console-image` flags to lightspeed-operator CSV deployment. Remove `--agentic-console-image` from agentic-operator CSV deployment. |
64+
| OLS-3236 | Migrate agentic console deployment from agentic-operator to lightspeed-operator. Wire `--alerts-adapter-image` and `--agentic-console-image` into lightspeed-operator CSV deployment. Remove `--agentic-console-image` from agentic-operator CSV deployment. |

.ai/spec/what/crd-api.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -108,7 +108,9 @@ Field path (relative to `spec.ols.deployment`) | JSON key | Go type | Notes
108108
`mcpServer` | `mcpServer` | `ContainerConfig` | MCP server container. Resources only
109109
`console` | `console` | `Config` | Console container. Has replicas field but operator forces 1
110110
`database` | `database` | `Config` | Database container. Has replicas field but operator forces 1
111-
`alertsAdapter` | `alertsAdapter` | `Config` | [PLANNED: OLS-3236] Agentic alerts adapter container. Replicas forced to 1
111+
`alertsAdapter` | `alertsAdapter` | `AlertsAdapterSpec` | Agentic alerts adapter deployment and user-managed runtime config reference. Replicas forced to 1
112+
113+
`AlertsAdapterSpec` embeds `Config` (deployment scheduling/resources) and optional `configMapRef` (`LocalObjectReference`). Setting `configMapRef` **enables** the alerts adapter operand. The referenced ConfigMap name is `configMapRef.name` (commonly `alerts-adapter-config`; see [adapter manifests](https://github.com/openshift/lightspeed-agentic-alerts-adapter/tree/main/manifests)). The operator does not create or update ConfigMap content. When the ConfigMap exists in the operator namespace it must contain data key `config.yaml`; a missing ConfigMap is allowed and the adapter uses built-in defaults.
112114
`agenticConsole` | `agenticConsole` | `Config` | [PLANNED: OLS-3236] Agentic console plugin container. Replicas forced to 1
113115

114116
20. Replicas are only user-configurable for the API container (`spec.ols.deployment.api.replicas`). For console, database, alerts adapter, and agentic console, the operator always overrides replicas to 1 regardless of spec value.
@@ -280,7 +282,7 @@ Condition types used by the operator:
280282
- `ApiReady` -- API server deployment health
281283
- `CacheReady` -- PostgreSQL cache deployment health
282284
- `ConsolePluginReady` -- Console UI plugin deployment health
283-
- `AlertsAdapterReady` -- [PLANNED: OLS-3236] Agentic alerts adapter deployment health
285+
- `AlertsAdapterReady` -- Agentic alerts adapter deployment health
284286
- `AgenticConsolePluginReady` -- [PLANNED: OLS-3236] Agentic console plugin deployment health
285287
- `ResourceReconciliation` -- Overall resource reconciliation status (set directly, not deployment-based)
286288

@@ -372,7 +374,8 @@ Path | Type | Default | Required | Validation | Description
372374
`spec.ols.deployment.database.nodeSelector` | `map[string]string` | -- | No | -- | Database node selector
373375
`spec.ols.deployment.database.affinity` | `*Affinity` | -- | No | -- | Database affinity
374376
`spec.ols.deployment.database.topologySpreadConstraints` | `[]TopologySpreadConstraint` | -- | No | -- | Database topology spread
375-
`spec.ols.deployment.alertsAdapter` | `Config` | -- | No | -- | [PLANNED: OLS-3236] Alerts adapter deployment
377+
`spec.ols.deployment.alertsAdapter` | `AlertsAdapterSpec` | -- | No | -- | Alerts adapter deployment and config reference
378+
`spec.ols.deployment.alertsAdapter.configMapRef` | `LocalObjectReference` | (none) | No | -- | Opt-in switch and runtime config reference: ConfigMap name in operator namespace (key `config.yaml` when present)
376379
`spec.ols.deployment.alertsAdapter.replicas` | `*int32` | `1` | No | Min=0 | Alerts adapter replicas (operator forces 1)
377380
`spec.ols.deployment.alertsAdapter.resources` | `*ResourceRequirements` | -- | No | -- | Alerts adapter resources
378381
`spec.ols.deployment.alertsAdapter.tolerations` | `[]Toleration` | -- | No | -- | Alerts adapter tolerations

0 commit comments

Comments
 (0)