WIP: OCPCLOUD-3465: feat: implement cluster-wide proxy support via new ProxyController#586
WIP: OCPCLOUD-3465: feat: implement cluster-wide proxy support via new ProxyController#586tmshort wants to merge 10 commits into
Conversation
|
@tmshort: This pull request references OCPCLOUD-3465 which is a valid jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/wip |
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughThis PR adds cluster-wide proxy reconciliation: installer wiring exposes the tracking cache, a new ProxyController syncs configv1.Proxy into tracked workloads' pod-template env, a util reads the Proxy resource, manifests add an inject-proxy annotation, and tests adjust manager metrics/TLS settings. ChangesProxy Configuration Reconciliation
Sequence DiagramsequenceDiagram
participant Proxy as configv1.Proxy
participant ProxyCtrl as ProxyController
participant Util as GetProxyEnvVars
participant TrackingCache as managedcache.TrackingCache
participant Workload as Deployment/DaemonSet/StatefulSet
Proxy->>ProxyCtrl: watch cluster proxy changes
ProxyCtrl->>Util: fetch HTTP_PROXY, HTTPS_PROXY, NO_PROXY
Util-->>ProxyCtrl: return proxy env vars
ProxyCtrl->>TrackingCache: list tracked workloads
ProxyCtrl->>Workload: patch pod template containers with proxy env vars via SSA
Workload-->>ProxyCtrl: acknowledge patch
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 13 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (13 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
/assign mdbooth |
|
There's a few problems here. The biggest one is that The other big thing is that I don't want the revision/installer controllers to contain any heuristics about providers. This is why for the TLS stuff I made the providers include explicit substitution points for the TLS vars. In this case the heuristic is that we only want to add proxy vars to a D/DS/SS with a container called 'manager'. This isn't true today (OpenStack already packages ORC, which doesn't follow this pattern), and I only expect the exceptions to expand over time. The takeaways are:
So perhaps something like an annotation This also means we can't use the PROXY_HASH mechanism, because if we're going to apply the proxy env vars directly they need to be included in the revision in full. I also had another option in mind. We could create a new controller which watches objects managed by the installer controller which have the new annotation, and adds the env vars after creation. This works because:
This may or may not be simpler, not sure. It's something to consider. I think we need to:
|
|
Reevaluating based on @mdbooth's comments. |
ceacc09 to
b269f09
Compare
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@CodeRabbit help |
ChatThere are 3 ways to chat with CodeRabbit:
CodeRabbit commands
Other keywords and placeholders
Status, support, documentation and community
|
|
@coderabbitai full review |
✅ Action performedFull review finished. |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/controllers/installer/proxy_controller.go`:
- Around line 159-161: The per-object calls to applyProxyVars inside the
workload reconciliation loops currently return on the first error, which stops
processing remaining objects; update each loop that calls applyProxyVars (the
calls at deploy.Name / deploy.Namespace and the similar calls for other workload
kinds) to instead collect errors into a slice and continue processing the rest
of the items, then after the loop return a combined error if any occurred (use
k8s.io/apimachinery/pkg/util/errors.NewAggregate or errors.Join) or nil if none;
ensure you append the returned err from applyProxyVars to the collector and do
not early-return inside the loop.
- Around line 153-157: The code currently skips objects when
ProxyInjectAnnotation is missing (e.g., checks like
deploy.Spec.Template.Annotations[ProxyInjectAnnotation]) which leaves previously
injected proxy env vars owned by "capi-operator-proxy" stale; update the
reconcile logic for Deployments, StatefulSets and DaemonSets to, when the
annotation is absent, explicitly remove env vars injected by this operator
(filter by env var owner/marker and remove from each container/envFrom) and
persist the updated PodTemplate (e.g., handle deploy.Spec.Template,
statefulSet.Spec.Template, daemonSet.Spec.Template) so opting-out cleans up
proxy env vars instead of leaving them behind.
In `@pkg/controllers/installer/suite_test.go`:
- Around line 109-110: The SetupWithManager call in the test asserts a bare
success but lacks context on failure; modify the assertion around
SetupWithManager (the call that assigns _, err = SetupWithManager(mgr,
allProviderProfiles, triggerSource)) to include a descriptive failure message
(e.g., "failed to register installer controller with manager") by passing that
message to the Expect assertion so a BeforeSuite failure clearly points at
installer controller registration rather than surfacing only the raw error.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 24a04a88-f1b2-4d76-a9cf-ecad951c51aa
📒 Files selected for processing (7)
cmd/capi-operator/main.gomanifests-gen/customizations.gopkg/controllers/installer/installer_controller.gopkg/controllers/installer/proxy_controller.gopkg/controllers/installer/suite_test.gopkg/controllers/revision/helpers_test.gopkg/util/proxy.go
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/controllers/installer/proxy_controller.go`:
- Around line 180-190: When the ProxyInjectAnnotation value changes (e.g. from
"manager,sidecar" to "manager") the current code only patches containers listed
in the new annotation, leaving proxy env vars in containers removed from the
annotation. Update the logic around ProxyInjectAnnotation handling so that after
computing containers := containerNamesFromAnnotation(annotation) you also
compute specContainers := containerNamesFromSpec(item.containers), compute
removed := specContainers minus containers, and call c.applyProxyVars twice:
once for the annotated containers with effectiveVars and once for the removed
containers with nil (to clear stale vars). Use the existing symbols
ProxyInjectAnnotation, containerNamesFromAnnotation, containerNamesFromSpec,
effectiveVars and applyProxyVars to locate and implement the fix.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 99e7c71d-f60d-4ac5-8223-4d95f98cd930
📒 Files selected for processing (2)
pkg/controllers/installer/proxy_controller.gopkg/controllers/installer/suite_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- pkg/controllers/installer/suite_test.go
mdbooth
left a comment
There was a problem hiding this comment.
Nice. I think this is most of the way there. I've requested some code cleanup things like use of constants, using a fully qualified field owner, etc.
Big things:
Please can you put the new controller in its own package? This is mostly for consistency. We've got a bunch of controller, and they're all separated in their own package.
I'm pretty sure the SSA handling in the workitem loop isn't right. An SSA fully specifies everything owned by the field owner on the resource. Multiple SSAs with the same field owner will just overwrite each other. Fully construct the desired state (owned fields only) and submit a single SSA. Any which weren't specified will be removed automatically.
No tests. I think this would have highlighed the above. These should be envtests. We have a ton of examples, so Claude will write them for you in a minute.
I think the 'nothing to do' optimisation in applyProxyVars is important.
| return serviceSecretNames | ||
| } | ||
|
|
||
| const proxyInjectAnnotation = "capi-operator.openshift.io/inject-proxy" |
There was a problem hiding this comment.
I'm pretty sure we've got a constant defined somewhere for capi-operator.openshift.io. Please can we use it here?
| // container names that should receive HTTP_PROXY, HTTPS_PROXY, and NO_PROXY | ||
| // environment variables. The annotation is added by manifests-gen and can be | ||
| // overridden by provider manifests. | ||
| ProxyInjectAnnotation = "capi-operator.openshift.io/inject-proxy" |
There was a problem hiding this comment.
Perhaps use this constant from manifests-gen? I can't remember which way round we configured the dependencies (note that manifests-gen is a separate module).
| apiVersion: "apps/v1", | ||
| kind: "Deployment", |
There was a problem hiding this comment.
micro-nit: I'd personally have pulled these from the item, too.
Don't change this unless you're particularly inspired to.
| }, | ||
| } | ||
|
|
||
| if err := c.client.Patch(ctx, patch, util.ApplyConfigPatch(patch.Object), client.FieldOwner(proxyFieldManager), client.ForceOwnership); err != nil { |
There was a problem hiding this comment.
The field owner needs to be qualified (i.e. capi-operator.../thingy). The domain should again use an existing constant. Depending on where the constant is currently, we might have to move it somewhere more sensible.
| for _, containerName := range containerNames { | ||
| containerPatches = append(containerPatches, map[string]interface{}{ | ||
| "name": containerName, | ||
| "env": envEntries, |
There was a problem hiding this comment.
nit: I'd bet this could be simplified to just pass proxyVars directly here and get rid of the proxyEnvEntries function.
| if !hasAnnotation { | ||
| // No annotation: clear any stale proxy env vars from all containers. | ||
| if err := c.applyProxyVars(ctx, item.apiVersion, item.kind, item.name, item.namespace, allContainers, nil); err != nil { | ||
| errs = append(errs, err) | ||
| } | ||
|
|
||
| continue | ||
| } |
There was a problem hiding this comment.
I think this is redundant. If the annotation is not set, all we need is for containerNamesFromAnnotation below to return empty targets and any env vars previously set by us will be cleared when we construct an empty SSA.
| if len(containerNames) == 0 { | ||
| return nil | ||
| } |
There was a problem hiding this comment.
This is problematic. We need to be able to submit an empty transaction.
| // it applies an empty env list, which removes any env var entries previously | ||
| // owned by the proxy controller's SSA field manager. Only the three proxy env | ||
| // var entries are owned; all other fields remain owned by boxcutter. | ||
| func (c *ProxyController) applyProxyVars( |
There was a problem hiding this comment.
I think this needs an optimisation to prevent a potential apply storm during installation. The controller triggers on modification to any object, but for each modified object reconciles the Proxy object and iterates over all objects. We've got a bit of matrix amplification going here.
That's ok, but we're also unconditionally submitting an SSA for every workitem on every reconcile, even if it's blank. That's not very nice to KAS.
The typed apply configuration objects have an ExtractInto function which will extract all fields managed by a particular field owner. I wonder if an equivalent exists for Unstructured?
We could:
- construct
patchas we're already doing below - ExtractInto a new unstructured with the same field owner
- If patch deepequals existing, I think we can skip it
There was a problem hiding this comment.
Any idea why these changes were required? Doesn't look like we should be touching these.
There was a problem hiding this comment.
It was likely a CodeRabbit comment; it ought to be revertable.
There was a problem hiding this comment.
But the BindAddress was changed to explicitly be "0" so that a port can be randomly selected, and subsequently allowing more parallelization of tests.
| // Arbitrarily chosen insecure default tls configuration for tests | ||
| config.CipherSuites = []uint16{tls.TLS_RSA_WITH_RC4_128_SHA} |
There was a problem hiding this comment.
Especially shouldn't be changing these. Was this code rabbit thinking deliberately insecure test defaults are a security issue by any chance?
There was a problem hiding this comment.
I believe it was. Since the test have two stages; I couldn't tell if it passed or not. :(
a1d05f6 to
724bbbd
Compare
|
/test vendor |
|
/test e2e-aws-capi-techpreview Test at least one of the e2es. |
724bbbd to
b095b94
Compare
|
rebased |
|
/test e2e-aws-capi-techpreview Test at least one of the e2es. |
mdbooth
left a comment
There was a problem hiding this comment.
I know I asked you to move the proxy controller into a separate package same as we've done for the other controller, but I'm questioning that. I was wondering why you didn't also move the tests: it's because you're using the installer controller to create managed resources for the tests.
This wasn't what I expected, but I think I like it better than what I expected: the proxy controller explicitly only ever operates on resources managed by the installer controller. Integrating those in the integration tests absolute makes sense (to me, but I know this is not a universally popular view).
How can we make that relationship more explicit in the structure? For crdcompatibility we put related controllers in sub-packages. Suggest we also do that here? I think I'd leave the tests where they are for now.
I've asked for consideration of another SSA thing (UnstructuredExtractor): either a fix or at least a comment. And either way some test coverage of the case.
| // proxyVarNames are the only env var names this controller ever owns. | ||
| // Used for the skip-if-unchanged optimisation. | ||
|
|
| Watches(&appsv1.Deployment{}, handler.EnqueueRequestsFromMapFunc(toProxy), | ||
| builder.WithPredicates(isManagedWorkload)). | ||
| Watches(&appsv1.DaemonSet{}, handler.EnqueueRequestsFromMapFunc(toProxy), | ||
| builder.WithPredicates(isManagedWorkload)). | ||
| Watches(&appsv1.StatefulSet{}, handler.EnqueueRequestsFromMapFunc(toProxy), | ||
| builder.WithPredicates(isManagedWorkload)). | ||
| Complete(c) |
There was a problem hiding this comment.
I don't think is right. I think we need to use the tracking cache here for 2 reasons:
- This will start new D/DS/SS informers rather than using the tracking cache's existing watches, so it uses more memory/KAS capacity
- This will be subject to the DefaultNamespace cache options we configured in the entrypoint, so we won't watch anything the provider installs outside of the default namespaces
On the latter point, although we expect providers to install things in the default namespace I don't think we want or need to hardcode this requirement.
| // Watch only managed workloads (those labelled by the installer). | ||
| // Using standard controller-runtime Watches instead of trackingCache.Source() | ||
| // avoids registering a handler on the TrackingCache, which prevents it from | ||
| // stopping cleanly and can leave orphaned envtest processes. | ||
| managedReq, err := labels.NewRequirement(revisiongenerator.ManagedLabelKey, selection.Exists, nil) | ||
| if err != nil { | ||
| return fmt.Errorf("building managed label requirement: %w", err) | ||
| } | ||
|
|
||
| isManagedWorkload := predicate.NewPredicateFuncs(func(obj client.Object) bool { | ||
| return labels.NewSelector().Add(*managedReq).Matches(labels.Set(obj.GetLabels())) | ||
| }) | ||
|
|
There was a problem hiding this comment.
If we use the tracking cache directly I think we don't need (or want) this.
| } | ||
|
|
||
| // Build the set of containers that should receive proxy env vars. | ||
| annotatedSet := make(map[string]struct{}) |
There was a problem hiding this comment.
nit: Set[string] is idiomatic in this codebase
| } else { | ||
| // Not annotated, or annotated but proxy was cleared: none of our var | ||
| // names may be present (covers both "never had proxy" and "proxy removed"). | ||
| for _, name := range proxyVarNames() { | ||
| if _, ok := envByName[name]; ok { | ||
| return false | ||
| } | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
I don't think this is quite right, because it asserts that unannotated containers do not have proxy vars. They could have proxy vars which aren't managed by us. I think UnstructuredExtractor could be the answer here. Just be careful to create the Extractor 'globally' e.g. in SetupWithManager, because it has its own discovery cache.
That said, while my preference is to fix this, it should only result in 1 additional reconcile because the resulting SSA will still be correct and will not result in further reconciles. So if the fix starts looking too complex may just add a comment here explaining that it's not quite right and why the impact of that is contained.
| }, defaultNodeTimeout) | ||
| }) | ||
|
|
||
| Context("SSA field ownership correctness", func() { |
There was a problem hiding this comment.
Please can you add another test here: does not affect proxy vars set on un-annotated containers.
So: add some proxy vars to the other container with custom values unrelated to the cluster configuration. Annotate the deployment to manage the manager container. Proxy vars should be set on manager and left alone (i.e. not removed) on other.
|
I've copied an additional clanker review below. I agree with most of that, although I wonder if 'Only uppercase proxy env vars' should be major. BlockingNone. The feature works correctly on the primary happy paths (full proxy enable/disable/delete). Major Issues1. proxyVarsAlreadyApplied() allows stale env vars on partial proxy reconfigurationhttps://github.com/openshift/cluster-capi-operator/blob/b269f09a1bae68269e132911e56dea251c83138b/pkg/controllers/proxy When a container is annotated and proxyVars is non-empty, the function only checks that the desired vars are present Scenario: Admin previously had HTTPProxy, HTTPSProxy, and NoProxy all configured (three env vars injected). Admin then Fix: In the annotated+non-empty branch, additionally verify that no other proxy var names from proxyVarNames() are desiredNames := make(map[string]struct{}, len(proxyVars))
for _, pv := range proxyVars {
desiredNames[pv.Name] = struct{}{}
}
for _, name := range proxyVarNames() {
if _, desired := desiredNames[name]; !desired {
if _, present := envByName[name]; present {
return false
}
}
}2. Test coverage gaps for important behaviour pathsThe 6 integration tests cover the core flows well but miss:
Additionally, the haveProxyEnvVars() matcher only checks env var names, not values: https://github.com/openshift/cluster-capi-operator/blob/b269f09a1bae68269e132911e56dea251c83138b/pkg/controllers/insta A proxy URL update test would pass even if the values are wrong. Consider matching both name and value: func haveProxyEnvVarsWithURL(url string) OmegaMatcher {
return HaveField("Env", ContainElements(
SatisfyAll(HaveField("Name", "HTTP_PROXY"), HaveField("Value", url)),
SatisfyAll(HaveField("Name", "HTTPS_PROXY"), HaveField("Value", url)),
// ...
))
}3. No unit tests for SSA helpers in pkg/controllers/proxy/buildPatch, proxyVarsAlreadyApplied, and containerNamesFromAnnotation are non-trivial pure functions with edge cases.
Minor Issues4. Only uppercase proxy env varsSome Go HTTP clients (and libraries like net/http's ProxyFromEnvironment) check lowercase http_proxy first. The 5. EnvVar.ValueFrom not handled by skip checkIn proxyVarsAlreadyApplied, only Value is indexed. If another field manager sets HTTP_PROXY via ValueFrom (e.g. a 6. Stale doc reference in installer_controller.goThe updated SetupWithManager godoc references SetupProxyController, but the actual function is proxy.SetupWithManager. 7. manifests-gen domain constant duplicationDuplicated because manifests-gen is a separate Go module. The comment explaining this is good. No great alternative Nits
|
Add a ProxyController to the capi-installer manager that watches managed workloads opting into proxy injection via the new annotation capi-operator.openshift.io/inject-proxy: <container,...> and applies HTTP_PROXY/HTTPS_PROXY/NO_PROXY environment variables to the named containers via Server-Side Apply whenever the cluster-wide Proxy CR changes. Design (Option B from OCPCLOUD-3465 Jira): - A dedicated controller in the capi-installer manager watches the Proxy singleton (proxies/cluster) and all managed objects via the existing trackingCache, so it never creates a second informer for those objects. - Env vars are applied with a separate SSA field manager (capi-operator-proxy) that owns only the three proxy env var entries per container. Boxcutter continues to own all other fields on the same workload object without conflict. - Proxy changes propagate to running pods independently of the revision lifecycle, including when the installer has fallen back to an older revision (N-1) because the newest revision failed to roll out. - Starting with Option B avoids a future field-ownership migration: if Option A (inject at render time) had been deployed first, converting to Option B later would require force-stealing SSA ownership from boxcutter. The annotation is added to provider Deployments by manifests-gen (see next commit). Providers that use different container names or need proxy injection on additional containers should set the annotation in their upstream manifests. SetupWithManager now returns the TrackingCache so it can be shared with SetupProxyController without creating a second managed-object cache. Refs OCPCLOUD-3465 Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Todd Short <tshort@redhat.com>
…or new SetupWithManager signature - Add Metrics.BindAddress="0" to revision controller test manager so it binds to a random port instead of :8080, eliminating port conflicts when consecutive test runs start new managers before the previous one releases the port. Matches the pattern already used by every other controller test suite in this repo. - Replace RC4 (TLS_RSA_WITH_RC4_128_SHA / TLS 1.0) with TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 / TLS 1.2 in the default test TLS config to avoid the No-Weak-Crypto CI check. - Update installer suite_test.go to handle the new (TrackingCache, error) return type of SetupWithManager. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Todd Short <tshort@redhat.com>
Clear stale proxy env vars on annotation removal: previously, workloads that lost the inject-proxy annotation kept proxy env vars owned by the capi-operator-proxy SSA field manager indefinitely. Now when the annotation is absent the controller applies an empty env list to all containers in the pod spec, which removes any previously-owned HTTP_PROXY/HTTPS_PROXY/NO_PROXY entries. Objects that never had the annotation are unaffected (no ownership to clear). Continue reconciling remaining workloads after per-object patch errors: the three separate applyTo* functions (which also caused a duplication lint failure) are now merged into a single applyToAllWorkloads+collectWorkloads pair. Errors are accumulated per-object and joined rather than short- circuiting the rest of the reconcile loop. Add assertion message to SetupWithManager in suite_test.go so a BeforeSuite failure names the failing step clearly. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Todd Short <tshort@redhat.com>
…ation The previous fix only cleaned up stale proxy env vars when the annotation was entirely removed. If the annotation changed from "manager,sidecar" to "manager", the sidecar container kept HTTP_PROXY/HTTPS_PROXY/NO_PROXY owned by capi-operator-proxy indefinitely. Now for each workload we split containers into two groups: - Containers named in the annotation: proxy vars applied (or refreshed) - Containers in the spec but NOT in the annotation: empty env applied, removing our SSA field-manager ownership of any proxy vars we set before This covers three cases cleanly: - Annotation absent: all containers cleared (existing behaviour) - Annotation present, unchanged: annotated get vars, non-annotated get cleared (no-op if we never owned their env vars) - Annotation shrinks: dropped containers get cleared, remaining get vars Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Todd Short <tshort@redhat.com>
SSA field manager correctness:
The previous implementation sent two separate SSA applies from the same
field manager per workload (one for annotated containers, one for stale
containers). In Kubernetes SSA semantics, each apply from a field manager
is the *complete* desired state: the second apply would cancel the ownership
established by the first. The fix submits a single apply covering ALL
containers in the spec. Annotated containers receive proxy env vars;
non-annotated containers receive an empty env list, which clears any proxy
env vars previously owned by our field manager. As a result the separate
"stale" handling, the redundant !hasAnnotation fast path, and the early
return on empty containerNames are all removed.
Use existing domain constant for annotation and field manager:
- ProxyInjectAnnotation now uses operatorstatus.CAPIOperatorIdentifierDomain
so the string "capi-operator.openshift.io" is only defined once in the
main module.
- The SSA field manager now uses operatorstatus.CAPIFieldOwner("proxy"),
which produces the qualified name "capi-operator.openshift.io/proxy".
- manifests-gen defines a local capiOperatorDomain constant (manifests-gen
is a standalone module and cannot import from operatorstatus).
Remove proxyEnvEntries helper: env map construction is now inlined inside
applyProxyVars, removing the unnecessary indirection.
Revert inadvertent cipher change in helpers_test.go: the RC4/TLS1.0 default
was intentionally insecure for tests and should not have been changed. Only
the Metrics.BindAddress: "0" fix (needed to prevent port conflicts between
consecutive test runs) is retained.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Todd Short <tshort@redhat.com>
Add six integration tests that run through the complete proxy injection lifecycle against a real envtest API server: 1. inject proxy env vars into the annotated "manager" container when the Proxy CR is configured, while leaving the unannotated "other" container unaffected — this validates the single-SSA-apply correctness (if the previous two-call design had been used, the second apply would have wiped the first container's vars) 2. clear proxy env vars when the Proxy CR is deleted 3. clear proxy env vars when the Proxy CR is reconfigured to empty values 4. do NOT inject env vars into any container when the inject-proxy annotation is absent — even with a configured proxy 5. clear env vars after the inject-proxy annotation is removed via a new revision (annotation change propagated through the installer) 6. preserve env vars set by other field managers alongside the proxy vars Infrastructure changes: - ProxyController is wired into the installer suite manager alongside the InstallerController, sharing the existing TrackingCache (SetupWithManager now returns it so SetupProxyController can reuse it) - Two new provider profiles (proxy-annotated, proxy-not-annotated) provide a two-container Deployment that exercises both annotated and plain containers in the same object - markDeploymentAvailable() patches the Deployment status to Available=True using komega.UpdateStatus so the installer's deploymentAvailableProbe passes and the revision is marked complete - Proxy CRD added to the shared envtest setup (pkg/test/envtest.go) - AfterSuite manager-stop wait changed from Eventually to a select with a 30s timeout so that tracking-cache drain delays under heavy parallel load do not fail the suite Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Todd Short <tshort@redhat.com>
…troller Registering a handler on the TrackingCache via trackingCache.Source() prevents the cache from stopping cleanly when the manager context is cancelled. This caused mgr.Start() to hang indefinitely in tests, which in turn prevented test.StopEnvTest() from running, leaving orphaned etcd and kube-apiserver processes after every make unit invocation. Replace the WatchesRawSource(c.trackingCache.Source(...)) registration with standard controller-runtime Watches on Deployment, DaemonSet, and StatefulSet, filtered to objects carrying the ManagedLabelKey. The manager's own cache handles these watches and shuts down cleanly with the context. The listing step in collectWorkloads still uses the TrackingCache (correct, since it only caches managed objects), but the event delivery no longer depends on it. Also revert the AfterSuite workaround (select with 30s timeout) back to the original Eventually now that the root cause is fixed. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Todd Short <tshort@redhat.com>
…ged optimisation
Move pkg/controllers/installer/proxy_controller.go to
pkg/controllers/proxy/proxy_controller.go for consistency with the rest of
the codebase where each controller lives in its own package.
The type is renamed from ProxyController to Controller and the entry point
from SetupProxyController to SetupWithManager, following the existing
convention (e.g. revision.RevisionController / revision.SetupWithManager).
Add the "skip if nothing to do" optimisation that mdbooth requested.
Before submitting the SSA apply, proxyVarsAlreadyApplied() compares the
desired proxy state (derived from the annotation + current Proxy CR) against
what the tracking cache reports as the current container state:
- annotated container with proxy configured: all desired vars must be
present with the correct values
- annotated container with proxy cleared, or unannotated container: none
of the proxy var names (HTTP_PROXY / HTTPS_PROXY / NO_PROXY) may be
present
If the state already matches, the SSA apply is skipped entirely, avoiding
unnecessary KAS round-trips during installation when every newly-applied
managed object triggers a ProxyController reconcile.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Todd Short <tshort@redhat.com>
After rebasing onto main, the installer/revision/proxy controllers moved from cmd/capi-operator to the new cmd/capi-installer binary (PR openshift#581). Wire the proxy controller setup into the new binary and reset cmd/capi-operator/main.go to the upstream version (which now only runs InstallerDeploymentController). Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Todd Short <tshort@redhat.com>
Use trackingCache.Source() via sub-reconciler pattern: The ProxyController no longer registers as a standalone controller. Instead it becomes a sub-reconciler called directly from the InstallerController's Reconcile(), so both share a single trackingCache.Source() registration and avoid the double-registration shutdown issue that previously caused orphaned envtest processes. The InstallerController gains a Proxy CR watch so that proxy changes still trigger reconciliation. Use UnstructuredExtractor for the skip-if-unchanged optimisation: The previous proxyVarsAlreadyApplied() checked the full container env, which caused an infinite reconcile loop when another field manager set proxy var names on an unannotated container (the check would keep seeing those vars and triggering applies, but SSA would never remove them since they're owned by a different manager). The new check uses UnstructuredExtractor to compare only what our field manager owns against the desired patch, which is both correct and efficient. Use sets.Set[string] for annotatedSet (idiomatic in this codebase). Remove stale orphaned comment about proxyVarNames. Add test: verify proxy vars set by other field managers on unannotated containers are not removed by our controller's SSA apply. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Todd Short <tshort@redhat.com>
b095b94 to
26a862e
Compare
|
@tmshort: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Add a ProxyController to the capi-installer manager that watches managed
workloads opting into proxy injection via the new annotation
capi-operator.openshift.io/inject-proxy: <container,...>
and applies HTTP_PROXY/HTTPS_PROXY/NO_PROXY environment variables to the
named containers via Server-Side Apply whenever the cluster-wide Proxy CR
changes.
Design (Option B from OCPCLOUD-3465 Jira):
singleton (proxies/cluster) and all managed objects via the existing
trackingCache, so it never creates a second informer for those objects.
that owns only the three proxy env var entries per container. Boxcutter
continues to own all other fields on the same workload object without
conflict.
lifecycle, including when the installer has fallen back to an older
revision (N-1) because the newest revision failed to roll out.
Option A (inject at render time) had been deployed first, converting to
Option B later would require force-stealing SSA ownership from boxcutter.
The annotation is added to provider Deployments by manifests-gen (see next
commit). Providers that use different container names or need proxy injection
on additional containers should set the annotation in their upstream manifests.
SetupWithManager now returns the TrackingCache so it can be shared with
SetupProxyController without creating a second managed-object cache.
Summary by CodeRabbit
New Features
Tests