envoy: server-side apply, skip unchanged applies, and ConfigMap checksum for pod rollouts#246
Conversation
✅ Deploy Preview for kube-agentic-networking ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
492fdad to
717ab2f
Compare
717ab2f to
142f746
Compare
b3d3d1b to
0fd05f0
Compare
|
/test pull-kube-agentic-networking-verify |
|
I think since we are adding this new server-side apply logic, it would be good to send a PR for one resource for review to avoid having a quite large review. Once that review is done, we can have a follow up for the rest of the resources as they will be boilerplate in a single follow-up PR. In addition, it would be good to discuss explicitly which fields you chose to be managed vs unmanaged. There are quite a lot of choices here and it's not clear that the choices are obvious? Finally, the checksum on the ConfigMap. I would like this separate out into its own PR. |
0fd05f0 to
d7844d0
Compare
4437b9a to
5f53341
Compare
5f53341 to
b4a6c11
Compare
|
@bowei , Thanks for the review — agree on all three points.
|
|
/lgtm /hold for @LiorLieberman to approve |
| } | ||
|
|
||
| // ensureSA ensures that the ServiceAccount for the Envoy proxy exists. | ||
| // ensureSA applies the ServiceAccount for the Envoy proxy (server-side apply). |
There was a problem hiding this comment.
There are some resources we should just consider "owned completely" by the controller. In these cases, it's not out of the question to do the simple thing and overwrite all fields. Is Service Account one of these cases?
I'm trying to think through which controllers/users will "collaborate" on the definition of the service account.
There was a problem hiding this comment.
Yes — the Envoy proxy ServiceAccount is fully owned by this controller. It has a deterministic hashed name, a Gateway ownerReference, is only referenced by the proxy Deployment, and is deleted with the Gateway. We don’t expect users or other controllers to co-manage this object directly; the only supported customization path is Gateway.spec.infrastructure labels/annotations propagated to proxy infra.
We still shouldn’t do a full-object replace that clears .secrets (populated by the token controllers). SSA via serviceAccountApply only sets labels, annotations, and ownerReferences with our field manager + Force, which is the simple “overwrite what we own” model without touching apiserver-owned fields.
I removed the GET/compare skip-apply logic for the ServiceAccount — that optimization made sense for shared/complex resources but was unnecessary here. ensureSA now always Applies. ConfigMap/Deployment/Service still use skip-apply where the field surface is larger.
…ollouts. ServiceAccount: GET before Apply; if labels, annotations, and ownerReferences match the rendered desired object, skip Apply to avoid redundant writes. Document managed vs unmanaged fields in godoc on serviceAccountDesiredMatchesExisting for review clarity. Server-side apply for Envoy infra (apply.go) and ServiceAccount get/compare/skip. ConfigMap digest on Deployment pod template (annotation + reconcile helper + tests) so bootstrap/SDS ConfigMap updates roll pods. ConfigMap/Deployment/Service skip-apply comes in the following commit. Signed-off-by: Harshitha MS <harshitha.ms@ibm.com>
GET before Apply for bootstrap ConfigMap, Envoy Deployment, and proxy Service; skip Apply when desired state matches live objects per configMapApply / deploymentApply / serviceApply. Signed-off-by: Harshitha MS <harshitha.ms@ibm.com>
b4a6c11 to
2847436
Compare
|
New changes are detected. LGTM label has been removed. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: HarshithaMS005 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
What type of PR is this?
/kind feature
**What this PR does **:
The Envoy ResourceManager used to only create the proxy ServiceAccount, ConfigMap, Deployment, and Service when they were missing. This PR reconciles those objects on every sync so changes (Envoy image, bootstrap/SDS content, manual edits) converge without deleting the Gateway.
Infra updates use server-side apply (docs) with a single field manager and Force, instead of field-by-field comparison in reconcile.go. reconcile.go now only holds configMapDataChecksum for the pod-template annotation. Because updating a mounted ConfigMap alone does not restart pods, the Deployment pod template keeps a checksum annotation derived from rendered ConfigMap data so bootstrap/SDS changes still trigger a rollout. The controller does not wait for the Envoy Deployment Available condition, so it does not block reconciliation of other Gateways (#198).
Proxy resources use the standard Gateway label gateway.networking.k8s.io/gateway-name (GEP-1762).
Which issue(s) this PR fixes:
Fixes #238
Does this PR introduce a user-facing change?: