fix(rbg): use SSA Apply with distinct field managers on RBG writes#309
Open
sebest wants to merge 1 commit into
Open
fix(rbg): use SSA Apply with distinct field managers on RBG writes#309sebest wants to merge 1 commit into
sebest wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
Code Review
This pull request migrates the RoleBasedGroup, RoleBasedGroupScalingAdapter, and RoleBasedGroupSet controllers to use Server-Side Apply (SSA) for resource updates, replacing previous update and patch mechanisms that relied on manual retry logic. Key changes include the introduction of a JSON-based conversion helper for complex spec fields and refactored metadata synchronization. Review feedback correctly identifies that the new SSA configurations in the RoleBasedGroup and RoleBasedGroupSet controllers are missing required Kind and APIVersion fields, which would lead to failures when the API server attempts to identify the target resource.
762f04a to
e03df62
Compare
31795e0 to
7c4d584
Compare
Three RBG-controller write paths used full-object client.Update or
MergeFrom patches on RoleBasedGroup, which the API server records under
the default "manager" field manager (binary name is /manager). On any
chart-value bump that touched a passively co-owned field (containers,
env, volumes, tolerations, ...), Helm v4 SSA Apply with
ForceConflicts=false aborted with field_conflict — even though the RBG
controller never authoritatively writes those sibling fields.
Replace the three writes with narrow SSA Apply calls and assign each its
own field manager:
- ensureDiscoveryConfigMode → "rbg-discovery"
Applies only metadata.annotations.discovery-config-mode.
- updateRoleReplicas (RBGSA controller) → "rbg-replicas"
Applies only spec.roles[].{name, replicas}.
- updateExistingRBGs (RBGSet controller) → "rbg-set-sync"
Applies metadata.{labels, annotations} and spec.roles.
A single shared "rbg" field manager would not work here. SSA's
claim-release semantics: when the same field manager re-applies with a
narrower claim, fields it previously owned but no longer claims are
released and (if no other manager owns them) removed by the API server.
With three Apply paths claiming disjoint subsets of the same RBG, a
shared field manager makes them ping-pong — each Apply silently strips
the others' writes, hot-looping the controller. Distinct field managers
give each path an independent ownership boundary.
The retry.RetryOnConflict wrappers from the previous Update path are
dropped: SSA Apply with Force=true does not 409 on field-manager
conflicts, and no resourceVersion is sent in the apply payload, so
version-skew conflicts do not arise either.
pkg/utils.PatchObjectApplyConfigurationWithFieldManager is the new
field-manager-aware variant; PatchObjectApplyConfiguration remains a
thin wrapper that defaults to the legacy "rbg" field manager for callers
that don't share the RBG main subresource with other rbg-controller
Apply paths (RBG status writes, RBGSA writes, ConfigMap writes — all on
different objects/subresources, no ping-pong risk).
Stripping pre-existing legacy "manager" Update entries on already-
deployed resources is intentionally out of scope; that is the
responsibility of the upstream SSA writer (e.g. via
csaupgrade.UpgradeManagedFields or a one-off managedFields strip).
Validated end-to-end on kind: applied an RBG via
'kubectl apply --server-side --field-manager=external-actor' with
scalingAdapter.enable=true, scaled the auto-created RBGSA to drive
updateRoleReplicas, and applied a parent RBGSet to drive
updateExistingRBGs. Generation is stable (would hot-loop at ~20 Hz with
a shared field manager), the discovery-config-mode annotation persists
alongside external-actor's annotations, external-actor retains
ownership of containers/env/image/resources/scalingAdapter, and
managedFields shows three independent rbg-* field managers each owning
only its narrow claim.
7c4d584 to
3b1ad5a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three RBG-controller write paths used full-object
client.UpdateorMergeFrompatches onRoleBasedGroup, which the API server records under the default"manager"field manager (the operator binary is/manager). On any chart-value bump that touched a passively co-owned field (containers, env, volumes, tolerations, ...), Helm v4 SSA Apply withForceConflicts=falseaborted withfield_conflict— even though the RBG controller never authoritatively writes those sibling fields.Replace the three writes with narrow SSA Apply calls and assign each its own field manager:
ensureDiscoveryConfigMode(RBG controller)rbg-discoverymetadata.annotations.discovery-config-modeupdateRoleReplicas(RBGSA controller)rbg-replicasspec.roles[].{name, replicas}updateExistingRBGs(RBGSet controller)rbg-set-syncmetadata.{labels, annotations},spec.rolesA shared
"rbg"field manager across these three paths would not work. Per SSA claim-release semantics, when the same field manager re-applies with a narrower claim, fields it previously owned but no longer claims are released and (if no other manager owns them) removed by the API server. With three Apply paths claiming disjoint subsets of the same RBG, a shared field manager makes them ping-pong — each Apply silently strips the others' writes, hot-looping the controller. Distinct field managers give each path an independent ownership boundary.pkg/utils.PatchObjectApplyConfigurationWithFieldManageris the new field-manager-aware variant;PatchObjectApplyConfigurationremains a thin wrapper that defaults to the legacy"rbg"field manager — preserved for callers that don't share the RBG main subresource with other rbg-controller Apply paths (RBG status writes, RBGSA writes, ConfigMap writes — all on different objects/subresources, no ping-pong risk).The
retry.RetryOnConflictwrappers from the previous Update paths are dropped: SSA Apply withForce=truedoes not 409 on field-manager conflicts, and noresourceVersionis sent in the apply payload, so version-skew conflicts do not arise either.Stripping pre-existing legacy
"manager"Update entries on already-deployed resources is intentionally out of scope; that is the responsibility of the upstream SSA writer (e.g. viacsaupgrade.UpgradeManagedFieldsor a one-offmanagedFieldsstrip).Test plan
go build ./...cleango vet ./...cleangofmt -lcleango test ./internal/controller/workloads/... ./pkg/utils/...green (incl. new tests for each Apply path:TestUpdateRoleReplicas,TestBuildSyncedMetadata,TestToRoleSpecApplyConfigurations,TestRoleBasedGroupSetReconciler_updateExistingRBGs_SSA, extendedTestEnsureDiscoveryConfigMode)kubectl apply --server-side --field-manager=external-actorwithscalingAdapter.enable=true, scale auto-created RBGSA to driveupdateRoleReplicas, apply parent RBGSet to driveupdateExistingRBGs.discovery-config-modeannotation persists alongside external-actor's annotations.kubectl get rbg --show-managed-fieldsshows three independentrbg-discovery,rbg-replicas,rbg-set-syncmanagers each owning only its narrow claim.HEDWIG_CLIENT_ID.value(the original failure mode) with--force-conflicts=falseapplies cleanly.Follow-up
#310 — RBGSet child RBGs carry a residual
managerUpdate entry fromclient.CreateinnewRBGForSet. Predates this PR and doesn't affect the i8s-bridge scenario (which deploys standalone RBGs, not RBGSets), but worth fixing for hygiene.