Skip to content

Commit 049f813

Browse files
camilamacedo86Per Goncalves da Silva
andauthored
✨ (feat): When using Boxcutter feature-gate, use ClusterExtension ServiceAccount for revision operations (#2429)
* (feat): [Boxcutter] Use ClusterExtension ServiceAccount for revision operations Implement serviceAccount-scoped token-based authentication for ClusterExtensionRevision controller using annotation-based configuration. - Add RevisionEngineFactory with CreateRevisionEngine(ctx, rev) interface - Read ServiceAccount from annotations (no ClusterExtension dependency) - Token-based auth using TokenInjectingRoundTripper - ServiceAccount name and namespace in annotations for observability - TrackingCache uses global client for caching/cleanup - Comprehensive error path tests ClusterExtensionRevision can exist independently. Easy mode impersonation deferred until API is finalized. Assisted-by: Cursor * (doc) Add godoc comments to label constants Adds documentation comments to all label/annotation constants explaining: - What each constant represents - Where they are applied (labels vs annotations) - ServiceAccount constants document their relationship to ClusterExtension spec Addresses code review feedback for improved maintainability. * (fix) Add ClusterExtensionRevision permissions to upgrade test RBAC The upgrade test ServiceAccount needs permissions to manage ClusterExtensionRevisions when BoxcutterRuntime is enabled. Without these permissions, the upgraded controller cannot create or update ClusterExtensionRevision resources, causing the ClusterExtension to fail reconciliation after upgrade. * review changes * (fix): e2e: add bind/escalate verbs for Boxcutter Server-Side Apply Add `bind` and `escalate` RBAC verbs to e2e test ServiceAccount to support Boxcutter applier's use of Kubernetes Server-Side Apply (SSA). Experimental e2e tests fail when Boxcutter uses ServiceAccount-scoped clients to apply bundle RBAC resources (ClusterRoles and ClusterRoleBindings): ``` clusterrolebindings.rbac.authorization.k8s.io is forbidden: User "system:serviceaccount:olmv1-e2e:olm-sa" cannot bind ClusterRole: RBAC: attempting to grant RBAC permissions not currently held ``` - Uses helm.sh/helm/v3 library - Applies resources with traditional CREATE/UPDATE operations - Kubernetes RBAC allows ClusterRoleBinding creation when the ServiceAccount already has all the permissions being granted (permission matching) - **Works WITHOUT `bind`/`escalate` verbs** ✅ - Uses pkg.package-operator.run/boxcutter machinery - Applies resources with **Server-Side Apply (SSA)** (`client.Apply`) - SSA enforces field-level ownership and **stricter RBAC enforcement** - Kubernetes API server **requires explicit `bind` verb** for ClusterRoleBindings - Permission matching fallback does NOT work reliably with SSA - **REQUIRES `bind`/`escalate` verbs** ❌ Validated by running actual tests: **Test 1: Main branch standard-e2e (Helm, NO bind/escalate)** ```bash make test-e2e ``` Result: ✅ PASS (21 scenarios passed) **Test 2: PR branch experimental-e2e (Boxcutter, NO bind/escalate)** ```bash make test-experimental-e2e ``` Result: ❌ FAIL (cannot bind ClusterRole error) **Test 3: PR branch experimental-e2e (Boxcutter, WITH bind/escalate)** Result: ✅ PASS (all tests pass) Add `bind` and `escalate` verbs to the e2e test RBAC template: ```yaml - apiGroups: ["rbac.authorization.k8s.io"] resources: [clusterroles, roles, clusterrolebindings, rolebindings] verbs: [ update, create, list, watch, get, delete, patch, bind, escalate ] ``` These verbs allow the ServiceAccount to: - `bind`: Create ClusterRoleBindings that reference roles with permissions the ServiceAccount doesn't have - `escalate`: Create ClusterRoles with permissions the ServiceAccount doesn't have This is the documented requirement in `docs/concepts/permission-model.md` for extension installers and aligns with Kubernetes RBAC best practices. 1. **Required for SSA**: Server-Side Apply has stricter RBAC enforcement 2. **Documented requirement**: OLMv1 docs specify bind/escalate as proper approach 3. **Industry best practice**: Operator installers should have these verbs 4. **Supports all operators**: Not just test-operator with matching permissions 5. **Maintains SSA benefits**: Field ownership, conflict resolution, GitOps support - Kubernetes RBAC: https://kubernetes.io/docs/reference/access-authn-authz/rbac/#privilege-escalation-prevention-and-bootstrapping - OLMv1 Permission Model: docs/concepts/permission-model.md - Boxcutter machinery: pkg.package-operator.run/boxcutter/machinery (uses client.Apply) - Testing evidence: FINAL_TESTED_ANSWER.md, SERVER_SIDE_APPLY_ANSWER.md Tested-by: Actual e2e test runs comparing Helm vs Boxcutter behavior Signed-off-by: Camila <camil@example.com> * Split rbac phase into two Signed-off-by: Per Goncalves da Silva <pegoncal@redhat.com> --------- Signed-off-by: Camila <camil@example.com> Signed-off-by: Per Goncalves da Silva <pegoncal@redhat.com> Co-authored-by: Per Goncalves da Silva <pegoncal@redhat.com>
1 parent 1fa4169 commit 049f813

10 files changed

Lines changed: 527 additions & 72 deletions

File tree

cmd/operator-controller/main.go

Lines changed: 22 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -42,10 +42,7 @@ import (
4242
_ "k8s.io/client-go/plugin/pkg/client/auth"
4343
"k8s.io/klog/v2"
4444
"k8s.io/utils/ptr"
45-
"pkg.package-operator.run/boxcutter/machinery"
4645
"pkg.package-operator.run/boxcutter/managedcache"
47-
"pkg.package-operator.run/boxcutter/ownerhandling"
48-
"pkg.package-operator.run/boxcutter/validation"
4946
ctrl "sigs.k8s.io/controller-runtime"
5047
crcache "sigs.k8s.io/controller-runtime/pkg/cache"
5148
"sigs.k8s.io/controller-runtime/pkg/certwatcher"
@@ -653,21 +650,29 @@ func (c *boxcutterReconcilerConfigurator) Configure(ceReconciler *controllers.Cl
653650
return fmt.Errorf("unable to add tracking cache to manager: %v", err)
654651
}
655652

653+
cerCoreClient, err := corev1client.NewForConfig(c.mgr.GetConfig())
654+
if err != nil {
655+
return fmt.Errorf("unable to create client for ClusterExtensionRevision controller: %w", err)
656+
}
657+
cerTokenGetter := authentication.NewTokenGetter(cerCoreClient, authentication.WithExpirationDuration(1*time.Hour))
658+
659+
revisionEngineFactory, err := controllers.NewDefaultRevisionEngineFactory(
660+
c.mgr.GetScheme(),
661+
trackingCache,
662+
discoveryClient,
663+
c.mgr.GetRESTMapper(),
664+
fieldOwnerPrefix,
665+
c.mgr.GetConfig(),
666+
cerTokenGetter,
667+
)
668+
if err != nil {
669+
return fmt.Errorf("unable to create revision engine factory: %w", err)
670+
}
671+
656672
if err = (&controllers.ClusterExtensionRevisionReconciler{
657-
Client: c.mgr.GetClient(),
658-
RevisionEngine: machinery.NewRevisionEngine(
659-
machinery.NewPhaseEngine(
660-
machinery.NewObjectEngine(
661-
c.mgr.GetScheme(), trackingCache, c.mgr.GetClient(),
662-
ownerhandling.NewNative(c.mgr.GetScheme()),
663-
machinery.NewComparator(ownerhandling.NewNative(c.mgr.GetScheme()), discoveryClient, c.mgr.GetScheme(), fieldOwnerPrefix),
664-
fieldOwnerPrefix, fieldOwnerPrefix,
665-
),
666-
validation.NewClusterPhaseValidator(c.mgr.GetRESTMapper(), c.mgr.GetClient()),
667-
),
668-
validation.NewRevisionValidator(), c.mgr.GetClient(),
669-
),
670-
TrackingCache: trackingCache,
673+
Client: c.mgr.GetClient(),
674+
RevisionEngineFactory: revisionEngineFactory,
675+
TrackingCache: trackingCache,
671676
}).SetupWithManager(c.mgr); err != nil {
672677
return fmt.Errorf("unable to setup ClusterExtensionRevision controller: %w", err)
673678
}

hack/test/pre-upgrade-setup.sh

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -122,6 +122,19 @@ rules:
122122
- "update"
123123
resourceNames:
124124
- "${TEST_CLUSTER_EXTENSION_NAME}"
125+
- apiGroups:
126+
- "olm.operatorframework.io"
127+
resources:
128+
- "clusterextensionrevisions"
129+
- "clusterextensionrevisions/finalizers"
130+
verbs:
131+
- "create"
132+
- "update"
133+
- "patch"
134+
- "delete"
135+
- "get"
136+
- "list"
137+
- "watch"
125138
EOF
126139

127140
kubectl apply -f - <<EOF

internal/operator-controller/applier/boxcutter.go

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -186,6 +186,12 @@ func (r *SimpleRevisionGenerator) buildClusterExtensionRevision(
186186
ext *ocv1.ClusterExtension,
187187
annotations map[string]string,
188188
) *ocv1.ClusterExtensionRevision {
189+
if annotations == nil {
190+
annotations = make(map[string]string)
191+
}
192+
annotations[labels.ServiceAccountNameKey] = ext.Spec.ServiceAccount.Name
193+
annotations[labels.ServiceAccountNamespaceKey] = ext.Spec.Namespace
194+
189195
return &ocv1.ClusterExtensionRevision{
190196
ObjectMeta: metav1.ObjectMeta{
191197
Annotations: annotations,

internal/operator-controller/applier/boxcutter_test.go

Lines changed: 30 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,12 @@ func Test_SimpleRevisionGenerator_GenerateRevisionFromHelmRelease(t *testing.T)
6666
ObjectMeta: metav1.ObjectMeta{
6767
Name: "test-123",
6868
},
69+
Spec: ocv1.ClusterExtensionSpec{
70+
Namespace: "test-namespace",
71+
ServiceAccount: ocv1.ServiceAccountReference{
72+
Name: "test-sa",
73+
},
74+
},
6975
}
7076

7177
objectLabels := map[string]string{
@@ -79,10 +85,12 @@ func Test_SimpleRevisionGenerator_GenerateRevisionFromHelmRelease(t *testing.T)
7985
ObjectMeta: metav1.ObjectMeta{
8086
Name: "test-123-1",
8187
Annotations: map[string]string{
82-
"olm.operatorframework.io/bundle-name": "my-bundle",
83-
"olm.operatorframework.io/bundle-reference": "bundle-ref",
84-
"olm.operatorframework.io/bundle-version": "1.2.0",
85-
"olm.operatorframework.io/package-name": "my-package",
88+
"olm.operatorframework.io/bundle-name": "my-bundle",
89+
"olm.operatorframework.io/bundle-reference": "bundle-ref",
90+
"olm.operatorframework.io/bundle-version": "1.2.0",
91+
"olm.operatorframework.io/package-name": "my-package",
92+
"olm.operatorframework.io/service-account-name": "test-sa",
93+
"olm.operatorframework.io/service-account-namespace": "test-namespace",
8694
},
8795
Labels: map[string]string{
8896
labels.OwnerNameKey: "test-123",
@@ -172,6 +180,12 @@ func Test_SimpleRevisionGenerator_GenerateRevision(t *testing.T) {
172180
ObjectMeta: metav1.ObjectMeta{
173181
Name: "test-extension",
174182
},
183+
Spec: ocv1.ClusterExtensionSpec{
184+
Namespace: "test-namespace",
185+
ServiceAccount: ocv1.ServiceAccountReference{
186+
Name: "test-sa",
187+
},
188+
},
175189
}
176190

177191
rev, err := b.GenerateRevision(t.Context(), fstest.MapFS{}, ext, map[string]string{}, map[string]string{})
@@ -291,7 +305,12 @@ func Test_SimpleRevisionGenerator_AppliesObjectLabelsAndRevisionAnnotations(t *t
291305
"other": "value",
292306
}
293307

294-
rev, err := b.GenerateRevision(t.Context(), fstest.MapFS{}, &ocv1.ClusterExtension{}, map[string]string{
308+
rev, err := b.GenerateRevision(t.Context(), fstest.MapFS{}, &ocv1.ClusterExtension{
309+
Spec: ocv1.ClusterExtensionSpec{
310+
Namespace: "test-namespace",
311+
ServiceAccount: ocv1.ServiceAccountReference{Name: "test-sa"},
312+
},
313+
}, map[string]string{
295314
"some": "value",
296315
}, revAnnotations)
297316
require.NoError(t, err)
@@ -319,7 +338,12 @@ func Test_SimpleRevisionGenerator_Failure(t *testing.T) {
319338
ManifestProvider: r,
320339
}
321340

322-
rev, err := b.GenerateRevision(t.Context(), fstest.MapFS{}, &ocv1.ClusterExtension{}, map[string]string{}, map[string]string{})
341+
rev, err := b.GenerateRevision(t.Context(), fstest.MapFS{}, &ocv1.ClusterExtension{
342+
Spec: ocv1.ClusterExtensionSpec{
343+
Namespace: "test-namespace",
344+
ServiceAccount: ocv1.ServiceAccountReference{Name: "test-sa"},
345+
},
346+
}, map[string]string{}, map[string]string{})
323347
require.Nil(t, rev)
324348
t.Log("by checking rendering errors are propagated")
325349
require.Error(t, err)

internal/operator-controller/applier/phase.go

Lines changed: 13 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -28,20 +28,22 @@ func determinePhase(gk schema.GroupKind) Phase {
2828
type Phase string
2929

3030
const (
31-
PhaseNamespaces Phase = "namespaces"
32-
PhasePolicies Phase = "policies"
33-
PhaseRBAC Phase = "rbac"
34-
PhaseCRDs Phase = "crds"
35-
PhaseStorage Phase = "storage"
36-
PhaseDeploy Phase = "deploy"
37-
PhasePublish Phase = "publish"
31+
PhaseNamespaces Phase = "namespaces"
32+
PhasePolicies Phase = "policies"
33+
PhaseRBAC Phase = "rbac"
34+
PhaseRBACBindings Phase = "rbac-bindings"
35+
PhaseCRDs Phase = "crds"
36+
PhaseStorage Phase = "storage"
37+
PhaseDeploy Phase = "deploy"
38+
PhasePublish Phase = "publish"
3839
)
3940

4041
// Well known phases ordered.
4142
var defaultPhaseOrder = []Phase{
4243
PhaseNamespaces,
4344
PhasePolicies,
4445
PhaseRBAC,
46+
PhaseRBACBindings,
4547
PhaseCRDs,
4648
PhaseStorage,
4749
PhaseDeploy,
@@ -68,8 +70,11 @@ var (
6870
PhaseRBAC: {
6971
{Kind: "ServiceAccount"},
7072
{Kind: "Role", Group: "rbac.authorization.k8s.io"},
71-
{Kind: "RoleBinding", Group: "rbac.authorization.k8s.io"},
7273
{Kind: "ClusterRole", Group: "rbac.authorization.k8s.io"},
74+
},
75+
76+
PhaseRBACBindings: {
77+
{Kind: "RoleBinding", Group: "rbac.authorization.k8s.io"},
7378
{Kind: "ClusterRoleBinding", Group: "rbac.authorization.k8s.io"},
7479
},
7580

internal/operator-controller/applier/phase_test.go

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -87,6 +87,30 @@ func Test_PhaseSort(t *testing.T) {
8787
},
8888
},
8989
},
90+
{
91+
Object: unstructured.Unstructured{
92+
Object: map[string]interface{}{
93+
"apiVersion": "rbac.authorization.k8s.io/v1",
94+
"kind": "ClusterRoleBinding",
95+
},
96+
},
97+
},
98+
{
99+
Object: unstructured.Unstructured{
100+
Object: map[string]interface{}{
101+
"apiVersion": "rbac.authorization.k8s.io/v1",
102+
"kind": "RoleBinding",
103+
},
104+
},
105+
},
106+
{
107+
Object: unstructured.Unstructured{
108+
Object: map[string]interface{}{
109+
"apiVersion": "rbac.authorization.k8s.io/v1",
110+
"kind": "Role",
111+
},
112+
},
113+
},
90114
{
91115
Object: unstructured.Unstructured{
92116
Object: map[string]interface{}{
@@ -150,6 +174,35 @@ func Test_PhaseSort(t *testing.T) {
150174
},
151175
},
152176
},
177+
{
178+
Object: unstructured.Unstructured{
179+
Object: map[string]interface{}{
180+
"apiVersion": "rbac.authorization.k8s.io/v1",
181+
"kind": "Role",
182+
},
183+
},
184+
},
185+
},
186+
},
187+
{
188+
Name: string(applier.PhaseRBACBindings),
189+
Objects: []v1.ClusterExtensionRevisionObject{
190+
{
191+
Object: unstructured.Unstructured{
192+
Object: map[string]interface{}{
193+
"apiVersion": "rbac.authorization.k8s.io/v1",
194+
"kind": "ClusterRoleBinding",
195+
},
196+
},
197+
},
198+
{
199+
Object: unstructured.Unstructured{
200+
Object: map[string]interface{}{
201+
"apiVersion": "rbac.authorization.k8s.io/v1",
202+
"kind": "RoleBinding",
203+
},
204+
},
205+
},
153206
},
154207
},
155208
{

internal/operator-controller/controllers/clusterextensionrevision_controller.go

Lines changed: 17 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -43,9 +43,9 @@ const (
4343
// ClusterExtensionRevisionReconciler actions individual snapshots of ClusterExtensions,
4444
// as part of the boxcutter integration.
4545
type ClusterExtensionRevisionReconciler struct {
46-
Client client.Client
47-
RevisionEngine RevisionEngine
48-
TrackingCache trackingCache
46+
Client client.Client
47+
RevisionEngineFactory RevisionEngineFactory
48+
TrackingCache trackingCache
4949
}
5050

5151
type trackingCache interface {
@@ -55,11 +55,6 @@ type trackingCache interface {
5555
Free(ctx context.Context, user client.Object) error
5656
}
5757

58-
type RevisionEngine interface {
59-
Teardown(ctx context.Context, rev machinerytypes.Revision, opts ...machinerytypes.RevisionTeardownOption) (machinery.RevisionTeardownResult, error)
60-
Reconcile(ctx context.Context, rev machinerytypes.Revision, opts ...machinerytypes.RevisionReconcileOption) (machinery.RevisionResult, error)
61-
}
62-
6358
//+kubebuilder:rbac:groups=olm.operatorframework.io,resources=clusterextensionrevisions,verbs=get;list;watch;update;patch;create;delete
6459
//+kubebuilder:rbac:groups=olm.operatorframework.io,resources=clusterextensionrevisions/status,verbs=update;patch
6560
//+kubebuilder:rbac:groups=olm.operatorframework.io,resources=clusterextensionrevisions/finalizers,verbs=update
@@ -139,7 +134,13 @@ func (c *ClusterExtensionRevisionReconciler) reconcile(ctx context.Context, rev
139134
return ctrl.Result{}, werr
140135
}
141136

142-
rres, err := c.RevisionEngine.Reconcile(ctx, *revision, opts...)
137+
revisionEngine, err := c.RevisionEngineFactory.CreateRevisionEngine(ctx, rev)
138+
if err != nil {
139+
setRetryingConditions(rev, err.Error())
140+
return ctrl.Result{}, fmt.Errorf("failed to create revision engine: %v", err)
141+
}
142+
143+
rres, err := revisionEngine.Reconcile(ctx, *revision, opts...)
143144
if err != nil {
144145
if rres != nil {
145146
// Log detailed reconcile reports only in debug mode (V(1)) to reduce verbosity.
@@ -253,7 +254,13 @@ func (c *ClusterExtensionRevisionReconciler) reconcile(ctx context.Context, rev
253254
func (c *ClusterExtensionRevisionReconciler) teardown(ctx context.Context, rev *ocv1.ClusterExtensionRevision, revision *boxcutter.Revision) (ctrl.Result, error) {
254255
l := log.FromContext(ctx)
255256

256-
tres, err := c.RevisionEngine.Teardown(ctx, *revision)
257+
revisionEngine, err := c.RevisionEngineFactory.CreateRevisionEngine(ctx, rev)
258+
if err != nil {
259+
markAsAvailableUnknown(rev, ocv1.ClusterExtensionRevisionReasonReconciling, err.Error())
260+
return ctrl.Result{}, fmt.Errorf("failed to create revision engine for teardown: %v", err)
261+
}
262+
263+
tres, err := revisionEngine.Teardown(ctx, *revision)
257264
if err != nil {
258265
if tres != nil {
259266
l.V(1).Info("teardown failure report", "report", tres.String())

0 commit comments

Comments
 (0)