OCPCLOUD-1645: Scope capi-controllers RBAC to least-privilege#585
OCPCLOUD-1645: Scope capi-controllers RBAC to least-privilege#585stefanonardo wants to merge 1 commit into
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
|
@stefanonardo: This pull request references OCPCLOUD-1645 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughWildcard RBAC rules were removed and replaced with explicit ClusterRole and Role rules that enumerate specific API groups, resources, and verbs for the ChangesRBAC Permissions Hardening
🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 15✅ Passed checks (15 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
8ea8125 to
cfd7831
Compare
|
/jira refresh |
|
@stefanonardo: This pull request references OCPCLOUD-1645 which is a valid jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
manifests/0000_30_cluster-api_03_rbac_roles.yaml (1)
246-256: 💤 Low valueCluster-wide secrets access is intentional but worth documenting.
Trivy flags this rule because cluster-wide secrets management is a sensitive permission. However, this appears justified given:
- Secret Sync Controller needs cross-namespace access (MAPI ↔ CAPI namespaces)
- Kubeconfig Controller manages kubeconfig secrets
- The
deleteverb is intentionally omitted, demonstrating least-privilege- This is a significant improvement over the previous wildcards
Consider adding a comment in the manifest documenting why cluster-wide scope is required, which helps future audits.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@manifests/0000_30_cluster-api_03_rbac_roles.yaml` around lines 246 - 256, Add an inline comment above the ClusterRole rule that grants resources: ["secrets"] under apiGroups: [""] explaining why cluster-wide secrets access is required (e.g., Secret Sync Controller must sync secrets between MAPI and CAPI namespaces and the Kubeconfig Controller manages kubeconfig secrets), note that delete is intentionally omitted to preserve least-privilege, and reference the controllers (Secret Sync Controller, Kubeconfig Controller) so auditors can understand the justification during reviews.Source: Linters/SAST tools
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@manifests/0000_30_cluster-api_03_rbac_roles.yaml`:
- Around line 246-256: Add an inline comment above the ClusterRole rule that
grants resources: ["secrets"] under apiGroups: [""] explaining why cluster-wide
secrets access is required (e.g., Secret Sync Controller must sync secrets
between MAPI and CAPI namespaces and the Kubeconfig Controller manages
kubeconfig secrets), note that delete is intentionally omitted to preserve
least-privilege, and reference the controllers (Secret Sync Controller,
Kubeconfig Controller) so auditors can understand the justification during
reviews.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 76c3409f-8bf9-49a3-a6ff-eeb8fc36df3c
📒 Files selected for processing (1)
manifests/0000_30_cluster-api_03_rbac_roles.yaml
|
/test ? |
|
/test required |
|
/lgtm |
|
Scheduling tests matching the |
|
/lgtm cancel Haven't actually looked at this, yet. Was just checking if it was waiting on the pipeline controller. Wasn't sure it was, as normally you see the pipeline jobs listed but 'Waiting', but they were not present at all. |
cfd7831 to
44913ce
Compare
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@manifests/0000_30_cluster-api_03_rbac_roles.yaml`:
- Around line 254-264: The ClusterRole currently grants cluster-wide read/write
on Secrets (apiGroups: "", resources: secrets, verbs:
get/list/watch/create/update/patch) which is too permissive; remove secret
permissions from the ClusterRole and instead create namespace-scoped Role +
RoleBinding pairs that grant only the needed secret verbs in the exact
namespaces touched by the secret-sync/kubeconfig flows, bind those Roles to the
capi-controllers ServiceAccount used by openshift-capi-controllers, and where
the set of secrets is fixed use resourceNames to restrict to specific Secret
objects; update any references to the removed ClusterRole permissions
accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 954effa5-a923-4230-8052-f53063758493
📒 Files selected for processing (1)
manifests/0000_30_cluster-api_03_rbac_roles.yaml
947e14a to
39788fd
Compare
|
@stefanonardo if we could add some guidance like a README + reference to a skill on how this was done, so it is repeatable in the future, it'd be awesome! |
39788fd to
d60a4f8
Compare
|
/test e2e-aws-capi-techpreview |
d60a4f8 to
3290aaa
Compare
|
/test e2e-aws-capi-techpreview |
3290aaa to
5a705ff
Compare
|
/test e2e-aws-capi-techpreview |
5a705ff to
eac2241
Compare
|
/test e2e-aws-capi-techpreview |
eac2241 to
b531611
Compare
|
/test ci/prow/verify-deps |
|
/test e2e-aws-capi-techpreview |
|
/test verify-deps |
|
@coderabbitai review |
✅ Action performedReview finished.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
.claude/skills/rbac-review/SKILL.md (1)
19-19: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueMinor style suggestion: Use "ask" instead of "tell."
The LanguageTool linter flagged this as more polite phrasing for user-facing guidance. If you're aligning skill documentation with a particular voice/tone standard, consider:
- If `oc whoami` fails, stop and tell the user to connect to a cluster first. + If `oc whoami` fails, stop and ask the user to connect to a cluster first.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.claude/skills/rbac-review/SKILL.md at line 19, In the SKILL.md file, locate the instruction that handles the case when oc whoami fails and currently states "stop and tell the user to connect to a cluster first." Replace the word "tell" with "ask" so it reads "stop and ask the user to connect to a cluster first" to use more polite phrasing consistent with user-facing documentation standards.Source: Linters/SAST tools
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@manifests/0000_30_cluster-api_03_rbac_roles.yaml`:
- Around line 183-189: The RBAC rule for azureclusteridentities is missing the
list and watch verbs required for the controller-runtime cached client to
initialize its informer. Locate the azureclusteridentities resource definition
in the apiGroups section for infrastructure.cluster.x-k8s.io and add list and
watch verbs to the existing verbs array that currently contains get and create.
---
Nitpick comments:
In @.claude/skills/rbac-review/SKILL.md:
- Line 19: In the SKILL.md file, locate the instruction that handles the case
when oc whoami fails and currently states "stop and tell the user to connect to
a cluster first." Replace the word "tell" with "ask" so it reads "stop and ask
the user to connect to a cluster first" to use more polite phrasing consistent
with user-facing documentation standards.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 8bbb1a72-67c8-486d-bab2-f05961a19305
📒 Files selected for processing (4)
.claude/skills/rbac-review/SKILL.mddocs/rbac.mdmanifests/0000_30_cluster-api_03_rbac_roles.yamlmanifests/0000_30_cluster-api_04_rbac_bindings.yaml
✅ Files skipped from review due to trivial changes (1)
- docs/rbac.md
🚧 Files skipped from review as they are similar to previous changes (1)
- manifests/0000_30_cluster-api_04_rbac_bindings.yaml
b531611 to
1fd7d2f
Compare
| - apiGroups: | ||
| - config.openshift.io | ||
| resources: | ||
| - clusteroperators | ||
| verbs: | ||
| - get | ||
| - list | ||
| - watch | ||
| - create | ||
| - update | ||
| - patch |
There was a problem hiding this comment.
CVO creates this. I saw you noted there was some code which will create the CO if it doesn't exist. We should remove that and turn it into an error.
I would like to see:
- remove the create/update/patch permissions (we only patch status, below)
- restrict to the
cluster-apiobject specifically
| - apiGroups: | ||
| - cluster.x-k8s.io | ||
| resources: | ||
| - clusters |
There was a problem hiding this comment.
I think we need the same permissions for clusters, machines, machinesets. Lets just list them here so we DRY.
| - apiGroups: | ||
| - infrastructure.cluster.x-k8s.io | ||
| resources: | ||
| - azureclusteridentities |
There was a problem hiding this comment.
azureclusteridentities sounds like a cluster-scoped resource? If so, I don't think this permission will be effective in a Role (would need to be ClusterRole).
There was a problem hiding this comment.
| - apiGroups: | ||
| - machine.openshift.io | ||
| resources: | ||
| - machines |
There was a problem hiding this comment.
Lets combine machines and machinesets
| resources: | ||
| - secrets |
There was a problem hiding this comment.
IIRC we can (and should) confine this to a specific named secret. All secrets in kube-system is a bit broad.
78dd40c to
a5de004
Compare
|
/test e2e-aws-capi-techpreview |
|
@stefanonardo: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Replace wildcard RBAC rules with enumerated resources and verbs for the capi-controllers ServiceAccount. Validated with audit2rbac and e2e tests on an AWS cluster with MachineAPIMigration enabled. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
a5de004 to
76e2a76
Compare
|
/test e2e-aws-capi-techpreview |
Summary
apiGroups: ['*'], resources: ['*'], verbs: ['*']) in theopenshift-capi-controllersClusterRole andcapi-controllersRole with enumerated, auditable permissionsopenshift-cluster-api,openshift-machine-api, andkube-system— secrets access is no longer cluster-widecluster-apiobject specifically viaresourceNames, and remove thecreateverb (CVO creates the ClusterOperator, not this operator)GetOrCreateClusterOperatorcreate fallback inoperator_status.go— renamed toGetClusterOperatordocs/rbac.mddocumenting the RBAC structure and/rbac-reviewskill for repeating this processPermission justification
Each rule was derived from code analysis and validated against audit2rbac output from an AWS cluster with
MachineAPIMigrationenabled.Confirmed by audit2rbac
config.openshift.ioinfrastructurescorecluster_controller.go:122,infracluster_controller.go:278,kubeconfig.go:85config.openshift.ioclusteroperatorsoperator_status.go:115— informer cache (resourceNames not applicable to list/watch)config.openshift.ioclusteroperators(resourceNames: cluster-api)operator_status.go:115,186,controller_status.go:295config.openshift.ioclusteroperators/status(resourceNames: cluster-api)operator_status.go:186,controller_status.go:295config.openshift.iofeaturegates,clusterversionsfeaturegates.go:50-53apiextensions.k8s.iocustomresourcedefinitionsmachine-api-migration)""nodescluster-api-controller-manager)openshift-cluster-apicluster.x-k8s.ioclusters,machines,machinesetscorecluster_controller.go:122,152,machine_sync_controller.go:220,656,686,1060,machineset_sync_controller.go:200,838,866,1191openshift-cluster-apicluster.x-k8s.ioclusters/status,machines/status,machinesets/statuscorecluster_controller.go:216,machine_sync_controller.go:1365,machineset_sync_controller.go:917openshift-cluster-apiinfrastructure.cluster.x-k8s.ioawsclustersaws.go:57,90openshift-cluster-apiinfrastructure.cluster.x-k8s.ioawsclusters/statusinfracluster_controller.go:162openshift-cluster-apiinfrastructure.cluster.x-k8s.ioawsmachinesmachine_sync_mapi2capi_infrastructure.go:108,136,213openshift-cluster-apiinfrastructure.cluster.x-k8s.ioawsmachines/statusmachine_sync_mapi2capi_infrastructure.go:194openshift-cluster-apiinfrastructure.cluster.x-k8s.ioawsmachinetemplatesmachineset_sync_controller.go:251,453,763,492openshift-cluster-api""podscluster_accessor.go:270)openshift-cluster-api""secretssecret_sync_controller.go:73,90,161,kubeconfig.go:138,171,209openshift-cluster-api""eventsmachine_sync_controller.go:180,machineset_sync_controller.go:151openshift-cluster-apicoordination.k8s.ioleasesopenshift-machine-apimachine.openshift.iomachines,machinesetsmachine_sync_controller.go:203,1391,1467,1039,machineset_sync_controller.go:190,943,1149openshift-machine-apimachine.openshift.iomachines/status,machinesets/statusmachine_sync_controller.go:1441,machineset_sync_controller.go:996,migratestatus.go:105openshift-machine-apimachine.openshift.iocontrolplanemachinesetsinfracluster_controller.go:285-288,412openshift-machine-api""secretssecret_sync_controller.go:73— controller-runtime cache needs list/watch for Getopenshift-machine-api""eventsmachine_sync_controller.go:180Not observed by audit2rbac — justified by code evidence
openshift-cluster-apicluster.x-k8s.ioclusters/finalizers,machines/finalizers,machinesets/finalizersblockOwnerDeletion: trueon ownerReferences (machine_sync_controller.go:577,867,machineset_sync_controller.go:635). Kubernetes requiresupdateon the owner'sfinalizerssubresource. Not captured by audit2rbac (validated by GC admission plugin, not a separate API call).openshift-machine-apimachine.openshift.iomachines/finalizers,machinesets/finalizersblockOwnerDeletionpattern — MAPI machines set ownerReferences to MAPI machinesets (machine_sync_controller.go:953-958).openshift-cluster-api""secretssecret_sync_controller.go:152,azure.go:131— creates secrets if not found.openshift-cluster-api""eventscreatewas observed.openshift-cluster-apicoordination.k8s.ioleasescreateonly on first leader election.list/watch/deleteare standard controller-runtime patterns.openshift-cluster-apiinfrastructure.cluster.x-k8s.ioopenshift-cluster-apiinfrastructure.cluster.x-k8s.ioazureclusteridentitiesazure.go:142Get (controller-runtime cache needs list/watch),:188Create.openshift-cluster-apiinfrastructure.cluster.x-k8s.ioawsmachinetemplatesopenshift-machine-apimachine.openshift.iomachinesetsmachineset_sync_controller.go— creates MAPI MachineSets during CAPI→MAPI sync. Only MAPI→CAPI direction was tested.kube-system""secretskube-system""secrets(resourceNames: vsphere-creds)vsphere.go:157— only exercised on vSphere clusters.🤖 Generated with Claude Code