Skip to content

Add tolerations customization interface for service operators#1550

Merged
openshift-merge-bot[bot] merged 1 commit into
openstack-k8s-operators:mainfrom
stuggi:OSPRH-18693
Aug 5, 2025
Merged

Add tolerations customization interface for service operators#1550
openshift-merge-bot[bot] merged 1 commit into
openstack-k8s-operators:mainfrom
stuggi:OSPRH-18693

Conversation

@stuggi

@stuggi stuggi commented Aug 5, 2025

Copy link
Copy Markdown
Contributor

Adds ability for service operators to customize pod tolerations similar to how resource limits/requests are currently handled.

Features:

  • Add Tolerations field to ContainerSpec API type
  • Implement merge behavior: custom tolerations are merged with defaults, overriding by key when same key exists
  • Set global default tolerations (node.kubernetes.io/not-ready and node.kubernetes.io/unreachable with 120s timeout) in controller
  • Update deployment templates (managers.yaml, operator.yaml) to render custom tolerations from Deployment struct
  • Add test coverage for merge logic and override behavior

Example usage:

operatorOverrides:
- name: "keystone"
  controllerManager:
    tolerations:
    - key: "node.kubernetes.io/not-ready"  # Override default timeout
      operator: "Exists"
      effect: "NoExecute"
      tolerationSeconds: 600
    - key: "node.example.com/gpu"          # Add new toleration
      operator: "Equal"
      value: "nvidia"
      effect: "NoSchedule"

The merge behavior ensures operators get both default tolerations (unless overridden by matching key) plus any additional custom ones, providing flexibility while maintaining safe defaults.

Assisted-by: claude-4-sonnet

Jira: OSPRH-18693

@openshift-ci openshift-ci Bot requested review from abays and viroel August 5, 2025 07:26
@openshift-ci openshift-ci Bot added the approved label Aug 5, 2025
@stuggi stuggi requested review from dprince and removed request for viroel August 5, 2025 07:26
Comment thread apis/operator/v1beta1/openstack_types.go Outdated
Adds ability for service operators to customize pod tolerations similar
to how resource limits/requests are currently handled.

Features:
- Add Tolerations field to ContainerSpec API type
- Implement merge behavior: custom tolerations are merged with defaults,
  overriding by key when same key exists
- Set global default tolerations (node.kubernetes.io/not-ready and
  node.kubernetes.io/unreachable with 120s timeout) in controller
- Update deployment templates (managers.yaml, operator.yaml) to render
  custom tolerations from Deployment struct
- Add test coverage for merge logic and override behavior

Example usage:
```yaml
operatorOverrides:
- name: "keystone"
  controllerManager:
    tolerations:
    - key: "node.kubernetes.io/not-ready"  # Override default timeout
      operator: "Exists"
      effect: "NoExecute"
      tolerationSeconds: 600
    - key: "node.example.com/gpu"          # Add new toleration
      operator: "Equal"
      value: "nvidia"
      effect: "NoSchedule"
```

The merge behavior ensures operators get both default tolerations
(unless overridden by matching key) plus any additional custom ones,
providing flexibility while maintaining safe defaults.

Jira: OSPRH-18693

Assisted-by: claude-4-sonnet

Signed-off-by: Martin Schuppert <mschuppert@redhat.com>

@abays abays left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci Bot added the lgtm label Aug 5, 2025
@openshift-ci

openshift-ci Bot commented Aug 5, 2025

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: abays, stuggi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@softwarefactory-project-zuul

Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/43a1fc26b3754b65840bf7c6e890664e

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 50m 18s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 12m 03s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 27m 29s
adoption-standalone-to-crc-ceph-provider FAILURE in 1h 36m 15s
✔️ openstack-operator-tempest-multinode SUCCESS in 1h 27m 41s

@stuggi

stuggi commented Aug 5, 2025

Copy link
Copy Markdown
Contributor Author

recheck

@stuggi

stuggi commented Aug 5, 2025

Copy link
Copy Markdown
Contributor Author

/cherry-pick 18.0.fr3

@openshift-cherrypick-robot

Copy link
Copy Markdown

@stuggi: once the present PR merges, I will cherry-pick it on top of 18.0.fr3 in a new PR and assign it to you.

Details

In response to this:

/cherry-pick 18.0.fr3

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-merge-bot openshift-merge-bot Bot merged commit 06610ad into openstack-k8s-operators:main Aug 5, 2025
10 checks passed
@openshift-cherrypick-robot

Copy link
Copy Markdown

@stuggi: cannot checkout 18.0.fr3: error checking out "18.0.fr3": exit status 1 error: pathspec '18.0.fr3' did not match any file(s) known to git

Details

In response to this:

/cherry-pick 18.0.fr3

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@stuggi

stuggi commented Aug 5, 2025

Copy link
Copy Markdown
Contributor Author

/cherry-pick 18.0-fr3

@openshift-cherrypick-robot

Copy link
Copy Markdown

@stuggi: new pull request created: #1551

Details

In response to this:

/cherry-pick 18.0-fr3

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants