OCPBUGS-79418: ctrl: sched: add topologySpreadConstraints to deployment by shajmakh · Pull Request #4177 · openshift-kni/numaresources-operator

shajmakh · 2026-05-25T13:52:49Z

To evenly spread the pods across the nodes in a balanced way taking
into account the new replicaset pods while rollout.

previous discussions:
(reverted) #4021
(cancelled due to regressions) #4028

coderabbitai · 2026-05-25T13:53:04Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Scheduler pod placement shifts from affinity rules to hostname-based topology spread constraints. Explicit scheduler replica counts are now validated against auto-detected eligible nodes. Deployment-completion waiting is refactored to accept explicit replica counts and validate generation and per-field replica readiness independently with detailed logging.

Changes

Scheduler topology spread and replica validation

Layer / File(s)	Summary
TopologySpreadConstraints pod-placement helper `pkg/objectupdate/sched/sched.go`, `pkg/objectupdate/sched/sched_test.go`	`DeploymentTopologySpreadConstraints` replaces affinity-based placement; it configures hostname-scoped spread with `DoNotSchedule`, `MaxSkew=1`, and `pod-template-hash` matching, returning error if template has no labels. Unit tests verify error cases and constraint structure.
Controller replica detection and validation `internal/controller/numaresourcesscheduler_controller.go`, `internal/controller/numaresourcesscheduler_controller_test.go`	`computeSchedulerReplicas` now detects eligible nodes by platform role before validating explicit `spec.replicas` against that count, returning error if exceeded. Scheduler Deployment sync applies `DeploymentTopologySpreadConstraints` instead of affinity. Controller tests cover success paths, validation failure, and degradation on node scale-down.
Deployment-completion wait logic `internal/wait/deployment.go`	`ForDeploymentCompleteWithReplicas` handles polling with an explicit expected-replica count. `IsDeploymentComplete` refactored to accept generation and expected-replicas parameters, performing generation check first, then validating `Replicas`, `UpdatedReplicas`, and `AvailableReplicas` independently with per-field logging. `ForDeploymentComplete` delegates through the new variant.
Scheduler e2e topology spread tests `test/e2e/sched/sched_test.go`	Comprehensive e2e suite validates topology spread behavior: autodetection of replica count from platform nodes, constraint structure verification, replica toggling (0/explicit/auto) and deployment identity stability, distinct-node placement, and cordon-induced topology-failure scenario (pod Pending with Unschedulable). Adds deployment status logging helper.
Serial e2e wait signature updates `test/e2e/serial/tests/configuration.go`, `test/e2e/serial/tests/scheduler_removal.go`, `test/e2e/serial/tests/workload_placement_nodelabel.go`	Updates `wait.IsDeploymentComplete` calls to use the new generation- and replica-aware signature, passing `deployment.Generation`, status, and expected replica count.

Sequence Diagram(s)

sequenceDiagram
  participant NUMAResourcesScheduler
  participant Controller as computeSchedulerReplicas
  participant NodeAPI
  participant KubeAPI
  NUMAResourcesScheduler->>Controller: spec.Replicas (may be nil)
  Controller->>NodeAPI: List nodes by platform role
  NodeAPI-->>Controller: eligible node count
  alt explicit replicas set
    Controller->>Controller: validate spec.Replicas ≤ node count
    alt validation fails
      Controller-->>NUMAResourcesScheduler: error
    else validation passes
      Controller-->>NUMAResourcesScheduler: spec.Replicas
    end
  else autodetect
    Controller-->>NUMAResourcesScheduler: detected node count
  end
  NUMAResourcesScheduler->>KubeAPI: patch Deployment with TopologySpreadConstraints
  KubeAPI-->>NUMAResourcesScheduler: Deployment updated

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main change: adding topologySpreadConstraints to the scheduler deployment, which is reflected throughout all modified files.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description check	✅ Passed	The pull request description accurately describes the changes: adding topologySpreadConstraints to evenly spread scheduler pods across nodes during rollouts, with clear context from previous discussions.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/controller/numaresourcesscheduler_controller.go`:
- Around line 279-285: The current replica validation rejects spec.replicas: 0
when replicaCount is 0; update the logic in the block handling
schedSpec.Replicas so that an explicit zero is always accepted: first check if
*schedSpec.Replicas == 0 and if so return schedSpec.Replicas, nil; otherwise
keep the existing guard that errors when *schedSpec.Replicas > replicaCount
(using the existing replicaCount and nodeRoleDescription symbols) and return the
validated value. This preserves the existing maximum check but allows
intentional scale-to-zero.

In `@test/e2e/sched/sched_test.go`:
- Around line 392-396: The current test compares Deployment.ResourceVersion
immediately (using schedDpPostUpdate.ResourceVersion ==
schedDp.ResourceVersion), which is flaky; replace that assertion with a
Consistently check that the deployment's spec and generation do not change over
a short interval. Specifically, fetch the deployment via
podlist.With(...).DeploymentByOwnerReference(ctx, nroSchedObj.UID) inside a
Gomega Consistently block and assert that schedDpPostUpdate.Spec (or relevant
spec fields) and schedDpPostUpdate.Generation remain equal to the original
schedDp.Spec and schedDp.Generation for the duration; remove the ResourceVersion
equality assertion. Apply the same replacement for the similar check around
schedDpPostUpdate/schedDp at lines 405-409.
- Around line 485-493: The cleanup uses an uninitialized pointer `var node
*corev1.Node`, causing nil deref when calling `e2eclient.Client.Get` and
accessing `node.Spec.Unschedulable`; allocate a concrete Node before use (e.g.,
declare `node := &corev1.Node{}`) in the `DeferCleanup` closure, then call
`e2eclient.Client.Get(ctx, client.ObjectKey{Name: targetNodeName}, node)` and
update `node.Spec.Unschedulable = false` followed by
`e2eclient.Client.Update(ctx, node)` inside the `Eventually` block so
`needToUncordon`, `targetNodeName`, `DeferCleanup`, and the get/update calls
work safely.
- Around line 503-518: In the Eventually polling loop that calls
podlist.With(...).ByDeployment and assigns to recreatedPod, guard against nil
before dereferencing recreatedPod by moving or conditionalizing the klog.InfoS
call and any access to recreatedPod.Namespace/Name until after you assert
recreatedPod is not nil (or wrap the logging in an if recreatedPod != nil
block); ensure g.Expect(recreatedPod).ToNot(BeNil()) still runs in the loop so
the loop can retry when no pending pod is found and only after that check use
recreatedPod fields for logging or further assertions.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: bb00a70c-f80a-4212-b730-569aa2e631d4

📥 Commits

Reviewing files that changed from the base of the PR and between b06519a and 9a45f1f.

📒 Files selected for processing (5)

internal/controller/numaresourcesscheduler_controller.go
internal/controller/numaresourcesscheduler_controller_test.go
pkg/objectupdate/sched/sched.go
pkg/objectupdate/sched/sched_test.go
test/e2e/sched/sched_test.go

ffromani · 2026-05-25T14:10:17Z

/cc

shajmakh · 2026-05-26T05:44:10Z

requires #4193

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/objectupdate/sched/sched.go`:
- Around line 107-110: The code only rejects nil dp.Spec.Template.Labels but
allows an empty map which yields an empty MatchLabels selector; update the
validation to treat empty maps as invalid by checking both nil and len(labels)
== 0 (i.e., if labels == nil || len(labels) == 0) and return an error before
constructing the selector/MatchLabels so you never build a LabelSelector with an
empty MatchLabels.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 738f118f-0f41-492e-9e1e-9f31e89f752d

📥 Commits

Reviewing files that changed from the base of the PR and between 9a45f1f and b3da392.

📒 Files selected for processing (5)

internal/controller/numaresourcesscheduler_controller.go
internal/controller/numaresourcesscheduler_controller_test.go
pkg/objectupdate/sched/sched.go
pkg/objectupdate/sched/sched_test.go
test/e2e/sched/sched_test.go

ffromani

Good work. Do we have a test which actually verify the happy path? 3 control-plane nodes, check we autodetect 3 replicas and each one lands on a different node.

shajmakh · 2026-05-26T07:12:32Z

Good work. Do we have a test which actually verify the happy path? 3 control-plane nodes, check we autodetect 3 replicas and each one lands on a different node.

we have that as a step in one of the tests:
b3da392#diff-6e473fac5187d020f14665d506a6553280fc9db1866488805157ada1a93e82b0R479

shajmakh · 2026-05-26T09:07:33Z

/retest

openshift-ci-robot · 2026-05-26T11:50:05Z

@shajmakh: This pull request references Jira Issue OCPBUGS-79418, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug

bug is open, matching expected state (open)
bug target version (5.0.0) matches configured target version for branch (5.0.0)
bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @rshemtov13

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

To evenly spread the pods across the nodes in a a balanced way taking
into account the new replicaset pods while rollout.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Tal-or

LGTM overall, left minor comment

ffromani

let's add one more e2e test (suggested inline) and we can merge

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/wait/deployment.go`:
- Around line 53-63: areDeploymentReplicasAvailable currently always returns
true, which makes IsDeploymentComplete succeed as soon as ObservedGeneration
advances; change areDeploymentReplicasAvailable to perform a real availability
check: compare newStatus.Replicas, newStatus.UpdatedReplicas and
newStatus.AvailableReplicas against the desired replicas value and return false
if any required condition is not met (log the mismatches with klog.InfoS as
already used), returning true only when the availability predicate is satisfied;
if strict equality is too strict for your rollout semantics, implement a clear
predicate (e.g., UpdatedReplicas >= replicas and AvailableReplicas >=
minDesired) instead of unconditional true so IsDeploymentComplete behaves
correctly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: a48fb4bf-6aa6-4125-8967-f83a3a69196d

📥 Commits

Reviewing files that changed from the base of the PR and between b3da392 and 098333c.

📒 Files selected for processing (6)

internal/controller/numaresourcesscheduler_controller.go
internal/controller/numaresourcesscheduler_controller_test.go
internal/wait/deployment.go
pkg/objectupdate/sched/sched.go
pkg/objectupdate/sched/sched_test.go
test/e2e/sched/sched_test.go

🚧 Files skipped from review as they are similar to previous changes (5)

internal/controller/numaresourcesscheduler_controller.go
pkg/objectupdate/sched/sched.go
test/e2e/sched/sched_test.go
internal/controller/numaresourcesscheduler_controller_test.go
pkg/objectupdate/sched/sched_test.go

PodAntiAffinity is a legacy way to handle HA scenarios and in project state it caused deadlocks and regressions. This reverts commit b4e7fd1. Signed-off-by: Shereen Haj <shajmakh@redhat.com>

It's reasonable to have high-availability enabled to have scheduler pods number matches the control-plane nodes' number or workers on hypershift. but a number greater than that makes no point. This commit degrades the scheduler status in case a higher number of replicas is requested. Signed-off-by: Shereen Haj <shajmakh@redhat.com>

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

internal/wait/deployment.go (1)
27-40: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

This waiter can still report success before a no-scale rollout is observed.

IsDeploymentComplete(dp.Generation, ...) only proves the status is not older than the pre-existing Deployment generation. When the rollout is triggered indirectly and keeps the same replica count, callers like test/e2e/sched/sched_test.go Line 80 and Line 135 pass a Deployment fetched before the CR update, so ObservedGeneration == dp.Generation is already true and this can return before the new image/spec is actually rolled out.

That weakens the new “generation-aware” contract for exactly the same-replica updates this PR adds. The waiter needs a way to observe a newer Deployment generation for indirect rollouts, or a separate helper for “wait for reconcile after parent object change” instead of reusing the stale generation from the caller.

Also applies to: 66-72
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/wait/deployment.go` around lines 27 - 40, The waiter currently can
return success using the caller's stale dp.Generation; change
ForDeploymentCompleteWithReplicas (and the similar block at 66-72) so it will
not treat the Deployment as complete until a newer generation has been observed:
require that either updatedDp.Generation > dp.Generation or
updatedDp.Status.ObservedGeneration > dp.Generation (i.e., the controller has
observed a newer generation) before allowing IsDeploymentComplete to return
success; in practice, check the
updatedDp.Generation/updatedDp.Status.ObservedGeneration and only call/accept
IsDeploymentComplete when the newer generation has been observed, otherwise keep
polling.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/e2e/sched/sched_test.go`:
- Around line 653-660: The helper infoUponDeploymentCompleteFailure logs
deployment status but only iterates pods when
podlist.With(e2eclient.Client).ByDeployment(...) returns an error, which is
inverted; change the control flow so that after calling
podlist.With(...).ByDeployment(ctx, *schedDp) you iterate and log each pod
(using pods range and klog.InfoS with "podName"/"podStatus") when err == nil,
and handle the error case by logging the retrieval error (include err) instead;
update the branches around the call to ByDeployment to reflect this corrected
logic.

In `@test/e2e/serial/tests/scheduler_removal.go`:
- Around line 162-163: The wait call mistakenly calls Interval twice which
overwrites the first interval; update the chain that calls
wait.With(fxt.Client).Interval(2*time.Second).Interval(30*time.Second).ForDeploymentCompleteWithReplicas(...)
so the second Interval becomes a Timeout (e.g., Timeout(30*time.Second)) to
bound the overall wait while keeping a 2s poll cadence; target the wait.With /
Interval / Timeout / ForDeploymentCompleteWithReplicas chain in
scheduler_removal.go and ensure the poll interval remains 2s and the overall
timeout is 30s.

In `@test/e2e/serial/tests/workload_placement.go`:
- Line 286: After triggering a rollout and assigning updatedDp, change calls
that pass the original dp into the generation-aware waiter to instead pass the
updated deployment object returned from the update; specifically update the
three calls to
wait.With(fxt.Client).Interval(...).Timeout(...).ForDeploymentCompleteWithReplicas(...)
so they use updatedDp (the result of the update) rather than dp to ensure the
waiter observes the new generation when waiting for replicas, keeping references
to function ForDeploymentCompleteWithReplicas and variables dp/updatedDp to
locate the sites.

---

Outside diff comments:
In `@internal/wait/deployment.go`:
- Around line 27-40: The waiter currently can return success using the caller's
stale dp.Generation; change ForDeploymentCompleteWithReplicas (and the similar
block at 66-72) so it will not treat the Deployment as complete until a newer
generation has been observed: require that either updatedDp.Generation >
dp.Generation or updatedDp.Status.ObservedGeneration > dp.Generation (i.e., the
controller has observed a newer generation) before allowing IsDeploymentComplete
to return success; in practice, check the
updatedDp.Generation/updatedDp.Status.ObservedGeneration and only call/accept
IsDeploymentComplete when the newer generation has been observed, otherwise keep
polling.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 41902836-39ef-489b-a0b6-58348bb7bdfb

📥 Commits

Reviewing files that changed from the base of the PR and between 098333c and 4f252a0.

📒 Files selected for processing (14)

internal/controller/numaresourcesscheduler_controller.go
internal/controller/numaresourcesscheduler_controller_test.go
internal/wait/deployment.go
pkg/objectupdate/sched/sched.go
pkg/objectupdate/sched/sched_test.go
test/e2e/sched/sched_test.go
test/e2e/serial/tests/configuration.go
test/e2e/serial/tests/non_regression.go
test/e2e/serial/tests/resource_accounting.go
test/e2e/serial/tests/scheduler_removal.go
test/e2e/serial/tests/workload_overhead.go
test/e2e/serial/tests/workload_placement.go
test/e2e/serial/tests/workload_placement_nodelabel.go
test/e2e/serial/tests/workload_placement_tmpol.go

🚧 Files skipped from review as they are similar to previous changes (4)

pkg/objectupdate/sched/sched_test.go
pkg/objectupdate/sched/sched.go
internal/controller/numaresourcesscheduler_controller.go
internal/controller/numaresourcesscheduler_controller_test.go

ffromani · 2026-05-28T08:35:56Z

 func areDeploymentReplicasAvailable(newStatus *appsv1.DeploymentStatus, replicas int32) bool {
-	return newStatus.UpdatedReplicas == replicas &&
-		newStatus.Replicas == replicas &&
-		newStatus.AvailableReplicas == replicas
+	if newStatus.Replicas != replicas {
+		klog.InfoS("newStatus.Replicas", "expected", replicas, "found", newStatus.Replicas)
+		return false
+	}
+	if newStatus.UpdatedReplicas != replicas {
+		klog.InfoS("newStatus.UpdatedReplicas", "expected", replicas, "found", newStatus.UpdatedReplicas)
+		return false
+	}
+	if newStatus.AvailableReplicas != replicas {
+		klog.InfoS("newStatus.AvailableReplicas", "expected", replicas, "found", newStatus.AvailableReplicas)
+		return false
+	}
+	return true
 }

 func IsDeploymentComplete(dp *appsv1.Deployment, newStatus *appsv1.DeploymentStatus) bool {
-	return areDeploymentReplicasAvailable(newStatus, *(dp.Spec.Replicas)) &&
-		newStatus.ObservedGeneration >= dp.Generation
+	if newStatus.ObservedGeneration < dp.Generation {
+		klog.InfoS("generation is older than expected to indicate the deployment is complete", "old", dp.Generation, "new", newStatus.ObservedGeneration)
+		return false
+	}
+
+	return areDeploymentReplicasAvailable(newStatus, *(dp.Spec.Replicas))


This is a bit heavy handed. Currently this is called only in test code, but the moment we pull in production code we will have log spam. There are options, but all require more work to fix. Can we simplify this commit or drop it entirely?

this was needed for debugging, it essentially describes why the functions fail which is why we have the last fixing commit so this is very helpful details for troubleshooting the tests.
If ever we decide to use that in production, we need to be aware of the old use which is initially meant for tests. we can anyway create a production version and improve it then as we see fit.
Thinking about the backport, I think it's better to keep the old function ForDeploymentComplete to reduce any tests conflicts which will then simplify the commit a bit in terms of the adaptations across the codebase. would that sound good enough?

discouraging production use could be as simple as move this code in a subtree of test/ perhaps test/internal/wait/... but I won't do in this PR.

A way forward would be post a followup PR with first the trivial move and then adding the logs

actually let's move forward and let's revisit later

ffromani · 2026-05-28T08:38:07Z

-	// not enough to guarantee that the deployment is ready with the NEW replica, thus need to cover that by
-	// additional checks as the context requires
+// ForDeploymentCompleteWithReplicas waits for the deployment to be complete and have the specified number of replicas.
+func (wt Waiter) ForDeploymentCompleteWithReplicas(ctx context.Context, dp *appsv1.Deployment, newExpectedReplicas int32) (*appsv1.Deployment, error) {


good addition. If most of the codebase actually uses the pattern

ForDeploymentCompleteWithReplicas(ctx, dp, *dp.Spec.Replicas)

and at glance it seems the case, it's worth keeping the old API ForDeploymentComplete

I intentionally updated the codebase because ForDeploymentComplete without having the correct expected replicas count does not sound complete and can be confusing. Internally in the function, replicas check is a core one and in order to avoid using the incorrect function for deployment waiter I overrode the old API. This was actually an old bug as noted in the omitted comment; the new version is expected to handle all deployments now.

ok, so let's just rename ForDeploymentCompleteWithReplicas to ForDeploymentComplete

in the latest push I restored ForDeploymentComplete to keep the diff as less as possible to reduce conflicts while backporting. I can address changing the whole code base in a followup PR including #4177 (comment)

ffromani

/approve
/lgtm

we can defer the remaining comments to followups

openshift-ci · 2026-05-28T11:53:51Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ffromani, shajmakh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [ffromani,shajmakh]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

shajmakh · 2026-05-28T12:01:30Z

/hold
for updates to make the automated backport process smoother

To evenly spread the pods across the nodes in a a balanced way taking into account the new replicaset pods while rollout. Signed-off-by: Shereen Haj <shajmakh@redhat.com>

The check includes many verification steps that only when all passed the function pass. Sometimes the check fails without extra details of what exactly was the step that failed. This commit adds debugging logs to provide these details. Signed-off-by: Shereen Haj <shajmakh@redhat.com>

So far the old function was waiting for the deployment to be ready with the same replicas count as the old one. It is however possible that the new deployment version includes a different replicas field hence should expect to a different replicas count. The old version would fail on such case where the old count vs the new one are not the same. This commit fixes this issue by introducing new function that accepts the expected new replicas count and compare against it. The old function API still exists but is redirected to use the new function with same old behavior. Existing tests are not affected by this. Signed-off-by: Shereen Haj <shajmakh@redhat.com>

coderabbitai · 2026-05-28T12:22:33Z

Actionable comments posted: 0

openshift-ci-robot · 2026-05-28T12:41:43Z

@shajmakh: This pull request references Jira Issue OCPBUGS-79418, which is valid.

3 validation(s) were run on this bug

bug is open, matching expected state (open)
bug target version (5.0.0) matches configured target version for branch (5.0.0)
bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @rshemtov13

Details

In response to this:

To evenly spread the pods across the nodes in a balanced way taking
into account the new replicaset pods while rollout.

previous discussions:
(reverted) #4021
(cancelled due to regressions) #4028

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

ffromani

/lgtm

shajmakh · 2026-05-29T10:01:10Z

/unhold

shajmakh · 2026-05-29T10:02:09Z

/cherry-pick release-4.22,release-4.21, release-4.20

openshift-cherrypick-robot · 2026-05-29T10:02:13Z

@shajmakh: once the present PR merges, I will cherry-pick it on top of release-4.22,release-4.21, in a new PR and assign it to you.

Details

In response to this:

/cherry-pick release-4.22,release-4.21, release-4.20

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

openshift-ci-robot · 2026-05-29T12:22:50Z

@shajmakh: Jira Issue OCPBUGS-79418: All pull requests linked via external trackers have merged:

openshift-kni/numaresources-operator#4177

Jira Issue OCPBUGS-79418 has been moved to the MODIFIED state.

Details

In response to this:

To evenly spread the pods across the nodes in a balanced way taking
into account the new replicaset pods while rollout.

previous discussions:
(reverted) #4021
(cancelled due to regressions) #4028

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-cherrypick-robot · 2026-05-29T12:23:30Z

@shajmakh: cannot checkout release-4.22,release-4.21,: error checking out "release-4.22,release-4.21,": exit status 1 error: pathspec 'release-4.22,release-4.21,' did not match any file(s) known to git

Details

In response to this:

/cherry-pick release-4.22,release-4.21, release-4.20

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

shajmakh · 2026-05-29T13:27:26Z

/cherry-pick release-4.22 release-4.21 release-4.20

openshift-cherrypick-robot · 2026-05-29T13:28:09Z

@shajmakh: new pull request created: #4265

Details

In response to this:

/cherry-pick release-4.22 release-4.21 release-4.20

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

openshift-ci Bot requested review from mrniranjan and swatisehgal May 25, 2026 13:53

openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 25, 2026

shajmakh mentioned this pull request May 25, 2026

WIP: topologySpreadConstraints for HA #4126

Closed

coderabbitai Bot reviewed May 25, 2026

View reviewed changes

Comment thread internal/controller/numaresourcesscheduler_controller.go

Comment thread test/e2e/sched/sched_test.go Outdated

Comment thread test/e2e/sched/sched_test.go Outdated

Comment thread test/e2e/sched/sched_test.go

openshift-ci Bot requested a review from ffromani May 25, 2026 14:10

shajmakh force-pushed the tsc-for-ha-sched branch from 9a45f1f to aad497d Compare May 26, 2026 05:42

shajmakh force-pushed the tsc-for-ha-sched branch from aad497d to b3da392 Compare May 26, 2026 06:32

coderabbitai Bot reviewed May 26, 2026

View reviewed changes

Comment thread pkg/objectupdate/sched/sched.go

ffromani reviewed May 26, 2026

View reviewed changes

Comment thread pkg/objectupdate/sched/sched.go Outdated

Comment thread pkg/objectupdate/sched/sched_test.go Outdated

shajmakh force-pushed the tsc-for-ha-sched branch 3 times, most recently from 2207af4 to c2e260d Compare May 26, 2026 08:46

shajmakh changed the title ~~ctrl: sched: add topologySpreadConstraints to deployment~~ OCPBUGS-79418: ctrl: sched: add topologySpreadConstraints to deployment May 26, 2026

openshift-ci Bot requested a review from rshemtov13 May 26, 2026 11:50

Tal-or reviewed May 26, 2026

View reviewed changes

Comment thread pkg/objectupdate/sched/sched.go

ffromani reviewed May 26, 2026

View reviewed changes

Comment thread internal/controller/numaresourcesscheduler_controller_test.go Outdated

Comment thread pkg/objectupdate/sched/sched_test.go Outdated

Comment thread test/e2e/sched/sched_test.go

shajmakh force-pushed the tsc-for-ha-sched branch from c2e260d to 098333c Compare May 27, 2026 06:45

coderabbitai Bot reviewed May 27, 2026

View reviewed changes

Comment thread internal/wait/deployment.go

shajmakh added 2 commits May 27, 2026 10:25

Revert "ctrl: sched: add PodAntiAffinity to deployment"

c04e8f7

PodAntiAffinity is a legacy way to handle HA scenarios and in project state it caused deadlocks and regressions. This reverts commit b4e7fd1. Signed-off-by: Shereen Haj <shajmakh@redhat.com>

shajmakh force-pushed the tsc-for-ha-sched branch 3 times, most recently from cc90aeb to 01d534d Compare May 27, 2026 15:17

coderabbitai Bot reviewed May 28, 2026

View reviewed changes

Comment thread test/e2e/sched/sched_test.go

Comment thread test/e2e/serial/tests/scheduler_removal.go Outdated

Comment thread test/e2e/serial/tests/workload_placement.go Outdated

shajmakh force-pushed the tsc-for-ha-sched branch from b1ee4d6 to ef1ff13 Compare May 28, 2026 07:11

ffromani reviewed May 28, 2026

View reviewed changes

openshift-ci Bot assigned ffromani May 28, 2026

openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label May 28, 2026

openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 28, 2026

shajmakh force-pushed the tsc-for-ha-sched branch from ef1ff13 to e3a5016 Compare May 28, 2026 12:17

openshift-ci Bot removed the lgtm Indicates that a PR is ready to be merged. label May 28, 2026

shajmakh added 3 commits May 28, 2026 15:21

ctrl: sched: add topologySpreadConstraints to deployment

e192eb5

To evenly spread the pods across the nodes in a a balanced way taking into account the new replicaset pods while rollout. Signed-off-by: Shereen Haj <shajmakh@redhat.com>

shajmakh force-pushed the tsc-for-ha-sched branch from e3a5016 to f30a924 Compare May 28, 2026 12:21

ffromani reviewed May 28, 2026

View reviewed changes

openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label May 28, 2026

openshift-ci Bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 29, 2026

openshift-merge-bot Bot merged commit 2f6eb77 into openshift-kni:main May 29, 2026
21 checks passed

openshift-cherrypick-robot mentioned this pull request May 29, 2026

[release-4.22] OCPBUGS-85702: ctrl: sched: add topologySpreadConstraints to deployment #4265

Open

shajmakh mentioned this pull request Jun 2, 2026

[4.21] OCPBUGS-79418: ctrl: sched: add topologySpreadConstraints to deployment #4307

Open

Conversation

shajmakh commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ffromani commented May 25, 2026

Uh oh!

shajmakh commented May 26, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ffromani left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

shajmakh commented May 26, 2026

Uh oh!

shajmakh commented May 26, 2026

Uh oh!

openshift-ci-robot commented May 26, 2026

Uh oh!

Tal-or left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ffromani left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ffromani left a comment

Choose a reason for hiding this comment

Uh oh!

openshift-ci Bot commented May 28, 2026

shajmakh commented May 25, 2026 •

edited

Loading

coderabbitai Bot commented May 25, 2026 •

edited

Loading