chore: patch metadata-only pod updates#10398
Conversation
When a desired pod template sets CPU or memory limits but omits requests, Kubernetes can preserve a live request value defaulted from an older limit. The in-place resource comparison should not infer a desired request from the desired limit in that case; otherwise a post-resize reconcile keeps reporting a request difference even though the desired template does not own that request. Compare CPU and memory requests only when the desired template explicitly sets that request. Continue to compare limits normally.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #10398 +/- ##
==========================================
+ Coverage 61.87% 61.90% +0.02%
==========================================
Files 533 533
Lines 63609 63623 +14
==========================================
+ Hits 39360 39384 +24
+ Misses 20661 20651 -10
Partials 3588 3588
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
Runtime validation for head Exact image / controller identity:
Focused gate result:
First-blocker classification: control plane. The setup, exact image identity, runner wait, and probe target are sufficient for this focused result; the Redis product/action path is not the first blocker in this run. Boundary: #10398 remains draft. This head does not unblock the Redis addon runtime gate or #2819; a follow-up controller fix is required before another exact-image validation. |
After a successful reconfigure, the remaining pod diff can be only config-hash metadata. A full Pod update can conflict with concurrent kubelet/status writes and leave the annotation stale, which keeps the OpsRequest running even after the runtime config changed. Add an explicit ObjectTree patch option and use it for metadata-only in-place pod updates in the InstanceSet and Instance reconcilers.
04f6d7e to
31908d0
Compare
|
Updated this draft PR to a narrower follow-up head: What changed from the failed
Local checks for the new head:
Boundary: this is still draft. The previous exact-image runtime run failed, and the new head still needs CI plus exact-image Redis Parameters x replication-twemproxy validation before it can be treated as a candidate fix for #10369. |
|
Runtime validation for head Exact controller identity:
Focused gate result:
Convergence evidence:
Acceptance classification: patch-version focused validation passed for the Redis Parameters x replication-twemproxy gate. No first blocker was observed in this focused run. Boundary: this validates the current #10398 metadata-only PATCH path for this #10369-focused gate, but it is not a Redis addon release-ready claim or a broader matrix result. The PR is still stacked on #10394, so the final merge path depends on accepting, rebasing, or retargeting the base chain. |
| } | ||
| if !equalField(oc.Resources.Requests, realRequests) { | ||
| return false | ||
| for _, resourceName := range []corev1.ResourceName{corev1.ResourceCPU, corev1.ResourceMemory} { |
There was a problem hiding this comment.
[P1] Preserve explicit non-CPU/memory resource changes. VerticalScaling validation still accepts hugepages-* resources, but this loop now compares only CPU and memory requests, and equalField for ResourceList also ignores non-CPU/memory limits. A desired pod that changes a hugepages-* request/limit can therefore be treated as equal, so getPodUpdatePolicy returns noOps and the scaling operation can converge without applying the requested resource change. Keep the new omitted CPU/memory request handling scoped to CPU/memory defaulting, but continue comparing explicit non-CPU/memory requests/limits or reject them before this path. The same issue exists in pkg/controller/instance/in_place_update_utils.go.
Problem
Targets #10369.
This PR is stacked on #10394 so the resource-comparison fix stays separate from the config-hash/status convergence fix.
After a successful reconfigure, the remaining in-place Pod diff can be only metadata such as the config-hash annotation. Planning that as a full Pod update can race with concurrent status writes and leave the config-hash/status stale, which keeps the OpsRequest running even though the database-side reconfigure action returned OK.
The previous head
04f6d7e90770ed07d32a9c9b9435cab45782692dused a normal-update-before-resize approach and failed the Redis Parameters x replication-twemproxy focused runtime gate. This head replaces that attempt with the narrower metadata-only patch path.Changes
/resizesubresource path.Tests
go test ./pkg/controller/kubebuilderx ./pkg/controller/instance ./pkg/controller/instanceset -count=1git diff --check origin/bugfix/instanceset-resize-resource-compare..HEAD31908d0cb7aa64c79c882e6ee71dfbce0e668516: green / expected-skip, includingmake-testandpush-pre-check (test).Runtime status
04f6d7e90770ed07d32a9c9b9435cab45782692d: exact-image Redis Parameters x replication-twemproxy focused gate failed, PASS 14 / FAIL 1 / SKIP 0; OpsRequest stayedRunning; Component/InstanceSet status and config-hash did not converge.31908d0cb7aa64c79c882e6ee71dfbce0e668516: exact-image Redis Parameters x replication-twemproxy focused gate passed, PASS 34 / FAIL 0 / SKIP 0.imagePullPolicy: Never, manager podkubeblocks-6dfbccf448-8vbpw, live imageIDsha256:80435fad50cb92d42df0aaff1bb7a23a8650559d44ff8c4f0bbb1f7ae15ff523, restartCount 0.Succeed; Clusterrds-params-twemproxyreachedRunning; Redis Component generation/observedGeneration reached 5/5; Redis InstanceSet generation/observedGeneration reached 4/4 with ready/updated/available replicas 2/2 and configHash6b4845ccf7.maxmemory-policy=allkeys-lru,hz=50, anddatabases=32; dynamic changes kept pod UIDs unchanged; the staticdatabaseschange rolled pods as expected; twemproxy SET/GET and replica readback passed.redis-params-twemproxy-r12-31908-r1was NotFound after cleanup and PV residue was empty.Boundaries