Skip to content

fix: converge omitted-request resize resources#10394

Draft
weicao wants to merge 2 commits into
mainfrom
bugfix/instanceset-resize-resource-compare
Draft

fix: converge omitted-request resize resources#10394
weicao wants to merge 2 commits into
mainfrom
bugfix/instanceset-resize-resource-compare

Conversation

@weicao

@weicao weicao commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Problem

Fixes #10395.

This replaces closed PR #10362 with the narrower resource-only scope requested in review.

When the desired pod template omits CPU or memory requests and only sets limits, Kubernetes can keep a live request value that was defaulted from an older limit. The old comparison treated the omitted desired request as the desired limit, so after a resize updated the limit the next reconcile could still see a request difference even though the desired template does not own that request.

Changes

  • Compare CPU and memory requests only when the desired template explicitly sets that request.
  • Keep limit comparison unchanged.
  • Add focused Instance and InstanceSet tests for omitted-request resource comparison.

Runtime status

Exact-head Redis Parameters x replication-twemproxy validation with kubeblocks:pr-10394-75d1223-stella failed. The first blocker in that run is control-plane config-hash commit/status convergence, tracked separately by #10369. This PR remains draft and does not unblock Redis PR #2819.

Scope

  • No ObjectTree patch option.
  • No metadata-only pod patch path.
  • No Redis addon release-ready claim.

Tests

  • go test ./pkg/controller/instance ./pkg/controller/instanceset -count=1

@apecloud-bot

Copy link
Copy Markdown
Collaborator

Auto Cherry-pick Instructions

Usage:
  - /nopick: Not auto cherry-pick when PR merged.
  - /pick: release-x.x [release-x.x]: Auto cherry-pick to the specified branch when PR merged.

Example:
  - /nopick
  - /pick release-1.1

CLA Recheck Instructions

Usage:
  - /recheck-cla: Trigger a re-check of CLA status for this pull request.
Example:
  - /recheck-cla

@github-actions github-actions Bot added the size/XL Denotes a PR that changes 500-999 lines. label Jun 17, 2026
@weicao weicao force-pushed the bugfix/instanceset-resize-resource-compare branch from abaff59 to 75d1223 Compare June 17, 2026 14:37
@github-actions github-actions Bot added size/L Denotes a PR that changes 100-499 lines. and removed size/XL Denotes a PR that changes 500-999 lines. labels Jun 17, 2026
@weicao weicao added the nopick Not auto cherry-pick when PR merged label Jun 17, 2026
@codecov

codecov Bot commented Jun 17, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 61.87%. Comparing base (00dc1b7) to head (dc80a11).
⚠️ Report is 21 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #10394      +/-   ##
==========================================
- Coverage   61.87%   61.87%   -0.01%     
==========================================
  Files         533      533              
  Lines       63609    63621      +12     
==========================================
+ Hits        39360    39363       +3     
- Misses      20661    20674      +13     
+ Partials     3588     3584       -4     
Flag Coverage Δ
unittests 61.87% <100.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@weicao

weicao commented Jun 17, 2026

Copy link
Copy Markdown
Contributor Author

Runtime validation update (2026-06-18 CST)

Exact head 75d122399 image kubeblocks:pr-10394-75d1223-stella was tested in the Redis Parameters x replication-twemproxy focused gate. Result: FAIL (PASS 14 / FAIL 1 / SKIP 0).

Evidence verified locally:

  • v2 key package sha256: 0621060b6d98fc77584f215bb8f03720a425e266501151711bfdbf419bda7c96
  • The live controller Deployment/Pod used the exact PR image with imagePullPolicy: Never and restartCount 0; all 3 nodes had the sideloaded tag.
  • OpsRequest rds-params-twemproxy-reconf-79353-2502 stayed Running for 900s with ReconfigureStarted.
  • Component rds-params-twemproxy-redis is generation 3 / observedGeneration 3, but stays Updating with Progressing=True, reason=WorkloadNotUpdated.
  • InstanceSet rds-params-twemproxy-redis is generation 2 / observedGeneration 1; status.instanceStatus still reports both Redis pods at old configHash 7fd467d678.
  • Component desired configHash and the ConfigMap are 75d4889cd7; the ConfigMap data contains maxmemory-policy allkeys-lru.
  • InstanceSet controller repeatedly logs successful reconfigure for rds-params-twemproxy-redis-1 with configHash 75d4889cd7, but the InstanceSet status does not converge.

Classification for this exact run: first blocker is control-plane config-hash commit/status convergence, tracked by #10369. This PR remains draft and should not be used to unblock Redis PR #2819. The resource-comparison change may still address #10395, but it is not sufficient for this Redis focused gate.

When a desired pod template sets CPU or memory limits but omits requests, Kubernetes can preserve a live request value defaulted from an older limit. The in-place resource comparison should not infer a desired request from the desired limit in that case; otherwise a post-resize reconcile keeps reporting a request difference even though the desired template does not own that request.

Compare CPU and memory requests only when the desired template explicitly sets that request. Continue to compare limits normally.
@weicao weicao force-pushed the bugfix/instanceset-resize-resource-compare branch from 75d1223 to 28c1e02 Compare June 17, 2026 16:45
@weicao

weicao commented Jun 17, 2026

Copy link
Copy Markdown
Contributor Author

Pushed scope cleanup after the runtime classification:

  • removed runtime-style config-hash convergence tests from this resource-only PR
  • kept the implementation to explicit CPU/memory request comparison only
  • added focused Instance/InstanceSet resource-comparison tests
  • local go test ./pkg/controller/instance ./pkg/controller/instanceset -count=1 passed

PR remains draft; #10369 carries the control-plane config-hash/status blocker.

@CLAassistant

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ weicao
❌ Wei


Wei seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

nopick Not auto cherry-pick when PR merged size/L Denotes a PR that changes 100-499 lines.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Omitted CPU/memory requests keep resize resource comparison from converging

3 participants