Skip to content

chore: patch metadata-only pod updates#10401

Merged
leon-ape merged 1 commit into
mainfrom
bugfix/instanceset-metadata-only-main
Jun 18, 2026
Merged

chore: patch metadata-only pod updates#10401
leon-ape merged 1 commit into
mainfrom
bugfix/instanceset-metadata-only-main

Conversation

@weicao

@weicao weicao commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Problem

Fixes #10369.

After a successful reconfigure, the remaining in-place Pod diff can be only metadata such as the config-hash annotation. Planning that as a full Pod update can race with concurrent status writes and leave the config-hash/status stale, which keeps the OpsRequest running even though the database-side reconfigure action returned OK.

This PR is a clean metadata-only branch based directly on main. It intentionally does not include the resource-compare cleanup from #10394.

Changes

  • Add an explicit ObjectTree option for planning PATCH actions.
  • Use PATCH for safe metadata-only in-place Pod updates in the InstanceSet and Instance reconcilers.
  • Keep non-metadata Pod changes on the existing switchover/full-update path.
  • Add kubebuilderx plan-builder coverage and update the focused InstanceSet test to assert metadata-only PATCH planning.

Tests

  • git diff --check origin/main..HEAD
  • go test ./pkg/controller/kubebuilderx ./pkg/controller/instance ./pkg/controller/instanceset -count=1

Runtime status

  • Previous failed approach 04f6d7e90770ed07d32a9c9b9435cab45782692d: exact-image Redis Parameters x replication-twemproxy focused gate failed, PASS 14 / FAIL 1 / SKIP 0; OpsRequest stayed Running; Component/InstanceSet status and config-hash did not converge.
  • Previous stacked head 31908d0cb7aa64c79c882e6ee71dfbce0e668516: exact-image Redis Parameters x replication-twemproxy focused gate passed, PASS 34 / FAIL 0 / SKIP 0.
  • Clean head 5c5c2a060d15a7eb2c0e7fb25d640ac7a29ad579: exact-image Redis Parameters x replication-twemproxy focused gate passed, PASS 34 / FAIL 0 / SKIP 0.

Clean-head runtime evidence:

  • Image tag: kubeblocks:pr-10401-5c5c2a0-stella
  • Image tar sha256: b25456144e7cceb6ab6e0d1c841521a99c5f8da2bd0fa914278bf690f4d66a89
  • Import digest from tar: sha256:4102f31aa873f864fe115ceaf6f59b29673be2e861f4902183f1cfe6be7ee56e
  • Live manager pod: kubeblocks-785f58d76b-8wd9b, imageID sha256:f4584d503aba16f3cc28d4057dbba319e6c73a71607d73f94e0314853bdb51d1, startTime 2026-06-18T06:17:34Z, restartCount 0
  • Evidence package: redis-pr10401-5c5c2a0-parameters-twemproxy-20260618T1411.tgz, sha256 bd3ebe3a2e91e9e0741fad0f80f5b31940fcb0390e1498006818296c5f4b863f

Observed clean-head gate details:

  • A02 dynamic maxmemory-policy: OpsRequest Succeed, ConfigMap updated, both Redis pods read back allkeys-lru, pod UIDs unchanged, twemproxy SET/GET OK.
  • A03 dynamic hz: OpsRequest Succeed, both Redis pods read back 50, pod UIDs unchanged, twemproxy SET/GET OK.
  • A04 static databases: OpsRequest Succeed, both Redis pods read back 32, pod UIDs changed as expected, twemproxy SET/GET OK.
  • A05 replica readback after reconfigure passed.
  • A06 cleanup passed: namespace NotFound and no PV residue.

Boundaries

  • This PR is the metadata-only controller fix only.
  • fix: converge omitted-request resize resources #10394 remains separate for resource-compare cleanup.
  • This is patch-version focused validation for Redis Parameters x replication-twemproxy only, not a Redis addon release-ready claim or a broader matrix result.

After a successful reconfigure, the remaining pod diff can be only config-hash metadata. A full Pod update can conflict with concurrent kubelet/status writes and leave the annotation stale, which keeps the OpsRequest running even after the runtime config changed.

Add an explicit ObjectTree patch option and use it for metadata-only in-place pod updates in the InstanceSet and Instance reconcilers.
@weicao weicao requested review from a team and leon-ape as code owners June 18, 2026 06:00
@github-actions github-actions Bot added the size/L Denotes a PR that changes 100-499 lines. label Jun 18, 2026
@apecloud-bot

Copy link
Copy Markdown
Collaborator

Auto Cherry-pick Instructions

Usage:
  - /nopick: Not auto cherry-pick when PR merged.
  - /pick: release-x.x [release-x.x]: Auto cherry-pick to the specified branch when PR merged.

Example:
  - /nopick
  - /pick release-1.1

CLA Recheck Instructions

Usage:
  - /recheck-cla: Trigger a re-check of CLA status for this pull request.
Example:
  - /recheck-cla

@leon-ape leon-ape changed the title fix: patch metadata-only pod updates chore: patch metadata-only pod updates Jun 18, 2026
@leon-ape leon-ape added the pick-1.1 Auto cherry-pick to release-1.1 when PR merged label Jun 18, 2026
@apecloud-bot apecloud-bot added the approved PR Approved Test label Jun 18, 2026
@codecov

codecov Bot commented Jun 18, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 65.38462% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 61.97%. Comparing base (00dc1b7) to head (5c5c2a0).

Files with missing lines Patch % Lines
pkg/controller/instance/reconciler_update.go 0.00% 7 Missing ⚠️
pkg/controller/instanceset/reconciler_update.go 71.42% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #10401      +/-   ##
==========================================
+ Coverage   61.87%   61.97%   +0.09%     
==========================================
  Files         533      533              
  Lines       63609    63621      +12     
==========================================
+ Hits        39360    39427      +67     
+ Misses      20661    20618      -43     
+ Partials     3588     3576      -12     
Flag Coverage Δ
unittests 61.97% <65.38%> (+0.09%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@weicao

weicao commented Jun 18, 2026

Copy link
Copy Markdown
Contributor Author

Runtime validation update for clean head 5c5c2a060d15a7eb2c0e7fb25d640ac7a29ad579:

  • Scope: Redis Parameters x replication-twemproxy focused gate.
  • Result: PASS 34 / FAIL 0 / SKIP 0.
  • Image tag: kubeblocks:pr-10401-5c5c2a0-stella.
  • Image tar sha256: b25456144e7cceb6ab6e0d1c841521a99c5f8da2bd0fa914278bf690f4d66a89.
  • Import digest from tar: sha256:4102f31aa873f864fe115ceaf6f59b29673be2e861f4902183f1cfe6be7ee56e.
  • Live manager pod: kubeblocks-785f58d76b-8wd9b, imageID sha256:f4584d503aba16f3cc28d4057dbba319e6c73a71607d73f94e0314853bdb51d1, startTime 2026-06-18T06:17:34Z, restartCount 0.
  • Evidence package: redis-pr10401-5c5c2a0-parameters-twemproxy-20260618T1411.tgz, sha256 bd3ebe3a2e91e9e0741fad0f80f5b31940fcb0390e1498006818296c5f4b863f.

Observed gate details:

  • A02 dynamic maxmemory-policy: OpsRequest Succeed, ConfigMap updated, both Redis pods read back allkeys-lru, pod UIDs unchanged, twemproxy SET/GET OK.
  • A03 dynamic hz: OpsRequest Succeed, both Redis pods read back 50, pod UIDs unchanged, twemproxy SET/GET OK.
  • A04 static databases: OpsRequest Succeed, both Redis pods read back 32, pod UIDs changed as expected, twemproxy SET/GET OK.
  • A05 replica readback after reconfigure passed.
  • A06 cleanup passed: namespace NotFound and no PV residue.

Boundary: this is exact clean-head validation for the focused Redis Parameters x replication-twemproxy gate. It is not a Redis addon release-ready claim or a broader matrix result.

@leon-ape leon-ape merged commit e296fcd into main Jun 18, 2026
68 of 73 checks passed
@leon-ape leon-ape deleted the bugfix/instanceset-metadata-only-main branch June 18, 2026 07:04
@github-actions github-actions Bot added this to the Release 1.2.0 milestone Jun 18, 2026
@apecloud-bot

Copy link
Copy Markdown
Collaborator

/cherry-pick release-1.1

@apecloud-bot

Copy link
Copy Markdown
Collaborator

🤖 says: cherry pick action finished successfully 🎉!
See: https://github.com/apecloud/kubeblocks/actions/runs/27742714915

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved PR Approved Test pick-1.1 Auto cherry-pick to release-1.1 when PR merged size/L Denotes a PR that changes 100-499 lines.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

InstanceSet config hash can stay stale when resize subresource is used first

3 participants