fix: prevent duplicate CNClaim binding to same CN Pod during migration#586
Closed
xzxiong wants to merge 4 commits intomatrixorigin:mainfrom
Closed
fix: prevent duplicate CNClaim binding to same CN Pod during migration#586xzxiong wants to merge 4 commits intomatrixorigin:mainfrom
xzxiong wants to merge 4 commits intomatrixorigin:mainfrom
Conversation
Add podClaimedByOthers() check to selectCN() and Finalize() to verify no other CNClaim references a Pod via spec.podName before binding or reclaiming it. This prevents the race condition where migration causes Pod labels to become stale while spec.podName still holds the true binding relationship. Fixes matrixorigin#581, matrixorigin#582, matrixorigin#585
…laims - Extract podClaimedByOthers loop into buildPodClaimIndex() that builds a map[podName]claimName once per reconcile, reducing N API calls to 1 - Skip CNClaims with DeletionTimestamp != nil to prevent Pod leak when both claims are being deleted simultaneously - Add holder claim name to log messages for easier debugging - Add test for deleting claim filtering and buildPodClaimIndex
Contributor
Author
Review 修复 (7217df0)已根据 code review 反馈修复以下问题: P1 性能优化 ✅
P2 正确性 ✅
P3 可观测性 ✅
新增测试 ✅
|
d65f56d to
f13d295
Compare
skywalking-eyes defaults to current year when validating copyright headers, causing all .go files to fail when the year rolls over. Set copyright-year to '2024-2030' to accept existing and future headers.
Contributor
Author
|
Closing: recreating with correct head (same-repo mode, not fork mode) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
变更说明
修复 CNClaim controller 在 CN Pod 迁移流程中的竞态条件:多个 CNClaim 可以同时绑定到同一个 CN Pod,删除其中一个 claim 后 Pod 被错误回收,破坏另一个 claim 的绑定关系。核心思路是以
CNClaim.spec.podName为 Pod 归属的 source of truth,在selectCN()和Finalize()中增加 CNClaim 列表校验。podClaimedByOthers()helper:列出 namespace 下所有 CNClaim,检查是否有其他 claim 的spec.podName指向目标 PodselectCN()绑定前校验:在ensureOwnership()前调用podClaimedByOthers()跳过已被占用的 PodFinalize()回收前校验:回收 Pod 前检查是否有其他 CNClaim 仍引用该 Pod,有则跳过核心变更
1. CNClaim Controller (
pkg/controllers/cnclaim/controller.go)selectCN(): 在遍历 idle Pod 列表时,调用podClaimedByOthers()过滤已被其他 claim 绑定的 PodFinalize(): 回收 Pod 前检查podClaimedByOthers(),如果有其他 claim 引用则跳过回收podClaimedByOthers()函数:列出 namespace 下所有 CNClaim,检查spec.podName是否匹配调用链
新增文件
变更类型
影响范围
测试验证
UT 覆盖说明
podClaimedByOthers— 5 个 case(controller_test.go)回归验证(Bug Fix PR 专项)
Test_podClaimedByOthers/pod_claimed_by_another_claim精确复现了 Issue #581 的竞态场景:两个 CNClaim 的spec.podName指向同一 Pod。修复前selectCN()不检查此条件会绑定成功;修复后返回true并跳过。相关链接