Skip to content

feat: delay ERI attachment until node is ready#29

Open
BSWANG wants to merge 1 commit into
AliyunContainerService:mainfrom
BSWANG:feature/wait-node-ready
Open

feat: delay ERI attachment until node is ready#29
BSWANG wants to merge 1 commit into
AliyunContainerService:mainfrom
BSWANG:feature/wait-node-ready

Conversation

@BSWANG

@BSWANG BSWANG commented Jun 25, 2026

Copy link
Copy Markdown
Member

Summary

  • Wait for kubelet to report NodeReady before starting ERI selection and attachment, avoiding repeated API failures and potential kernel memory allocation issues when internal drivers are not yet initialized
  • Add configurable timeout (waitNodeReadyTimeoutSeconds, default 300s) as fallback — if a node stays NotReady beyond this, proceed anyway
  • Update node watch predicate to trigger reconciliation on NotReady→Ready transitions

Test plan

  • go build passes for all modified packages
  • go test ./internal/controller/... passes, including new TestIsNodeReady and TestPredictNodeUpdate cases
  • Deploy to test cluster, add a new node, verify controller logs show "waiting" messages before NodeReady and no ECS API calls until ready (or timeout)

@BSWANG BSWANG force-pushed the feature/wait-node-ready branch from 830d73e to 2cc4cbd Compare June 25, 2026 08:07
Wait for kubelet to report NodeReady before starting ERI selection
and attachment. This avoids repeated failures and potential kernel
memory allocation issues when internal drivers are not yet initialized.

If a node stays NotReady beyond a configurable timeout (default 5min),
proceed anyway as a fallback.
@BSWANG BSWANG force-pushed the feature/wait-node-ready branch from 2cc4cbd to bbf03fb Compare June 25, 2026 08:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants