Skip to content

Add safe-to-evict annotations to DRA installers#2081

Open
Haihan-Jiang wants to merge 1 commit into
GoogleCloudPlatform:mainfrom
Haihan-Jiang:codex/kes-dra-safe-to-evict
Open

Add safe-to-evict annotations to DRA installers#2081
Haihan-Jiang wants to merge 1 commit into
GoogleCloudPlatform:mainfrom
Haihan-Jiang:codex/kes-dra-safe-to-evict

Conversation

@Haihan-Jiang

Copy link
Copy Markdown

Fixes #1740.

The DRA guide applies the NVIDIA driver installer from the container-engine-accelerators repo. That DaemonSet runs in kube-system and can block scale-down unless the pod template is marked safe to evict.

This updates the guide to patch the driver installer after applying it, and marks the local NVIDIA container toolkit installer DaemonSet the same way.

Validation:

  • ruby -e YAML.load_file(...) check for nvidia-container-toolkit-installer.yaml
  • jq validation for the README patch payload
  • git diff --check

I could not run kubectl client-side dry-run locally because kubectl is not installed in this environment.

@Haihan-Jiang Haihan-Jiang requested review from a team and yoshi-approver as code owners May 30, 2026 06:03
@Haihan-Jiang Haihan-Jiang force-pushed the codex/kes-dra-safe-to-evict branch from 646b1f6 to 368ffb8 Compare June 14, 2026 10:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DRA instructions missing safe-to-evict annotation on nvidia-driver-installer, preventing down scaling

1 participant