feat: Vulkan vGPU support — auto-mount HAMi Vulkan implicit layer manifest into containers#118
feat: Vulkan vGPU support — auto-mount HAMi Vulkan implicit layer manifest into containers#118100milliongold wants to merge 7 commits intoProject-HAMi:mainfrom
Conversation
Signed-off-by: Jea-Eok-Kim <je.kim@xiilab.com>
… compile Signed-off-by: Jea-Eok-Kim <je.kim@xiilab.com>
Signed-off-by: Jea-Eok-Kim <je.kim@xiilab.com>
Signed-off-by: Jea-Eok-Kim <je.kim@xiilab.com>
Signed-off-by: Jea-Eok-Kim <je.kim@xiilab.com>
Signed-off-by: Jea-Eok-Kim <je.kim@xiilab.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: 100milliongold The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Welcome @100milliongold! It looks like this is your first PR to Project-HAMi/volcano-vgpu-device-plugin 🎉 |
The device-plugin advertises vgpu-memory at one device per MiB, which on nodes with 40+ GiB GPUs (e.g. RTX 6000 Ada × 2 == ~92 K devices) exceeds the kubelet's 4 MiB gRPC ListAndWatch receive limit. The kubelet then drops the advertise and the node's Allocatable for volcano.sh/vgpu-memory is reported as 0, while vgpu-number / vgpu-cores are correct. The Volcano scheduler's capacity plugin treats this as 'queue resource quota insufficient'. The fix already exists in the codebase (gpuMemoryFactor in the device-config ConfigMap, default 1), but operators hitting the 0 issue have nothing in the README or docs that points at it. This commit adds: - README Troubleshooting section: symptoms, root cause, fix steps, unit-change warning, the Allocate emits CUDA_DEVICE_MEMORY_LIMIT_<i> in MiB regardless so hard memory enforcement (CUDA + Vulkan via HAMi-core) is unaffected. - ConfigMap inline comment in both plugin yamls describing when to raise gpuMemoryFactor. - doc/vulkan-vgpu.md: Vulkan side note explaining the unit mapping (vgpu-memory: 4 == 4 GiB at gpuMemoryFactor=1024). - examples/vulkan-pod.yaml: clarify that the limit unit depends on gpuMemoryFactor. No code change. Default behavior (gpuMemoryFactor=1) is preserved. Signed-off-by: Jea-Eok-Kim <je.kim@xiilab.com>
Follow-up: bump libvgpu submodule again to pick up Step D cherry-picksAfter Step D verification on ws-node074, two more libvgpu commits were cherry-picked into HAMi-core
Without these, What this PR does
Verificationws-node074 isaac-launchable-0 4-path PASS confirmed against an out-of-band-installed Submodule SHA: |
Summary
Adds Vulkan vGPU support to
volcano-vgpu-device-plugin. When a pod opts into Vulkan partitioning, the device-plugin auto-mounts the HAMi Vulkan implicit layer manifest into the container so the Vulkan loader picks up the layer that enforces per-pod memory limits onvkAllocateMemory.This is the Volcano-side counterpart to Project-HAMi/HAMi#1803 and Project-HAMi/HAMi-core#182. The actual memory enforcement happens in the HAMi-core Vulkan layer; this PR just makes sure the manifest reaches the container.
Why
Vulkan workloads (Isaac Sim, ray tracing, GPU-accelerated rendering) currently bypass HAMi's per-container memory limit because allocations go through
vkAllocateMemory/ the NVIDIA Vulkan ICD, not the CUDA driver path. We hit this in production with Isaac Sim — Kit allocates several GB through Vulkan, ignored the requested partition, and OOM'd the host.The fix needs three coordinated layers — HAMi-core's Vulkan layer (PR linked above), the admission webhook env injection (HAMi PR linked above), and this PR which makes the device-plugin actually deliver the implicit-layer manifest into the container.
What changed
libvgpusubmoduleProject-HAMi/HAMi-core@vulkan-layer(companion PR #182). The submodule now shipsetc/vulkan/implicit_layer.d/hami.jsonnext tolibvgpu.so.docker/Dockerfile.ubuntu20.04libvulkan-devin thenvidia_builderstage so the HAMi-core Vulkan layer can compile against the Vulkan headers;COPYthehami.jsonmanifest into the runtime image at/k8s-vgpu/lib/nvidia/vulkan/implicit_layer.d/hami.json.volcano-vgpu-device-plugin{,-cdi}.ymlcp -f→cp -rfso the newvulkan/implicit_layer.d/subdirectory is copied to the host (not just the top-level.sofiles). Backwards compatible —cp -rfhandles plain files identically.pkg/plugin/vulkan.gobuildVulkanManifestMount(hostHookPath)helper. Returns a singleMount{}for/etc/vulkan/implicit_layer.d/hami.jsonwhen the host file exists; returns nil otherwise. Idempotent and side-effect-free on nodes without the manifest.pkg/plugin/server.goAllocateresponse.pkg/plugin/server_vulkan_test.goPresent(manifest exists → mount appended),Absent(manifest missing → no-op).examples/vulkan-pod.yamldoc/vulkan-vgpu.mdHow it works
vkAllocateMemoryand enforces the per-pod budget. The implicit-layer manifest is gated byenable_environment: HAMI_VULKAN_ENABLE=1.HAMI_VULKAN_ENABLE=1and mergesgraphicsintoNVIDIA_DRIVER_CAPABILITIESfor pods that carry thehami.io/vulkan: \"true\"annotation.Pods without the annotation get neither the env nor the layer activation — the manifest's
enable_environmentguard means the layer doesn't load even if the file is present.Compatibility / Breaking changes
enable_environmentand the per-pod budget — neither activates without the pod opting in.cp -f→cp -rfchange in the postStart hook is backwards compatible; nodes that already have the previous lib directory get the newvulkan/subdirectory added on next pod restart.buildVulkanManifestMountreturns nil so the Allocate response is unchanged. No pod-startup blocker.Test plan
go test ./pkg/plugin/...— Vulkan unit tests (Present / Absent) pass.make buildsucceeds with the updated submodule.make imageproduces a runtime image containing/k8s-vgpu/lib/nvidia/vulkan/implicit_layer.d/hami.json.hami.io/vulkan: \"true\"has/etc/vulkan/implicit_layer.d/hami.jsonmounted; pod without the annotation does not (or has it mounted but the layer doesn't load — same outcome).nvidia.com/gpumemlimit + the annotation; Kit boot log reports the configured partition size and the workload is held to it.Notes for reviewers
Project-HAMi/HAMi-core@vulkan-layer(PR #182). This PR cannot be merged until that one lands; happy to coordinate.pkg/plugin/vulkan.gohelper is intentionally tiny so MIG / non-MIG / CDI paths can call it identically. (The HAMi PR #1803 follows the same pattern with its own helper.)