todo after time-slicing feature is added
kubectl get gpu shows total GPUs, how many in use, across which namespaces
Before
NAME READY REASON DRIVER VERSION NODES READY AGE
gpu True Ready 590 1 5m
After (with time-slicing, shipped together)
NAME READY REASON DRIVER VERSION NODES READY TOTAL GPUs ALLOCATED AGE
gpu True Ready 590 3 12 8 5m
TOTAL GPUs
Read from node labels that NVIDIA sets automatically after driver installation - nvidia.com/gpu.count per node, summed across all GPU nodes. We don't set this, we read it. When time-slicing is active NVIDIA advertises virtual GPUs instead of physical ones, so this number already reflects the sharing configuration. No user action needed.
ALLOCATED
Computed by listing all running pods across all namespaces and summing nvidia.com/gpu resource requests. A pod requesting nvidia.com/gpu: 2 contributes 2 to the count. We don't set this either - we derive it from the cluster state on every reconcile. Reflects current demand, not capacity.
todo after time-slicing feature is added
kubectl get gpu shows total GPUs, how many in use, across which namespaces
Before
NAME READY REASON DRIVER VERSION NODES READY AGE
gpu True Ready 590 1 5m
After (with time-slicing, shipped together)
NAME READY REASON DRIVER VERSION NODES READY TOTAL GPUs ALLOCATED AGE
gpu True Ready 590 3 12 8 5m
TOTAL GPUs
Read from node labels that NVIDIA sets automatically after driver installation - nvidia.com/gpu.count per node, summed across all GPU nodes. We don't set this, we read it. When time-slicing is active NVIDIA advertises virtual GPUs instead of physical ones, so this number already reflects the sharing configuration. No user action needed.
ALLOCATED
Computed by listing all running pods across all namespaces and summing nvidia.com/gpu resource requests. A pod requesting nvidia.com/gpu: 2 contributes 2 to the count. We don't set this either - we derive it from the cluster state on every reconcile. Reflects current demand, not capacity.