Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 23 additions & 1 deletion pkg/resourcemonitor/noderesourcesaggregator.go
Original file line number Diff line number Diff line change
Expand Up @@ -239,8 +239,30 @@ func getContainerDevicesFromAllocatableResources(availRes *podresourcesapi.Alloc
// updateAvailable computes the actually available resources.
// This function assumes the available resources are initialized to be equal to the allocatable.
func (noderesourceData *nodeResources) updateAvailable(numaData map[int]map[corev1.ResourceName]*resourceData, ri ResourceInfo) {
resName := string(ri.Name)
resMap, ok := noderesourceData.resourceID2NUMAID[resName]
if !ok {
resMap = make(map[string]int)
for _, numaNodeID := range ri.NumaNodeIds {
if _, ok := numaData[numaNodeID]; !ok {
klog.InfoS("failed to find NUMA node ID under the node topology", "numaID", numaNodeID)
continue
}

if _, ok := numaData[numaNodeID][ri.Name]; !ok {
klog.InfoS("failed to find resource under the node topology", "resourceName", ri.Name)
continue
}
Comment on lines +252 to +255
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check verifies if the resource exists in numaData before adding it to the resourceID2NUMAID map, but the resource may legitimately not exist yet in numaData if this is a newly registered device plugin. The check at line 252 will cause the function to skip adding new resources to the map, which defeats the purpose of this PR. The perNuma/numaData structure is built from perNUMAAllocatable which is initialized only at startup from GetAllocatableResources, so new resources won't be present there until the aggregator is recreated. This condition should be reconsidered or removed.

Copilot uses AI. Check for mistakes.

for _, resID := range ri.Data {
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nested loop structure incorrectly maps all resource IDs to each NUMA node in sequence, causing the last NUMA node to overwrite previous mappings. If ri.NumaNodeIds contains multiple nodes [0, 1] and ri.Data contains ['dev0', 'dev1'], both devices will incorrectly be mapped only to the last NUMA node (1). The ResourceInfo structure doesn't provide device-to-node pairing, so this approach cannot correctly establish the mapping. Consider whether the ResourceInfo needs to be enhanced to include explicit device-to-NUMA-node pairing, or if the mapping logic needs to rely on the original topology information from the Pod Resources API.

Suggested change
for _, resID := range ri.Data {
for _, resID := range ri.Data {
// Only set the mapping if this resource ID has not been mapped yet.
if _, exists := resMap[resID]; exists {
continue
}

Copilot uses AI. Check for mistakes.
resMap[resID] = numaNodeID
}
}

noderesourceData.resourceID2NUMAID[resName] = resMap
}

for _, resID := range ri.Data {
resName := string(ri.Name)
resMap, ok := noderesourceData.resourceID2NUMAID[resName]
if !ok {
Comment on lines 266 to 267
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This lookup is redundant since resMap was already retrieved (and potentially populated) at lines 243-263. The variable resMap is still in scope and can be reused directly, eliminating the need for this duplicate lookup.

Copilot uses AI. Check for mistakes.
klog.InfoS("unknown resource", "resourceName", ri.Name)
Expand Down
Loading