Skip to content

[nvidia_stable-11.0] NVIDIA: SAUCE: WAR: hw/vfio: Retry dma_map to bypass PFNMAP#20

Open
NathanChenNVIDIA wants to merge 1 commit into
NVIDIA:nvidia_stable-11.0from
NathanChenNVIDIA:bypass_pfnmap_11.0-stable
Open

[nvidia_stable-11.0] NVIDIA: SAUCE: WAR: hw/vfio: Retry dma_map to bypass PFNMAP#20
NathanChenNVIDIA wants to merge 1 commit into
NVIDIA:nvidia_stable-11.0from
NathanChenNVIDIA:bypass_pfnmap_11.0-stable

Conversation

@NathanChenNVIDIA
Copy link
Copy Markdown
Collaborator

Summary

EGM uses file-backed guest RAM (/dev/egm*) which selects the vfio_container_dma_map() path in hw/vfio/container.c. When this mapping fails with kernels lacking dmabuf support, return a successful return value to move forward with the bypass PFNMAP WAR in place of dmabuf support: NVIDIA: SAUCE: WAR: iommufd/pages: Bypass PFNMAP. This patch is not required when dmabuf support is present.

Testing

Launch 4 GPU EGM VM on GB200 with linux-nvidia 6.17.0-1018-nvidia-64k, verify nvidia-smi and CUDA samples

The dma_map_file pathway doesn't fit into the kernel WAR that bypasses
PFNMAP, resulting dma_map failures.

Only the dma_map pathway could work. So retry with that upon a failure.

Keep this WAR until the kernel WAR for PFNMAP is lifted.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
(cherry picked from commit faff92d https://github.com/nvmochs/QEMU/tree/stable101_smmuv3-accel-07212025_egm)
[nathanc: Moved changes from hw/vfio/container-base.c to hw/vfio/container.c for 11.0 base]
Signed-off-by: Nathan Chen <nathanc@nvidia.com>
@NathanChenNVIDIA NathanChenNVIDIA requested a review from nvmochs June 6, 2026 00:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants