Skip to content

NVIDIA: SAUCE: Patch NVMe/NVMeoF driver to support GDS on Linux 7.0 Kernel#373

Closed
sourabgupta3 wants to merge 1 commit into
NVIDIA:26.04_linux-nvidiafrom
sourabgupta3:sg_26.04_linux-nvidia-nvme-fix
Closed

NVIDIA: SAUCE: Patch NVMe/NVMeoF driver to support GDS on Linux 7.0 Kernel#373
sourabgupta3 wants to merge 1 commit into
NVIDIA:26.04_linux-nvidiafrom
sourabgupta3:sg_26.04_linux-nvidia-nvme-fix

Conversation

@sourabgupta3

Copy link
Copy Markdown

With this change, the NVMe and NVMeoF driver would be enabled to
support GPUDirectStorage(GDS). NVMe driver introduced a way to
use the blk_rq_dma_map API to DMA map requests instead of
scatter gather lists. With these changes, GDS path also adopts
a similar framework where we introduce blk based
APIs(nvfs_blk_rq_dma_map_iter_start and nvfs_blk_rq_dma_map_iter_next)
to map a DMA request. The NVMeoF path remains the same as previous releases.

Signed-off-by: Sourab Gupta sougupta@nvidia.com
Reviewed-by: Kiran Modukuri kmodukuri@nvidia.com

@nirmoy

nirmoy commented Apr 16, 2026

Copy link
Copy Markdown
Collaborator

@sourabgupta3 I would squash these two patches together as these are not upstreamed patches so keeps patch list small.

@sourabgupta3

Copy link
Copy Markdown
Author

@nirmoy For context, on 6.17 these two went as separate commits as we discovered a bug later, that's why I kept it this way to be consistent with 6.17. Do you still want to squash them? I am okay with either option.

@jamieNguyenNVIDIA

Copy link
Copy Markdown
Collaborator

@nirmoy For context, on 6.17 these two went as separate commits as we discovered a bug later, that's why I kept it this way to be consistent with 6.17. Do you still want to squash them? I am okay with either option.

I think you should still squash them. It avoids leaving in a known bugged commit and would mean that you'd only need to pick a single patch into the next HWE kernel.

@nvmochs

nvmochs commented Apr 16, 2026

Copy link
Copy Markdown
Collaborator

@nirmoy For context, on 6.17 these two went as separate commits as we discovered a bug later, that's why I kept it this way to be consistent with 6.17. Do you still want to squash them? I am okay with either option.

6.17-HWE will be going away in a few months; this LTS will be supported for 10+ years. We should squash.

@sourabgupta3 sourabgupta3 force-pushed the sg_26.04_linux-nvidia-nvme-fix branch from 0afef66 to fd03681 Compare April 16, 2026 18:25
@sourabgupta3

Copy link
Copy Markdown
Author

@jamieNguyenNVIDIA @nirmoy Done!

@nvmochs

nvmochs commented Apr 16, 2026

Copy link
Copy Markdown
Collaborator

I confirmed that the squashed patches contain the same content as the version in 6.17-HWE with these differences:

  1. IOD_NVFS_IO = 1U << 31 to avoid enum conflicts
  2. u8 → u32 flags to accommodate the new bit width
  3. Local iod declaration in nvme_pci_prp_iter_next due to changed function scope
  4. blk_rq_dma_map_iter_next API dropped dma_state parameter
  5. nvme_pci_setup_data_prp restructured around nvme_pci_prp_save_mapping

No issues or concerns from me.

Acked-by: Matthew R. Ochs <mochs@nvidia.com>


@sourabgupta3 - Please also submit this change against the 26.04_linux-nvidia-bos branch (which will be the first "HWE" in 26.04). Thanks!

Comment thread drivers/nvme/host/pci.c
struct nvme_request req;
struct nvme_command cmd;
u8 flags;
u32 flags;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At linux-nvidia-6.17 this u8 flags? is this a new fix?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is new as there wasn't any space left to accomodate NVFS flags. But I think I should add this also under #ifdef CONFIG_NVFS.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's needed because Sourab needed to add IOD_NVFS_IO to nvme_iod_flags and it was maxed out on bits that fit within a single byte. So he expanded the field to a large width (an integer) and used a very high bit number (the last bit of an integer) to avoid future conflicts when rebasing. This is the only place that uses these flags and I confirmed that there are no local variables that make a copy of the field (which would have potentially also required a fixup if they too used a single byte for storage).

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the replies.

@nvmochs

nvmochs commented Apr 16, 2026

Copy link
Copy Markdown
Collaborator

I confirmed that the squashed patches contain the same content as the version in 6.17-HWE with these differences:

  1. IOD_NVFS_IO = 1U << 31 to avoid enum conflicts
  2. u8 → u32 flags to accommodate the new bit width
  3. Local iod declaration in nvme_pci_prp_iter_next due to changed function scope
  4. blk_rq_dma_map_iter_next API dropped dma_state parameter
  5. nvme_pci_setup_data_prp restructured around nvme_pci_prp_save_mapping

No issues or concerns from me.

Acked-by: Matthew R. Ochs <mochs@nvidia.com>

@sourabgupta3 - Please also submit this change against the 26.04_linux-nvidia-bos branch (which will be the first "HWE" in 26.04). Thanks!

@sourabgupta3 Actually, there was 1 nit I had that I forgot about...

There are some instances of trailing whitespace added by this PR. Can these be addressed?

@sourabgupta3 sourabgupta3 force-pushed the sg_26.04_linux-nvidia-nvme-fix branch 2 times, most recently from 21af88d to 93bd3d6 Compare April 16, 2026 23:49
@sourabgupta3

Copy link
Copy Markdown
Author

@nvmochs I cleaned up some now. Does this look okay?

@nvmochs

nvmochs commented Apr 16, 2026

Copy link
Copy Markdown
Collaborator

@nvmochs I cleaned up some now. Does this look okay?

It looks great, no more trailing whitespace.

One more tiny nit, in the commit title, can you change "Linux 6.17 Kernel" -> "Linux 7.0 Kernel"?

@sourabgupta3 sourabgupta3 force-pushed the sg_26.04_linux-nvidia-nvme-fix branch from 93bd3d6 to 9582112 Compare April 16, 2026 23:56
@sourabgupta3

Copy link
Copy Markdown
Author

Done @nvmochs

@sourabgupta3

Copy link
Copy Markdown
Author

@nvmochs PR against 26.04_linux-nvidia-bos: #376

@clsotog clsotog self-requested a review April 17, 2026 05:13

@clsotog clsotog left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acked-by: Carol L Soto <csoto@nvidia.com>

@nirmoy

nirmoy commented Apr 17, 2026

Copy link
Copy Markdown
Collaborator

Acked-by: Nirmoy Das <nirmoyd@nvidia.com>

@jamieNguyenNVIDIA jamieNguyenNVIDIA left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General comment.

nvfs-dma.c, nvfs-rdma.c, and nvfs.h all appear to have space-indented code:

  • nvfs-dma.c lines 31–47: REGISTER_FUNC and UNREGISTER_FUNC bodies use spaces
  • nvfs-rdma.c lines 32–47: same functions as above
  • nvfs.h lines 55–68: nvfs_count_ops() and nvfs_get_ops() bodies use spaces
  • nvfs-dma.h lines 38–47: nvme_nvfs_unmap_sgls inner block uses spaces

The nvfs-dma.h line 174 (else with space indent) is also in nvme_nvfs_map_data.

Can you change these all to tabs?

Comment thread drivers/nvme/host/pci.c
if (!nvfs_ops->nvfs_blk_rq_dma_map_iter_next(req, dma_dev,
&iod->dma_state, iter))
return false;
iod->dma_vecs[iod->nr_dma_vecs].addr = iter->addr;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't tell if this matters or not, but if mempool_alloc() fails here, we'll return and not set iod->dma_vecs[0].addr = iter->addr;. Later, when nvme_nvfs_unmap_prps() is called, it will exit early at the if (!iod->dma_vecs) check without looping through iod->nr_dma_vecs (and calling nvfs_dma_unmap_page()). Could we have a leak because of this?

Comment thread drivers/nvme/host/pci.c
iod->dma_vecs = mempool_alloc(nvmeq->dev->dmavec_mempool, GFP_ATOMIC);
if (!iod->dma_vecs)
return BLK_STS_RESOURCE;
iod->dma_vecs[0].addr = iter->addr;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the previous comment in the PRP path: if dma_pool_alloc() fails here, we return before writing any SGL entries and before iod->cmd.common.dptr.sgl is populated. When nvme_nvfs_unmap_sgls() is called with nr_descriptors == 0, it will call nvfs_dma_unmap_page() using the addr/len from iod->cmd.common.dptr.sgl, which was never set. Could there be a leak because of this?

Comment thread drivers/nvme/host/nvfs-rdma.h Outdated
return 0;
}

return ret;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This return looks to be unreachable. Seems like you can drop the else case.

@sourabgupta3 sourabgupta3 force-pushed the sg_26.04_linux-nvidia-nvme-fix branch from 9582112 to 825c167 Compare April 17, 2026 20:27
@github-actions

github-actions Bot commented Apr 17, 2026

Copy link
Copy Markdown
Contributor

✅ Patchscan: No Missing Fixes

All cherry-picked commits have been checked — no missing upstream fixes found.

@nvmochs

nvmochs commented Apr 22, 2026

Copy link
Copy Markdown
Collaborator

@sourabgupta3 Can you please review/respond to Jamie's comments and questions above? Thanks!

@sourabgupta3

Copy link
Copy Markdown
Author

@nvmochs I already responded to Jamie's comment last week. I am in the middle of testing the fix.

@jamieNguyenNVIDIA

Copy link
Copy Markdown
Collaborator

@nvmochs I already responded to Jamie's comment last week. I am in the middle of testing the fix.

FYI: I don't see responses on this PR.

@sourabgupta3

Copy link
Copy Markdown
Author

@jamieNguyenNVIDIA @nvmochs It looks like my comment is stuck in pending state. This is what I have commented:
Both the comments look valid. Nice catch! Unfortunately, I don't have a setup to test these now. I will work on the fixes next week and submit.

Do you know why could it be in pending state?

@sourabgupta3 sourabgupta3 force-pushed the sg_26.04_linux-nvidia-nvme-fix branch from 825c167 to 09c15e6 Compare April 22, 2026 01:49
@jamieNguyenNVIDIA

Copy link
Copy Markdown
Collaborator

Hi @sourabgupta3: looks like there are still spaces instead of tabs at:

  1. nvfs-rdma.c:27
  2. nvfs.h:78–127

My other comments appear to have been addressed.

@sourabgupta3 sourabgupta3 force-pushed the sg_26.04_linux-nvidia-nvme-fix branch from 09c15e6 to fb14aa9 Compare April 22, 2026 04:13
@sourabgupta3

Copy link
Copy Markdown
Author

@jamieNguyenNVIDIA Addressed other comments as well.

Comment thread drivers/nvme/host/rdma.c
if (ret)
return -ENOMEM;

#ifdef CONFIG_NVFS

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spacing still looks off here:

The #ifdef CONFIG_NVFS block added to rdma.c (inside nvme_rdma_dma_map_req) uses mostly spaces where the surrounding code uses tabs:

  drivers/nvme/host/rdma.c:1490-1500:                       
  #ifdef CONFIG_NVFS          
          {                                           ← 8 spaces                                                        
          bool is_nvfs_io = false;                    ← 8 spaces                                                        
          ret = nvme_rdma_nvfs_map_data(...);         ← 8 spaces                                                        
          if (is_nvfs_io) {                           ← 8 spaces                                                        
                if (ret)                            ← tab + 8 spaces (!)                                                
                       goto out_free_table;         ← tab + 15 spaces (!)                                               
                  return 0;                           ← 16 spaces                                                       
        }                                           ← tab                                                             
          }                                           ← 8 spaces                                                        
  #endif                                                                                                              
        req->data_sgl.nents = ...                   ← tab (surrounding code) 

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spacing still looks off here:

The #ifdef CONFIG_NVFS block added to rdma.c (inside nvme_rdma_dma_map_req) uses mostly spaces where the surrounding code uses tabs:

  drivers/nvme/host/rdma.c:1490-1500:                       
  #ifdef CONFIG_NVFS          
          {                                           ← 8 spaces                                                        
          bool is_nvfs_io = false;                    ← 8 spaces                                                        
          ret = nvme_rdma_nvfs_map_data(...);         ← 8 spaces                                                        
          if (is_nvfs_io) {                           ← 8 spaces                                                        
                if (ret)                            ← tab + 8 spaces (!)                                                
                       goto out_free_table;         ← tab + 15 spaces (!)                                               
                  return 0;                           ← 16 spaces                                                       
        }                                           ← tab                                                             
          }                                           ← 8 spaces                                                        
  #endif                                                                                                              
        req->data_sgl.nents = ...                   ← tab (surrounding code) 

@sourabgupta3 - Can you review and reply to this comment from Jamie? (I believe this is the final issue before we plan to merge)

…ernel

BugLink: https://bugs.launchpad.net/bugs/2134960

With this change, the NVMe and NVMeoF driver would be enabled to
support GPUDirectStorage(GDS). NVMe driver introduced a way to
use the blk_rq_dma_map API to DMA map requests instead of
scatter gather lists. With these changes, GDS path also adopts
a similar framework where we introduce blk based
APIs(nvfs_blk_rq_dma_map_iter_start and nvfs_blk_rq_dma_map_iter_next)
to map a DMA request. The NVMeoF path remains the same as previous releases.

Signed-off-by: Sourab Gupta <sougupta@nvidia.com>
Reviewed-by: Kiran Modukuri <kmodukuri@nvidia.com>
@sourabgupta3 sourabgupta3 force-pushed the sg_26.04_linux-nvidia-nvme-fix branch from fb14aa9 to 28be368 Compare April 24, 2026 22:44
@github-actions

Copy link
Copy Markdown
Contributor

PR Validation Report

Patchscan ✅ No Missing Fixes

All cherry-picked commits checked — no missing upstream fixes found.

PR Lint ❌ Errors found

Details
Checking 1 commits...

Cherry-pick digest:
┌──────────────┬───────────────────────────────────────────────┬────────────┬─────────┬───────────────────────────┐
│ Local        │ Referenced upstream / Patch subject           │ Patch-ID   │ Subject │ SoB chain                 │
├──────────────┼───────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 28be36885ae9 │ [SAUCE] patch nvme/nvmeof driver to support g │ N/A        │ N/A     │ sougupta                  │
└──────────────┴───────────────────────────────────────────────┴────────────┴─────────┴───────────────────────────┘

Lint: all checks passed.

PR metadata:
W: PR title missing [<branch>] prefix: "NVIDIA: SAUCE: Patch NVMe/NVMeoF driver to support GDS on Linux 7.0 Kernel"
E: PR targets 26.04_linux-nvidia but body has no 'BugLink:' or 'LP:' https://bugs.launchpad.net/... line

@nvmochs

nvmochs commented Apr 24, 2026

Copy link
Copy Markdown
Collaborator

I reviewed the latest commit, no further issues from me.

Acked-by: Matthew R. Ochs <mochs@nvidia.com>

@jamieNguyenNVIDIA

Copy link
Copy Markdown
Collaborator

Acked-by: Jamie Nguyen <jamien@nvidia.com>

@sourabgupta3 : Please apply the same changes to pr#376. Thanks!

@sourabgupta3

Copy link
Copy Markdown
Author

@jamieNguyenNVIDIA I did apply the changes to 376 as well, you don't see it there?

@jamieNguyenNVIDIA

Copy link
Copy Markdown
Collaborator

@jamieNguyenNVIDIA I did apply the changes to 376 as well, you don't see it there?

I do see them there -- sorry!

@jamieNguyenNVIDIA

Copy link
Copy Markdown
Collaborator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants