NVIDIA: SAUCE: Patch NVMe/NVMeoF driver to support GDS on Linux 7.0 Kernel#373
Conversation
|
@sourabgupta3 I would squash these two patches together as these are not upstreamed patches so keeps patch list small. |
|
@nirmoy For context, on 6.17 these two went as separate commits as we discovered a bug later, that's why I kept it this way to be consistent with 6.17. Do you still want to squash them? I am okay with either option. |
I think you should still squash them. It avoids leaving in a known bugged commit and would mean that you'd only need to pick a single patch into the next HWE kernel. |
6.17-HWE will be going away in a few months; this LTS will be supported for 10+ years. We should squash. |
0afef66 to
fd03681
Compare
|
@jamieNguyenNVIDIA @nirmoy Done! |
|
I confirmed that the squashed patches contain the same content as the version in 6.17-HWE with these differences:
No issues or concerns from me.
@sourabgupta3 - Please also submit this change against the 26.04_linux-nvidia-bos branch (which will be the first "HWE" in 26.04). Thanks! |
| struct nvme_request req; | ||
| struct nvme_command cmd; | ||
| u8 flags; | ||
| u32 flags; |
There was a problem hiding this comment.
At linux-nvidia-6.17 this u8 flags? is this a new fix?
There was a problem hiding this comment.
Yes, this is new as there wasn't any space left to accomodate NVFS flags. But I think I should add this also under #ifdef CONFIG_NVFS.
There was a problem hiding this comment.
It's needed because Sourab needed to add IOD_NVFS_IO to nvme_iod_flags and it was maxed out on bits that fit within a single byte. So he expanded the field to a large width (an integer) and used a very high bit number (the last bit of an integer) to avoid future conflicts when rebasing. This is the only place that uses these flags and I confirmed that there are no local variables that make a copy of the field (which would have potentially also required a fixup if they too used a single byte for storage).
@sourabgupta3 Actually, there was 1 nit I had that I forgot about... There are some instances of trailing whitespace added by this PR. Can these be addressed? |
21af88d to
93bd3d6
Compare
|
@nvmochs I cleaned up some now. Does this look okay? |
It looks great, no more trailing whitespace. One more tiny nit, in the commit title, can you change "Linux 6.17 Kernel" -> "Linux 7.0 Kernel"? |
93bd3d6 to
9582112
Compare
|
Done @nvmochs |
clsotog
left a comment
There was a problem hiding this comment.
Acked-by: Carol L Soto <csoto@nvidia.com>
|
|
jamieNguyenNVIDIA
left a comment
There was a problem hiding this comment.
General comment.
nvfs-dma.c, nvfs-rdma.c, and nvfs.h all appear to have space-indented code:
- nvfs-dma.c lines 31–47: REGISTER_FUNC and UNREGISTER_FUNC bodies use spaces
- nvfs-rdma.c lines 32–47: same functions as above
- nvfs.h lines 55–68: nvfs_count_ops() and nvfs_get_ops() bodies use spaces
- nvfs-dma.h lines 38–47: nvme_nvfs_unmap_sgls inner block uses spaces
The nvfs-dma.h line 174 (else with space indent) is also in nvme_nvfs_map_data.
Can you change these all to tabs?
| if (!nvfs_ops->nvfs_blk_rq_dma_map_iter_next(req, dma_dev, | ||
| &iod->dma_state, iter)) | ||
| return false; | ||
| iod->dma_vecs[iod->nr_dma_vecs].addr = iter->addr; |
There was a problem hiding this comment.
I can't tell if this matters or not, but if mempool_alloc() fails here, we'll return and not set iod->dma_vecs[0].addr = iter->addr;. Later, when nvme_nvfs_unmap_prps() is called, it will exit early at the if (!iod->dma_vecs) check without looping through iod->nr_dma_vecs (and calling nvfs_dma_unmap_page()). Could we have a leak because of this?
| iod->dma_vecs = mempool_alloc(nvmeq->dev->dmavec_mempool, GFP_ATOMIC); | ||
| if (!iod->dma_vecs) | ||
| return BLK_STS_RESOURCE; | ||
| iod->dma_vecs[0].addr = iter->addr; |
There was a problem hiding this comment.
Similar to the previous comment in the PRP path: if dma_pool_alloc() fails here, we return before writing any SGL entries and before iod->cmd.common.dptr.sgl is populated. When nvme_nvfs_unmap_sgls() is called with nr_descriptors == 0, it will call nvfs_dma_unmap_page() using the addr/len from iod->cmd.common.dptr.sgl, which was never set. Could there be a leak because of this?
| return 0; | ||
| } | ||
|
|
||
| return ret; |
There was a problem hiding this comment.
This return looks to be unreachable. Seems like you can drop the else case.
9582112 to
825c167
Compare
✅ Patchscan: No Missing FixesAll cherry-picked commits have been checked — no missing upstream fixes found. |
|
@sourabgupta3 Can you please review/respond to Jamie's comments and questions above? Thanks! |
|
@nvmochs I already responded to Jamie's comment last week. I am in the middle of testing the fix. |
FYI: I don't see responses on this PR. |
|
@jamieNguyenNVIDIA @nvmochs It looks like my comment is stuck in pending state. This is what I have commented: Do you know why could it be in pending state? |
825c167 to
09c15e6
Compare
|
Hi @sourabgupta3: looks like there are still spaces instead of tabs at:
My other comments appear to have been addressed. |
09c15e6 to
fb14aa9
Compare
|
@jamieNguyenNVIDIA Addressed other comments as well. |
| if (ret) | ||
| return -ENOMEM; | ||
|
|
||
| #ifdef CONFIG_NVFS |
There was a problem hiding this comment.
The spacing still looks off here:
The #ifdef CONFIG_NVFS block added to rdma.c (inside nvme_rdma_dma_map_req) uses mostly spaces where the surrounding code uses tabs:
drivers/nvme/host/rdma.c:1490-1500:
#ifdef CONFIG_NVFS
{ ← 8 spaces
bool is_nvfs_io = false; ← 8 spaces
ret = nvme_rdma_nvfs_map_data(...); ← 8 spaces
if (is_nvfs_io) { ← 8 spaces
if (ret) ← tab + 8 spaces (!)
goto out_free_table; ← tab + 15 spaces (!)
return 0; ← 16 spaces
} ← tab
} ← 8 spaces
#endif
req->data_sgl.nents = ... ← tab (surrounding code)
There was a problem hiding this comment.
The spacing still looks off here:
The #ifdef CONFIG_NVFS block added to rdma.c (inside nvme_rdma_dma_map_req) uses mostly spaces where the surrounding code uses tabs:
drivers/nvme/host/rdma.c:1490-1500: #ifdef CONFIG_NVFS { ← 8 spaces bool is_nvfs_io = false; ← 8 spaces ret = nvme_rdma_nvfs_map_data(...); ← 8 spaces if (is_nvfs_io) { ← 8 spaces if (ret) ← tab + 8 spaces (!) goto out_free_table; ← tab + 15 spaces (!) return 0; ← 16 spaces } ← tab } ← 8 spaces #endif req->data_sgl.nents = ... ← tab (surrounding code)
@sourabgupta3 - Can you review and reply to this comment from Jamie? (I believe this is the final issue before we plan to merge)
…ernel BugLink: https://bugs.launchpad.net/bugs/2134960 With this change, the NVMe and NVMeoF driver would be enabled to support GPUDirectStorage(GDS). NVMe driver introduced a way to use the blk_rq_dma_map API to DMA map requests instead of scatter gather lists. With these changes, GDS path also adopts a similar framework where we introduce blk based APIs(nvfs_blk_rq_dma_map_iter_start and nvfs_blk_rq_dma_map_iter_next) to map a DMA request. The NVMeoF path remains the same as previous releases. Signed-off-by: Sourab Gupta <sougupta@nvidia.com> Reviewed-by: Kiran Modukuri <kmodukuri@nvidia.com>
fb14aa9 to
28be368
Compare
PR Validation ReportPatchscan ✅ No Missing FixesAll cherry-picked commits checked — no missing upstream fixes found. PR Lint ❌ Errors foundDetailsChecking 1 commits... Cherry-pick digest: ┌──────────────┬───────────────────────────────────────────────┬────────────┬─────────┬───────────────────────────┐ │ Local │ Referenced upstream / Patch subject │ Patch-ID │ Subject │ SoB chain │ ├──────────────┼───────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤ │ 28be36885ae9 │ [SAUCE] patch nvme/nvmeof driver to support g │ N/A │ N/A │ sougupta │ └──────────────┴───────────────────────────────────────────────┴────────────┴─────────┴───────────────────────────┘ Lint: all checks passed. PR metadata: W: PR title missing [<branch>] prefix: "NVIDIA: SAUCE: Patch NVMe/NVMeoF driver to support GDS on Linux 7.0 Kernel" E: PR targets 26.04_linux-nvidia but body has no 'BugLink:' or 'LP:' https://bugs.launchpad.net/... line |
|
I reviewed the latest commit, no further issues from me.
|
|
@sourabgupta3 : Please apply the same changes to pr#376. Thanks! |
|
@jamieNguyenNVIDIA I did apply the changes to 376 as well, you don't see it there? |
I do see them there -- sorry! |
With this change, the NVMe and NVMeoF driver would be enabled to
support GPUDirectStorage(GDS). NVMe driver introduced a way to
use the blk_rq_dma_map API to DMA map requests instead of
scatter gather lists. With these changes, GDS path also adopts
a similar framework where we introduce blk based
APIs(nvfs_blk_rq_dma_map_iter_start and nvfs_blk_rq_dma_map_iter_next)
to map a DMA request. The NVMeoF path remains the same as previous releases.
Signed-off-by: Sourab Gupta sougupta@nvidia.com
Reviewed-by: Kiran Modukuri kmodukuri@nvidia.com