Skip to content

cuda.core.system: Add basic Nvlink and Utilization support#1918

Open
mdboom wants to merge 3 commits intoNVIDIA:mainfrom
mdboom:cuda-core-system-jupyterlab-nvdashboard
Open

cuda.core.system: Add basic Nvlink and Utilization support#1918
mdboom wants to merge 3 commits intoNVIDIA:mainfrom
mdboom:cuda-core-system-jupyterlab-nvdashboard

Conversation

@mdboom
Copy link
Copy Markdown
Contributor

@mdboom mdboom commented Apr 15, 2026

These APIs are needed by rapidsai/jupterlab-nvdashboard and rapidsai/rapids-cli

@mdboom mdboom self-assigned this Apr 15, 2026
@mdboom mdboom added the cuda.core Everything related to the cuda.core module label Apr 15, 2026
@mdboom mdboom added this to the cuda.core v1.0.0 milestone Apr 15, 2026
@github-actions
Copy link
Copy Markdown

@mdboom mdboom force-pushed the cuda-core-system-jupyterlab-nvdashboard branch from ac86822 to 039013e Compare April 20, 2026 18:24
@mdboom mdboom requested a review from rparolin April 20, 2026 18:25
Copy link
Copy Markdown
Contributor

@rwgk rwgk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generated with the help of Cursor GPT-5.4 Extra High Fast

Manually verified.


Medium: Invalid NVLink indices are accepted and fail late

Device.nvlink() currently accepts negative or out-of-range link indices and
returns NvlinkInfo without validating them first. That differs from existing
indexed accessors such as Device.fan(), which validate eagerly. In practice,
device.nvlink(-1) constructs successfully and only fails later when a
property such as .version is accessed, which turns a basic argument error
into a delayed runtime failure.

Relevant paths:

  • cuda_core/cuda/core/system/_device.pyx:585
  • cuda_core/cuda/core/system/_device.pyx:683
  • cuda_core/cuda/core/system/_nvlink.pxi

Low: NvlinkInfo.version documents a non-existent return type

The public enum exported by cuda.core.system is NvlinkVersion, and the API
index plus tests use that spelling, but NvlinkInfo.version is annotated and
documented as NvLinkVersion. That leaks a wrong type name into the generated
help/doc output and points users at a symbol that does not exist.

Relevant paths:

  • cuda_core/cuda/core/system/_nvlink.pxi:21
  • cuda_core/docs/source/api.rst:225
  • cuda_core/tests/system/test_system_device.py:747

Low: NvlinkInfo.state has no direct test coverage

The new test_nvlink() checks construction of NvlinkInfo and accesses
.version, but it never reads .state. As a result, the wrapper path behind
NvlinkInfo.state has no direct coverage even on systems where the test does
not skip.

Relevant paths:

  • cuda_core/cuda/core/system/_nvlink.pxi:35
  • cuda_core/tests/system/test_system_device.py:734

@mdboom
Copy link
Copy Markdown
Contributor Author

mdboom commented Apr 21, 2026

Thanks for having your agent fight with my agent, @rwgk. ;)

@mdboom mdboom requested a review from rwgk April 21, 2026 15:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cuda.core Everything related to the cuda.core module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants