Skip to content

Commit 1a04c76

Browse files
merge main
2 parents 3736541 + 291ed64 commit 1a04c76

13 files changed

Lines changed: 88 additions & 82 deletions

File tree

.github/workflows/backport.yml

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,8 @@ jobs:
1414
backport:
1515
name: Backport pull request
1616
if: ${{ github.repository_owner == 'nvidia' &&
17-
github.event.pull_request.merged == true
17+
github.event.pull_request.merged == true &&
18+
contains( github.event.pull_request.labels.*.name, 'to-be-backported')
1819
}}
1920
runs-on: ubuntu-latest
2021
steps:
@@ -32,6 +33,4 @@ jobs:
3233
copy_assignees: true
3334
copy_labels_pattern: true
3435
copy_requested_reviewers: true
35-
label_pattern: to-be-backported
3636
target_branches: ${{ env.OLD_BRANCH }}
37-
conflict_resolution: draft_commit_conflicts

.pre-commit-config.yaml

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,13 @@
1-
# Copyright (c) 2024, NVIDIA CORPORATION.
1+
# Copyright (c) 2024-2025, NVIDIA CORPORATION.
2+
ci:
3+
autofix_commit_msg: |
4+
[pre-commit.ci] auto code formatting
5+
autofix_prs: false
6+
autoupdate_branch: ''
7+
autoupdate_commit_msg: '[pre-commit.ci] pre-commit autoupdate'
8+
autoupdate_schedule: quarterly
9+
skip: []
10+
submodules: false
211

312
repos:
413
- repo: https://github.com/astral-sh/ruff-pre-commit

cuda_bindings/README.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -19,9 +19,9 @@ Differences between these options are described in [Installation](https://nvidia
1919
CUDA Python is supported on all platforms that CUDA is supported. Specific dependencies are as follows:
2020

2121
* Driver: Linux (450.80.02 or later) Windows (456.38 or later)
22-
* CUDA Toolkit 12.0 to 12.6
22+
* CUDA Toolkit 12.x
2323

24-
Only the NVRTC redistributable component is required from the CUDA Toolkit. [CUDA Toolkit Documentation](https://docs.nvidia.com/cuda/index.html) Installation Guides can be used for guidance. Note that the NVRTC component in the Toolkit can be obtained via PYPI, Conda or Local Installer.
24+
Only the NVRTC and nvJitLink redistributable components are required from the CUDA Toolkit, which can be obtained via PyPI, Conda, or local installers (as described in the CUDA Toolkit [Windows](https://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/index.html) and [Linux](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html) Installation Guides).
2525

2626
### Supported Python Versions
2727

@@ -63,8 +63,8 @@ Latest dependencies can be found in [requirements.txt](https://github.com/NVIDIA
6363

6464
Multiple testing options are available:
6565

66-
* Cython Unit Tests
6766
* Python Unit Tests
67+
* Cython Unit Tests
6868
* Samples
6969
* Benchmark
7070

@@ -73,18 +73,18 @@ Multiple testing options are available:
7373
Responsible for validating different binding usage patterns. Unit test `test_kernelParams.py` is particularly special since it demonstrates various approaches in setting up kernel launch parameters.
7474

7575
To run these tests:
76-
* `python -m pytest tests/` against local builds
76+
* `python -m pytest tests/` against editable installations
7777
* `pytest tests/` against installed packages
7878

7979
### Cython Unit Tests
8080

81-
Cython tests are located in `tests/cython` and need to be built. Furthermore they need CUDA Toolkit headers matching the major-minor of CUDA Python. To build them:
81+
Cython tests are located in `tests/cython` and need to be built. These builds have the same CUDA Toolkit header requirements as [Installing from Source](https://nvidia.github.io/cuda-python/cuda-bindings/latest/install.html#requirements) where the major.minor version must match `cuda.bindings`. To build them:
8282

8383
1. Setup environment variable `CUDA_HOME` with the path to the CUDA Toolkit installation.
8484
2. Run `build_tests` script located in `test/cython` appropriate to your platform. This will both cythonize the tests and build them.
8585

8686
To run these tests:
87-
* `python -m pytest tests/cython/` against local builds
87+
* `python -m pytest tests/cython/` against editable installations
8888
* `pytest tests/cython/` against installed packages
8989

9090
### Samples
@@ -102,13 +102,13 @@ In addition, extra examples are included:
102102
wrappers of the driver API.
103103

104104
To run these samples:
105-
* `python -m pytest tests/cython/` against local builds
105+
* `python -m pytest tests/cython/` against editable installations
106106
* `pytest tests/cython/` against installed packages
107107

108108
### Benchmark (WIP)
109109

110110
Benchmarks were used for performance analysis during initial release of CUDA Python. Today they need to be updated the 12.x toolkit and are work in progress.
111111

112112
The intended way to run these benchmarks was:
113-
* `python -m pytest --benchmark-only benchmark/` against local builds
113+
* `python -m pytest --benchmark-only benchmark/` against editable installations
114114
* `pytest --benchmark-only benchmark/` against installed packages
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
# Environment Variables
2+
3+
## Build-Time Environment Variables
4+
5+
- `CUDA_HOME` or `CUDA_PATH`: Specifies the location of the CUDA Toolkit.
6+
7+
- `CUDA_PYTHON_PARSER_CACHING` : bool, toggles the caching of parsed header files during the cuda-bindings build process. If caching is enabled (`CUDA_PYTHON_PARSER_CACHING` is True), the cache path is set to ./cache_<library_name>, where <library_name> is derived from the cuda toolkit libraries used to build cuda-bindings.
8+
9+
- `CUDA_PYTHON_PARALLEL_LEVEL` (previously `PARALLEL_LEVEL`) : int, sets the number of threads used in the compilation of extension modules. Not setting it or setting it to 0 would disable parallel builds.
10+
11+
## Runtime Environment Variables
12+
13+
- `CUDA_PYTHON_CUDA_PER_THREAD_DEFAULT_STREAM` : When set to 1, the default stream is the per-thread default stream. When set to 0, the default stream is the legacy default stream. This defaults to 0, for the legacy default stream. See [Stream Synchronization Behavior](https://docs.nvidia.com/cuda/cuda-runtime-api/stream-sync-behavior.html) for an explanation of the legacy and per-thread default streams.

cuda_bindings/docs/source/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@
99
overview.md
1010
motivation.md
1111
release.md
12+
environment_variables.md
1213
api.rst
1314

1415

cuda_bindings/docs/source/install.md

Lines changed: 21 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -2,91 +2,53 @@
22

33
## Runtime Requirements
44

5-
CUDA Python is supported on all platforms that CUDA is supported. Specific
6-
dependencies are as follows:
5+
`cuda.bindings` supports the same platforms as CUDA. Runtime dependencies are:
76

87
* Driver: Linux (450.80.02 or later) Windows (456.38 or later)
9-
* CUDA Toolkit 12.0 to 12.6
8+
* CUDA Toolkit 12.x
109

11-
```{note} Only the NVRTC redistributable component is required from the CUDA Toolkit. [CUDA Toolkit Documentation](https://docs.nvidia.com/cuda/index.html) Installation Guides can be used for guidance. Note that the NVRTC component in the Toolkit can be obtained via PYPI, Conda or Local Installer.
10+
```{note}
11+
Only the NVRTC and nvJitLink redistributable components are required from the CUDA Toolkit, which can be obtained via PyPI, Conda, or local installers (as described in the CUDA Toolkit [Windows](https://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/index.html) and [Linux](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html) Installation Guides).
1212
```
1313

1414
## Installing from PyPI
1515

16-
```{code-block} shell
17-
pip install cuda-python
16+
```console
17+
$ pip install cuda-python
1818
```
1919

2020
## Installing from Conda
2121

22-
```{code-block} shell
23-
conda install -c nvidia cuda-python
22+
```console
23+
$ conda install -c conda-forge cuda-python
2424
```
2525

26-
Conda packages are assigned a dependency to CUDA Toolkit:
27-
28-
* cuda-cudart (Provides CUDA headers to enable writting NVRTC kernels with CUDA types)
29-
* cuda-nvrtc (Provides NVRTC shared library)
30-
3126
## Installing from Source
3227

33-
### Build Requirements
28+
### Requirements
3429

35-
* CUDA Toolkit headers
36-
* Cython
37-
* pyclibrary
30+
* CUDA Toolkit headers[^1]
3831

39-
Remaining build and test dependencies are outlined in [requirements.txt](https://github.com/NVIDIA/cuda-python/blob/main/requirements.txt)
32+
[^1]: User projects that `cimport` CUDA symbols in Cython must also use CUDA Toolkit (CTK) types as provided by the `cuda.bindings` major.minor version. This results in CTK headers becoming a transitive dependency of downstream projects through CUDA Python.
4033

41-
The version of CUDA Toolkit headers must match the major.minor of CUDA Python. Note that minor version compatibility will still be maintained.
34+
Source builds require that the provided CUDA headers are of the same major.minor version as the `cuda.bindings` you're trying to build. Despite this requirement, note that the minor version compatibility is still maintained. Use the `CUDA_HOME` (or `CUDA_PATH`) environment variable to specify the location of your headers. For example, if your headers are located in `/usr/local/cuda/include`, then you should set `CUDA_HOME` with:
4235

43-
During the build process, environment variable `CUDA_HOME` or `CUDA_PATH` are used to find the location of CUDA headers. In particular, if your headers are located in path `/usr/local/cuda/include`, then you should set `CUDA_HOME` as follows:
44-
45-
```
46-
export CUDA_HOME=/usr/local/cuda
36+
```console
37+
$ export CUDA_HOME=/usr/local/cuda
4738
```
4839

49-
### In-place
40+
See [Environment Variables](environment_variables.md) for a description of other build-time environment variables.
5041

51-
To compile the extension in-place, run:
52-
53-
```{code-block} shell
54-
python setup.py build_ext --inplace
42+
```{note}
43+
Only `cydriver`, `cyruntime` and `cynvrtc` are impacted by the header requirement.
5544
```
5645

57-
To compile for debugging the extension modules with gdb, pass the `--debug`
58-
argument to setup.py.
59-
60-
### Develop
46+
### Editable Install
6147

6248
You can use
6349

64-
```{code-block} shell
65-
pip install -e .
66-
```
67-
68-
to install the module as editible in your current Python environment (e.g. for
69-
testing of porting other libraries to use the binding).
70-
71-
## Build the Docs
72-
73-
```{code-block} shell
74-
conda env create -f docs_src/environment-docs.yml
75-
conda activate cuda-python-docs
50+
```console
51+
$ pip install -v -e .
7652
```
77-
Then compile and install `cuda-python` following the steps above.
7853

79-
```{code-block} shell
80-
cd docs_src
81-
make html
82-
open build/html/index.html
83-
```
84-
85-
### Publish the Docs
86-
87-
```{code-block} shell
88-
git checkout gh-pages
89-
cd docs_src
90-
make html
91-
cp -a build/html/. ../docs/
92-
```
54+
to install the module as editable in your current Python environment (e.g. for testing of porting other libraries to use the binding).

cuda_bindings/docs/source/module/nvjitlink.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,11 @@
11
nvjitlink
22
=========
33

4+
Note
5+
----
6+
7+
The nvjitlink bindings are not supported on nvJitLink installations <12.3. Ensure the installed CUDA toolkit's nvJitLink version is >=12.3.
8+
49
Functions
510
---------
611

cuda_bindings/setup.py

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@
1515
import sys
1616
import sysconfig
1717
import tempfile
18+
from warnings import warn
1819

1920
from Cython import Tempita
2021
from Cython.Build import cythonize
@@ -32,7 +33,15 @@
3233
raise RuntimeError("Environment variable CUDA_HOME or CUDA_PATH is not set")
3334

3435
CUDA_HOME = CUDA_HOME.split(os.pathsep)
35-
nthreads = int(os.environ.get("PARALLEL_LEVEL", "0") or "0")
36+
if os.environ.get("PARALLEL_LEVEL") is not None:
37+
warn(
38+
"Environment variable PARALLEL_LEVEL is deprecated. Use CUDA_PYTHON_PARALLEL_LEVEL instead",
39+
DeprecationWarning,
40+
stacklevel=1,
41+
)
42+
nthreads = int(os.environ.get("PARALLEL_LEVEL", "0"))
43+
else:
44+
nthreads = int(os.environ.get("CUDA_PYTHON_PARALLEL_LEVEL", "0") or "0")
3645
PARSER_CACHING = os.environ.get("CUDA_PYTHON_PARSER_CACHING", False)
3746
PARSER_CACHING = bool(PARSER_CACHING)
3847

@@ -80,7 +89,7 @@
8089
found_values = []
8190

8291
include_path_list = [os.path.join(path, "include") for path in CUDA_HOME]
83-
print(f'Parsing headers in "{include_path_list}" (Caching {PARSER_CACHING})')
92+
print(f'Parsing headers in "{include_path_list}" (Caching = {PARSER_CACHING})')
8493
for library, header_list in header_dict.items():
8594
header_paths = []
8695
for header in header_list:

cuda_core/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,5 +39,5 @@ for more details, including how to sign your commits.
3939
## Testing
4040

4141
To run these tests:
42-
* `python -m pytest tests/` against local builds
42+
* `python -m pytest tests/` against editable installations
4343
* `pytest tests/` against installed packages

cuda_core/cuda/core/experimental/_stream.py

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
from __future__ import annotations
66

77
import os
8+
import warnings
89
import weakref
910
from dataclasses import dataclass
1011
from typing import TYPE_CHECKING, Optional, Tuple, Union
@@ -87,9 +88,19 @@ def _init(obj=None, *, options: Optional[StreamOptions] = None):
8788
if obj is not None and options is not None:
8889
raise ValueError("obj and options cannot be both specified")
8990
if obj is not None:
90-
if not hasattr(obj, "__cuda_stream__"):
91-
raise ValueError
92-
info = obj.__cuda_stream__
91+
try:
92+
info = obj.__cuda_stream__()
93+
except AttributeError as e:
94+
raise TypeError(f"{type(obj)} object does not have a '__cuda_stream__' method") from e
95+
except TypeError:
96+
info = obj.__cuda_stream__
97+
warnings.simplefilter("once", DeprecationWarning)
98+
warnings.warn(
99+
"Implementing __cuda_stream__ as an attribute is deprecated; it must be implemented as a method",
100+
stacklevel=3,
101+
category=DeprecationWarning,
102+
)
103+
93104
assert info[0] == 0
94105
self._mnff.handle = cuda.CUstream(info[1])
95106
# TODO: check if obj is created under the current context/device
@@ -132,7 +143,6 @@ def close(self):
132143
"""
133144
self._mnff.close()
134145

135-
@property
136146
def __cuda_stream__(self) -> Tuple[int, int]:
137147
"""Return an instance of a __cuda_stream__ protocol."""
138148
return (0, self.handle)
@@ -279,7 +289,6 @@ def from_handle(handle: int) -> Stream:
279289
"""
280290

281291
class _stream_holder:
282-
@property
283292
def __cuda_stream__(self):
284293
return (0, handle)
285294

0 commit comments

Comments
 (0)