@@ -50,29 +50,24 @@ New features
5050- Added the DLPack C exchange API (``__dlpack_c_exchange_api__ ``) to
5151 :class: `~utils.StridedMemoryView `.
5252
53- - Added NVRTC precompiled header (PCH) runtime APIs to :class: `Program `:
54- :meth: `~Program.get_pch_create_status `, :meth: `~Program.get_pch_heap_size_required `,
55- :meth: `~Program.get_pch_heap_size ` (static), and :meth: `~Program.set_pch_heap_size `
56- (static). Requires NVRTC 12.8+.
57-
58- - Added ``preferred_location_type `` option to :class: `ManagedMemoryResourceOptions `
59- for explicit control over the preferred location kind (``"device" ``,
60- ``"host" ``, or ``"host_numa" ``). This enables NUMA-aware managed memory
61- pool placement. The existing ``preferred_location `` parameter retains full
62- backwards compatibility when ``preferred_location_type `` is not set.
63-
64- - Added :attr: `ManagedMemoryResource.preferred_location ` property to query the
65- resolved preferred location of a managed memory pool. Returns ``None `` for no
66- preference, or a tuple such as ``("device", 0) ``, ``("host", None) ``, or
67- ``("host_numa", 3) ``.
68-
69- - Added ``numa_id `` option to :class: `PinnedMemoryResourceOptions ` for explicit
70- control over host NUMA node placement. When ``ipc_enabled=True `` and
71- ``numa_id `` is not set, the NUMA node is automatically derived from the
72- current CUDA device.
73-
74- - Added :attr: `PinnedMemoryResource.numa_id ` property to query the host NUMA
75- node ID used for pool placement. Returns ``-1 `` for OS-managed placement.
53+ - Added NVRTC precompiled header (PCH) support (CUDA 12.8+).
54+ :class: `ProgramOptions ` gains ``pch ``, ``create_pch ``, ``use_pch ``,
55+ ``pch_dir ``, and related options. :attr: `Program.pch_status ` reports the
56+ PCH creation outcome, and ``compile() `` automatically resizes the NVRTC
57+ PCH heap and retries when PCH creation fails due to heap exhaustion.
58+
59+ - Added NUMA-aware managed memory pool placement.
60+ :class: `ManagedMemoryResourceOptions ` gains a ``preferred_location_type ``
61+ option (``"device" ``, ``"host" ``, or ``"host_numa" ``), and
62+ :attr: `ManagedMemoryResource.preferred_location ` queries the resolved
63+ location. The existing ``preferred_location `` parameter retains full
64+ backwards compatibility.
65+
66+ - Added NUMA-aware pinned memory pool placement.
67+ :class: `PinnedMemoryResourceOptions ` gains a ``numa_id `` option, and
68+ :attr: `PinnedMemoryResource.numa_id ` queries the host NUMA node ID used for
69+ pool placement. When ``ipc_enabled=True `` and ``numa_id `` is not set, the
70+ NUMA node is automatically derived from the current CUDA device.
7671
7772- Added support for CUDA 13.2.
7873
@@ -91,23 +86,23 @@ Fixes and enhancements
9186
9287- Fixed managed memory buffers being misclassified as ``kDLCUDAHost `` in DLPack
9388 device mapping. They are now correctly reported as ``kDLCUDAManaged ``.
94- (:issue: ` 1863 ` )
89+ (` # 1863 < https://github.com/NVIDIA/cuda-python/pull/1863 >`__ )
9590- Fixed IPC-enabled pinned memory pools using a hardcoded NUMA node ID of ``0 ``
9691 instead of the NUMA node closest to the active CUDA device. On multi-NUMA
9792 systems where the device is attached to a non-zero host NUMA node, this could
98- cause pool creation or allocation failures. (:issue: ` 1603 ` )
93+ cause pool creation or allocation failures. (` # 1603 < https://github.com/NVIDIA/cuda-python/issues/1603 >`__ )
9994- Fixed :attr: `DeviceMemoryResource.peer_accessible_by ` returning stale results when wrapping
10095 a non-owned (default) memory pool. The property now always queries the CUDA driver for
101- non-owned pools, so multiple wrappers around the same pool see consistent state. (:issue: ` 1720 ` )
96+ non-owned pools, so multiple wrappers around the same pool see consistent state. (` # 1720 < https://github.com/NVIDIA/cuda-python/issues/1720 >`__ )
10297- Fixed a bare ``except `` clause in stream acceptance that silently swallowed all exceptions,
10398 including ``KeyboardInterrupt `` and ``SystemExit ``. Only the expected "protocol not
104- supported" case is now caught. (:issue: ` 1631 ` )
99+ supported" case is now caught. (` # 1631 < https://github.com/NVIDIA/cuda-python/issues/1631 >`__ )
105100- :class: `~utils.StridedMemoryView ` now validates strides at construction time so unsupported
106- layouts fail immediately instead of on first metadata access. (:issue: ` 1429 ` )
101+ layouts fail immediately instead of on first metadata access. (` # 1429 < https://github.com/NVIDIA/cuda-python/issues/1429 >`__ )
107102- IPC file descriptor cleanup now uses a C++ ``shared_ptr `` with a POSIX deleter, avoiding
108103 cryptic errors when a :class: `DeviceMemoryResource ` is destroyed during Python shutdown.
109- - Improved error message when `` ManagedMemoryResource() ` ` is called without options on platforms
110- that lack a default managed memory pool (e.g. WSL2). (:issue: ` 1617 ` )
104+ - Improved error message when :class: ` ManagedMemoryResource ` is called without options on platforms
105+ that lack a default managed memory pool (e.g. WSL2). (` # 1617 < https://github.com/NVIDIA/cuda-python/issues/1617 >`__ )
111106- Handle properties on core API objects now return ``None `` during Python shutdown instead of
112107 crashing.
113108- Reduced Python overhead in :class: `Program ` and :class: `Linker ` by moving compilation and
0 commit comments