.. currentmodule:: cuda.core
Added public access to default CUDA streams via module-level constants
LEGACY_DEFAULT_STREAMandPER_THREAD_DEFAULT_STREAMUsers can now access default streams directly from the
cuda.corenamespace:from cuda.core import LEGACY_DEFAULT_STREAM, PER_THREAD_DEFAULT_STREAM # Use legacy default stream (synchronizes with all blocking streams) LEGACY_DEFAULT_STREAM.sync() # Use per-thread default stream (non-blocking, thread-local) PER_THREAD_DEFAULT_STREAM.sync()
The legacy default stream synchronizes with all blocking streams in the same CUDA context, ensuring strict ordering but potentially limiting concurrency. The per-thread default stream is local to the calling thread and does not synchronize with other streams, enabling concurrent execution in multi-threaded applications.
This replaces the previous undocumented workaround of using
Stream.from_handle(0)to access the legacy default stream.Added :func:`~cuda.core.utils.make_aligned_dtype` utility for creating structured NumPy dtypes with GPU-compatible alignment. Field offsets and the total
itemsizeare recomputed so that each field is naturally aligned and the structure size is a multiple of the largest member alignment. An explicitalignmentcan be requested and is stored in the dtype's metadata under the key"__cuda_alignment__". (Resolves :issue:`734`.)
None.