You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Avoid PDL race conditions by disabling __restrict__ when PDL is used (ggml-org#24030)
* Removes __restrict__ from PDL kernel headers due to incompatibility with
PDL. Adds preprocessor directives based on arch in kernel body to add
__restrict__ to retain performance on older architectures.
* Simplifies new __restrict__ usage via macro
* Add hopper to PDL __restrict__ fix.
Co-authored-by: Oliver Simons <osimons@nvidia.com>
---------
Co-authored-by: Oliver Simons <osimons@nvidia.com>
0 commit comments