Commit 57311e8
authored
## Summary
- Project Versioning: Sets the starting project version to 0.1.0.
- Code Shortcuts (Macros): Creates clean shorthand terms for CUDA
keywords (like wrapping __device__ into QX_DEVICE) to make writing GPU
kernels cleaner.
- Math & Memory Utilities: Adds fast math helpers for aligning memory,
rounding numbers, and calculating power-of-two boundaries quickly.
- Memory Optimization: Forces a 128-byte memory alignment to ensure the
GPU can read data as fast as possible (coalesced memory access).
- Automatic Error Checking: Introduces safety wrappers (CUDA_CHECK,
CUBLAS_CHECK, NCCL_CHECK) that instantly watch for crashes or failures
in Nvidia's core hardware and math libraries, making debugging much
easier.
0 file changed
0 commit comments