Skip to content

Commit 184b7ff

Browse files
committed
feat(core): add host, pinned, and device memory management utilities
- Implement memory wrappers for host (`malloc`), pinned host (`cudaMallocHost`), and aligned device allocations (`cudaMalloc`). - Enforce strict memory layout by rounding up device bytes to `QX_MEM_ALIGN`. - Add `tensor_alloc_device` and `tensor_alloc_host` factory allocators with automatic initialization. - Implement unified `tensor_free` handling safe deallocations across all memory spaces. - Add async Host-to-Device (`tensor_h2d`) copy routine.
1 parent 98dce09 commit 184b7ff

1 file changed

Lines changed: 9 additions & 0 deletions

File tree

cuda/includes/utils.cuh

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
#pragma once
2+
3+
// Aggregator — include this one header to get the full Day 1 runtime.
4+
// Each sub-header is small and independently loadable.
5+
6+
#include "common.h" // macros, enums, error checks, dtype helpers
7+
#include "tensor.cuh" // TensorShape, Tensor struct
8+
#include "memory.cuh" // allocators, tensor_alloc_*, tensor_free, transfers
9+
#include "reduce.cuh" // warpReduceSum/Max/Min, blockReduceSum/Max

0 commit comments

Comments
 (0)