Skip to content

Commit a2839b4

Browse files
committed
ggml-cuda : add explicit casts to -INFINITY for float and half2 types
This commit adds explicit casts to float for -INFINITY. The motivation for this is that in CUDA 11.8.0, the -INFINITY macro is defined as a double (a header provided NVCC). This triggers a warning and hence causes a CI failure in whisper.cpp. I belive that this header might have been updated in CUDA 12 which is why we don't see this warning. Refs: https://github.com/ggml-org/whisper.cpp/actions/runs/25713948217/job/75500081939?pr=3803 Refs: ggml-org/llama.cpp#22824
1 parent 4babfd4 commit a2839b4

1 file changed

Lines changed: 2 additions & 2 deletions

File tree

ggml/src/ggml-cuda/common.cuh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -582,9 +582,9 @@ template <typename T> struct block_reduce_policy<block_reduce_method::MAX, T> {
582582

583583
static __device__ T sentinel() {
584584
if constexpr (std::is_same_v<T, float>) {
585-
return -INFINITY;
585+
return -(float)INFINITY;
586586
} else if constexpr (std::is_same_v<T, half2>) {
587-
return make_half2(-INFINITY, -INFINITY);
587+
return make_half2(__float2half(-(float)INFINITY), __float2half(-(float)INFINITY));
588588
} else {
589589
static_assert(ggml_cuda_dependent_false_v<T>, "Unsupported type for block reduce max");
590590
}

0 commit comments

Comments
 (0)