Skip to content

Commit f166d18

Browse files
authored
Reset CUDA error after cudaFuncSetAttribute (#504)
This PR fixes a bug which occures when `cudaFuncSetAttribute` is used to conditionally check if there's enough shmem available to fit the HLL sketch in. If it doesn't fit then the CUDA error returned by the function is sticky and may resurface in downstream calls, e.g., Thrust. The fix consists of flushing the CUDA error after the call to `cudaFuncSetAttribute`. Big thanks to @gevtushenko for tracking down this bug.
1 parent 03304a3 commit f166d18

1 file changed

Lines changed: 6 additions & 3 deletions

File tree

include/cuco/detail/hyperloglog/hyperloglog_ref.cuh

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -550,9 +550,12 @@ class hyperloglog_ref {
550550
[[nodiscard]] __host__ constexpr bool try_reserve_shmem(Kernel kernel,
551551
int shmem_bytes) const noexcept
552552
{
553-
return cudaSuccess == cudaFuncSetAttribute(reinterpret_cast<void const*>(kernel),
554-
cudaFuncAttributeMaxDynamicSharedMemorySize,
555-
shmem_bytes);
553+
bool const ret =
554+
cudaSuccess == cudaFuncSetAttribute(reinterpret_cast<void const*>(kernel),
555+
cudaFuncAttributeMaxDynamicSharedMemorySize,
556+
shmem_bytes);
557+
cudaGetLastError(); // flush CUDA error
558+
return ret;
556559
}
557560

558561
hasher hash_; ///< Hash function used to hash items

0 commit comments

Comments
 (0)