Commit 64ac9ab
authored
CUDA : Fix CUB's argsort when nrows % block_size == 0 CCCL < 3.1 (#21181)
* CUDA: Fix CUB's argsort when nrows % block_size == 0 CCCL < 3.1
We wrongly calculated offset_grid as `ceildiv(nrows, block_size)`,
while it must be `ceildiv(nrows + 1, block_size)`. As a consequence, we
had uninitialized values in `offset_iterator[nrows]` for the case when
`nrows % block_size == 0`.
Fixes #21162
* Reduce nrows in test case to 256, don't need 7681 parent cad2d38 commit 64ac9ab
2 files changed
Lines changed: 5 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
47 | 47 | | |
48 | 48 | | |
49 | 49 | | |
50 | | - | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
51 | 53 | | |
52 | | - | |
| 54 | + | |
53 | 55 | | |
54 | 56 | | |
55 | 57 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8424 | 8424 | | |
8425 | 8425 | | |
8426 | 8426 | | |
| 8427 | + | |
8427 | 8428 | | |
8428 | 8429 | | |
8429 | 8430 | | |
| |||
0 commit comments