Commit 7dd7bc1
authored
Implement CUDA multipass for knn > GPU_MAX_SELECTION_K (#7381)
The KNN search on GPU breaks silently when the k value is larger than the macro GPU_MAX_SELECTION_K, resulting in a trash output (all 0s, large indices > number of total points, or even negative indices). The macro GPU_MAX_SELECTION_K is 2048 if CUDA_VERSION > 9000, otherwise it is 1024. On CPU, the KNN search obviously has no such limits. To improve the GPU KNN search without altering the macro GPU_MAX_SELECTION_K, a multipass algorithm should be implemented, splitting the KNN search into batches where each batch size is < GPU_MAX_SELECTION_K.1 parent 55a51af commit 7dd7bc1
4 files changed
Lines changed: 433 additions & 49 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
69 | 69 | | |
70 | 70 | | |
71 | 71 | | |
| 72 | + | |
72 | 73 | | |
73 | 74 | | |
74 | 75 | | |
| |||
0 commit comments