Commit 5f15f19
authored
Speed up recall calculation in cuVS Bench for large top-K (#1816)
Currently, recall calculation in cuVS Bench essentially runs an outer for loop over the `k` ground-truth vector IDs and an inner loop over the `k` ANN result vector IDs, incrementing a counter if the computed value matches the ground truth. This works well assuming `k` is small but the complexity is `O(k^2)`.
When benchmarking use cases involving large `k` values, the recall calculation becomes a bottleneck especially since a large `k` does not necessarily lead to much slower search times, so the recall calculation is performed about as many times as would be for a small `k`, leading to unacceptable (or at least humanly unbearable) run times.
This update speeds up the recall calculation in the following ways:
1. Eager hashing of vector IDs
- During the construction of the dataset, we populate for each query a `std::unordered_map` of {vector_id, neighbor_rank}. This step has complexity `O(k)` and the hash maps are cached for all benchmark cases.
- During search, we look up the hash of each search result in the ground truth map to determine whether it is a true result. This step has complexity `O(k)`.
2. Parallelizing hash map build and lookup
- We use basic threading to parallelize recall calculation at the query level (for ease of implementation and cache locality).
- Care is taken to avoid oversubscribing the CPU when benchmarking is run on multiple threads e.g. in throughput mode.
3. Capping the total number of queries for which recall is calculated to about 10,000
- This avoids unbounded recall calculations if using large sets of queries and ground truths while performing many iterations of the benchmark case.
- Underlying assumption is that the sample of queries used for recall calculation will be representative of the recall performance for the benchmark case tested.
Testing at k=15000, batch-size=500, iterations=20, cpu=AMD EPYC 7413 24 cores/48 threads:
- baseline wall time: 285 s
- improved wall time: 3.7 s
- Note that the wall time includes loading and running the benchmarks, which takes over 1 s for these settings.
- Also note that if the number of iterations is not specified, the benchmark would run for over 100 iterations which would make the baseline runtime much slower as recall calculation is performed for far more than 10,000 queries.
- At k=10, wall times are 1.369 s (PR) vs. 1.362 s (baseline)
Authors:
- James Xia (https://github.com/jamxia155)
- Anupam (https://github.com/aamijar)
Approvers:
- Artem M. Chirkin (https://github.com/achirkin)
URL: #18161 parent 74681b5 commit 5f15f19
2 files changed
Lines changed: 158 additions & 52 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | | - | |
| 2 | + | |
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
| |||
351 | 351 | | |
352 | 352 | | |
353 | 353 | | |
354 | | - | |
355 | | - | |
356 | | - | |
357 | | - | |
358 | | - | |
359 | | - | |
360 | | - | |
361 | | - | |
362 | | - | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
363 | 357 | | |
364 | 358 | | |
365 | 359 | | |
| |||
369 | 363 | | |
370 | 364 | | |
371 | 365 | | |
| 366 | + | |
| 367 | + | |
| 368 | + | |
| 369 | + | |
| 370 | + | |
| 371 | + | |
372 | 372 | | |
373 | | - | |
374 | | - | |
375 | | - | |
376 | | - | |
377 | | - | |
378 | | - | |
379 | | - | |
380 | | - | |
381 | | - | |
382 | | - | |
383 | | - | |
384 | | - | |
385 | | - | |
386 | | - | |
387 | | - | |
388 | | - | |
389 | | - | |
390 | | - | |
391 | | - | |
392 | | - | |
393 | | - | |
394 | | - | |
395 | | - | |
396 | | - | |
397 | | - | |
398 | | - | |
399 | | - | |
400 | | - | |
| 373 | + | |
| 374 | + | |
| 375 | + | |
| 376 | + | |
| 377 | + | |
| 378 | + | |
| 379 | + | |
| 380 | + | |
| 381 | + | |
| 382 | + | |
| 383 | + | |
| 384 | + | |
| 385 | + | |
| 386 | + | |
| 387 | + | |
| 388 | + | |
401 | 389 | | |
402 | | - | |
403 | 390 | | |
| 391 | + | |
| 392 | + | |
| 393 | + | |
| 394 | + | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
404 | 399 | | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
| 404 | + | |
| 405 | + | |
| 406 | + | |
| 407 | + | |
| 408 | + | |
405 | 409 | | |
406 | 410 | | |
407 | 411 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | | - | |
| 2 | + | |
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
| 17 | + | |
17 | 18 | | |
18 | 19 | | |
19 | 20 | | |
| |||
33 | 34 | | |
34 | 35 | | |
35 | 36 | | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
36 | 139 | | |
37 | 140 | | |
38 | 141 | | |
39 | | - | |
| 142 | + | |
40 | 143 | | |
41 | 144 | | |
42 | 145 | | |
43 | 146 | | |
44 | 147 | | |
45 | 148 | | |
46 | 149 | | |
47 | | - | |
48 | 150 | | |
| 151 | + | |
49 | 152 | | |
50 | 153 | | |
51 | 154 | | |
| |||
73 | 176 | | |
74 | 177 | | |
75 | 178 | | |
76 | | - | |
77 | | - | |
78 | | - | |
79 | | - | |
| 179 | + | |
80 | 180 | | |
81 | 181 | | |
82 | 182 | | |
| |||
94 | 194 | | |
95 | 195 | | |
96 | 196 | | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
97 | 202 | | |
98 | 203 | | |
99 | 204 | | |
| |||
118 | 223 | | |
119 | 224 | | |
120 | 225 | | |
121 | | - | |
122 | | - | |
| 226 | + | |
123 | 227 | | |
124 | 228 | | |
125 | 229 | | |
| |||
137 | 241 | | |
138 | 242 | | |
139 | 243 | | |
140 | | - | |
| 244 | + | |
141 | 245 | | |
142 | | - | |
143 | | - | |
144 | | - | |
| 246 | + | |
145 | 247 | | |
146 | 248 | | |
147 | 249 | | |
| |||
0 commit comments