Add support for FP16/INT8 on GPU indexes#1183
Conversation
Signed-off-by: Mickael Ide <mide@nvidia.com>
Signed-off-by: Mickael Ide <mide@nvidia.com>
Signed-off-by: Mickael Ide <mide@nvidia.com>
Signed-off-by: Mickael Ide <mide@nvidia.com>
|
@lowener 🔍 Important: PR Classification Needed! For efficient project management and a seamless review process, it's essential to classify your PR correctly. Here's how:
For any PR outside the kind/improvement category, ensure you link to the associated issue using the format: “issue: #”. Thanks for your efforts and contribution to the community!. |
Signed-off-by: Mickael Ide <mide@nvidia.com>
| set(CUVS_VERSION "${RAPIDS_VERSION}") | ||
| set(CUVS_FORK "rapidsai") | ||
| set(CUVS_PINNED_TAG "branch-25.02") | ||
| set(CUVS_PINNED_TAG "branch-25.06") |
There was a problem hiding this comment.
has cuvs 25.06 been released already? If not, then is it safe to use it already?
There was a problem hiding this comment.
The official release for 25.06 is the 5th of June. So it will be safe to use from that day on.
| auto k = neighbors.extent(1); | ||
| auto k_tmp = k + k_offset; | ||
| if (refine_ratio > 1.0f) { | ||
| bool do_refine_step = refine_ratio > 1.0f; |
There was a problem hiding this comment.
is the refine index type the same as the coarse index type? If not, then 1.0f refine_ratio value is acceptable: imagine coarse index being fp16 and the fine index being fp32
There was a problem hiding this comment.
The refinement step will narrow the n_candidates neighbors selected by the ANN search down to k neighbors. If the refine_ratio is equal to 1.0 then there is no need to narrow it down because n_candidates == k
|
|
||
| if constexpr (index_kind == cuvs_proto::cuvs_index_kind::brute_force) { | ||
| if (k_tmp > k) { | ||
| for (auto i = 0; i < host_ids.extent(0); ++i) { |
There was a problem hiding this comment.
why is this logic no longer needed?
There was a problem hiding this comment.
This logic was introduced for post-filtering the brute-force algorithm. It now supports pre-filtering so that's why it is not required anymore.
I found also some comments left at the time it was introduced detailing it: #286 (comment)
Signed-off-by: Mickael Ide <mide@nvidia.com>
|
/lgtm |
|
issue #1201 |
|
issue: #1201 |
|
/kind improvement |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alexanderguzhva, lowener The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
issue: #1201