Add UDF Usage and Developer docs by divyegala · Pull Request #2030 · rapidsai/cuvs

divyegala · 2026-04-16T02:17:13Z

dantegd

Nice PR overall, just had a few comments

dantegd · 2026-04-16T20:40:23Z

+* Include ``<cuvs/neighbors/ivf_flat.hpp>`` and define a metric with ``CUVS_METRIC(MyName, { ... })``.
+  Set ``search_params.metric_udf`` to the string returned by ``MyName_udf()``.
+* Prefer the helpers documented next to the macro (``squared_diff``, ``abs_diff``, ``dot_product``,
+  ``point`` element access, and so on) so the same definition works across ``float``, ``int8_t`` /


"and so on" kinda leaves the user guessing, almost like "left as an exercise to the reader". A user writing a custom metric needs to know exactly what's available, a brief table listing all the helpers (name, signature, description) or at minimum a cross-reference to the header line numbers where they're defined would be helpful here.

dantegd · 2026-04-16T20:53:30Z

+   CUVS_METRIC(my_chebyshev, {
+     auto d = abs_diff(x, y);
+     acc    = (d > acc) ? d : acc;
+   })


Small thing, but this differs from the example in ivf_flat.hpp where we don't use the helper, would be good to make them match

dantegd · 2026-04-16T20:57:00Z

 } // namespace example::detail
 ```

+### Step 7b: Example — NVRTC UDFs for `compute_distance` and `apply_filter`


This is clever, however, the explanatory text leading into the code snippets is a really large wall of text that mixes:

Architectural rationale (why NVRTC and not static fatbins)

Implementation constraints (explicit instantiations, forwarding definitions)

Usage instructions (which planner API to call)

Which makes it really hard to read and follow, particularly for someone that has never used NVRTC or familiar with the UDF infra at all.

Consider restructuring into something like:

What you're building (1-2 sentences)

Architecture diagram or flow (entry kernel → forward declarations → NVRTC TU → explicit instantiations → planner registration)

Code with inline comments

Pitfalls/constraints as a separate callout

dantegd · 2026-04-16T20:57:55Z

+  void add_metric_udf_fragment(std::unique_ptr<UDFFatbinFragment> fragment)
+  {
+    add_fragment(std::move(fragment));
+  }
+
+  void add_filter_udf_fragment(std::unique_ptr<UDFFatbinFragment> fragment)
+  {
+    add_fragment(std::move(fragment));
+  }


These do the same and is just semantic sugar, no? Would be good to add a comment to state so for readability and documentation.

dantegd · 2026-04-16T20:59:06Z

+template <DistanceType Metric>
+constexpr auto get_metric_tag() {
+  if constexpr (Metric == DistanceType::Euclidean) return tag_metric_euclidean{};
+  else if constexpr (Metric == DistanceType::MetricUdf) return tag_metric_custom_udf{};
+}
+
+template <FilterType Filter>
+constexpr auto get_filter_tag() {
+  if constexpr (Filter == FilterType::None) return tag_filter_none{};
+  else if constexpr (Filter == FilterType::FilterUdf) return tag_filter_custom_udf{};
+}


I think this is similar to a comment I had in the UDF infra PR, if a new enum value is added later but these functions aren't updated, the compiler will give a confusing "no return" error rather than a clear message. Adding an else { static_assert(...); } fallback would be a good idea. Since this is documentation/example code, arguably it's even more important here to model best practices.

dantegd · 2026-04-16T21:00:40Z

+enum class DistanceType { Euclidean, MetricUdf };
+enum class FilterType { None, FilterUdf };


Suggested change

enum class DistanceType { Euclidean, MetricUdf };

enum class FilterType { None, FilterUdf };

// Extend the DistanceType / FilterType enums from Step 7:

enum class DistanceType { Euclidean, MetricUdf };

enum class FilterType { None, FilterUdf };

Small comment that might help readability

udf docs

5b13cb7

divyegala self-assigned this Apr 16, 2026

divyegala requested a review from a team as a code owner April 16, 2026 02:17

divyegala added doc Improvements or additions to documentation non-breaking Introduces a non-breaking change labels Apr 16, 2026

github-project-automation Bot added this to Unstructured Data Processing Apr 16, 2026

dantegd requested changes Apr 16, 2026

View reviewed changes

divyegala changed the title ~~Add UDF Usage and Devloper docs~~ Add UDF Usage and Developer docs Apr 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add UDF Usage and Developer docs#2030

Add UDF Usage and Developer docs#2030
divyegala wants to merge 1 commit intorapidsai:mainfrom
divyegala:udf-docs

divyegala commented Apr 16, 2026

Uh oh!

dantegd left a comment

Uh oh!

dantegd Apr 16, 2026

Uh oh!

dantegd Apr 16, 2026

Uh oh!

dantegd Apr 16, 2026

Uh oh!

dantegd Apr 16, 2026

Uh oh!

dantegd Apr 16, 2026

Uh oh!

dantegd Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		enum class DistanceType { Euclidean, MetricUdf };
		enum class FilterType { None, FilterUdf };

Conversation

divyegala commented Apr 16, 2026

Uh oh!

dantegd left a comment

Choose a reason for hiding this comment

Uh oh!

dantegd Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

dantegd Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

dantegd Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

dantegd Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

dantegd Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

dantegd Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants