[REVIEW] Rewrite cuvs-sys build to discover pre-installed cuVS via cmake-package#2022
[REVIEW] Rewrite cuvs-sys build to discover pre-installed cuVS via cmake-package#2022yan-zaretskiy wants to merge 8 commits intorapidsai:mainfrom
Conversation
- Current build process works by manual CMake invocation, which is suboptimal. The cmake-package is an official Rust crate meant to be used in the build scripts for this purpose. The new scripts attempts the discovery in existing library path, via pip, or CMAKE_PREFIX_PATH - added dlpack to rust conda environment for bindgen - Added pre-generated bindings (src/bindings.rs) plus an optional "generate-bindings" cargo feature for opt-in regeneration via bindgen. To support this build scenario, also added the generate-bindings.sh script for CI staleness verification. This would allow us to build cuvs on machines that have no CUDA or cmake, e.g. docs.rs - Moved cuvs-sys dependency to [workspace.dependencies] to deduplicate version - Updated Rust edition to 2024 - "build.sh rust" now checks bindings staleness; the publish CI workflow no longer needs LIBCLANG_PATH
| find_package("dlpack").find().map_err(|e| match e { | ||
| CmakeError::CMakeNotFound | CmakeError::UnsupportedCMakeVersion => { | ||
| anyhow::anyhow!( | ||
| "CMake is not installed or too old (3.19+ required). Install CMake and try again." |
There was a problem hiding this comment.
Minimum required CMake version is 3.30.4. You shouldn't hard-code the version here at all as it is subject to change.
There was a problem hiding this comment.
Updated cuvs-sys/build.rs to it doesn't hard-code a CMake version in the error message.
|
|
||
| // Pip installs cmake configs under site-packages/libcuvs/lib64/cmake/cuvs/. | ||
| // If there's no pip cuVS, report the original system error. | ||
| let pip_cmake_dir = match pip_cuvs_cmake_dir() { |
There was a problem hiding this comment.
All searches for non explicit paths should be opt-in. So the user should be required to ask for pip version
instead of automatically searching for it
There was a problem hiding this comment.
Changed this to opt-in. Python discovery only runs when the LIBCUVS_USE_PYTHON=1 env var is set. Otherwise cuvs-sys uses the standard CMake discovery path. I also verified the Python discovery mechanism in a fresh venv with a pip-installed libcuvs wheel.
| } | ||
|
|
||
| fn find_cuvs_package(prefix: Option<PathBuf>) -> Result<cmake_package::CMakePackage> { | ||
| let mut builder = find_package("cuvs").components([CUVS_COMPONENT.to_owned()]); |
There was a problem hiding this comment.
We should explicitly specify the version of CUVS we want Cmake to find. Otherwise you could be finding CUVS 26.02 while using the rust bindings logic from 26.08+
There was a problem hiding this comment.
This has already been working as requested. I refactored the code to make this version checking more obvious when reading the code, it was buried too deep. The build script requires the underlying libcuvs_c's version to exactly match the crate version.
benfred
left a comment
There was a problem hiding this comment.
nice!
I think that checking in the generated bindings is definitely the way to go here - since this removes the need for clang etc to build the rust bindings, which should make building the rust code easier for your users. As part of this, I'd at least consider removing the clang dependencies from dependencies.yaml though -
| } | ||
| // Bindings are pre-generated and checked in at src/bindings.rs. | ||
| // Use `rust/scripts/generate-bindings.sh` to regenerate them. | ||
| include!("bindings.rs"); |
There was a problem hiding this comment.
Given that the bindings are checked in now - maybe we should go something like pub use bindings::* instead of including via the include! macro ? (or just have the generate_bindings.sh script write out this entire file instead ? )
There was a problem hiding this comment.
Switched this to a normal module re-export in rust/cuvs-sys/src/lib.rs, now it uses mod bindings; pub use bindings::*. We don't want the script to write the lib.rs directly, because we can't get crate level docs this way.
# Conflicts: # rust/cuvs/src/cagra/index.rs
a1bad50 to
b92301f
Compare
b92301f to
d502dcb
Compare
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughReplace CMake-based cuvs-sys packaging with workspace-managed discovery and checked-in bindings; add a bindings generator; bump Rust edition to 2024; remove Clang pins from Conda specs; update CI to CUDA 13.1.1 and drop LIBCLANG_PATH exports; assorted build script and formatting/import refinements across Rust sources. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
rust/cuvs-sys/build.rs (1)
77-78:⚠️ Potential issue | 🟠 MajorConstrain CMake discovery to the required cuVS version, not just post-validate it.
This still calls an unconstrained
find_package("cuvs")and only checksPACKAGE_VERSIONafter a package has already been selected. If an older cuVS appears earlier inCMAKE_PREFIX_PATH, the build can fail even when the exact required version is installed later in the search path.Please verify the
cmake-packageAPI for exact-version lookup and passrequired_versioninto discovery if supported.cmake-package crate Rust API find_package exact version requirementRead-only repo check:
#!/bin/bash # Expect: cuVS discovery passes the required package version into the CMake/package lookup, # not only into ensure_exact_cuvs_version after discovery. rg -n -C4 'find_package\("cuvs"\)|ensure_exact_cuvs_version|required_version' rust/cuvs-sys/build.rsAlso applies to: 125-128
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rust/cuvs-sys/build.rs` around lines 77 - 78, The current find_cuvs_package implementation calls find_package("cuvs") without constraining the version and only validates PACKAGE_VERSION later in ensure_exact_cuvs_version; change it to pass the required version into the cmake-package discovery API (e.g. use the find_package variant that accepts a required_version or required flag) so the lookup itself requires the exact cuVS version, not just post-validated; update find_cuvs_package (and the other discovery call around ensure_exact_cuvs_version) to supply the required_version (derived from your CUVS version constant) to the cmake_package::find_package invocation and remove reliance on selecting an unconstrained package before checking PACKAGE_VERSION.
🧹 Nitpick comments (1)
rust/cuvs/build.rs (1)
8-17: Clean rewrite; consider a small note on single-path assumption.Switching from parsing
DEP_CUVS_CMAKE_LINKER_FLAGSto a direct-Wl,-rpath=$DEP_CUVS_LIBand removing the unconditionalunwrap()is a nice simplification and makesdocs.rs/CUDA-less builds robust. This assumesDEP_CUVS_LIBcontains a single directory containinglibcuvs/libcuvs_cand that any transitive runtime dependencies (e.g., CUDA runtime) are discoverable via system paths orrustc-link-searchemitted elsewhere — which matches the newcuvs-syscmake-package flow. Worth a one-line comment here documenting that contract so a future contributor doesn’t reintroduce multi-path parsing.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rust/cuvs/build.rs` around lines 8 - 17, Add a one-line comment above add_runtime_search_path (or inside main) stating that DEP_CUVS_LIB is expected to be a single directory containing libcuvs/libcuvs_c and that transitive runtime deps (e.g., CUDA) must be resolvable via system paths or other rustc-link-search emissions; keep the simplified behavior of printing cargo:rustc-link-arg=-Wl,-rpath={lib_path} and the removed unwrap().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@rust/scripts/generate-bindings.sh`:
- Around line 115-126: The current lookup for generated_file relies on GNU-only
find -printf and a hardcoded debug path; change it to search the entire
${target_dir} tree portably and pick the newest match: use find starting at
"${target_dir}" with -type f -name 'cuvs_bindings.rs' and -path
'*/cuvs-sys-*/out/*' and -print0, pipe to xargs -0 ls -1t (or ls -t) and take
the first line to set generated_file, preserving the existing existence check;
update the block that defines generated_file to this portable pipeline so it
works on BSD/macOS and for cross-compilation targets.
---
Duplicate comments:
In `@rust/cuvs-sys/build.rs`:
- Around line 77-78: The current find_cuvs_package implementation calls
find_package("cuvs") without constraining the version and only validates
PACKAGE_VERSION later in ensure_exact_cuvs_version; change it to pass the
required version into the cmake-package discovery API (e.g. use the find_package
variant that accepts a required_version or required flag) so the lookup itself
requires the exact cuVS version, not just post-validated; update
find_cuvs_package (and the other discovery call around
ensure_exact_cuvs_version) to supply the required_version (derived from your
CUVS version constant) to the cmake_package::find_package invocation and remove
reliance on selecting an unconstrained package before checking PACKAGE_VERSION.
---
Nitpick comments:
In `@rust/cuvs/build.rs`:
- Around line 8-17: Add a one-line comment above add_runtime_search_path (or
inside main) stating that DEP_CUVS_LIB is expected to be a single directory
containing libcuvs/libcuvs_c and that transitive runtime deps (e.g., CUDA) must
be resolvable via system paths or other rustc-link-search emissions; keep the
simplified behavior of printing cargo:rustc-link-arg=-Wl,-rpath={lib_path} and
the removed unwrap().
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 27326598-c2a4-4bdd-95c8-5a0428266560
📒 Files selected for processing (41)
.github/workflows/publish-rust.yaml.pre-commit-config.yamlci/build_rust.shci/release/update-version.shconda/environments/rust_cuda-129_arch-aarch64.yamlconda/environments/rust_cuda-129_arch-x86_64.yamlconda/environments/rust_cuda-131_arch-aarch64.yamlconda/environments/rust_cuda-131_arch-x86_64.yamldependencies.yamlrust/Cargo.tomlrust/cuvs-sys/CMakeLists.txtrust/cuvs-sys/Cargo.tomlrust/cuvs-sys/RAPIDS_BRANCHrust/cuvs-sys/VERSIONrust/cuvs-sys/build.rsrust/cuvs-sys/get_dlpack.cmakerust/cuvs-sys/src/bindings.rsrust/cuvs-sys/src/lib.rsrust/cuvs/Cargo.tomlrust/cuvs/build.rsrust/cuvs/examples/cagra.rsrust/cuvs/src/brute_force.rsrust/cuvs/src/cagra/index.rsrust/cuvs/src/cagra/index_params.rsrust/cuvs/src/cagra/search_params.rsrust/cuvs/src/cluster/kmeans/mod.rsrust/cuvs/src/cluster/kmeans/params.rsrust/cuvs/src/distance/mod.rsrust/cuvs/src/dlpack.rsrust/cuvs/src/ivf_flat/index.rsrust/cuvs/src/ivf_flat/index_params.rsrust/cuvs/src/ivf_flat/search_params.rsrust/cuvs/src/ivf_pq/index.rsrust/cuvs/src/ivf_pq/index_params.rsrust/cuvs/src/ivf_pq/search_params.rsrust/cuvs/src/lib.rsrust/cuvs/src/resources.rsrust/cuvs/src/vamana/index.rsrust/cuvs/src/vamana/index_params.rsrust/rustfmt.tomlrust/scripts/generate-bindings.sh
💤 Files with no reviewable changes (8)
- rust/cuvs-sys/VERSION
- rust/cuvs-sys/RAPIDS_BRANCH
- conda/environments/rust_cuda-131_arch-aarch64.yaml
- rust/cuvs-sys/get_dlpack.cmake
- conda/environments/rust_cuda-131_arch-x86_64.yaml
- conda/environments/rust_cuda-129_arch-aarch64.yaml
- conda/environments/rust_cuda-129_arch-x86_64.yaml
- rust/cuvs-sys/CMakeLists.txt
|
The bot made a comment to "constrain CMake discovery to the required cuVS version, not just post-validate it." This does not work (that's what I did in the beginning), because the cmake handles the version requirement as a lower bound. But we want the strict equality check, hence I had to enforce it in the build script itself. |
06c22b0 to
099495b
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
dependencies.yaml (1)
261-281: Optional: deduplicate the arch matrices shared withrapids_build.The
specificblock (lines 269-281) is byte-identical torapids_build'sspecificblock (lines 248-260). However, the proposed anchor solution using&rapids_build_toolchain_specificat thespecific:level is not a standard pattern in other RAPIDS repositories—official repos (rapids-metadata, cucim, raft, cuml, cudf) do not rely on YAML anchors in dependencies.yaml. While standard YAML parsers would technically resolve such anchors, rapids-dependency-file-generator's schema expects explicit lists rather than aliases at thespecific:level, and this pattern lacks documented support. If proceeding with deduplication, test thoroughly after running rapids-dependency-file-generator to ensure the generated conda environments are correct, or consider an alternative approach (e.g., manual code comments linking the duplicated sections).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@dependencies.yaml` around lines 261 - 281, The duplicated arch/matrix list appears in rust_build.specific and rapids_build.specific; do not replace it with a nonstandard YAML anchor at the top-level `specific:` (e.g., &rapids_build_toolchain_specific) because rapids-dependency-file-generator expects explicit lists; either keep the byte-identical `specific:` blocks as-is or, if you must deduplicate, move the anchor to the inner list level that exactly matches the generator's schema (anchor the matrix list itself), update both `rust_build` and `rapids_build` to reference that anchored list, and then run the rapids-dependency-file-generator and full conda env generation tests to verify correctness before merging.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@dependencies.yaml`:
- Around line 261-281: The duplicated arch/matrix list appears in
rust_build.specific and rapids_build.specific; do not replace it with a
nonstandard YAML anchor at the top-level `specific:` (e.g.,
&rapids_build_toolchain_specific) because rapids-dependency-file-generator
expects explicit lists; either keep the byte-identical `specific:` blocks as-is
or, if you must deduplicate, move the anchor to the inner list level that
exactly matches the generator's schema (anchor the matrix list itself), update
both `rust_build` and `rapids_build` to reference that anchored list, and then
run the rapids-dependency-file-generator and full conda env generation tests to
verify correctness before merging.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 5bdc7dc3-64ae-4eae-a634-62890975bb5b
📒 Files selected for processing (6)
conda/environments/rust_cuda-129_arch-aarch64.yamlconda/environments/rust_cuda-129_arch-x86_64.yamlconda/environments/rust_cuda-131_arch-aarch64.yamlconda/environments/rust_cuda-131_arch-x86_64.yamldependencies.yamlrust/cuvs-sys/Cargo.toml
💤 Files with no reviewable changes (4)
- conda/environments/rust_cuda-131_arch-x86_64.yaml
- conda/environments/rust_cuda-129_arch-aarch64.yaml
- conda/environments/rust_cuda-129_arch-x86_64.yaml
- conda/environments/rust_cuda-131_arch-aarch64.yaml
5b6511f to
b604559
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
dependencies.yaml (1)
261-282: Consider reusingrapids_buildinstead of duplicating it asrust_build.The new
rust_buildgroup is essentially a copy ofrapids_build(lines 237–260) restricted tooutput_types: [conda]. Since therustfiles block (line 88) emits conda only, the packages materialized byrapids_buildandrust_buildare identical for this use case — the same cmake (*cmake_ver),c-compiler,cuda-nvcc,cxx-compiler,ninja, and the same per-archgcc_linux-*/sysroot_linux-*entries.If the intent was simply to stop pulling
clang/libclanginto the Rust env, that could be achieved by keeping therapids_buildinclude on therustfiles block and dropping theclanginclude (which the AI summary indicates has already been done), without introducing a parallel group that must be kept in sync going forward (e.g., future bumps tocmake_ver, gcc 14.* → 15.*, sysroot 2.28 → 2.xx will need two edits).If a separate group is desired for clarity/isolation, consider at least documenting why it exists and diverges from
rapids_build, so the two don't drift silently.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@dependencies.yaml` around lines 261 - 282, The new rust_build group duplicates rapids_build (same packages: *cmake_ver, c-compiler, cuda-nvcc, cxx-compiler, ninja and per-arch gcc_linux-*/sysroot_linux-*) causing maintenance drift; either remove rust_build and reuse rapids_build in the rust files block (then drop the clang/libclang include there to avoid pulling them into the Rust env) or, if you must keep a separate group, add a clear comment/documentation above rust_build explaining why it diverges and which packages differ so future bumps (cmake_ver, gcc version, sysroot) are not missed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@dependencies.yaml`:
- Around line 261-282: The new rust_build group duplicates rapids_build (same
packages: *cmake_ver, c-compiler, cuda-nvcc, cxx-compiler, ninja and per-arch
gcc_linux-*/sysroot_linux-*) causing maintenance drift; either remove rust_build
and reuse rapids_build in the rust files block (then drop the clang/libclang
include there to avoid pulling them into the Rust env) or, if you must keep a
separate group, add a clear comment/documentation above rust_build explaining
why it diverges and which packages differ so future bumps (cmake_ver, gcc
version, sysroot) are not missed.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: fae3749e-e5e6-4451-8763-754a458fd9ce
📒 Files selected for processing (6)
conda/environments/rust_cuda-129_arch-aarch64.yamlconda/environments/rust_cuda-129_arch-x86_64.yamlconda/environments/rust_cuda-131_arch-aarch64.yamlconda/environments/rust_cuda-131_arch-x86_64.yamldependencies.yamlrust/cuvs-sys/Cargo.toml
💤 Files with no reviewable changes (4)
- conda/environments/rust_cuda-131_arch-aarch64.yaml
- conda/environments/rust_cuda-129_arch-x86_64.yaml
- conda/environments/rust_cuda-131_arch-x86_64.yaml
- conda/environments/rust_cuda-129_arch-aarch64.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
- rust/cuvs-sys/Cargo.toml
Delete stale rust build files in CI
b604559 to
23330f5
Compare
LIBCLANG_PATH