Skip to content

CAGRA Bloom Filter#2236

Open
divyegala wants to merge 2 commits into
rapidsai:mainfrom
divyegala:cagra-bloom-filter
Open

CAGRA Bloom Filter#2236
divyegala wants to merge 2 commits into
rapidsai:mainfrom
divyegala:cagra-bloom-filter

Conversation

@divyegala

Copy link
Copy Markdown
Member

No description provided.

@divyegala divyegala self-assigned this Jun 10, 2026
@divyegala divyegala added the feature request New feature or request label Jun 10, 2026
@divyegala divyegala requested review from a team as code owners June 10, 2026 23:34
@divyegala divyegala added the non-breaking Introduces a non-breaking change label Jun 10, 2026
@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Summary by CodeRabbit

  • New Features
    • Added bloom-filter-based filtering support to CAGRA nearest-neighbor search
    • Added a new bloom filter example for CAGRA
    • Added bloom-vs-bitset filtering benchmark and a plotting script to visualize results
  • Bug Fixes
    • Fixed filter type selection so UDF sample filters now return the correct type id in JIT selection
  • Chores
    • Added cuco library dependency to enable bloom filter functionality

Walkthrough

This PR integrates cuco bloom filters into CAGRA search by adding a new bloom_filter struct, compile-time type detection, device-side membership kernels, JIT fragment routing, explicit kernel instantiation, runtime dispatch via dynamic_cast, CMake dependency linking, and complete example and benchmark implementations demonstrating construction, search, and performance comparison.

Changes

Bloom Filter Support for CAGRA Search

Layer / File(s) Summary
Public API: Bloom Filter Type Contract
cpp/include/cuvs/neighbors/common.hpp, cpp/include/cuvs/detail/jit_lto/common_fragments.hpp
Adds cuvs::neighbors::filtering::bloom_filter struct storing a device void pointer and optional filtering rate, with get_filter_type() returning FilterType::Bloom. Introduces tag_filter_bloom_filter JIT fragment tag.
Compile-time Bloom Detection and JIT Fragment Routing
cpp/src/neighbors/detail/cagra/cagra_filter_payload.hpp, cpp/src/neighbors/detail/cagra/shared_launcher_jit.hpp
Adds is_bloom_filter type trait and updates sample_filter_jit_tag to recognize bloom filters and map them to the bloom filter JIT fragment for both direct and nested filter types.
Device-side Bloom Filter Data and Kernel Implementation
cpp/src/neighbors/detail/sample_filter_data.cuh, cpp/src/neighbors/detail/cagra/jit_lto_kernels/sample_filter_impl.cuh
Defines bloom_filter_data_t template holding a cuco bloom filter reference, and implements the sample_filter_bloom_filter_impl device kernel that checks bloom filter membership with null-data passthrough.
CAGRA Payload and Filter Pointer Wiring
cpp/src/neighbors/detail/cagra/cagra_filter_payload.hpp
Updates fill_cagra_sample_filter and cagra_filter_data_ptr to handle bloom filters by extracting the filter_data pointer and treating bloom filters equivalently to UDF-style filters.
Kernel Instantiation and JIT Configuration
cpp/src/neighbors/detail/cagra/jit_lto_kernels/sample_filter_matrix.json, cpp/src/neighbors/detail/cagra/search_single_cta_inst.cu.in, cpp/src/neighbors/detail/cagra/search_multi_cta_inst.cu.in, cpp/src/neighbors/detail/cagra/search_single_cta_kernel_launcher_jit.cuh
Adds bloom_filter to the JSON filter matrix and defines bloom_filter_t type alias in both single and multi-CTA instantiation templates. Fixes UDF filter type-id assignment to explicitly return 3.
Runtime Search Dispatch for Bloom Filters
cpp/src/neighbors/cagra.cuh
Extends cagra::search with dynamic_cast-based branch detecting bloom_filter, clamping filtering_rate to [0.0f, 0.999f], and dispatching to search_with_filtering.
Build System: CMake cuco Dependency
cpp/CMakeLists.txt, examples/cpp/CMakeLists.txt
Adds cuco as a CPM-managed dependency and links cuco::cuco to both the core jit_lto_kernel_usage_requirements target and example/benchmark targets.
Example: CAGRA Bloom Filter Search
examples/cpp/src/cagra_bloom_filter_example.cu
Complete CUDA example constructing a CAGRA index, building a cuco bloom filter for even-ID filtering, executing multi-query search with bloom filtering enabled, and validating returned neighbors using cuco's contains_async API.
Benchmark Program: Bitset vs Bloom Filter Comparison
examples/cpp/src/cagra_filter_benchmark.cu
CUDA benchmark program evaluating CAGRA search latency and recall across bitset and bloom filter strategies with configurable dataset sizes, validity percentages, optional ground-truth computation, and CSV result output.
Benchmark Visualization: Python Plotting Tool
examples/cpp/plot_filter_benchmark.py
Python script generating multi-row latency and recall grids plus overview sweeps from benchmark CSV, with grouped bars, legends, and optional recall@k visualization via --with-recall flag.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes


Possibly related PRs

  • rapidsai/cuvs#2132: Establishes the UDF filter pattern that bloom filter integration reuses for payload routing in shared_launcher_jit.hpp and cagra_filter_payload.hpp.
  • rapidsai/cuvs#1807: Introduces the JIT-LTO CAGRA search infrastructure that this PR extends with bloom filter support in the fragment selection and routing pipeline.

Suggested labels

C++, benchmarking


Suggested reviewers

  • KyleFromNVIDIA
  • cjnolet
🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 22.22% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive No description was provided by the author; the pull request lacks any accompanying explanation of the changes. Add a pull request description explaining the motivation, implementation approach, and key changes for bloom filter support in CAGRA.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'CAGRA Bloom Filter' directly summarizes the main change—adding bloom filter support to CAGRA search functionality across multiple files.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
examples/cpp/plot_filter_benchmark.py (1)

120-153: 💤 Low value

plot_panel_lines appears unused.

This function is defined but never called in the script. Consider removing it if not needed, or adding a comment indicating its intended future use.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@examples/cpp/plot_filter_benchmark.py` around lines 120 - 153, The function
plot_panel_lines is defined but is never called anywhere in the script. Either
remove this function entirely if it is not needed, or if it is intended for
future use, add a docstring or comment at the top of the function that clearly
explains its intended use case and why it is being retained. This will help
clarify whether the function is dead code or work-in-progress that should remain
in the codebase.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@examples/cpp/src/cagra_filter_benchmark.cu`:
- Around line 146-151: The `valid_ids_device` variable is destroyed when the if
block ends at line 151, but the `bloom.add_async()` call is asynchronous and may
still access this memory after it's freed. To fix this, either move the stream
synchronization call (currently at line 158) inside the if block before the
closing brace to ensure the async operation completes before valid_ids_device is
destroyed, or replace the `bloom.add_async()` call with the synchronous
`bloom.add()` method to avoid the race condition entirely.

---

Nitpick comments:
In `@examples/cpp/plot_filter_benchmark.py`:
- Around line 120-153: The function plot_panel_lines is defined but is never
called anywhere in the script. Either remove this function entirely if it is not
needed, or if it is intended for future use, add a docstring or comment at the
top of the function that clearly explains its intended use case and why it is
being retained. This will help clarify whether the function is dead code or
work-in-progress that should remain in the codebase.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 232cce46-f7f4-454f-98de-3f4904ed45c9

📥 Commits

Reviewing files that changed from the base of the PR and between 21e5f20 and 16946a4.

📒 Files selected for processing (3)
  • examples/cpp/CMakeLists.txt
  • examples/cpp/plot_filter_benchmark.py
  • examples/cpp/src/cagra_filter_benchmark.cu
🚧 Files skipped from review as they are similar to previous changes (1)
  • examples/cpp/CMakeLists.txt

Comment on lines +146 to +151
if (!valid_ids_host.empty()) {
rmm::device_uvector<key_type> valid_ids_device(valid_ids_host.size(), stream);
raft::copy(valid_ids_device.data(), valid_ids_host.data(), valid_ids_host.size(), stream);
bloom.add_async(
valid_ids_device.data(), valid_ids_device.data() + valid_ids_device.size(), stream);
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Use-after-free: valid_ids_device destroyed before async operation completes.

valid_ids_device is destroyed when the if block ends (line 151), but bloom.add_async is asynchronous. The device memory may be freed while the kernel is still accessing it. The sync_stream at line 158 is outside this block, so it does not protect this memory.

🐛 Proposed fix: move sync inside the if-block
   if (!valid_ids_host.empty()) {
     rmm::device_uvector<key_type> valid_ids_device(valid_ids_host.size(), stream);
     raft::copy(valid_ids_device.data(), valid_ids_host.data(), valid_ids_host.size(), stream);
     bloom.add_async(
       valid_ids_device.data(), valid_ids_device.data() + valid_ids_device.size(), stream);
+    raft::resource::sync_stream(res);
   }

Alternatively, use the synchronous bloom.add() instead of add_async().

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (!valid_ids_host.empty()) {
rmm::device_uvector<key_type> valid_ids_device(valid_ids_host.size(), stream);
raft::copy(valid_ids_device.data(), valid_ids_host.data(), valid_ids_host.size(), stream);
bloom.add_async(
valid_ids_device.data(), valid_ids_device.data() + valid_ids_device.size(), stream);
}
if (!valid_ids_host.empty()) {
rmm::device_uvector<key_type> valid_ids_device(valid_ids_host.size(), stream);
raft::copy(valid_ids_device.data(), valid_ids_host.data(), valid_ids_host.size(), stream);
bloom.add_async(
valid_ids_device.data(), valid_ids_device.data() + valid_ids_device.size(), stream);
raft::resource::sync_stream(res);
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@examples/cpp/src/cagra_filter_benchmark.cu` around lines 146 - 151, The
`valid_ids_device` variable is destroyed when the if block ends at line 151, but
the `bloom.add_async()` call is asynchronous and may still access this memory
after it's freed. To fix this, either move the stream synchronization call
(currently at line 158) inside the if block before the closing brace to ensure
the async operation completes before valid_ids_device is destroyed, or replace
the `bloom.add_async()` call with the synchronous `bloom.add()` method to avoid
the race condition entirely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature request New feature or request non-breaking Introduces a non-breaking change

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

1 participant