Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions c/src/neighbors/brute_force.cpp
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@

/*
* SPDX-FileCopyrightText: Copyright (c) 2024-2025, NVIDIA CORPORATION.
* SPDX-FileCopyrightText: Copyright (c) 2024-2026, NVIDIA CORPORATION.
* SPDX-License-Identifier: Apache-2.0
*/

Expand All @@ -10,6 +10,7 @@

#include <raft/core/error.hpp>
#include <raft/core/mdspan_types.hpp>
#include <raft/core/numpy_serializer.hpp>
#include <raft/core/resources.hpp>
#include <raft/core/serialize.hpp>

Expand Down Expand Up @@ -240,7 +241,7 @@ extern "C" cuvsError_t cuvsBruteForceDeserialize(cuvsResources_t res,
if (!is) { RAFT_FAIL("Cannot open file %s", filename); }
char dtype_string[4];
is.read(dtype_string, 4);
auto dtype = raft::detail::numpy_serializer::parse_descr(std::string(dtype_string, 4));
auto dtype = raft::numpy_serializer::parse_descr(std::string(dtype_string, 4));
Comment on lines 242 to +244
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard against truncated header reads before parsing dtype.

Line 243 reads 4 bytes but never checks success. On short/corrupt files, Line 244 may parse uninitialized bytes.

💡 Proposed fix
-    char dtype_string[4];
-    is.read(dtype_string, 4);
-    auto dtype = raft::numpy_serializer::parse_descr(std::string(dtype_string, 4));
+    char dtype_string[4]{};
+    if (!is.read(dtype_string, sizeof(dtype_string))) {
+      RAFT_FAIL("Invalid or truncated index header in file %s", filename);
+    }
+    auto dtype =
+      raft::numpy_serializer::parse_descr(std::string(dtype_string, sizeof(dtype_string)));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@c/src/neighbors/brute_force.cpp` around lines 242 - 244, The code reads 4
bytes into dtype_string using is.read and immediately calls
raft::numpy_serializer::parse_descr on those bytes; however it doesn't check
whether the read succeeded. Update the block around dtype_string/is.read to
verify the stream read (e.g., check is.gcount() == 4 or is.fail()/is.good())
before calling raft::numpy_serializer::parse_descr, and on failure handle the
truncated header by returning an error/throwing an exception or logging and
aborting the parse so parse_descr never receives uninitialized data.


index->dtype.bits = dtype.itemsize * 8;
if (dtype.kind == 'f' && dtype.itemsize == 4) {
Expand Down
3 changes: 2 additions & 1 deletion c/src/neighbors/cagra.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@

#include <raft/core/error.hpp>
#include <raft/core/mdspan_types.hpp>
#include <raft/core/numpy_serializer.hpp>
#include <raft/core/resources.hpp>
#include <raft/core/serialize.hpp>

Expand Down Expand Up @@ -875,7 +876,7 @@ extern "C" cuvsError_t cuvsCagraDeserialize(cuvsResources_t res,
if (!is) { RAFT_FAIL("Cannot open file %s", filename); }
char dtype_string[4];
is.read(dtype_string, 4);
auto dtype = raft::detail::numpy_serializer::parse_descr(std::string(dtype_string, 4));
auto dtype = raft::numpy_serializer::parse_descr(std::string(dtype_string, 4));

Comment on lines 877 to 880
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard the dtype prefix read before parse_descr.

Line 878 reads 4 bytes but does not validate read length before parsing on Line 879. Corrupt/truncated files can produce invalid dtype decoding.

Proposed fix
-    char dtype_string[4];
-    is.read(dtype_string, 4);
-    auto dtype = raft::numpy_serializer::parse_descr(std::string(dtype_string, 4));
+    char dtype_string[4] = {};
+    is.read(dtype_string, sizeof(dtype_string));
+    RAFT_EXPECTS(is.gcount() == static_cast<std::streamsize>(sizeof(dtype_string)),
+                 "Failed to read dtype header from %s",
+                 filename);
+    auto dtype =
+      raft::numpy_serializer::parse_descr(std::string(dtype_string, sizeof(dtype_string)));
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
char dtype_string[4];
is.read(dtype_string, 4);
auto dtype = raft::detail::numpy_serializer::parse_descr(std::string(dtype_string, 4));
auto dtype = raft::numpy_serializer::parse_descr(std::string(dtype_string, 4));
char dtype_string[4] = {};
is.read(dtype_string, sizeof(dtype_string));
RAFT_EXPECTS(is.gcount() == static_cast<std::streamsize>(sizeof(dtype_string)),
"Failed to read dtype header from %s",
filename);
auto dtype =
raft::numpy_serializer::parse_descr(std::string(dtype_string, sizeof(dtype_string)));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@c/src/neighbors/cagra.cpp` around lines 877 - 880, The code reads 4 bytes
into dtype_string then calls raft::numpy_serializer::parse_descr without
validating the read; first check the read succeeded (e.g., verify is.read(...)
and that is.gcount() == 4 or that the stream is in a good state) before
constructing std::string(dtype_string, 4) and calling
raft::numpy_serializer::parse_descr; if the read fails/returns fewer than 4
bytes, handle the error (throw, return an error code, or log and abort) instead
of passing potentially truncated data to parse_descr.

index->dtype.bits = dtype.itemsize * 8;
if (dtype.kind == 'f' && dtype.itemsize == 4) {
Expand Down
5 changes: 3 additions & 2 deletions c/src/neighbors/ivf_flat.cpp
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@

/*
* SPDX-FileCopyrightText: Copyright (c) 2024-2025, NVIDIA CORPORATION.
* SPDX-FileCopyrightText: Copyright (c) 2024-2026, NVIDIA CORPORATION.
* SPDX-License-Identifier: Apache-2.0
*/

Expand All @@ -9,6 +9,7 @@

#include <raft/core/error.hpp>
#include <raft/core/mdspan_types.hpp>
#include <raft/core/numpy_serializer.hpp>
#include <raft/core/resources.hpp>
#include <raft/core/serialize.hpp>
#include <raft/util/cudart_utils.hpp>
Expand Down Expand Up @@ -301,7 +302,7 @@ extern "C" cuvsError_t cuvsIvfFlatDeserialize(cuvsResources_t res,
if (!is) { RAFT_FAIL("Cannot open file %s", filename); }
char dtype_string[4];
is.read(dtype_string, 4);
auto dtype = raft::detail::numpy_serializer::parse_descr(std::string(dtype_string, 4));
auto dtype = raft::numpy_serializer::parse_descr(std::string(dtype_string, 4));

Comment on lines 303 to 306
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fail early on short dtype-header reads.

Line 304 reads the dtype prefix without checking byte count before parsing on Line 305. This should be validated to handle malformed files safely.

Proposed fix
-    char dtype_string[4];
-    is.read(dtype_string, 4);
-    auto dtype = raft::numpy_serializer::parse_descr(std::string(dtype_string, 4));
+    char dtype_string[4] = {};
+    is.read(dtype_string, sizeof(dtype_string));
+    RAFT_EXPECTS(is.gcount() == static_cast<std::streamsize>(sizeof(dtype_string)),
+                 "Failed to read dtype header from %s",
+                 filename);
+    auto dtype =
+      raft::numpy_serializer::parse_descr(std::string(dtype_string, sizeof(dtype_string)));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@c/src/neighbors/ivf_flat.cpp` around lines 303 - 306, The code reads four
bytes into dtype_string then calls raft::numpy_serializer::parse_descr without
verifying the read succeeded; validate the read length immediately after
is.read(dtype_string, 4) (e.g., check is.gcount() == 4 or test stream state like
if (!is || is.gcount() != 4)) and on short reads/failure throw or return a clear
error (or set an error status) before calling
raft::numpy_serializer::parse_descr, so malformed or truncated files are handled
safely.

index->dtype.bits = dtype.itemsize * 8;
if (dtype.kind == 'f' && dtype.itemsize == 4) {
Expand Down
7 changes: 4 additions & 3 deletions c/src/neighbors/mg_cagra.cpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* SPDX-FileCopyrightText: Copyright (c) 2025, NVIDIA CORPORATION.
* SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION.
* SPDX-License-Identifier: Apache-2.0
*/

Expand All @@ -10,6 +10,7 @@
#include <cuvs/neighbors/common.hpp>
#include <dlpack/dlpack.h>
#include <raft/core/error.hpp>
#include <raft/core/numpy_serializer.hpp>
#include <raft/core/serialize.hpp>

#include "../core/exceptions.hpp"
Expand Down Expand Up @@ -401,7 +402,7 @@ extern "C" cuvsError_t cuvsMultiGpuCagraDeserialize(cuvsResources_t res,
if (!is) { RAFT_FAIL("Cannot open file %s", filename); }
char dtype_string[4];
is.read(dtype_string, 4);
auto dtype = raft::detail::numpy_serializer::parse_descr(std::string(dtype_string, 4));
auto dtype = raft::numpy_serializer::parse_descr(std::string(dtype_string, 4));
is.close();
Comment on lines 403 to 406
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate dtype header reads before parsing.

On Line 404 and Line 435, the 4-byte read is not validated. Truncated files can pass partial/garbage bytes into parse_descr, causing invalid dtype dispatch.

Proposed fix
-    char dtype_string[4];
-    is.read(dtype_string, 4);
-    auto dtype = raft::numpy_serializer::parse_descr(std::string(dtype_string, 4));
+    char dtype_string[4] = {};
+    is.read(dtype_string, sizeof(dtype_string));
+    RAFT_EXPECTS(is.gcount() == static_cast<std::streamsize>(sizeof(dtype_string)),
+                 "Failed to read dtype header from %s",
+                 filename);
+    auto dtype =
+      raft::numpy_serializer::parse_descr(std::string(dtype_string, sizeof(dtype_string)));
@@
-    char dtype_string[4];
-    is.read(dtype_string, 4);
-    auto dtype = raft::numpy_serializer::parse_descr(std::string(dtype_string, 4));
+    char dtype_string[4] = {};
+    is.read(dtype_string, sizeof(dtype_string));
+    RAFT_EXPECTS(is.gcount() == static_cast<std::streamsize>(sizeof(dtype_string)),
+                 "Failed to read dtype header from %s",
+                 filename);
+    auto dtype =
+      raft::numpy_serializer::parse_descr(std::string(dtype_string, sizeof(dtype_string)));

Also applies to: 434-437

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@c/src/neighbors/mg_cagra.cpp` around lines 403 - 406, The 4-byte dtype header
read into dtype_string before calling raft::numpy_serializer::parse_descr is not
validated; check the istream read result (e.g., is.read(...) and then
is.gcount() == 4 or !is.fail()) and handle short/truncated reads by
logging/throwing/returning an error instead of calling parse_descr with partial
data; apply the same validation to the other occurrence that reads 4 bytes so
neither is.read -> parse_descr path can receive garbage.


index->dtype.bits = dtype.itemsize * 8;
Expand Down Expand Up @@ -432,7 +433,7 @@ extern "C" cuvsError_t cuvsMultiGpuCagraDistribute(cuvsResources_t res,
if (!is) { RAFT_FAIL("Cannot open file %s", filename); }
char dtype_string[4];
is.read(dtype_string, 4);
auto dtype = raft::detail::numpy_serializer::parse_descr(std::string(dtype_string, 4));
auto dtype = raft::numpy_serializer::parse_descr(std::string(dtype_string, 4));
is.close();

index->dtype.bits = dtype.itemsize * 8;
Expand Down
7 changes: 4 additions & 3 deletions c/src/neighbors/mg_ivf_flat.cpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* SPDX-FileCopyrightText: Copyright (c) 2025, NVIDIA CORPORATION.
* SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION.
* SPDX-License-Identifier: Apache-2.0
*/

Expand All @@ -10,6 +10,7 @@
#include <cuvs/neighbors/ivf_flat.hpp>
#include <dlpack/dlpack.h>
#include <raft/core/error.hpp>
#include <raft/core/numpy_serializer.hpp>
#include <raft/core/serialize.hpp>

#include "../core/exceptions.hpp"
Expand Down Expand Up @@ -398,7 +399,7 @@ extern "C" cuvsError_t cuvsMultiGpuIvfFlatDeserialize(cuvsResources_t res,
if (!is) { RAFT_FAIL("Cannot open file %s", filename); }
char dtype_string[4];
is.read(dtype_string, 4);
auto dtype = raft::detail::numpy_serializer::parse_descr(std::string(dtype_string, 4));
auto dtype = raft::numpy_serializer::parse_descr(std::string(dtype_string, 4));
is.close();
Comment on lines 400 to 403
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate header read length before calling parse_descr in both paths.

Both blocks parse the dtype descriptor immediately after read(...) without checking stream state. Truncated files can lead to undefined behavior due to partially initialized buffers.

💡 Proposed fix (apply in both deserialize and distribute)
-    char dtype_string[4];
-    is.read(dtype_string, 4);
-    auto dtype = raft::numpy_serializer::parse_descr(std::string(dtype_string, 4));
+    char dtype_string[4]{};
+    if (!is.read(dtype_string, sizeof(dtype_string))) {
+      RAFT_FAIL("Invalid or truncated index header in file %s", filename);
+    }
+    auto dtype =
+      raft::numpy_serializer::parse_descr(std::string(dtype_string, sizeof(dtype_string)));

Also applies to: 431-434

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@c/src/neighbors/mg_ivf_flat.cpp` around lines 400 - 403, The code reads 4
bytes into dtype_string and calls raft::numpy_serializer::parse_descr without
validating the read; update both the deserialize and distribute code paths to
check is.read(...) and the stream state (e.g., verify gcount() == 4 or
is.good()/is) before constructing std::string(dtype_string,4) and calling
parse_descr, and handle truncated reads by returning/throwing an error or
reporting via existing error path; reference the dtype_string buffer, the
is.read call, and the parse_descr invocation so you change both occurrences
(around the calls near deserialize and distribute).


index->dtype.bits = dtype.itemsize * 8;
Expand Down Expand Up @@ -429,7 +430,7 @@ extern "C" cuvsError_t cuvsMultiGpuIvfFlatDistribute(cuvsResources_t res,
if (!is) { RAFT_FAIL("Cannot open file %s", filename); }
char dtype_string[4];
is.read(dtype_string, 4);
auto dtype = raft::detail::numpy_serializer::parse_descr(std::string(dtype_string, 4));
auto dtype = raft::numpy_serializer::parse_descr(std::string(dtype_string, 4));
is.close();

index->dtype.bits = dtype.itemsize * 8;
Expand Down
5 changes: 3 additions & 2 deletions c/src/neighbors/mg_ivf_pq.cpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* SPDX-FileCopyrightText: Copyright (c) 2025, NVIDIA CORPORATION.
* SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION.
* SPDX-License-Identifier: Apache-2.0
*/

Expand All @@ -10,6 +10,7 @@
#include <cuvs/neighbors/ivf_pq.hpp>
#include <dlpack/dlpack.h>
#include <raft/core/error.hpp>
#include <raft/core/numpy_serializer.hpp>
#include <raft/core/serialize.hpp>

#include "../core/exceptions.hpp"
Expand Down Expand Up @@ -390,7 +391,7 @@ extern "C" cuvsError_t cuvsMultiGpuIvfPqDeserialize(cuvsResources_t res,
if (!is) { RAFT_FAIL("Cannot open file %s", filename); }
char dtype_string[4];
is.read(dtype_string, 4);
auto dtype = raft::detail::numpy_serializer::parse_descr(std::string(dtype_string, 4));
auto dtype = raft::numpy_serializer::parse_descr(std::string(dtype_string, 4));
is.close();
Comment on lines 392 to 395
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate read() result before dtype parsing.

Line 393 does not verify that 4 bytes were actually read. A short read can propagate invalid bytes into parse_descr on Line 394.

Proposed fix
-    char dtype_string[4];
-    is.read(dtype_string, 4);
-    auto dtype = raft::numpy_serializer::parse_descr(std::string(dtype_string, 4));
+    char dtype_string[4] = {};
+    is.read(dtype_string, sizeof(dtype_string));
+    RAFT_EXPECTS(is.gcount() == static_cast<std::streamsize>(sizeof(dtype_string)),
+                 "Failed to read dtype header from %s",
+                 filename);
+    auto dtype =
+      raft::numpy_serializer::parse_descr(std::string(dtype_string, sizeof(dtype_string)));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@c/src/neighbors/mg_ivf_pq.cpp` around lines 392 - 395, The code calls is.read
into dtype_string and immediately hands those bytes to
raft::numpy_serializer::parse_descr without checking the read result; update the
logic around dtype_string/is.read to validate that 4 bytes were actually read
(e.g., check is.gcount() == 4 and/or is.good()/is.fail()) before calling
parse_descr, and handle the short-read error path (return an error, throw, or
log and exit) so parse_descr is never called with incomplete data; reference the
dtype_string buffer, the is.read call, and raft::numpy_serializer::parse_descr
when making the change and ensure is.close() still runs in all code paths.


index->dtype.bits = dtype.itemsize * 8;
Expand Down
6 changes: 3 additions & 3 deletions cpp/cmake/thirdparty/get_raft.cmake
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
# =============================================================================
# cmake-format: off
# SPDX-FileCopyrightText: Copyright (c) 2023-2025, NVIDIA CORPORATION.
# SPDX-FileCopyrightText: Copyright (c) 2023-2026, NVIDIA CORPORATION.
# SPDX-License-Identifier: Apache-2.0
# cmake-format: on

# Use RAPIDS_VERSION_MAJOR_MINOR from rapids_config.cmake
set(RAFT_VERSION "${RAPIDS_VERSION_MAJOR_MINOR}")
set(RAFT_FORK "rapidsai")
set(RAFT_PINNED_TAG "${rapids-cmake-checkout-tag}")
set(RAFT_FORK "julianmi")
set(RAFT_PINNED_TAG "expose-public-npy-helpers")
Comment on lines +9 to +10
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Checking upstream dependency status (rapidsai/raft PR `#3003`)..."
curl -s https://api.github.com/repos/rapidsai/raft/pulls/3003 \
  | jq '{state, merged, merge_commit_sha, head_repo: .head.repo.full_name, head_ref: .head.ref, base_ref: .base.ref}'

echo
echo "Checking whether expose-public-npy-helpers exists in official and fork repos..."
git ls-remote --heads https://github.com/rapidsai/raft.git expose-public-npy-helpers || true
git ls-remote --tags  https://github.com/rapidsai/raft.git expose-public-npy-helpers || true
git ls-remote --heads https://github.com/julianmi/raft.git expose-public-npy-helpers || true
git ls-remote --tags  https://github.com/julianmi/raft.git expose-public-npy-helpers || true

Repository: rapidsai/cuvs

Length of output: 407


🏁 Script executed:

cat -n cpp/cmake/thirdparty/get_raft.cmake

Repository: rapidsai/cuvs

Length of output: 3417


🏁 Script executed:

# Search for any comments or related discussions about this fork usage
rg "expose-public-npy-helpers|RAFT_FORK|julianmi" --type cmake --type markdown -C 3

Repository: rapidsai/cuvs

Length of output: 9046


🏁 Script executed:

# Check git history for when this fork was introduced and any commit messages
git log --oneline -20 -- cpp/cmake/thirdparty/get_raft.cmake

Repository: rapidsai/cuvs

Length of output: 109


🏁 Script executed:

# Search for any open or recent PRs that might explain the fork usage
git log --all --oneline --grep="raft\|RAFT" | head -30

Repository: rapidsai/cuvs

Length of output: 1759


Avoid hard-pinning production builds to a personal RAFT fork/ref.

This introduces a supply-chain risk for CI and releases if that fork/ref is rewritten or removed. The expose-public-npy-helpers branch currently exists in julianmi/raft, but pinning to a personal fork without a documented integration path or timeline creates long-term dependency fragility. Prefer official rapidsai/raft by default, and gate fork/ref overrides behind explicit opt-in CMake cache vars for temporary cross-PR testing.

Suggested hardening
-set(RAFT_FORK "julianmi")
-set(RAFT_PINNED_TAG "expose-public-npy-helpers")
+set(RAFT_FORK "rapidsai")
+set(RAFT_PINNED_TAG "${rapids-cmake-checkout-tag}")
+
+set(CUVS_RAFT_FORK_OVERRIDE "" CACHE STRING "Optional RAFT fork override for temporary testing")
+set(CUVS_RAFT_TAG_OVERRIDE  "" CACHE STRING "Optional RAFT tag/branch override for temporary testing")
+
+if(CUVS_RAFT_FORK_OVERRIDE)
+  set(RAFT_FORK "${CUVS_RAFT_FORK_OVERRIDE}")
+endif()
+if(CUVS_RAFT_TAG_OVERRIDE)
+  set(RAFT_PINNED_TAG "${CUVS_RAFT_TAG_OVERRIDE}")
+endif()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/cmake/thirdparty/get_raft.cmake` around lines 9 - 10, The CMake variables
RAFT_FORK and RAFT_PINNED_TAG are hard-pinned to a personal fork/branch; change
the defaults to use the official upstream (e.g., "rapidsai/raft") and a safe
default tag (empty or a release tag), and expose RAFT_FORK and RAFT_PINNED_TAG
as CMake CACHE variables so they can only be overridden explicitly via -D on the
command line; additionally gate any non-upstream usage behind an opt-in flag
(e.g., USE_CUSTOM_RAFT or RAFT_USE_FORK) so CI/releases use the official repo by
default and personal-fork overrides are explicitly documented and temporary.


function(find_and_configure_raft)
set(oneValueArgs VERSION FORK PINNED_TAG BUILD_STATIC_DEPS ENABLE_NVTX ENABLE_MNMG_DEPENDENCIES CLONE_ON_PIN)
Expand Down
7 changes: 4 additions & 3 deletions cpp/include/cuvs/neighbors/cagra.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
#include <raft/core/host_mdspan.hpp>
#include <raft/core/mdspan.hpp>
#include <raft/core/mdspan_types.hpp>
#include <raft/core/numpy_serializer.hpp>
#include <raft/core/resource/stream_view.hpp>
#include <raft/core/serialize.hpp>

Expand Down Expand Up @@ -764,7 +765,7 @@ struct index : cuvs::neighbors::index {
if (lseek(fd.get(), 0, SEEK_SET) == -1) {
RAFT_FAIL("Failed to seek to beginning of dataset file");
}
auto header = raft::detail::numpy_serializer::read_header(stream);
auto header = raft::numpy_serializer::read_header(stream);
RAFT_EXPECTS(header.shape.size() == 2,
"Dataset file should be 2D, got %zu dimensions",
header.shape.size());
Expand Down Expand Up @@ -799,7 +800,7 @@ struct index : cuvs::neighbors::index {
if (lseek(fd.get(), 0, SEEK_SET) == -1) {
RAFT_FAIL("Failed to seek to beginning of graph file");
}
auto header = raft::detail::numpy_serializer::read_header(stream);
auto header = raft::numpy_serializer::read_header(stream);
RAFT_EXPECTS(
header.shape.size() == 2, "Graph file should be 2D, got %zu dimensions", header.shape.size());

Expand Down Expand Up @@ -840,7 +841,7 @@ struct index : cuvs::neighbors::index {
if (lseek(fd.get(), 0, SEEK_SET) == -1) {
RAFT_FAIL("Failed to seek to beginning of mapping file");
}
auto header = raft::detail::numpy_serializer::read_header(stream);
auto header = raft::numpy_serializer::read_header(stream);
RAFT_EXPECTS(header.shape.size() == 1,
"Mapping file should be 1D, got %zu dimensions",
header.shape.size());
Expand Down
9 changes: 5 additions & 4 deletions cpp/include/cuvs/util/file_io.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
#pragma once

#include <raft/core/error.hpp>
#include <raft/core/numpy_serializer.hpp>
#include <raft/core/serialize.hpp>

#include <algorithm>
Expand Down Expand Up @@ -187,12 +188,12 @@ std::pair<file_descriptor, size_t> create_numpy_file(const std::string& path,
file_descriptor fd(path, O_CREAT | O_RDWR | O_TRUNC, 0644);

// Build header
const auto dtype = raft::detail::numpy_serializer::get_numpy_dtype<T>();
const bool fortran_order = false;
const raft::detail::numpy_serializer::header_t header = {dtype, fortran_order, shape};
const auto dtype = raft::numpy_serializer::get_numpy_dtype<T>();
const bool fortran_order = false;
const raft::numpy_serializer::header_t header = {dtype, fortran_order, shape};

std::stringstream ss;
raft::detail::numpy_serializer::write_header(ss, header);
raft::numpy_serializer::write_header(ss, header);
std::string header_str = ss.str();
size_t header_size = header_str.size();

Expand Down
5 changes: 3 additions & 2 deletions cpp/src/neighbors/brute_force_serialize.cu
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
/*
* SPDX-FileCopyrightText: Copyright (c) 2024, NVIDIA CORPORATION.
* SPDX-FileCopyrightText: Copyright (c) 2024-2026, NVIDIA CORPORATION.
* SPDX-License-Identifier: Apache-2.0
*/

#include <cuvs/neighbors/brute_force.hpp>
#include <raft/core/copy.cuh>
#include <raft/core/host_mdarray.hpp>
#include <raft/core/numpy_serializer.hpp>
#include <raft/core/resources.hpp>
#include <raft/core/serialize.hpp>

Expand All @@ -24,7 +25,7 @@ void serialize(raft::resources const& handle,
RAFT_LOG_DEBUG(
"Saving brute force index, size %zu, dim %u", static_cast<size_t>(index.size()), index.dim());

auto dtype_string = raft::detail::numpy_serializer::get_numpy_dtype<T>().to_string();
auto dtype_string = raft::numpy_serializer::get_numpy_dtype<T>().to_string();
dtype_string.resize(4);
os << dtype_string;

Expand Down
5 changes: 3 additions & 2 deletions cpp/src/neighbors/detail/cagra/cagra_build.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
#include <raft/core/host_mdarray.hpp>
#include <raft/core/host_mdspan.hpp>
#include <raft/core/logger.hpp>
#include <raft/core/numpy_serializer.hpp>
#include <raft/core/resource/cuda_stream.hpp>
#include <raft/util/integer_utils.hpp>

Expand Down Expand Up @@ -726,14 +727,14 @@ void ace_load_partition_dataset_from_disk(
std::ifstream is(reordered_dataset_path, std::ios::in | std::ios::binary);
if (!is) { RAFT_FAIL("Cannot open file %s", reordered_dataset_path.c_str()); }
auto start_pos = is.tellg();
raft::detail::numpy_serializer::read_header(is);
raft::numpy_serializer::read_header(is);
core_header_size = static_cast<size_t>(is.tellg() - start_pos);
}
{
std::ifstream is(augmented_dataset_path, std::ios::in | std::ios::binary);
if (!is) { RAFT_FAIL("Cannot open file %s", augmented_dataset_path.c_str()); }
auto start_pos = is.tellg();
raft::detail::numpy_serializer::read_header(is);
raft::numpy_serializer::read_header(is);
augmented_header_size = static_cast<size_t>(is.tellg() - start_pos);
}

Expand Down
3 changes: 2 additions & 1 deletion cpp/src/neighbors/detail/cagra/cagra_serialize.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
#include <raft/core/logger.hpp>
#include <raft/core/mdarray.hpp>
#include <raft/core/mdspan_types.hpp>
#include <raft/core/numpy_serializer.hpp>
#include <raft/core/resource/cuda_stream.hpp>
#include <raft/core/serialize.hpp>
#include <raft/util/cudart_utils.hpp>
Expand Down Expand Up @@ -54,7 +55,7 @@ void serialize(raft::resources const& res,
RAFT_LOG_DEBUG(
"Saving CAGRA index, size %zu, dim %u", static_cast<size_t>(index_.size()), index_.dim());

std::string dtype_string = raft::detail::numpy_serializer::get_numpy_dtype<T>().to_string();
std::string dtype_string = raft::numpy_serializer::get_numpy_dtype<T>().to_string();
dtype_string.resize(4);
os << dtype_string;

Expand Down
8 changes: 4 additions & 4 deletions cpp/src/neighbors/detail/hnsw.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,9 @@
#include <cuvs/util/file_io.hpp>

#include <raft/core/copy.cuh>
#include <raft/core/detail/mdspan_numpy_serializer.hpp>
#include <raft/core/host_mdspan.hpp>
#include <raft/core/logger.hpp>
#include <raft/core/numpy_serializer.hpp>
#include <raft/core/pinned_mdarray.hpp>
#include <raft/util/cudart_utils.hpp>

Expand Down Expand Up @@ -399,7 +399,7 @@ void serialize_to_hnswlib_from_disk(raft::resources const& res,
std::ifstream graph_stream(graph_path, std::ios::binary);
RAFT_EXPECTS(graph_stream.good(), "Failed to open graph file: %s", graph_path.c_str());

auto header = raft::detail::numpy_serializer::read_header(graph_stream);
auto header = raft::numpy_serializer::read_header(graph_stream);
graph_header_size = static_cast<size_t>(graph_stream.tellg());
RAFT_EXPECTS(
header.shape.size() == 2, "Graph file should be 2D, got %zu dimensions", header.shape.size());
Expand All @@ -419,7 +419,7 @@ void serialize_to_hnswlib_from_disk(raft::resources const& res,
std::ifstream dataset_stream(dataset_path, std::ios::binary);
RAFT_EXPECTS(dataset_stream.good(), "Failed to open dataset file: %s", dataset_path.c_str());

auto header = raft::detail::numpy_serializer::read_header(dataset_stream);
auto header = raft::numpy_serializer::read_header(dataset_stream);
dataset_header_size = static_cast<size_t>(dataset_stream.tellg());
RAFT_EXPECTS(header.shape.size() == 2,
"Dataset file should be 2D, got %zu dimensions",
Expand All @@ -439,7 +439,7 @@ void serialize_to_hnswlib_from_disk(raft::resources const& res,
std::ifstream mapping_stream(mapping_path, std::ios::binary);
RAFT_EXPECTS(mapping_stream.good(), "Failed to open mapping file: %s", mapping_path.c_str());

auto header = raft::detail::numpy_serializer::read_header(mapping_stream);
auto header = raft::numpy_serializer::read_header(mapping_stream);
label_header_size = static_cast<size_t>(mapping_stream.tellg());
RAFT_EXPECTS(header.shape.size() == 1,
"Mapping file should be 1D, got %zu dimensions",
Expand Down
4 changes: 2 additions & 2 deletions cpp/src/neighbors/ivf_flat/ivf_flat_serialize.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,8 @@
#include <cuvs/neighbors/ivf_flat.hpp>

#include <raft/core/copy.cuh>
#include <raft/core/detail/mdspan_numpy_serializer.hpp>
#include <raft/core/mdarray.hpp>
#include <raft/core/numpy_serializer.hpp>
#include <raft/core/resource/cuda_stream.hpp>
#include <raft/core/serialize.hpp>
#include <raft/util/pow2_utils.cuh>
Expand Down Expand Up @@ -44,7 +44,7 @@ void serialize(raft::resources const& handle, std::ostream& os, const index<T, I
RAFT_LOG_DEBUG(
"Saving IVF-Flat index, size %zu, dim %u", static_cast<size_t>(index_.size()), index_.dim());

std::string dtype_string = raft::detail::numpy_serializer::get_numpy_dtype<T>().to_string();
std::string dtype_string = raft::numpy_serializer::get_numpy_dtype<T>().to_string();
dtype_string.resize(4);
os << dtype_string;

Expand Down
3 changes: 2 additions & 1 deletion cpp/src/neighbors/mg/snmg.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
#include <raft/core/copy.cuh>
#include <raft/core/device_mdspan.hpp>
#include <raft/core/host_mdspan.hpp>
#include <raft/core/numpy_serializer.hpp>
#include <raft/core/resource/multi_gpu.hpp>
#include <raft/core/resource/nccl_comm.hpp>
#include <raft/core/serialize.hpp>
Expand Down Expand Up @@ -738,7 +739,7 @@ void serialize(const raft::resources& clique,
std::ofstream of(filename, std::ios::out | std::ios::binary);
if (!of) { RAFT_FAIL("Cannot open file %s", filename.c_str()); }

std::string dtype_string = raft::detail::numpy_serializer::get_numpy_dtype<T>().to_string();
std::string dtype_string = raft::numpy_serializer::get_numpy_dtype<T>().to_string();
dtype_string.resize(4);
of << dtype_string;

Expand Down
Loading