Skip to content

Add HNSW Layered Index Support#2148

Open
julianmi wants to merge 18 commits into
NVIDIA:mainfrom
julianmi:hnsw-layered-index
Open

Add HNSW Layered Index Support#2148
julianmi wants to merge 18 commits into
NVIDIA:mainfrom
julianmi:hnsw-layered-index

Conversation

@julianmi

@julianmi julianmi commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

The CAGRA graph built by the disk-backed ACE algorithm partitions the dataset. Thus, the CAGRA graph uses the reordered index space. Building a HNSW index using hnsw::from_cagra uses the reordered dataset and CAGRA graph. Downstream consumers building an HNSW index would therefore require the reordered dataset, which is typically large when requiring the disk-backed ACE algorithm. Thus, building only the layers of the HNSW index without the dataset and moving this to the search node can minimize the network transfers for downstream consumers if they have the original dataset locally available. The hnsw::deserialize step then takes the layered index and combines it with the local dataset to form a hnswlib compatible search index.

Artifact Layout

hnsw_index.cuvs
  fixed_header
    magic = CUVS_HNSW_LAYERED
    version
    metadata_offset
    metadata_size

  metadata_binary
    dataset shape, dtype, metric
    hnsw parameters
    section sizes
    upper-layer descriptors

  levels
    uint8[n_rows]
    indexed by original dataset row ID

  base_nodes
    uint32[n_rows]
    maps each base topology row to original row ID

  base_links
    n_rows fixed-size hnswlib-ready rows
    [count:uint32][neighbors:uint32[maxM0]]
    neighbors are original row IDs

  upper_nodes
    concatenated uint32 original row IDs for layers 1..maxlevel

  upper_links
    fixed-size hnswlib-ready rows
    [count:uint32][neighbors:uint32[maxM]]
    neighbors are original row IDs

Layered HNSW Serialization

The layered serializer creates hnsw_index.cuvs from the disk-backed ACE graph.

  1. Create the .cuvs file and write the fixed header and metadata.
  2. Generate HNSW levels in original ID space.
  3. Write levels sequentially.
  4. Read dataset_mapping.npy sequentially into reordered_to_original.
  5. Read cagra_graph.npy source-sequentially in ACE reordered row order.
  6. For each ACE graph row:
    • write base_nodes[row] = reordered_to_original[ace_reordered_row]
    • convert each neighbor from ACE reordered ID to original ID
    • write a padded hnswlib-ready row to base_links[row]
  7. Gather promoted vectors from the original dataset.
  8. Build upper-layer graphs using temporary HNSW promoted order.
  9. Write upper_nodes and upper_links with node IDs and neighbor IDs converted back to original IDs.

This keeps remapping, link padding, and upper-layer KNN work on the build node.

Deserialization

The search node reads:

  • hnsw_index.cuvs
  • the external original-order dataset from index_params.dataset_path

The loader:

  1. Reads the fixed header and metadata.
  2. Validates artifact shape, section sizes, and dataset shape.
  3. Reads levels sequentially.
  4. Allocates hnswlib storage.
  5. Reads the external dataset sequentially in original row order.
  6. Initializes hnswlib with:
    • internal ID = original ID
    • label = original ID
    • level = levels[original_id]
  7. Reads base_nodes and base_links sequentially.
  8. Copies each base link row into get_linklist0(base_node_id).
  9. Reads upper_nodes and upper_links sequentially by layer.
  10. Copies each upper link row into get_linklist(node_id, level).

The search node does no graph remapping, no level generation, no link padding, and no KNN work.

Disk Access Patterns

Build node:

  • Sequential scan of the original dataset for ACE partitioning.
  • Buffered partition writes for reordered and augmented datasets.
  • Contiguous per-partition reads from reordered_dataset.npy and augmented_dataset.npy.
  • Source-sequential reads from cagra_graph.npy when creating the final layered artifact.
  • Sequential writes to hnsw_index.cuvs.

Search node:

  • Sequential reads from hnsw_index.cuvs.
  • Sequential reads from the external original-order dataset.
  • Scatter writes only into in-memory hnswlib link storage by original ID.

Runtime Requirements

Only hnsw_index.cuvs is copied to the search node. ACE temporary files remain build-node-only.

The search node must have the original dataset in original row order and must provide that path through index_params.dataset_path.

Misc

Unifies the logging format of the ACE algorithm.

@copy-pr-bot

copy-pr-bot Bot commented Jun 1, 2026

Copy link
Copy Markdown

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@tfeher tfeher requested a review from mfoerste4 June 1, 2026 09:49

@mfoerste4 mfoerste4 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @julianmi for the PR. The structure looks good to me. Only a few comments that are not necessarily actionable.

Comment on lines +31 to +33
} else if (conf.at("hierarchy") == "gpu_layered_on_disk" ||
conf.at("hierarchy") == "gpu_layered" || conf.at("hierarchy") == "layered") {
hnsw_params.hierarchy = cuvs::neighbors::hnsw::HnswHierarchy::GPU_LAYERED_ON_DISK;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these all just synonyms for GPU_LAYERED_ON_DISK?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I've removed them to better align with the other formats.

"Layered HNSW artifact '%s' does not exist.",
src_artifact.c_str());

copy_file_overwrite(src_artifact, std::filesystem::path(file));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we copy here instead of moving (like in the cagra_ace block below?)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks. I've changed this and reuse the helper in the cagra_ace block.

Comment thread cpp/src/neighbors/detail/hnsw.hpp Outdated
Comment on lines +648 to +663
inline auto json_parse_double(const std::string& json, const std::string& key) -> double
{
auto pos = json_find_key(json, key);
auto colon = json.find(':', pos);
RAFT_EXPECTS(colon != std::string::npos, "Malformed JSON near key '%s'", key.c_str());
auto begin = json.find_first_of("0123456789-.", colon + 1);
auto end = json.find_first_not_of("0123456789-.eE+", begin);
RAFT_EXPECTS(begin != std::string::npos, "Malformed double JSON value for key '%s'", key.c_str());
return std::stod(json.substr(begin, end - begin));
}

inline auto json_parse_layer_field(const std::string& object, const std::string& key) -> size_t
{
return json_parse_size(object, key);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use JSON here - why don't we use a common json utility for parsing? This whole section feels very verbose for a standard. Is there a cleaner alternative?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would pull the JSON dependency (nlohmann/json) into libcuvs. It is currently only used in the benchmarking. I've changed to a binary header format instead given the simplicity. Let me know what you think.

Comment on lines +907 to +916
const auto current_node_bytes = current_batch_size * sizeof(IdxT);
const auto current_link_bytes = current_batch_size * base_link_row_bytes;
cuvs::util::write_large_file(output_fd,
base_node_buffer.data(),
current_node_bytes,
base_nodes_offset + source_start * sizeof(IdxT));
cuvs::util::write_large_file(output_fd,
base_link_buffer.data(),
current_link_bytes,
base_links_offset + source_start * base_link_row_bytes);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that the graph connections / links are shifted to reflect the original ordering, but I don't see the rows/nodes themself being reordered. Is this intended?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The idea is to write sequentially and read scatter into memory. I've added a comment.

Comment on lines +477 to +484
template <typename T, typename IdxT, typename Callback>
void build_hnsw_upper_layer_graphs(
raft::resources const& res,
raft::host_matrix_view<const T, int64_t, raft::row_major> promoted_dataset,
const hnsw_level_plan& plan,
size_t M,
cuvs::distance::DistanceType metric,
Callback&& callback)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to have this extracted as utility.

Comment on lines +1127 to +1142
for (int64_t batch_idx = 0; batch_idx < static_cast<int64_t>(current_batch_size);
++batch_idx) {
const auto row = batch_start + static_cast<size_t>(batch_idx);
node_buffer[batch_idx] = static_cast<IdxT>(hierarchy.order[start_idx + row]);
auto* link_row = link_buffer.data() + batch_idx * metadata.upper_link_row_bytes;
hnswlib::linklistsizeint list_count =
static_cast<hnswlib::linklistsizeint>(layer.degree);
std::memcpy(link_row, &list_count, sizeof(list_count));
auto* dst = reinterpret_cast<IdxT*>(link_row + sizeof(hnswlib::linklistsizeint));
if (layer.degree > 0) {
auto* src = host_neighbors.data_handle() + row * layer.degree;
for (size_t j = 0; j < layer.degree; ++j) {
dst[j] = static_cast<IdxT>(hierarchy.order[src[j] + start_idx]);
}
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, here we seem to write in in both the modified data order (not original) and by level connections instead of by row. I assume this gets re-ordered upon deserialize.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I've added a comment.

Comment on lines +2326 to +2329
auto ll0 = appr_algo->get_linklist0(node_id);
memcpy(ll0,
base_link_buffer.data() + batch_idx * metadata.base_link_row_bytes,
metadata.base_link_row_bytes);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok - this is where we re-order upon insertion.

Comment on lines +2388 to +2392
auto ll = appr_algo->get_linklist(node_id, layer.level);
memcpy(ll,
link_buffer.data() + batch_idx * metadata.upper_link_row_bytes,
metadata.upper_link_row_bytes);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, this way we can re-arrange the levels upon load as well. I guess this works as long as we can provide our own deserialize-loader and don't have to mimic the serialized file structure of the original format.

But is this really the use-case? Do we eventually need to support a disk-layered+dataset->disk conversion?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this minimizes the I/O for reading and constructing the HNSW index in memory. However, the same approach can be used for the memory bound case where we need to support the disk-layered+dataset->disk conversion you've mentioned. I'm working on a follow-up PR that enables it. The idea is to reorder the base links to original ID space and interleave the dataset vectors by either:

  1. Simple scatter if we have enough host memory.
  2. File-baked mmap: probably very slow.
  3. Write [id][link row] records to temporary files in buckets. Replay each bucket and store in small per-bucket scatter buffer. Thus, the base section needs to be written and re-read but disk I/O is sequential. This seems to be the most promising approach when memory constrained. Let me know if you have other ideas please.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added #2241 on top of this that implements strategies (1) and (3).

@julianmi julianmi marked this pull request as ready for review June 12, 2026 13:25
@julianmi julianmi requested review from a team as code owners June 12, 2026 13:25
@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR adds GPU_LAYERED_ON_DISK support for HNSW, including public API changes, layered artifact serialization and deserialization, benchmark save handling, ACE build updates, tests, and a new example.

Changes

Layered HNSW disk-backed flow

Layer / File(s) Summary
API and benchmark contracts
cpp/include/cuvs/neighbors/hnsw.hpp, cpp/bench/ann/src/cuvs/cuvs_cagra_hnswlib.cu
Adds HnswHierarchy::GPU_LAYERED_ON_DISK, index_params::dataset_path, layered benchmark parsing, default dataset path selection, and prefixed ACE parameter forwarding.
Layered artifact format and load path
cpp/src/neighbors/detail/hnsw.hpp
Adds layered artifact metadata and file-format helpers, writes layered artifacts from disk-backed CAGRA, blocks standard file-descriptor loading for the layered hierarchy, and routes serialize/deserialize through the layered artifact path.
Benchmark save relocation
cpp/bench/ann/src/cuvs/cuvs_cagra_hnswlib_wrapper.h
Adds overwrite-safe file movement and uses it for layered and ACE save-path relocation.
ACE build logging and validation
cpp/src/neighbors/detail/cagra/cagra_build.cuh
Updates ACE build progress reporting, logging, warning text, memory reporting, and parameter validation across disk and partition-processing paths.
Tests for layered artifact flow
cpp/tests/neighbors/ann_hnsw_ace.cuh, cpp/tests/neighbors/ann_hnsw_ace/test_*.cu
Adds layered build/deserialize/search coverage, corruption and mismatch failures, layered inputs, and parameterized test instantiations for multiple type combinations.
Example usage flow
examples/cpp/src/hnsw_ace_layered_example.cu, examples/cpp/CMakeLists.txt
Adds a new layered HNSW example target that writes dataset input, builds a layered artifact, deserializes it, and runs search with optional quantization.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested labels

feature request, C++

Suggested reviewers

  • KyleFromNVIDIA
  • dantegd
  • msarahan
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 3.70% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: adding layered HNSW index support.
Description check ✅ Passed The description is detailed and directly matches the layered HNSW artifact workflow described by the changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands.

julianmi added 2 commits June 22, 2026 08:33
- hnsw.hpp: adopt upstream's serialize_to_hnswlib_batched / from_inmem core and
  auto-selecting build(); re-apply layered helpers, serialize_to_layered_hnsw_from_disk,
  deserialize_layered_hnsw, and GPU_LAYERED_ON_DISK routing in from_cagra/serialize/
  deserialize/build. Tighten the in-memory spill guard to NONE/GPU only.
- cagra_build.cuh: take upstream's refined ACE memory-limit clamping and logging.
- bench: adopt upstream hnsw_params.M; forward ace_* overrides via graph_build_params.
- ACE tests: keep both the layered and in-memory disk-spill test cases.
- examples: keep both HNSW_ACE_LAYERED and HNSW_OPENAI examples.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
cpp/src/neighbors/detail/hnsw.hpp (2)

2396-2405: 🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

HIGH: Validate enterpoint_node before storing it in hnswlib state.

Have you considered range-checking metadata.enterpoint_node against [0, n_rows) during deserialize? A malformed layered artifact can currently survive load with an invalid entrypoint and then walk hnswlib with an out-of-bounds internal id on the first search.

Suggested fix
   RAFT_EXPECTS(metadata.n_rows > 0, "Layered HNSW artifact must contain at least one row");
   RAFT_EXPECTS(metadata.dim > 0, "Layered HNSW artifact must contain at least one dimension");
+  RAFT_EXPECTS(metadata.enterpoint_node >= 0 &&
+                 static_cast<size_t>(metadata.enterpoint_node) < metadata.n_rows,
+               "Layered HNSW artifact enterpoint_node (%d) is outside [0, %zu)",
+               metadata.enterpoint_node,
+               metadata.n_rows);
   RAFT_EXPECTS(static_cast<size_t>(dim) == metadata.dim,
                "Layered HNSW artifact dim (%zu) does not match requested dim (%d)",
                metadata.dim,
                dim);

As per coding guidelines, Input validation must check for negative or invalid dimensions, null pointers, and invalid parameter combinations before GPU operations.

Also applies to: 2499-2504

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/src/neighbors/detail/hnsw.hpp` around lines 2396 - 2405, Add a
deserialize-time validation for metadata.enterpoint_node in the layered HNSW
load path before it is written into hnswlib state. In the same checks near the
existing RAFT_EXPECTS guards in the layered HNSW deserialization logic, verify
that enterpoint_node is within [0, metadata.n_rows) and reject malformed
artifacts early, using the relevant deserialize/load helper and the hnswlib
state initialization code that stores the entrypoint.

Source: Coding guidelines


727-735: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

HIGH: Reject .npy datasets whose dtype or memory order does not match the layered artifact.

Have you considered carrying the .npy header’s dtype/layout through open_npy_file() and validating it here? Right now only the shape is checked, so a same-shaped but wrong-typed or Fortran-ordered dataset will be streamed as row-major T bytes and silently corrupt the reconstructed index. As per coding guidelines, Data layout (row-major vs column-major) must be explicitly verified and handled in memory access and Data format parameters (row-major vs column-major, memory layout) must be explicit in function signatures and documentation; ambiguous data layout assumptions should be clarified or eliminated.

Also applies to: 2428-2459

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/src/neighbors/detail/hnsw.hpp` around lines 727 - 735, open_npy_file()
currently returns only the shape, so layered_artifact::load_from_npz() can
accept same-shaped but wrong-typed or Fortran-ordered .npy data and corrupt the
index. Update open_npy_file() to carry the numpy header’s dtype and
layout/stride metadata alongside shape, then validate those fields in the
layered artifact loading path before streaming bytes into T. Use the existing
symbols open_npy_file(), npy_file, and layered_artifact::load_from_npz() to
enforce that only the expected row-major dtype/layout is accepted, and reject
mismatches early.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@cpp/src/neighbors/detail/hnsw.hpp`:
- Around line 2396-2405: Add a deserialize-time validation for
metadata.enterpoint_node in the layered HNSW load path before it is written into
hnswlib state. In the same checks near the existing RAFT_EXPECTS guards in the
layered HNSW deserialization logic, verify that enterpoint_node is within [0,
metadata.n_rows) and reject malformed artifacts early, using the relevant
deserialize/load helper and the hnswlib state initialization code that stores
the entrypoint.
- Around line 727-735: open_npy_file() currently returns only the shape, so
layered_artifact::load_from_npz() can accept same-shaped but wrong-typed or
Fortran-ordered .npy data and corrupt the index. Update open_npy_file() to carry
the numpy header’s dtype and layout/stride metadata alongside shape, then
validate those fields in the layered artifact loading path before streaming
bytes into T. Use the existing symbols open_npy_file(), npy_file, and
layered_artifact::load_from_npz() to enforce that only the expected row-major
dtype/layout is accepted, and reject mismatches early.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: b15a3474-efb8-43fe-8b06-6a6283890e30

📥 Commits

Reviewing files that changed from the base of the PR and between 0e2d458 and 4513f6f.

📒 Files selected for processing (10)
  • cpp/bench/ann/src/cuvs/cuvs_cagra_hnswlib.cu
  • cpp/bench/ann/src/cuvs/cuvs_cagra_hnswlib_wrapper.h
  • cpp/src/neighbors/detail/cagra/cagra_build.cuh
  • cpp/src/neighbors/detail/hnsw.hpp
  • cpp/tests/neighbors/ann_hnsw_ace.cuh
  • cpp/tests/neighbors/ann_hnsw_ace/test_float_uint32_t.cu
  • cpp/tests/neighbors/ann_hnsw_ace/test_half_uint32_t.cu
  • cpp/tests/neighbors/ann_hnsw_ace/test_int8_t_uint32_t.cu
  • cpp/tests/neighbors/ann_hnsw_ace/test_uint8_t_uint32_t.cu
  • examples/cpp/CMakeLists.txt
🚧 Files skipped from review as they are similar to previous changes (6)
  • cpp/tests/neighbors/ann_hnsw_ace/test_float_uint32_t.cu
  • examples/cpp/CMakeLists.txt
  • cpp/bench/ann/src/cuvs/cuvs_cagra_hnswlib.cu
  • cpp/tests/neighbors/ann_hnsw_ace.cuh
  • cpp/bench/ann/src/cuvs/cuvs_cagra_hnswlib_wrapper.h
  • cpp/src/neighbors/detail/cagra/cagra_build.cuh

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cpp/include/cuvs/util/numpy_dtype.hpp`:
- Around line 24-35: The public API in numpy_dtype.hpp is missing required
Doxygen coverage, so add complete Doxygen comments for numpy_dtype_string,
make_numpy_header_from_dtype, and make_numpy_header_string, including parameter
descriptions, return values, and any side effects/contract details; place the
docs directly above the function declarations and ensure the public header
exposes the serialized-header behavior clearly for downstream users.
- Around line 24-35: These new public helpers expose stateful STL types in the
cuVS API, so either move numpy_dtype_string and make_numpy_header_from_dtype out
of the public header or change their interfaces to use stateless inputs instead
of std::string and std::vector<size_t>. Update the API surface in
cpp/include/cuvs/util/numpy_dtype.hpp and the related helpers around the same
area to rely on POD-style parameters, raft::resources, pointers, or
mdspan-compatible inputs so the public-header contract is preserved.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 80b27030-35d3-4c2c-9a98-ec15b4faa45a

📥 Commits

Reviewing files that changed from the base of the PR and between 6d9b2c6 and 2bf44ea.

📒 Files selected for processing (10)
  • cpp/include/cuvs/util/file_io.hpp
  • cpp/include/cuvs/util/numpy_dtype.hpp
  • cpp/src/neighbors/brute_force_serialize.cu
  • cpp/src/neighbors/detail/cagra/cagra_serialize.cuh
  • cpp/src/neighbors/detail/hnsw.hpp
  • cpp/src/neighbors/ivf_flat/ivf_flat_serialize.cuh
  • cpp/src/neighbors/ivf_sq/ivf_sq_serialize.cuh
  • cpp/src/neighbors/mg/snmg.cuh
  • cpp/src/util/serialize_validation.hpp
  • examples/cpp/CMakeLists.txt
🚧 Files skipped from review as they are similar to previous changes (2)
  • examples/cpp/CMakeLists.txt
  • cpp/src/neighbors/detail/hnsw.hpp

Comment thread cpp/include/cuvs/util/numpy_dtype.hpp
@julianmi julianmi requested a review from a team as a code owner June 24, 2026 13:40
…ed-index

# Conflicts:
#	cpp/include/cuvs/util/file_io.hpp
#	cpp/src/neighbors/brute_force_serialize.cu
#	cpp/src/neighbors/detail/cagra/cagra_serialize.cuh
#	cpp/src/neighbors/ivf_flat/ivf_flat_serialize.cuh
#	cpp/src/neighbors/mg/snmg.cuh
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants