Skip to content

Materialize HNSW Layered Index to hnswlib Index#2241

Draft
julianmi wants to merge 12 commits into
NVIDIA:mainfrom
julianmi:materialize-layered-index
Draft

Materialize HNSW Layered Index to hnswlib Index#2241
julianmi wants to merge 12 commits into
NVIDIA:mainfrom
julianmi:materialize-layered-index

Conversation

@julianmi

@julianmi julianmi commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

#2148 added a HNSW layered index that compacts the HNSW index by stripping out the dataset. This can be especially beneficial if the index is built on an accelerated machine and moved over a network to a search node. This proposal adds cuvs::neighbors::hnsw::materialize_to_hnswlib, a file-to-file API that converts a GPU_LAYERED_ON_DISK artifact (graph topology only, stored in ACE/build order) plus its original-ID-ordered dataset into a standard hnswlib index file.

Public API (cpp/include/cuvs/neighbors/hnsw.hpp)

I don't think there is a similar API in cuVS nor hnswlib. Thus, I've aligned the naming with the existing serialize_to_hnswlib and used materialize to clarify that the result is a hnswlib index file. I'm open for other naming suggestions.

struct materialize_params {
  std::string dataset_path;          // original-ID-ordered vectors (.npy or ANN *.bin)
  double      max_host_memory_gb = 0; // 0 => single in-memory reorder; >0 => bucketed temp files
  int         num_threads        = 0; // 0 => max threads
};

void materialize_to_hnswlib(raft::resources const& res,
                            const materialize_params& params,
                            const std::string& layered_artifact_path,
                            const std::string& output_path,
                            int dim,
                            cuvs::distance::DistanceType metric);

A single entry point covers all dtypes: the element type (float, half, uint8_t, int8_t) is
read from the artifact header and dispatched internally.

Implementation (cpp/src/neighbors/detail/hnsw.hpp)

The goal is to minimize the random access necessary for the ID reordering. The random access in main memory is cheap but severe for disk I/O. Thus, two passes are used:

  • Phase 0 – setup: validate header/descriptors/levels, validate dataset shape, derive the hnswlib
    layout constants from a dummy single-element index, compute the output layout, posix_fallocate
    the output, and write the native hnswlib header.
  • Phase 1/2 – base region: reorder base links from ACE order to ID order and interleave dataset
    vectors, emitting [level-0 link block | vector | label] sequentially. Two strategies:
    • single in-memory scatter when the base topology fits max_host_memory_gb;
    • bucketed temporary files (id_record_spiller) otherwise — sequential append + sequential replay
      per bucket, peak RAM bounded near the budget.
  • Phase 3 – upper region: transpose the upper layers into per-element link lists and append them
    sequentially, with the same in-memory / bucketed fallback.

The output is the exact upstream layout (header + level-0 array + per-element link lists), loadable
by hnswlib::loadIndex and by cuvs::neighbors::hnsw::deserialize with hierarchy == CPU.

Details

  • Peak host RAM ≈ one reorder bucket (base_links_bytes / num_buckets) + small (~64 MiB) streaming
    buffers + levels (n bytes) + the upper locator. With the in-memory pass, RAM ≈ the base topology
    only and a budget lowers it further.
  • Disk IO is fully sequential: dataset read once, output written once, base topology read once
    (in-memory) or read once + temp write/read once (bucketed), and upper read once + written once.
  • All four dtypes are tested in cpp/tests/neighbors/ann_hnsw_ace*.
  • The example in examples/cpp/src/hnsw_ace_layered_example.cu was updated to use the materialization path.
  • Adds the C, Java, and Python bindings.

@copy-pr-bot

copy-pr-bot Bot commented Jun 12, 2026

Copy link
Copy Markdown

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant