Skip to content

Commit

Permalink
Merge branch 'branch-24.12' into diskann-wrapper
Browse files Browse the repository at this point in the history
  • Loading branch information
tarang-jain authored Dec 2, 2024
2 parents 82e21e8 + 5062594 commit 4b396d7
Show file tree
Hide file tree
Showing 100 changed files with 7,213 additions and 433 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/pr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ jobs:
with:
build_type: pull-request
enable_check_symbols: true
symbol_exclusions: (void (thrust::|cub::)|raft_cutlass)
symbol_exclusions: (void (thrust::|cub::))
conda-python-build:
needs: conda-cpp-build
secrets: inherit
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ jobs:
date: ${{ inputs.date }}
sha: ${{ inputs.sha }}
enable_check_symbols: true
symbol_exclusions: (void (thrust::|cub::)|raft_cutlass)
symbol_exclusions: (void (thrust::|cub::))
conda-cpp-tests:
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/[email protected]
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,7 @@ compile_commands.json
.clangd/

# serialized ann indexes
brute_force_index
cagra_index
ivf_flat_index
ivf_pq_index
Expand Down
11 changes: 10 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ Finally, faster vector search enables interactions between dense vectors and gra

Below are some common use-cases for vector search


- ### Semantic search
- Generative AI & Retrieval augmented generation (RAG)
- Recommender systems
Expand Down Expand Up @@ -68,6 +69,14 @@ There are several benefits to using cuVS and GPUs for vector search, including

In addition to the items above, cuVS takes on the burden of keeping non-trivial accelerated code up to date as new NVIDIA architectures and CUDA versions are released. This provides a deslightful development experimence, guaranteeing that any libraries, databases, or applications built on top of it will always be getting the best performance and scale.

## cuVS Technology Stack

cuVS is built on top of the RAPIDS RAFT library of high performance machine learning primitives and provides all the necessary routines for vector search and clustering on the GPU.

![cuVS is built on top of low-level CUDA libraries and provides many important routines that enable vector search and clustering on the GPU](img/tech_stack.png "cuVS Technology Stack")



## Installing cuVS

cuVS comes with pre-built packages that can be installed through [conda](https://conda.io/projects/conda/en/latest/user-guide/getting-started.html#managing-python) and [pip](https://pip.pypa.io/en/stable/). Different packages are available for the different languages supported by cuVS:
Expand Down Expand Up @@ -233,7 +242,7 @@ If you are interested in contributing to the cuVS library, please read our [Cont

For the interested reader, many of the accelerated implementations in cuVS are also based on research papers which can provide a lot more background. We also ask you to please cite the corresponding algorithms by referencing them in your own research.
- [CAGRA: Highly Parallel Graph Construction and Approximate Nearest Neighbor Search](https://arxiv.org/abs/2308.15136)
- [Top-K Algorithms on GPU: A Comprehensive Study and New Methods](https://dl.acm.org/doi/10.1145/3581784.3607062>)
- [Top-K Algorithms on GPU: A Comprehensive Study and New Methods](https://dl.acm.org/doi/10.1145/3581784.3607062)
- [Fast K-NN Graph Construction by GPU Based NN-Descent](https://dl.acm.org/doi/abs/10.1145/3459637.3482344?casa_token=O_nan1B1F5cAAAAA:QHWDEhh0wmd6UUTLY9_Gv6c3XI-5DXM9mXVaUXOYeStlpxTPmV3nKvABRfoivZAaQ3n8FWyrkWw>)
- [cuSLINK: Single-linkage Agglomerative Clustering on the GPU](https://arxiv.org/abs/2306.16354)
- [GPU Semiring Primitives for Sparse Neighborhood Methods](https://arxiv.org/abs/2104.06357)
5 changes: 4 additions & 1 deletion ci/build_wheel_cuvs.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@

set -euo pipefail

package_dir="python/cuvs"

case "${RAPIDS_CUDA_VERSION}" in
12.*)
EXTRA_CMAKE_ARGS=";-DUSE_CUDA_MATH_WHEELS=ON"
Expand All @@ -15,4 +17,5 @@ esac
# Set up skbuild options. Enable sccache in skbuild config options
export SKBUILD_CMAKE_ARGS="-DDETECT_CONDA_ENV=OFF;-DFIND_CUVS_CPP=OFF${EXTRA_CMAKE_ARGS}"

ci/build_wheel.sh cuvs python/cuvs
ci/build_wheel.sh cuvs ${package_dir}
ci/validate_wheel.sh ${package_dir} final_dist
35 changes: 35 additions & 0 deletions ci/validate_wheel.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
#!/bin/bash
# Copyright (c) 2024, NVIDIA CORPORATION.

set -euo pipefail

package_dir=$1
wheel_dir_relative_path=$2

RAPIDS_CUDA_MAJOR="${RAPIDS_CUDA_VERSION%%.*}"

# some packages are much larger on CUDA 11 than on CUDA 12
if [[ "${RAPIDS_CUDA_MAJOR}" == "11" ]]; then
PYDISTCHECK_ARGS=(
--max-allowed-size-compressed '1.4G'
)
else
PYDISTCHECK_ARGS=(
--max-allowed-size-compressed '950M'
)
fi

cd "${package_dir}"

rapids-logger "validate packages with 'pydistcheck'"

pydistcheck \
--inspect \
"${PYDISTCHECK_ARGS[@]}" \
"$(echo ${wheel_dir_relative_path}/*.whl)"

rapids-logger "validate packages with 'twine'"

twine check \
--strict \
"$(echo ${wheel_dir_relative_path}/*.whl)"
2 changes: 1 addition & 1 deletion conda/environments/all_cuda-118_arch-aarch64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ dependencies:
- cmake>=3.26.4,!=3.30.0
- cuda-nvtx=11.8
- cuda-profiler-api=11.8.86
- cuda-python>=11.7.1,<12.0a0
- cuda-python>=11.7.1,<12.0a0,<=11.8.3
- cuda-version=11.8
- cudatoolkit
- cupy>=12.0.0
Expand Down
2 changes: 1 addition & 1 deletion conda/environments/all_cuda-118_arch-x86_64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ dependencies:
- cmake>=3.26.4,!=3.30.0
- cuda-nvtx=11.8
- cuda-profiler-api=11.8.86
- cuda-python>=11.7.1,<12.0a0
- cuda-python>=11.7.1,<12.0a0,<=11.8.3
- cuda-version=11.8
- cudatoolkit
- cupy>=12.0.0
Expand Down
2 changes: 1 addition & 1 deletion conda/environments/all_cuda-125_arch-aarch64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ dependencies:
- cuda-nvcc
- cuda-nvtx-dev
- cuda-profiler-api
- cuda-python>=12.0,<13.0a0
- cuda-python>=12.0,<13.0a0,<=12.6.0
- cuda-version=12.5
- cupy>=12.0.0
- cxx-compiler
Expand Down
2 changes: 1 addition & 1 deletion conda/environments/all_cuda-125_arch-x86_64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ dependencies:
- cuda-nvcc
- cuda-nvtx-dev
- cuda-profiler-api
- cuda-python>=12.0,<13.0a0
- cuda-python>=12.0,<13.0a0,<=12.6.0
- cuda-version=12.5
- cupy>=12.0.0
- cxx-compiler
Expand Down
2 changes: 1 addition & 1 deletion conda/environments/bench_ann_cuda-118_arch-aarch64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ dependencies:
- cmake>=3.26.4,!=3.30.0
- cuda-nvtx=11.8
- cuda-profiler-api=11.8.86
- cuda-python>=11.7.1,<12.0a0
- cuda-python>=11.7.1,<12.0a0,<=11.8.3
- cuda-version=11.8
- cudatoolkit
- cxx-compiler
Expand Down
2 changes: 1 addition & 1 deletion conda/environments/bench_ann_cuda-118_arch-x86_64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ dependencies:
- cmake>=3.26.4,!=3.30.0
- cuda-nvtx=11.8
- cuda-profiler-api=11.8.86
- cuda-python>=11.7.1,<12.0a0
- cuda-python>=11.7.1,<12.0a0,<=11.8.3
- cuda-version=11.8
- cudatoolkit
- cxx-compiler
Expand Down
2 changes: 1 addition & 1 deletion conda/environments/bench_ann_cuda-125_arch-aarch64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ dependencies:
- cuda-nvcc
- cuda-nvtx-dev
- cuda-profiler-api
- cuda-python>=12.0,<13.0a0
- cuda-python>=12.0,<13.0a0,<=12.6.0
- cuda-version=12.5
- cxx-compiler
- cython>=3.0.0
Expand Down
2 changes: 1 addition & 1 deletion conda/environments/bench_ann_cuda-125_arch-x86_64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ dependencies:
- cuda-nvcc
- cuda-nvtx-dev
- cuda-profiler-api
- cuda-python>=12.0,<13.0a0
- cuda-python>=12.0,<13.0a0,<=12.6.0
- cuda-version=12.5
- cxx-compiler
- cython>=3.0.0
Expand Down
8 changes: 5 additions & 3 deletions conda/recipes/cuvs/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ build:
- {{ compiler('cuda') }}
- cuda-cudart-dev
{% endif %}
- cuda-python

requirements:
build:
Expand All @@ -42,10 +43,10 @@ requirements:
- {{ stdlib("c") }}
host:
{% if cuda_major == "11" %}
- cuda-python >=11.7.1,<12.0a0
- cuda-python >=11.7.1,<12.0a0,<=11.8.3
- cudatoolkit
{% else %}
- cuda-python >=12.0,<13.0a0
- cuda-python >=12.0,<13.0a0,<=12.6.0
- cuda-cudart-dev
{% endif %}
- cuda-version ={{ cuda_version }}
Expand All @@ -60,13 +61,14 @@ requirements:
- {{ pin_compatible('cuda-version', max_pin='x', min_pin='x') }}
{% if cuda_major == "11" %}
- cudatoolkit
- cuda-python >=11.7.1,<12.0a0,<=11.8.3
{% else %}
- cuda-cudart
- cuda-python >=12.0,<13.0a0,<=12.6.0
{% endif %}
- pylibraft {{ minor_version }}
- libcuvs {{ version }}
- python x.x
- cuda-python
- numpy >=1.23,<3.0a0

tests:
Expand Down
4 changes: 4 additions & 0 deletions cpp/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -369,7 +369,9 @@ if(BUILD_SHARED_LIBS)
src/distance/detail/fused_distance_nn.cu
src/distance/distance.cu
src/distance/pairwise_distance.cu
src/distance/sparse_distance.cu
src/neighbors/brute_force.cu
src/neighbors/brute_force_serialize.cu
src/neighbors/cagra_build_float.cu
src/neighbors/cagra_build_half.cu
src/neighbors/cagra_build_int8.cu
Expand Down Expand Up @@ -436,6 +438,7 @@ if(BUILD_SHARED_LIBS)
src/neighbors/nn_descent.cu
src/neighbors/nn_descent_float.cu
src/neighbors/nn_descent_half.cu
src/neighbors/nn_descent_index.cpp
src/neighbors/nn_descent_int8.cu
src/neighbors/nn_descent_uint8.cu
src/neighbors/reachability.cu
Expand All @@ -448,6 +451,7 @@ if(BUILD_SHARED_LIBS)
src/neighbors/refine/detail/refine_host_int8_t_float.cpp
src/neighbors/refine/detail/refine_host_uint8_t_float.cpp
src/neighbors/sample_filter.cu
src/neighbors/sparse_brute_force.cu
src/neighbors/vamana_build_float.cu
src/neighbors/vamana_build_uint8.cu
src/neighbors/vamana_build_int8.cu
Expand Down
81 changes: 81 additions & 0 deletions cpp/include/cuvs/distance/distance.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@

#include <cstdint>
#include <cuda_fp16.h>
#include <raft/core/device_csr_matrix.hpp>
#include <raft/core/device_mdspan.hpp>
#include <raft/core/resources.hpp>

Expand Down Expand Up @@ -331,6 +332,86 @@ void pairwise_distance(
cuvs::distance::DistanceType metric,
float metric_arg = 2.0f);

/**
* @brief Compute sparse pairwise distances between x and y, using the provided
* input configuration and distance function.
*
* @code{.cpp}
* #include <raft/core/device_resources.hpp>
* #include <raft/core/device_csr_matrix.hpp>
* #include <raft/core/device_mdspan.hpp>
*
* int x_n_rows = 100000;
* int y_n_rows = 50000;
* int n_cols = 10000;
*
* raft::device_resources handle;
* auto x = raft::make_device_csr_matrix<float>(handle, x_n_rows, n_cols);
* auto y = raft::make_device_csr_matrix<float>(handle, y_n_rows, n_cols);
*
* ...
* // populate data
* ...
*
* auto out = raft::make_device_matrix<float>(handle, x_nrows, y_nrows);
* auto metric = cuvs::distance::DistanceType::L2Expanded;
* raft::sparse::distance::pairwise_distance(handle, x.view(), y.view(), out, metric);
* @endcode
*
* @param[in] handle raft::resources
* @param[in] x raft::device_csr_matrix_view
* @param[in] y raft::device_csr_matrix_view
* @param[out] dist raft::device_matrix_view dense matrix
* @param[in] metric distance metric to use
* @param[in] metric_arg metric argument (used for Minkowski distance)
*/
void pairwise_distance(raft::resources const& handle,
raft::device_csr_matrix_view<const float, int, int, int> x,
raft::device_csr_matrix_view<const float, int, int, int> y,
raft::device_matrix_view<float, int, raft::row_major> dist,
cuvs::distance::DistanceType metric,
float metric_arg = 2.0f);

/**
* @brief Compute sparse pairwise distances between x and y, using the provided
* input configuration and distance function.
*
* @code{.cpp}
* #include <raft/core/device_resources.hpp>
* #include <raft/core/device_csr_matrix.hpp>
* #include <raft/core/device_mdspan.hpp>
*
* int x_n_rows = 100000;
* int y_n_rows = 50000;
* int n_cols = 10000;
*
* raft::device_resources handle;
* auto x = raft::make_device_csr_matrix<double>(handle, x_n_rows, n_cols);
* auto y = raft::make_device_csr_matrix<double>(handle, y_n_rows, n_cols);
*
* ...
* // populate data
* ...
*
* auto out = raft::make_device_matrix<double>(handle, x_nrows, y_nrows);
* auto metric = cuvs::distance::DistanceType::L2Expanded;
* raft::sparse::distance::pairwise_distance(handle, x.view(), y.view(), out, metric);
* @endcode
*
* @param[in] handle raft::resources
* @param[in] x raft::device_csr_matrix_view
* @param[in] y raft::device_csr_matrix_view
* @param[out] dist raft::device_matrix_view dense matrix
* @param[in] metric distance metric to use
* @param[in] metric_arg metric argument (used for Minkowski distance)
*/
void pairwise_distance(raft::resources const& handle,
raft::device_csr_matrix_view<const double, int, int, int> x,
raft::device_csr_matrix_view<const double, int, int, int> y,
raft::device_matrix_view<double, int, raft::row_major> dist,
cuvs::distance::DistanceType metric,
float metric_arg = 2.0f);

/** @} */ // end group pairwise_distance_runtime

}; // namespace cuvs::distance
Loading

0 comments on commit 4b396d7

Please sign in to comment.