Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RELEASE] raft v23.10 #1863

Merged
merged 89 commits into from
Oct 11, 2023
Merged

[RELEASE] raft v23.10 #1863

merged 89 commits into from
Oct 11, 2023

Conversation

raydouglass
Copy link
Member

❄️ Code freeze for branch-23.10 and v23.10 release

What does this mean?

Only critical/hotfix level issues should be merged into branch-23.10 until release (merging of this PR).

What is the purpose of this PR?

  • Update documentation
  • Allow testing for the new release
  • Enable a means to merge branch-23.10 into main for the release

raydouglass and others added 30 commits July 20, 2023 16:47
Forward-merge branch-23.08 to branch-23.10
Forward-merge branch-23.08 to branch-23.10
Forward-merge branch-23.08 to branch-23.10
Forward-merge branch-23.08 to branch-23.10
Forward-merge branch-23.08 to branch-23.10
Forward-merge branch-23.08 to branch-23.10
Forward-merge branch-23.08 to branch-23.10
Forward-merge branch-23.08 to branch-23.10
Forward-merge branch-23.08 to branch-23.10
Forward-merge branch-23.08 to branch-23.10
Forward-merge branch-23.08 to branch-23.10
Forward-merge branch-23.08 to branch-23.10
Forward-merge branch-23.08 to branch-23.10
Forward-merge branch-23.08 to branch-23.10
The CUTLASS-based kernels were disabled on CTK 12. This PR re-enables them.

Authors:
  - Allard Hendriksen (https://github.com/ahendriksen)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1702
The previous image puts a box around all elements and called it RAFT. This is incorrect. The new image puts a box only around RAFT elements. The new image is also higher resolution.

Authors:
  - Nathan Stephens (https://github.com/nwstephens)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1710
- Update algorithms names to reflect `scripts/ann-benchmarks/algos.yaml` : [`raft_ivf_flat`, `raft_ivf_pq`, `raft_cagra`].
- Add `-f` short argument to our script to be consistent with C++ error message asking for `-f` flag 
https://github.com/rapidsai/raft/blob/00f30006ecb64873edab16ad8c9fb9d532ff166e/cpp/bench/ann/src/common/benchmark.hpp#L72
- Build raft_cagra only once and run the three different algorithms on it, instead of building 3 time the same cagra index.

Authors:
  - Micka (https://github.com/lowener)
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1696
The rest of RAPIDS was updated a while ago, but raft was missed.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Robert Maynard (https://github.com/robertmaynard)
  - Bradley Dice (https://github.com/bdice)
  - Ray Douglass (https://github.com/raydouglass)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1677
Authors:
  - Corey J. Nolet (https://github.com/cjnolet)
  - Nathan Stephens (https://github.com/nwstephens)

Approvers:
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #1713
The raft Python code appears to need no updates to build without warnings (with the exception of upstream cuda-python issues that we expect to be fixed in an upcoming release).

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Ben Frederickson (https://github.com/benfred)
  - Ray Douglass (https://github.com/raydouglass)

URL: #1688
- Various documentation updates on C++ and Python doc, mainly for raft::neighbors
- Add QPS vs Recall plot

Authors:
  - Micka (https://github.com/lowener)

Approvers:
  - GALI PREM SAGAR (https://github.com/galipremsagar)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1717
This PR update build.sh to enable NEIGHBORS_ANN_CAGRA_TEST.

Authors:
  - tsuki (https://github.com/enp1s0)
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1724
Use correct types for indices pointers and indices when creating the cusparse
descriptor from a device_csr_matrix_view.

Authors:
  - Simon Adorf (https://github.com/csadorf)
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1680
tfeher and others added 14 commits September 22, 2023 10:15
Example jupyter notebook to demonstrate usage of the IVF-Flat API

Authors:
  - Tamas Bela Feher (https://github.com/tfeher)

Approvers:
  - Artem M. Chirkin (https://github.com/achirkin)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1758
`python/raft-ann-bench/pyproject.toml` was missed in `update-version.sh`.

This PR refactors a bit to update all `pyproject.toml` files which will capture future ones as well.

Authors:
   - Ray Douglass (https://github.com/raydouglass)

Approvers:
   - AJ Schmidt (https://github.com/ajschmidt8)
   - Jake Awe (https://github.com/AyodeAwe)
RMM was listed as a `host` dependency but not `run`, made it so that installing `raft-ann-bench` does not automatically install `RMM` for end users.

Authors:
  - Dante Gama Dessavre (https://github.com/dantegd)

Approvers:
  - Ray Douglass (https://github.com/raydouglass)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1838
Number of improvements and fixes for different million scale datasets that are supported by default by the python benchmarking package.

Authors:
  - Dante Gama Dessavre (https://github.com/dantegd)

Approvers:
  - Divye Gala (https://github.com/divyegala)

URL: #1844
Running the CAGRA benchmarks and there could be OOM errors on GPU memory with large datasets. This is caused by holding multiple copies of the dataset in GPU memory. Fix by:

* Free existing memory for the dataset/graph before allocating new memory during update_dataset/update_grph
* On deserialize, if the serialized index doesn't contain the dataset - don't allocate GPU memory for it
* Don't call update_dataset repeatedly in the benchmarking code with the same dataset

Authors:
  - Ben Frederickson (https://github.com/benfred)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1832
This PR adds the pre-filtering feature to the CAGRA search implementations.

Rel: taken over from #1765

## Algorithm
The pre-filtering algorithm removes a node that should not be in the final result after it has behaved as a parent node. This way, the nodes that should not be in the final result are also used in the graph traversal, avoiding potential performance degradation.

## Changes
- Add filtering operation on a parent node after internal top-M buffer candidate calculation.
- Add filtering operation to result buffer before storing them in the device memory.

Authors:
  - tsuki (https://github.com/enp1s0)
  - Micka (https://github.com/lowener)
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Tamas Bela Feher (https://github.com/tfeher)
  - Micka (https://github.com/lowener)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1811
This PR adds a C++ example program that demonstrate the usage of IVF-Flat vector search.

Authors:
  - Tamas Bela Feher (https://github.com/tfeher)

Approvers:
  - Artem M. Chirkin (https://github.com/achirkin)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1828
- [x] Build Time comparison of end-to-end RAFT CAGRA+nn-descent against cuANN CAGRA+nn-descent
- [x] Recall comparison of RAFT nn-descent against cuANN nn-descent
- [x] RAFT types/APIs in ported code from cuANN
- [x] End-to-end CAGRA+nn-descent tests
- [x] Docs and code examples
- [x] Add `graph_build_algo` build param to CAGRA ann-bench for benchmarking builds with IVF-PQ or NN-Descent
- [x] All-neighbors knn graph nn-descent tests against brute-force knn

Recall Value comparison of RAFT nn-descent vs cuANN nn-descent
```
Dataset	graph_degree, intermediate_degree	Iterations	cuANN Recall	RAFT Recall
sift-128-euclidean	(64, 98)	            15	        0.9265991875	0.9471194688
sift-128-euclidean	(64, 98)	            50	        0.9831858594	0.9783938594
deep-image-96-inner	(64, 98)	            50	        0.9806211946	0.9801508853
```

Authors:
  - Divye Gala (https://github.com/divyegala)
  - Ray Wang (https://github.com/RayWang96)
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Ray Wang (https://github.com/RayWang96)
  - Ray Douglass (https://github.com/raydouglass)
  - Tamas Bela Feher (https://github.com/tfeher)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1748
This PR changes the rule to select a CAGRA search mode as written in the [CAGRA paper](https://arxiv.org/abs/2308.15136).

Authors:
  - tsuki (https://github.com/enp1s0)
  - Tamas Bela Feher (https://github.com/tfeher)

Approvers:
  - Tamas Bela Feher (https://github.com/tfeher)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1830
This PR removes block size template parameters from CAGRA search kernel functions to reduce the library size and build time.

rel: #1459

Authors:
  - tsuki (https://github.com/enp1s0)

Approvers:
  - Tamas Bela Feher (https://github.com/tfeher)

URL: #1740
This adds an index class to match the ANN methods. This allows us to precompute norms for the dataset in `brute_force::build` and then use them in `brute_force::search` - meaning we don't have to compute norms for the entire dataset on every query.

Authors:
  - Ben Frederickson (https://github.com/benfred)
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1817
This PR adds some [devcontainers](https://containers.dev/) to help simplify building the RAFT C++ and Python libraries.

It also adds an optional job to the `pr.yaml` to [build the RAFT libs in each devcontainer](https://github.com/trxcllnt/raft/blob/fea/devcontainers/.github/workflows/pr.yaml#L96-L101), so the build caches are populated for devs by CI.

A devcontainer can be launched by clicking the "Reopen in Container" button that VSCode shows when opening the repo (or by using the "Rebuild and Reopen in Container" command from the command palette):
![image](https://user-images.githubusercontent.com/178183/221771999-97ab29d5-e718-4e5f-b32f-2cdd51bba25c.png)

Clicking this button will cause VSCode to prompt the user to select one of these devcontainer variants:
![image](https://github.com/rapidsai/rmm/assets/178183/68d4b264-4fc2-4008-92b6-cb4bdd19b29f)

On startup, the devcontainer creates or updates the conda/pip environment using `raft/dependencies.yaml`. The envs/package caches are cached on the host via volume mounts, which are described in more detail in [`.devcontainer/README.md`](https://github.com/trxcllnt/raft/blob/fea/devcontainers/.devcontainer/README.md).

The container includes convenience functions to clean, configure, and build the various RAFT components:

```shell
$ clean-raft-cpp # only cleans the C++ build dir
$ clean-pylibraft-python # only cleans the Python build dir
$ clean-raft # cleans both C++ and Python build dirs

$ configure-raft-cpp # only configures raft C++ lib

$ build-raft-cpp # only builds raft C++ lib
$ build-pylibraft-python # only builds raft Python lib
$ build-raft # builds both C++ and Python libs
```

* The C++ build script is a small wrapper around `cmake -S ~/raft/cpp -B ~/raft/cpp/build` and `cmake --build ~/raft/cpp/build`
* The Python build script is a small wrapper around `pip install --editable ~/raft/cpp`

Unlike `build.sh`, these convenience scripts *don't* install the libraries after building them. Instead, they automatically inject the correct arguments to build the C++ libraries from source and use their build dirs as package roots:

```shell
$ cmake -S ~/raft/cpp -B ~/raft/cpp/build
$ CMAKE_ARGS="-Draft_ROOT=~/raft/cpp/build" \ # <-- this argument is automatic
  pip install -e ~/raft/cpp
```

Authors:
  - Paul Taylor (https://github.com/trxcllnt)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)
  - Jake Awe (https://github.com/AyodeAwe)

URL: #1791
@raydouglass raydouglass requested review from a team as code owners September 28, 2023 14:57
@copy-pr-bot
Copy link

copy-pr-bot bot commented Sep 28, 2023

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

galipremsagar and others added 5 commits September 29, 2023 10:20
This PR pins `dask` and `distributed` to `2023.9.2` for `23.10` release.

xref: rapidsai/cudf#14225

Authors:
   - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
   - Ray Douglass (https://github.com/raydouglass)
   - Peter Andreas Entschev (https://github.com/pentschev)
   - Ben Frederickson (https://github.com/benfred)
This PR fixes a bug in the filtering operations in the CAGRA multi-kernel search implementation. This bug caused the test of #1837 to fail.

Authors:
   - tsuki (https://github.com/enp1s0)

Approvers:
   - Micka (https://github.com/lowener)
   - Corey J. Nolet (https://github.com/cjnolet)
…orms (#1865)

This makes the faiss integration substantially easier, since we can just use the existing norms that have already been calculated in GpuDistanceParams::vectorNorms - rather than require an owned copy that lives in the brute force index.

Authors:
   - Ben Frederickson (https://github.com/benfred)

Approvers:
   - Corey J. Nolet (https://github.com/cjnolet)
Closes #1600.
To be merged after #1803 and #1811
This PR adds `bitset_filter` to filter an index with a bitset.

Authors:
   - Micka (https://github.com/lowener)
   - tsuki (https://github.com/enp1s0)
   - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
   - Corey J. Nolet (https://github.com/cjnolet)
@raydouglass raydouglass merged commit 51f52c1 into main Oct 11, 2023
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.