Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RELEASE] raft v24.10 #2460

Merged
merged 40 commits into from
Oct 9, 2024
Merged

[RELEASE] raft v24.10 #2460

merged 40 commits into from
Oct 9, 2024

Conversation

raydouglass
Copy link
Member

❄️ Code freeze for branch-24.10 and v24.10 release

What does this mean?

Only critical/hotfix level issues should be merged into branch-24.10 until release (merging of this PR).

What is the purpose of this PR?

  • Update documentation
  • Allow testing for the new release
  • Enable a means to merge branch-24.10 into main for the release

raydouglass and others added 30 commits July 19, 2024 17:26
Forward-merge branch-24.08 into branch-24.10
Forward-merge branch-24.08 into branch-24.10
Forward-merge branch-24.08 into branch-24.10
Forward-merge branch-24.08 into branch-24.10
Forward-merge branch-24.08 into branch-24.10
Forward-merge branch-24.08 into branch-24.10
Contributes to rapidsai/build-planning#77.

Follow-up to rapidsai/devcontainers#338

Proposes updating the pip devcontainers to use v1.17.0, as part of an effort to use that version across RAPIDS.

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Ray Douglass (https://github.com/raydouglass)
  - Peter Andreas Entschev (https://github.com/pentschev)

URL: #2401
Forward-merge branch-24.08 into branch-24.10
This PR improves `update-version.sh` by clarifying suffix handling and ucxx/ucx-py version handling.

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - James Lamb (https://github.com/jameslamb)

URL: #2408
…rsion (#2406)

Contributes to rapidsai/build-planning#58.

`scikit-build-core==0.10.0` was released today (https://github.com/scikit-build/scikit-build-core/releases/tag/v0.10.0), and wheel-building configurations across RAPIDS are incompatible with it.

This proposes upgrading to that version and fixing configuration here in a way that:

* is compatible with that new `scikit-build-core` version
* takes advantage of the forward-compatibility mechanism (`minimum-version`) that `scikit-build-core` provides, to reduce the risk of needing to do this again in the future

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - https://github.com/jakirkham

URL: #2406
This PR updates pre-commit hooks to the latest versions that are supported without causing style check errors.

Authors:
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)

Approvers:
  - James Lamb (https://github.com/jameslamb)

URL: #2409
- distance supports half-float
- SDDMM support half-float
- gemm supports multi-type compose
- transpose & copy support half
- random supports half

Authors:
  - rhdong (https://github.com/rhdong)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #2382
Use CUDA math wheels to reduce wheel size by not statically linking CUDA math libraries.

Contributes to rapidsai/build-planning#35

Authors:
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)

Approvers:
  - Robert Maynard (https://github.com/robertmaynard)
  - Bradley Dice (https://github.com/bdice)
  - James Lamb (https://github.com/jameslamb)

URL: #2415
This PR implements batching NN Descent. It will be helpful for reducing device memory usages for large datasets (specifically if the dataset is kept on host).
`index_params` now has...
- `n_clusters`: number of clusters to make. Larger clusters reduce device memory usage. Default is 1, in which case it doesn't do the batched NND.

### Notes
- The batching approach may have duplicate indices in the knn graph (in rare cases) because sometimes distances calculated for the same pair may be slightly different. This results in putting the same index far apart after sorting by distances, making it difficult to get unique indices (which is done by looking at 2 indices before the current one). 
  - handled by adding a `max_duplicates` for `check_unique_indices` in tests

### Benchmarks
- Dataset for NND (no batch and batch) is on host
- Dataset for brute force knn is on device (but still won't be able to run with large datasets even if the data is put on the host because it brings the entire dataset to device anyway)
- The dataset is just a slice of the wiki-all dataset (88M, 768) to test for different sizes
<img width="773" alt="Screenshot 2024-08-02 at 8 35 58 AM" src="https://github.com/user-attachments/assets/2d2d4bb2-a7a9-4731-9b3d-7d0eccf875ea">

Authors:
  - Jinsol Park (https://github.com/jinsolp)

Approvers:
  - Divye Gala (https://github.com/divyegala)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #2403
This PR removes the NumPy<2 pin which is expected to work for
RAPIDS projects once CuPy 13.3.0 is released (CuPy 13.2.0 had
some issues preventing the use with NumPy 2).

Authors:
  - Sebastian Berg (https://github.com/seberg)
  - https://github.com/jakirkham

Approvers:
  - James Lamb (https://github.com/jameslamb)
  - https://github.com/jakirkham

URL: #2414
Contributes to rapidsai/build-planning#88

Finishes the work of dropping Python 3.9 support.

This project stopped building / testing against Python 3.9 as of rapidsai/shared-workflows#235.
This PR updates configuration and docs to reflect that.

## Notes for Reviewers

### How I tested this

Checked that there were no remaining uses like this:

```shell
git grep -E '3\.9'
git grep '39'
git grep 'py39'
```

And similar for variations on Python 3.8 (to catch things that were missed the last time this was done).

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - https://github.com/jakirkham

URL: #2417
This PR updates rapidsai/pre-commit-hooks to the version 0.4.0.

Authors:
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)

Approvers:
  - James Lamb (https://github.com/jameslamb)

URL: #2420
Contributes to rapidsai/build-planning#40

This PR adds support for Python 3.12.

## Notes for Reviewers

This is part of ongoing work to add Python 3.12 support across RAPIDS.
It temporarily introduces a build/test matrix including Python 3.12, from rapidsai/shared-workflows#213.

A follow-up PR will revert back to pointing at the `branch-24.10` branch of `shared-workflows` once all
RAPIDS repos have added Python 3.12 support.

### This will fail until all dependencies have been updates to Python 3.12

CI here is expected to fail until all of this project's upstream dependencies support Python 3.12.

This can be merged whenever all CI jobs are passing.

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: #2428
`raft::ceildiv` is also being replaced with `raft::div_rounding_up_safe` to avoid including CUDA headers when not needed.

Authors:
  - Micka (https://github.com/lowener)

Approvers:
  - rhdong (https://github.com/rhdong)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #2429
We need to update flake8 to fix a false-positive that appears with older flake8 versions on Python 3.12.

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - James Lamb (https://github.com/jameslamb)

URL: #2435
* Update fmt (to 11.0.2) and spdlog (to 1.14.1).

* use rmm and ucxx CI artifacts

* try using librmm wheels

* try again to use rmm wheels

* rapids-get-pr-wheel-artifact was missing RAPIDS_PY_WHEEL_NAME

* ok you do not need to provide the Python slug yourself for rapids-get-pr-wheel-artifact

* constraints need 'file://' protocol

* try suppressing unreachable-code diagnostics from nvcc (this should be narrowed down / upstreamed before merging)

* fix checks

* Revert "try suppressing unreachable-code diagnostics from nvcc (this should be narrowed down / upstreamed before merging)"

This reverts commit 3ba2201.

* copyright

* move rapids-cmake overrides [skip ci]

* kick off a build

* fix dependency graph

* devcontainer

* run all CI

* remove testing-only changes [skip ci]
galipremsagar and others added 9 commits September 25, 2024 18:49
In cudf we have observed a ~10% speed up of pytest suite execution by switching pytest traceback to `--native`:

```
currently:

102474 passed, 2117 skipped, 902 xfailed in 892.16s (0:14:52)

--tb=short:

102474 passed, 2117 skipped, 902 xfailed in 898.99s (0:14:58)

--tb=no:

102474 passed, 2117 skipped, 902 xfailed in 815.98s (0:13:35)

--tb=native:

102474 passed, 2117 skipped, 902 xfailed in 820.92s (0:13:40)
```

This PR makes a similar change to `raft` repo.

xref: rapidsai/cudf#16851

Authors:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #2446
Linked issue #2450

Authors:
  - Divye Gala (https://github.com/divyegala)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #2453
This change ensures `rmm-cuXX` is not in the package dependencies when building the pylibraft wheel with `-C "rapidsai.matrix-entry=cuda_suffixed=false`.

Authors:
  - Paul Taylor (https://github.com/trxcllnt)
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - James Lamb (https://github.com/jameslamb)

URL: #2440
…2439)

- This PR is a part of the feature that applies the prefilter brute-force in Cagra.

Authors:
  - rhdong (https://github.com/rhdong)
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Micka (https://github.com/lowener)

URL: #2439
Contributes to rapidsai/build-planning#102

Some RAPIDS libraries are using `ncclCommSplit()`, which was introduced in `nccl==2.18.1.1`. This is part of a series of PRs across RAPIDS updating libraries' pins to `nccl>=2.18.1.1` to ensure they get a new-enough version that supports that.

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #2443
Follow-up to #2443

As part of the work to support NumPy 2 across RAPIDS, we found reason to upgrade some libraries like `cugraph` to slightly newer NCCL (`>=2.19`). Context: rapidsai/build-planning#102 (comment)

This applies that same bump here, to keep the range of NCCL versions consistent across RAPIDS.

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - https://github.com/jakirkham
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #2458
I opted to deprecate just the necessary pieces, such as the `index` classes instead of deprecating every single function.

Authors:
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Ben Frederickson (https://github.com/benfred)

URL: #2448
@raydouglass raydouglass requested review from a team as code owners September 27, 2024 14:36
@raydouglass raydouglass requested review from bdice and removed request for a team September 27, 2024 14:36
Copy link

copy-pr-bot bot commented Sep 27, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@raydouglass raydouglass merged commit 226c82e into main Oct 9, 2024
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.