CI/CMake: Windows CUDA 12.x builds downgrade packages #1008

akx · 2024-02-01T12:47:31Z

It looks like the Windows CUDA 12.x builds end up doing extra work to first install CUDA bits version 12 and then downgrade them; the "CUDA Toolkit" step in e.g. https://github.com/TimDettmers/bitsandbytes/actions/runs/7738001447/job/21097984338

Channels:
 - pytorch
 - nvidia
 - conda-forge
 - defaults
Platform: win-64
Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... done

## Package Plan ##

  environment location: C:\Users\runneradmin\miniconda3\envs\bnb-env

  added / updated specs:
    - pytorch-cuda=11.8


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    cuda-cudart-11.8.89        |                0         1.4 MB  nvidia
    cuda-cudart-dev-11.8.89    |                0         723 KB  nvidia
    cuda-cupti-11.8.[87](https://github.com/TimDettmers/bitsandbytes/actions/runs/7738001447/job/21097984338#step:8:88)         |                0        11.5 MB  nvidia
    cuda-libraries-11.8.0      |                0           1 KB  nvidia
    cuda-libraries-dev-11.8.0  |                0           1 KB  nvidia
    cuda-nvrtc-11.8.89         |                0        72.1 MB  nvidia
    cuda-nvrtc-dev-11.8.89     |                0        16.1 MB  nvidia
    cuda-nvtx-11.8.86          |                0          43 KB  nvidia
    cuda-runtime-11.8.0        |                0           1 KB  nvidia
    libcublas-11.11.3.6        |                0          33 KB  nvidia
    libcublas-dev-11.11.3.6    |                0       375.9 MB  nvidia
    libcufft-10.9.0.58         |                0           6 KB  nvidia
    libcufft-dev-10.9.0.58     |                0       144.6 MB  nvidia
    libcusolver-11.4.1.48      |                0          29 KB  nvidia
    libcusolver-dev-11.4.1.48  |                0        94.1 MB  nvidia
    libcusparse-11.7.5.86      |                0          13 KB  nvidia
    libcusparse-dev-11.7.5.86  |                0       175.7 MB  nvidia
    libnpp-11.8.0.86           |                0         294 KB  nvidia
    libnpp-dev-11.8.0.86       |                0       143.2 MB  nvidia
    libnvjpeg-11.9.0.86        |                0           4 KB  nvidia
    libnvjpeg-dev-11.9.0.86    |                0         1.9 MB  nvidia
    pytorch-2.2.0              |py3.11_cuda11.8_cudnn8_0        1.42 GB  pytorch
    pytorch-cuda-11.8          |       h24eeafa_5           4 KB  pytorch
    ------------------------------------------------------------
                                           Total:        2.43 GB

The following packages will be DOWNGRADED:

  cuda-cudart                                    12.1.105-0 --> 11.8.89-0 
  cuda-cudart-dev                                12.1.105-0 --> 11.8.89-0 
  cuda-cupti                                     12.1.105-0 --> 11.8.87-0 
  cuda-libraries                                   12.1.0-0 --> 11.8.0-0 
  cuda-libraries-dev                               12.1.0-0 --> 11.8.0-0 
  cuda-nvrtc                                     12.1.105-0 --> 11.8.89-0 
  cuda-nvrtc-dev                                 12.1.105-0 --> 11.8.89-0 
  cuda-nvtx                                      12.1.105-0 --> 11.8.86-0 
  cuda-runtime                                     12.1.0-0 --> 11.8.0-0 
  libcublas                                     12.1.0.26-0 --> 11.11.3.6-0 
  libcublas-dev                                 12.1.0.26-0 --> 11.11.3.6-0 
  libcufft                                       11.0.2.4-0 --> 10.9.0.58-0 
  libcufft-dev                                   11.0.2.4-0 --> 10.9.0.58-0 
  libcusolver                                   11.4.4.55-0 --> 11.4.1.48-0 
  libcusolver-dev                               11.4.4.55-0 --> 11.4.1.48-0 
  libcusparse                                   12.0.2.55-0 --> 11.7.5.86-0 
  libcusparse-dev                               12.0.2.55-0 --> 11.7.5.86-0 
  libnpp                                        12.0.2.50-0 --> 11.8.0.86-0 
  libnpp-dev                                    12.0.2.50-0 --> 11.8.0.86-0 
  libnvjpeg                                     12.1.1.14-0 --> 11.9.0.86-0 
  libnvjpeg-dev                                 12.1.1.14-0 --> 11.9.0.86-0 
  pytorch                    2.2.0-py3.11_cuda12.1_cudnn8_0 --> 2.2.0-py3.11_cuda11.8_cudnn8_0 
  pytorch-cuda                              12.1-hde6ce7c_5 --> 11.8-h24eeafa_5 



Downloading and Extracting Packages: ...working... done
Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
Channels:
 - nvidia/label/cuda-11.8.0
 - pytorch
 - nvidia
 - conda-forge
 - defaults
Platform: win-64
Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... done

I know there's a whole bunch more work to do to clean up and optimize the CI CMake builds, but just wanted to jot this down :)

cc @wkpark

The text was updated successfully, but these errors were encountered:

wkpark · 2024-02-01T13:20:38Z

current CI builds both cuda 11.8 and cuda 12.1. ((ubuntu, windows) x (cuda11, cuda12) x (py310, py311) = total 8 builds)

default cuda version is cuda 12.1 (initialized by Run conda-incubator/setup-miniconda@v3.0.1 step)
matrix.cuda-version == 11.8 case -> it will downgrade cuda by CUDA Toolkit step.

you can check the used cuda version by cmake at cmake config step log:

akx · 2024-02-01T13:38:00Z

@wkpark #1011 simplifies this to 4 builds (Ubuntu, Windows) x (CUDA 11, CUDA 12) since a cp310 wheel should work fine on Python 3.11 too (see #1010 for more discussion).

Maybe the initial mamba setup step should skip installing..?

wkpark · 2024-02-01T14:40:41Z

@wkpark #1011 simplifies this to 4 builds (Ubuntu, Windows) x (CUDA 11, CUDA 12) since a cp310 wheel should work fine on Python 3.11 too (see #1010 for more discussion).

thanks for your information!

Maybe the initial mamba setup step should skip installing..?

mamba/miniconda init step seems slow on windows (on ubuntu this step is much faster and does not try to install cuda).
I suspect that some package listed in environment-bnb.yml has some dependency on cuda).

Titus-von-Koeller · 2024-03-15T18:40:02Z

cc @matthewdouglas

matthewdouglas · 2024-03-16T00:53:21Z

@Titus-von-Koeller I think this can be closed. The workflow was changed and doesn't use miniconda and only installs one CUDA toolkit per run. The Linux build uses Docker containers specific to each toolkit version. And #1111 should have taken care of #1010 as well.

Titus-von-Koeller closed this as completed Mar 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CI/CMake: Windows CUDA 12.x builds downgrade packages #1008

CI/CMake: Windows CUDA 12.x builds downgrade packages #1008

akx commented Feb 1, 2024

wkpark commented Feb 1, 2024

akx commented Feb 1, 2024

wkpark commented Feb 1, 2024

Titus-von-Koeller commented Mar 15, 2024

matthewdouglas commented Mar 16, 2024

CI/CMake: Windows CUDA 12.x builds downgrade packages #1008

CI/CMake: Windows CUDA 12.x builds downgrade packages #1008

Comments

akx commented Feb 1, 2024

wkpark commented Feb 1, 2024

akx commented Feb 1, 2024

wkpark commented Feb 1, 2024

Titus-von-Koeller commented Mar 15, 2024

matthewdouglas commented Mar 16, 2024