Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: CUDA Setup failed despite GPU being available. #1434

Open
kaijun123 opened this issue Dec 2, 2024 · 4 comments
Open

RuntimeError: CUDA Setup failed despite GPU being available. #1434

kaijun123 opened this issue Dec 2, 2024 · 4 comments

Comments

@kaijun123
Copy link

kaijun123 commented Dec 2, 2024

System Info

OS: Ubuntu 24.04.1 LTS
Python: Python 3.10.15

nvcc:
NVIDIA (R) Cuda compiler driver
Built on Thu_Sep_12_02:18:05_PDT_2024
Cuda compilation tools, release 12.6, V12.6.77
Build cuda_12.6.r12.6/compiler.34841621_0

Packages in environment at:

# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main  
_openmp_mutex             5.1                       1_gnu  
_sysroot_linux-64_curr_repodata_hack 3                   haa98f57_10  
accelerate                0.21.0                   pypi_0    pypi
aiofiles                  24.1.0                   pypi_0    pypi
aiohappyeyeballs          2.4.3                    pypi_0    pypi
aiohttp                   3.10.10                  pypi_0    pypi
aiosignal                 1.3.1                    pypi_0    pypi
altair                    5.4.1                    pypi_0    pypi
anyio                     4.6.2.post1              pypi_0    pypi
async-timeout             4.0.3                    pypi_0    pypi
attrs                     24.2.0                   pypi_0    pypi
backoff                   2.2.1                    pypi_0    pypi
binutils_impl_linux-64    2.40                 h5293946_0  
binutils_linux-64         2.40.0               hc2dff05_1  
bitsandbytes              0.41.0                   pypi_0    pypi
bzip2                     1.0.8                h5eee18b_6  
ca-certificates           2024.9.24            h06a4308_0  
certifi                   2024.8.30                pypi_0    pypi
charset-normalizer        3.4.0                    pypi_0    pypi
click                     8.1.7                    pypi_0    pypi
contourpy                 1.3.0                    pypi_0    pypi
cuda                      12.6.2                        0    nvidia
cuda-cccl_linux-64        12.6.77                       0    nvidia
cuda-command-line-tools   12.6.2                        0    nvidia
cuda-compiler             12.6.2                        0    nvidia
cuda-crt-dev_linux-64     12.6.77                       0    nvidia
cuda-crt-tools            12.6.77                       0    nvidia
cuda-cudart               12.6.77                       0    nvidia
cuda-cudart-dev           12.6.77                       0    nvidia
cuda-cudart-dev_linux-64  12.6.77                       0    nvidia
cuda-cudart-static        12.6.77                       0    nvidia
cuda-cudart-static_linux-64 12.6.77                       0    nvidia
cuda-cudart_linux-64      12.6.77                       0    nvidia
cuda-cuobjdump            12.6.77                       0    nvidia
cuda-cupti                12.6.80                       0    nvidia
cuda-cupti-dev            12.6.80                       0    nvidia
cuda-cuxxfilt             12.6.77                       0    nvidia
cuda-driver-dev           12.6.77                       0    nvidia
cuda-driver-dev_linux-64  12.6.77                       0    nvidia
cuda-gdb                  12.6.77                       0    nvidia
cuda-libraries            12.6.2                        0    nvidia
cuda-libraries-dev        12.6.2                        0    nvidia
cuda-nsight               12.6.77                       0    nvidia
cuda-nvcc                 12.6.77                       0    nvidia
cuda-nvcc-dev_linux-64    12.6.77                       0    nvidia
cuda-nvcc-impl            12.6.77                       0    nvidia
cuda-nvcc-tools           12.6.77                       0    nvidia
cuda-nvcc_linux-64        12.6.77                       0    nvidia
cuda-nvdisasm             12.6.77                       0    nvidia
cuda-nvml-dev             12.6.77                       2    nvidia
cuda-nvprof               12.6.80                       0    nvidia
cuda-nvprune              12.6.77                       0    nvidia
cuda-nvrtc                12.6.77                       0    nvidia
cuda-nvrtc-dev            12.6.77                       0    nvidia
cuda-nvtx                 12.6.77                       0    nvidia
cuda-nvvm-dev_linux-64    12.6.77                       0    nvidia
cuda-nvvm-impl            12.6.77                       0    nvidia
cuda-nvvm-tools           12.6.77                       0    nvidia
cuda-nvvp                 12.6.80                       0    nvidia
cuda-opencl               12.6.77                       0    nvidia
cuda-opencl-dev           12.6.77                       0    nvidia
cuda-profiler-api         12.6.77                       0    nvidia
cuda-runtime              12.6.2                        0    nvidia
cuda-sanitizer-api        12.6.77                       0    nvidia
cuda-toolkit              12.6.2                        0    nvidia
cuda-tools                12.6.2                        0    nvidia
cuda-version              12.6                          3    nvidia
cuda-visual-tools         12.6.2                        0    nvidia
cycler                    0.12.1                   pypi_0    pypi
dbus                      1.13.18              hb2f20db_0  
distro                    1.9.0                    pypi_0    pypi
einops                    0.6.1                    pypi_0    pypi
einops-exts               0.0.4                    pypi_0    pypi
exceptiongroup            1.2.2                    pypi_0    pypi
expat                     2.6.3                h6a678d5_0  
fastapi                   0.115.2                  pypi_0    pypi
ffmpy                     0.4.0                    pypi_0    pypi
filelock                  3.16.1                   pypi_0    pypi
fontconfig                2.14.1               h55d465d_3  
fonttools                 4.54.1                   pypi_0    pypi
freetype                  2.12.1               h4a9f257_0  
frozenlist                1.4.1                    pypi_0    pypi
fsspec                    2024.10.0                pypi_0    pypi
gcc_impl_linux-64         11.2.0               h1234567_1  
gcc_linux-64              11.2.0               h5c386dc_1  
gds-tools                 1.11.1.6                      0    nvidia
glib                      2.78.4               h6a678d5_0  
glib-tools                2.78.4               h6a678d5_0  
gmp                       6.2.1                h295c915_3  
gradio                    3.35.2                   pypi_0    pypi
gradio-client             0.2.9                    pypi_0    pypi
gxx_impl_linux-64         11.2.0               h1234567_1  
gxx_linux-64              11.2.0               hc2dff05_1  
h11                       0.14.0                   pypi_0    pypi
httpcore                  0.17.3                   pypi_0    pypi
httpx                     0.24.0                   pypi_0    pypi
huggingface-hub           0.26.1                   pypi_0    pypi
icu                       73.1                 h6a678d5_0  
idna                      3.10                     pypi_0    pypi
importlib-resources       6.4.5                    pypi_0    pypi
jinja2                    3.1.4                    pypi_0    pypi
joblib                    1.4.2                    pypi_0    pypi
jsonschema                4.23.0                   pypi_0    pypi
jsonschema-specifications 2024.10.1                pypi_0    pypi
kernel-headers_linux-64   3.10.0              h57e8cba_10  
kiwisolver                1.4.7                    pypi_0    pypi
latex2mathml              3.77.0                   pypi_0    pypi
ld_impl_linux-64          2.40                 h12ee557_0  
libcublas                 12.6.3.3                      0    nvidia
libcublas-dev             12.6.3.3                      0    nvidia
libcufft                  11.3.0.4                      0    nvidia
libcufft-dev              11.3.0.4                      0    nvidia
libcufile                 1.11.1.6                      0    nvidia
libcufile-dev             1.11.1.6                      0    nvidia
libcurand                 10.3.7.77                     0    nvidia
libcurand-dev             10.3.7.77                     0    nvidia
libcusolver               11.7.1.2                      0    nvidia
libcusolver-dev           11.7.1.2                      0    nvidia
libcusparse               12.5.4.2                      0    nvidia
libcusparse-dev           12.5.4.2                      0    nvidia
libffi                    3.4.4                h6a678d5_1  
libgcc-devel_linux-64     11.2.0               h1234567_1  
libgcc-ng                 11.2.0               h1234567_1  
libglib                   2.78.4               hdc74915_0  
libgomp                   11.2.0               h1234567_1  
libiconv                  1.16                 h5eee18b_3  
libnpp                    12.3.1.54                     0    nvidia
libnpp-dev                12.3.1.54                     0    nvidia
libnvfatbin               12.6.77                       0    nvidia
libnvfatbin-dev           12.6.77                       0    nvidia
libnvjitlink              12.6.77                       0    nvidia
libnvjitlink-dev          12.6.77                       0    nvidia
libnvjpeg                 12.3.3.54                     0    nvidia
libnvjpeg-dev             12.3.3.54                     0    nvidia
libpng                    1.6.39               h5eee18b_0  
libstdcxx-devel_linux-64  11.2.0               h1234567_1  
libstdcxx-ng              11.2.0               h1234567_1  
libuuid                   1.41.5               h5eee18b_0  
libxcb                    1.15                 h7f8727e_0  
libxkbcommon              1.0.1                h097e994_2  
libxml2                   2.13.1               hfdd30dd_2  
linkify-it-py             2.0.3                    pypi_0    pypi
llava-med                 1.5.0                    pypi_0    pypi
markdown-it-py            2.2.0                    pypi_0    pypi
markdown2                 2.5.1                    pypi_0    pypi
markupsafe                3.0.2                    pypi_0    pypi
matplotlib                3.9.2                    pypi_0    pypi
mdit-py-plugins           0.3.3                    pypi_0    pypi
mdurl                     0.1.2                    pypi_0    pypi
mpmath                    1.3.0                    pypi_0    pypi
multidict                 6.1.0                    pypi_0    pypi
narwhals                  1.10.0                   pypi_0    pypi
ncurses                   6.4                  h6a678d5_0  
networkx                  3.4.2                    pypi_0    pypi
nibabel                   5.3.1                    pypi_0    pypi
nsight-compute            2024.3.2.3                    0    nvidia
nspr                      4.35                 h6a678d5_0  
nss                       3.89.1               h6a678d5_0  
numpy                     2.1.2                    pypi_0    pypi
nvidia-cublas-cu12        12.4.5.8                 pypi_0    pypi
nvidia-cuda-cupti-cu12    12.4.127                 pypi_0    pypi
nvidia-cuda-nvrtc-cu12    12.4.127                 pypi_0    pypi
nvidia-cuda-runtime-cu12  12.4.127                 pypi_0    pypi
nvidia-cudnn-cu12         9.1.0.70                 pypi_0    pypi
nvidia-cufft-cu12         11.2.1.3                 pypi_0    pypi
nvidia-curand-cu12        10.3.5.147               pypi_0    pypi
nvidia-cusolver-cu12      11.6.1.9                 pypi_0    pypi
nvidia-cusparse-cu12      12.3.1.170               pypi_0    pypi
nvidia-nccl-cu12          2.21.5                   pypi_0    pypi
nvidia-nvjitlink-cu12     12.4.127                 pypi_0    pypi
nvidia-nvtx-cu12          12.4.127                 pypi_0    pypi
openai                    1.12.0                   pypi_0    pypi
openssl                   3.0.15               h5eee18b_0  
orjson                    3.10.9                   pypi_0    pypi
packaging                 24.1                     pypi_0    pypi
pandas                    2.2.3                    pypi_0    pypi
pcre2                     10.42                hebb0a14_1  
peft                      0.4.0                    pypi_0    pypi
pillow                    11.0.0                   pypi_0    pypi
pip                       24.2            py310h06a4308_0  
propcache                 0.2.0                    pypi_0    pypi
protobuf                  5.28.2                   pypi_0    pypi
psutil                    6.1.0                    pypi_0    pypi
pydantic                  1.10.18                  pypi_0    pypi
pydub                     0.25.1                   pypi_0    pypi
pygments                  2.18.0                   pypi_0    pypi
pyparsing                 3.2.0                    pypi_0    pypi
python                    3.10.15              he870216_1  
python-dateutil           2.9.0.post0              pypi_0    pypi
python-multipart          0.0.12                   pypi_0    pypi
pytz                      2024.2                   pypi_0    pypi
pyyaml                    6.0.2                    pypi_0    pypi
readline                  8.2                  h5eee18b_0  
referencing               0.35.1                   pypi_0    pypi
regex                     2024.9.11                pypi_0    pypi
requests                  2.32.3                   pypi_0    pypi
rpds-py                   0.20.0                   pypi_0    pypi
safetensors               0.4.5                    pypi_0    pypi
scikit-learn              1.2.2                    pypi_0    pypi
scipy                     1.14.1                   pypi_0    pypi
semantic-version          2.10.0                   pypi_0    pypi
sentencepiece             0.1.99                   pypi_0    pypi
setuptools                75.1.0          py310h06a4308_0  
shortuuid                 1.0.13                   pypi_0    pypi
six                       1.16.0                   pypi_0    pypi
sniffio                   1.3.1                    pypi_0    pypi
sqlite                    3.45.3               h5eee18b_0  
starlette                 0.40.0                   pypi_0    pypi
svgwrite                  1.4.3                    pypi_0    pypi
sympy                     1.13.1                   pypi_0    pypi
sysroot_linux-64          2.17                h57e8cba_10  
threadpoolctl             3.5.0                    pypi_0    pypi
tiktoken                  0.8.0                    pypi_0    pypi
timm                      0.9.12                   pypi_0    pypi
tk                        8.6.14               h39e8969_0  
tokenizers                0.15.2                   pypi_0    pypi
torch                     2.5.0                    pypi_0    pypi
torchvision               0.20.0                   pypi_0    pypi
tqdm                      4.66.5                   pypi_0    pypi
transformers              4.36.2                   pypi_0    pypi
triton                    3.1.0                    pypi_0    pypi
typing-extensions         4.12.2                   pypi_0    pypi
tzdata                    2024.2                   pypi_0    pypi
uc-micro-py               1.0.3                    pypi_0    pypi
urllib3                   2.2.3                    pypi_0    pypi
uvicorn                   0.32.0                   pypi_0    pypi
wavedrom                  2.0.3.post3              pypi_0    pypi
websockets                13.1                     pypi_0    pypi
wheel                     0.44.0          py310h06a4308_0  
xz                        5.4.6                h5eee18b_1  
yarl                      1.15.5                   pypi_0    pypi
zlib                      1.2.13               h5eee18b_1  

Reproduction

Error message:

================================================================================
WARNING: Manual override via BNB_CUDA_VERSION env variable detected!
BNB_CUDA_VERSION=XXX can be used to load a bitsandbytes version that is different from the PyTorch CUDA version.
If this was unintended set the BNB_CUDA_VERSION variable to an empty string: export BNB_CUDA_VERSION=
If you use the manual override make sure the right libcudart.so is in your LD_LIBRARY_PATH
For example by adding the following to your .bashrc: export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:<path_to_cuda_dir/lib64
Loading CUDA version: BNB_CUDA_VERSION=126
================================================================================
  warn((f'\n\n{"="*80}\n'
False

===================================BUG REPORT===================================
/home/anaconda3/envs/llava-med/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:166: UserWarning: Welcome to bitsandbytes. For bug reports, please run
python -m bitsandbytes
  warn(msg)
================================================================================
/home/anaconda3/envs/llava-med/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:166: UserWarning: /home/anaconda3/envs/llava-med did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] as expected! Searching further paths...
  warn(msg)
/home/anaconda3/envs/llava-med/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:166: UserWarning: /usr/local/cuda-12.6/lib64 did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] as expected! Searching further paths...
  warn(msg)
The following directories listed in your path were found to be non-existent: {PosixPath('-DCMAKE_LINKER=/home/anaconda3/envs/llava-med/bin/x86_64-conda-linux-gnu-ld -DCMAKE_STRIP=/home/anaconda3/envs/llava-med/bin/x86_64-conda-linux-gnu-strip')}
The following directories listed in your path were found to be non-existent: {PosixPath('-fvisibility-inlines-hidden -std=c++17 -fmessage-length=0 -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /home/anaconda3/envs/llava-med/include  -I/home/anaconda3/envs/llava-med/targets/x86_64-linux/include  -L/home/anaconda3/envs/llava-med/targets/x86_64-linux/lib -L/home/anaconda3/envs/llava-med/targets/x86_64-linux/lib/stubs')}
The following directories listed in your path were found to be non-existent: {PosixPath('-fvisibility-inlines-hidden -std=c++17 -fmessage-length=0 -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-all -fno-plt -Og -g -Wall -Wextra -fvar-tracking-assignments -ffunction-sections -pipe -isystem /home/anaconda3/envs/llava-med/include')}
The following directories listed in your path were found to be non-existent: {PosixPath('-Wl,-O2 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now -Wl,--disable-new-dtags -Wl,--gc-sections -Wl,-rpath,/home/anaconda3/envs/llava-med/lib -Wl,-rpath-link,/home/anaconda3/envs/llava-med/lib -L/home/anaconda3/envs/llava-med/lib  -L/home/anaconda3/envs/llava-med/targets/x86_64-linux/lib -L/home/anaconda3/envs/llava-med/targets/x86_64-linux/lib/stubs')}
The following directories listed in your path were found to be non-existent: {PosixPath('-march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-all -fno-plt -Og -g -Wall -Wextra -fvar-tracking-assignments -ffunction-sections -pipe -isystem /home/anaconda3/envs/llava-med/include')}
The following directories listed in your path were found to be non-existent: {PosixPath(' -ccbin=/home/anaconda3/envs/llava-med/bin/x86_64-conda-linux-gnu-c++')}
The following directories listed in your path were found to be non-existent: {PosixPath('-DNDEBUG -D_FORTIFY_SOURCE=2 -O2 -isystem /home/anaconda3/envs/llava-med/include  -I/home/anaconda3/envs/llava-med/targets/x86_64-linux/include  -L/home/anaconda3/envs/llava-med/targets/x86_64-linux/lib -L/home/anaconda3/envs/llava-med/targets/x86_64-linux/lib/stubs')}
The following directories listed in your path were found to be non-existent: {PosixPath('-D_DEBUG -D_FORTIFY_SOURCE=2 -Og -isystem /home/anaconda3/envs/llava-med/include')}
The following directories listed in your path were found to be non-existent: {PosixPath('//debuginfod.ubuntu.com '), PosixPath('https')}
The following directories listed in your path were found to be non-existent: {PosixPath('-march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /home/anaconda3/envs/llava-med/include  -I/home/anaconda3/envs/llava-med/targets/x86_64-linux/include  -L/home/anaconda3/envs/llava-med/targets/x86_64-linux/lib -L/home/anaconda3/envs/llava-med/targets/x86_64-linux/lib/stubs')}
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching in backup paths...
DEBUG: Possible options found for libcudart.so: {PosixPath('/usr/local/cuda/lib64/libcudart.so')}
CUDA SETUP: PyTorch settings found: CUDA_VERSION=124, Highest Compute Capability: 8.9.
CUDA SETUP: To manually override the PyTorch CUDA version please see:https://github.com/TimDettmers/bitsandbytes/blob/main/how_to_use_nonpytorch_cuda.md
CUDA SETUP: Required library version not found: libbitsandbytes_cuda124.so. Maybe you need to compile it from source?
CUDA SETUP: Defaulting to libbitsandbytes_cpu.so...

================================================ERROR=====================================
CUDA SETUP: CUDA detection failed! Possible reasons:
1. You need to manually override the PyTorch CUDA version. Please see: "https://github.com/TimDettmers/bitsandbytes/blob/main/how_to_use_nonpytorch_cuda.md
2. CUDA driver not installed
3. CUDA not installed
4. You have multiple conflicting CUDA libraries
5. Required library not pre-compiled for this bitsandbytes release!
CUDA SETUP: If you compiled from source, try again with `make CUDA_VERSION=DETECTED_CUDA_VERSION` for example, `make CUDA_VERSION=113`.
CUDA SETUP: The CUDA version for the compile might depend on your conda install. Inspect CUDA version via `conda list | grep cuda`.
================================================================================

CUDA SETUP: Something unexpected happened. Please compile from source:
git clone https://github.com/TimDettmers/bitsandbytes.git
cd bitsandbytes
CUDA_VERSION=124
python setup.py install
CUDA SETUP: Setup Failed!
Traceback (most recent call last):
  File "/home/anaconda3/envs/llava-med/lib/python3.10/runpy.py", line 187, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/home/anaconda3/envs/llava-med/lib/python3.10/runpy.py", line 146, in _get_module_details
    return _get_module_details(pkg_main_name, error)
  File "/home/anaconda3/envs/llava-med/lib/python3.10/runpy.py", line 110, in _get_module_details
    __import__(pkg_name)
  File "/home/anaconda3/envs/llava-med/lib/python3.10/site-packages/bitsandbytes/__init__.py", line 6, in <module>
    from . import cuda_setup, utils, research
  File "/home/anaconda3/envs/llava-med/lib/python3.10/site-packages/bitsandbytes/research/__init__.py", line 1, in <module>
    from . import nn
  File "/home/anaconda3/envs/llava-med/lib/python3.10/site-packages/bitsandbytes/research/nn/__init__.py", line 1, in <module>
    from .modules import LinearFP8Mixed, LinearFP8Global
  File "/home/anaconda3/envs/llava-med/lib/python3.10/site-packages/bitsandbytes/research/nn/modules.py", line 8, in <module>
    from bitsandbytes.optim import GlobalOptimManager
  File "/home/anaconda3/envs/llava-med/lib/python3.10/site-packages/bitsandbytes/optim/__init__.py", line 6, in <module>
    from bitsandbytes.cextension import COMPILED_WITH_CUDA
  File "/home/anaconda3/envs/llava-med/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 20, in <module>
    raise RuntimeError('''
RuntimeError: 
        CUDA Setup failed despite GPU being available. Please run the following command to get more information:

        python -m bitsandbytes

        Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
        to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
        and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues

Reproduction:

  • I am trying to run the LLava-Med model on my device. The repo is provided here: https://github.com/microsoft/LLaVA-Med

  • I followed the instructions and installed the necessary dependencies:

git clone https://github.com/microsoft/LLaVA-Med.git
cd LLaVA-Med
conda create -n llava-med python=3.10 -y
conda activate llava-med
pip install --upgrade pip  # enable PEP 660 support
pip install -e .
  • Create a python file in the llava-med directory, with the following code:
from llava.model.builder import load_pretrained_model
tokenizer, model, image_processor, context_len = load_pretrained_model(
    model_path='microsoft/llava-med-v1.5-mistral-7b',
    model_base=None,
    model_name='llava-med-v1.5-mistral-7b',
    # device='cpu'
)
  • Set the environmental variables to point to the correct cuda libraries.
export BNB_CUDA_VERSION=126
export CUDA_HOME=/usr/local/cuda-12.6
export LD_LIBRARY_PATH=/usr/local/cuda-12.6/lib64

To verify that the location of the files are correct:

/usr/local/cuda-12.6/lib64$ ls
cmake                    libcublas.so.12.6.3.3  libcufft_static_nocallback.a  libcupti.so.12               libcusolverMg.so.11.7.1.2  libnvblas.so.12          libnvperf_host_static.a       libnvToolsExt.so
libaccinj64.so           libcublas_static.a     libcufftw.so                  libcupti.so.2024.3.2         libcusolver.so             libnvblas.so.12.6.3.3    libnvperf_target.so           libnvToolsExt.so.1
libaccinj64.so.12.6      libcudadevrt.a         libcufftw.so.11               libcupti_static.a            libcusolver.so.11          libnvfatbin.so           libnvptxcompiler_static.a     libnvToolsExt.so.1.0.0
libaccinj64.so.12.6.80   libcudart.so           libcufftw.so.11.3.0.4         libcurand.so                 libcusolver.so.11.7.1.2    libnvfatbin.so.12        libnvrtc-builtins.so          libOpenCL.so
libcheckpoint.so         libcudart.so.12        libcufftw_static.a            libcurand.so.10              libcusolver_static.a       libnvfatbin.so.12.6.77   libnvrtc-builtins.so.12.6     libOpenCL.so.1
libcublasLt.so           libcudart.so.12.6.77   libcufilt.a                   libcurand.so.10.3.7.77       libcusparse.so             libnvfatbin_static.a     libnvrtc-builtins.so.12.6.77  libOpenCL.so.1.0
libcublasLt.so.12        libcudart_static.a     libcuinj64.so                 libcurand_static.a           libcusparse.so.12          libnvJitLink.so          libnvrtc-builtins_static.a    libOpenCL.so.1.0.0
libcublasLt.so.12.6.3.3  libcufft.so            libcuinj64.so.12.6            libcusolver_lapack_static.a  libcusparse.so.12.5.4.2    libnvJitLink.so.12       libnvrtc.so                   libpcsamplingutil.so
libcublasLt_static.a     libcufft.so.11         libcuinj64.so.12.6.80         libcusolver_metis_static.a   libcusparse_static.a       libnvJitLink.so.12.6.77  libnvrtc.so.12                stubs
libcublas.so             libcufft.so.11.3.0.4   libculibos.a                  libcusolverMg.so             libmetis_static.a          libnvJitLink_static.a    libnvrtc.so.12.6.77
libcublas.so.12          libcufft_static.a      libcupti.so                   libcusolverMg.so.11          libnvblas.so               libnvperf_host.so        libnvrtc_static.a

The same error occurs when I set LD_LIBRARY_PATH=/usr/local/cuda/lib64

Expected behavior

I should be able to download the model with the pre-trained weights, without any errors. However, I am getting errors from bitsandbytes.

@KevKibe
Copy link

KevKibe commented Dec 3, 2024

Ran into the same problem.
Package version: 0.42.0
nvcc:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0

OS: Ubuntu 22.04.2
Python: 3.10.12

!python -m bitsandbytes

False

===================================BUG REPORT===================================
/usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:167: UserWarning: Welcome to bitsandbytes. For bug reports, please run

python -m bitsandbytes


  warn(msg)
================================================================================
The following directories listed in your path were found to be non-existent: {PosixPath('/usr/local/nvidia/lib'), PosixPath('/usr/local/nvidia/lib64')}
/usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:167: UserWarning: /usr/local/nvidia/lib:/usr/local/nvidia/lib64 did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] as expected! Searching further paths...
  warn(msg)
The following directories listed in your path were found to be non-existent: {PosixPath('module'), PosixPath('//matplotlib_inline.backend_inline')}
The following directories listed in your path were found to be non-existent: {PosixPath('SHA256'), PosixPath('F4B90V9yLQOFOvRtA/Vg5y9w5pdWE1UUnE7Lr7qzp2U [email protected]')}
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching in backup paths...
/usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:167: UserWarning: Found duplicate ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] files: {PosixPath('/usr/local/cuda/lib64/libcudart.so'), PosixPath('/usr/local/cuda/lib64/libcudart.so.11.0')}.. We select the PyTorch default libcudart.so, which is {torch.version.cuda},but this might missmatch with the CUDA version that is needed for bitsandbytes.To override this behavior set the BNB_CUDA_VERSION=<version string, e.g. 122> environmental variableFor example, if you want to use the CUDA version 122BNB_CUDA_VERSION=122 python ...OR set the environmental variable in your .bashrc: export BNB_CUDA_VERSION=122In the case of a manual override, make sure you set the LD_LIBRARY_PATH, e.g.export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-11.2
  warn(msg)
DEBUG: Possible options found for libcudart.so: {PosixPath('/usr/local/cuda/lib64/libcudart.so'), PosixPath('/usr/local/cuda/lib64/libcudart.so.11.0')}
CUDA SETUP: PyTorch settings found: CUDA_VERSION=124, Highest Compute Capability: 8.6.
CUDA SETUP: To manually override the PyTorch CUDA version please see:https://github.com/TimDettmers/bitsandbytes/blob/main/how_to_use_nonpytorch_cuda.md
CUDA SETUP: Required library version not found: libbitsandbytes_cuda124.so. Maybe you need to compile it from source?
CUDA SETUP: Defaulting to libbitsandbytes_cpu.so...

================================================ERROR=====================================
CUDA SETUP: CUDA detection failed! Possible reasons:
1. You need to manually override the PyTorch CUDA version. Please see: "https://github.com/TimDettmers/bitsandbytes/blob/main/how_to_use_nonpytorch_cuda.md
2. CUDA driver not installed
3. CUDA not installed
4. You have multiple conflicting CUDA libraries
5. Required library not pre-compiled for this bitsandbytes release!
CUDA SETUP: If you compiled from source, try again with `make CUDA_VERSION=DETECTED_CUDA_VERSION` for example, `make CUDA_VERSION=113`.
CUDA SETUP: The CUDA version for the compile might depend on your conda install. Inspect CUDA version via `conda list | grep cuda`.
================================================================================

CUDA SETUP: Something unexpected happened. Please compile from source:
git clone https://github.com/TimDettmers/bitsandbytes.git
cd bitsandbytes
CUDA_VERSION=124
python setup.py install
CUDA SETUP: Setup Failed!
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 187, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/usr/lib/python3.10/runpy.py", line 146, in _get_module_details
    return _get_module_details(pkg_main_name, error)
  File "/usr/lib/python3.10/runpy.py", line 110, in _get_module_details
    __import__(pkg_name)
  File "/usr/local/lib/python3.10/dist-packages/bitsandbytes/__init__.py", line 6, in <module>
    from . import cuda_setup, utils, research
  File "/usr/local/lib/python3.10/dist-packages/bitsandbytes/research/__init__.py", line 1, in <module>
    from . import nn
  File "/usr/local/lib/python3.10/dist-packages/bitsandbytes/research/nn/__init__.py", line 1, in <module>
    from .modules import LinearFP8Mixed, LinearFP8Global
  File "/usr/local/lib/python3.10/dist-packages/bitsandbytes/research/nn/modules.py", line 8, in <module>
    from bitsandbytes.optim import GlobalOptimManager
  File "/usr/local/lib/python3.10/dist-packages/bitsandbytes/optim/__init__.py", line 6, in <module>
    from bitsandbytes.cextension import COMPILED_WITH_CUDA
  File "/usr/local/lib/python3.10/dist-packages/bitsandbytes/cextension.py", line 20, in <module>
    raise RuntimeError('''
RuntimeError: 
        CUDA Setup failed despite GPU being available. Please run the following command to get more information:

        python -m bitsandbytes

        Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
        to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
        and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues

nvidia-smi

Tue Dec  3 09:42:42 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07              Driver Version: 550.90.07      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA RTX A4000               On  |   00000000:83:00.0 Off |                  Off |
| 41%   50C    P8             17W /  140W |       4MiB /  16376MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

Reproduction

from transformers import WhisperForConditionalGeneration

quantization_config = BitsAndBytesConfig(load_in_8bit=True)
model = WhisperForConditionalGeneration.from_pretrained(
                model="openai/whisper-small,
                quantization_config=quantization_config,
                device_map="auto"
)

@kaijun123
Copy link
Author

I was previously using bitsandbytes 0.41.0. But the problem seems to be resolved when the latest bitsandbytes version is used instead. This is despite the model (llava-med) in my case specifying that bitsandbytes 0.41.0 should be used.

I ran the following code to install bitsandbytes 0.44.1: pip uninstall bitsandbytes and pip install bitsandbytes

@Veda0718
Copy link

Veda0718 commented Dec 7, 2024

I download the Linux version from the following link. It resolved the error.
https://pypi.org/project/bitsandbytes/#files

@KevKibe
Copy link

KevKibe commented Dec 7, 2024

Resolved, thanks for the help @kaijun123

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants