Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

python -m bitsandbytes · #1423

Open
gunnusravani opened this issue Nov 21, 2024 · 1 comment
Open

python -m bitsandbytes · #1423

gunnusravani opened this issue Nov 21, 2024 · 1 comment

Comments

@gunnusravani
Copy link

System Info

Thu Nov 21 17:46:23 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.54.03 Driver Version: 535.54.03 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA A100 80GB PCIe Off | 00000000:17:00.0 Off | 0 |
| N/A 66C P0 80W / 300W | 11726MiB / 81920MiB | 91% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
| 1 NVIDIA A100 80GB PCIe Off | 00000000:31:00.0 Off | 0 |
| N/A 57C P0 345W / 300W | 26308MiB / 81920MiB | 97% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
| 2 NVIDIA A100 80GB PCIe Off | 00000000:4B:00.0 Off | 0 |
| N/A 56C P0 88W / 300W | 5252MiB / 81920MiB | 19% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
| 3 NVIDIA A100 80GB PCIe Off | 00000000:CA:00.0 Off | 0 |
| N/A 32C P0 61W / 300W | 47449MiB / 81920MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 53845 C /usr/bin/python 414MiB |
| 0 N/A N/A 78243 C /usr/bin/python 10732MiB |
| 0 N/A N/A 93569 C ...niconda3/envs/mixtralkit/bin/python 554MiB |
| 1 N/A N/A 46294 C /usr/bin/python 25738MiB |
| 1 N/A N/A 93569 C ...niconda3/envs/mixtralkit/bin/python 552MiB |
| 2 N/A N/A 53845 C /usr/bin/python 4734MiB |
| 2 N/A N/A 93569 C ...niconda3/envs/mixtralkit/bin/python 552MiB |
| 3 N/A N/A 93569 C ...niconda3/envs/mixtralkit/bin/python 47436MiB |
+---------------------------------------------------------------------------------------+

Reproduction

False

===================================BUG REPORT===================================
/data/nlp/sravani_g/miniconda3/envs/mixtralkit/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:167: UserWarning: Welcome to bitsandbytes. For bug reports, please run

python -m bitsandbytes

warn(msg)

/data/nlp/sravani_g/miniconda3/envs/mixtralkit/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:167: UserWarning: /data/nlp/sravani_g/miniconda3/envs/mixtralkit did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] as expected! Searching further paths...
warn(msg)
The following directories listed in your path were found to be non-existent: {PosixPath('//matplotlib_inline.backend_inline'), PosixPath('module')}
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching in backup paths...
/data/nlp/sravani_g/miniconda3/envs/mixtralkit/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:167: UserWarning: Found duplicate ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] files: {PosixPath('/usr/local/cuda/lib64/libcudart.so'), PosixPath('/usr/local/cuda/lib64/libcudart.so.11.0')}.. We select the PyTorch default libcudart.so, which is {torch.version.cuda},but this might missmatch with the CUDA version that is needed for bitsandbytes.To override this behavior set the BNB_CUDA_VERSION=<version string, e.g. 122> environmental variableFor example, if you want to use the CUDA version 122BNB_CUDA_VERSION=122 python ...OR set the environmental variable in your .bashrc: export BNB_CUDA_VERSION=122In the case of a manual override, make sure you set the LD_LIBRARY_PATH, e.g.export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-11.2
warn(msg)
DEBUG: Possible options found for libcudart.so: {PosixPath('/usr/local/cuda/lib64/libcudart.so'), PosixPath('/usr/local/cuda/lib64/libcudart.so.11.0')}
CUDA SETUP: PyTorch settings found: CUDA_VERSION=124, Highest Compute Capability: 8.0.
CUDA SETUP: To manually override the PyTorch CUDA version please see:https://github.com/TimDettmers/bitsandbytes/blob/main/how_to_use_nonpytorch_cuda.md
CUDA SETUP: Required library version not found: libbitsandbytes_cuda124.so. Maybe you need to compile it from source?
CUDA SETUP: Defaulting to libbitsandbytes_cpu.so...

================================================ERROR=====================================
CUDA SETUP: CUDA detection failed! Possible reasons:

  1. You need to manually override the PyTorch CUDA version. Please see: "https://github.com/TimDettmers/bitsandbytes/blob/main/how_to_use_nonpytorch_cuda.md
  2. CUDA driver not installed
  3. CUDA not installed
  4. You have multiple conflicting CUDA libraries
  5. Required library not pre-compiled for this bitsandbytes release!
    CUDA SETUP: If you compiled from source, try again with make CUDA_VERSION=DETECTED_CUDA_VERSION for example, make CUDA_VERSION=113.
    CUDA SETUP: The CUDA version for the compile might depend on your conda install. Inspect CUDA version via conda list | grep cuda.
    ================================================================================

CUDA SETUP: Something unexpected happened. Please compile from source:
git clone https://github.com/TimDettmers/bitsandbytes.git
cd bitsandbytes
CUDA_VERSION=124
python setup.py install
CUDA SETUP: Setup Failed!
Traceback (most recent call last):
File "/data/nlp/sravani_g/miniconda3/envs/mixtralkit/lib/python3.10/runpy.py", line 187, in _run_module_as_main
mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
File "/data/nlp/sravani_g/miniconda3/envs/mixtralkit/lib/python3.10/runpy.py", line 146, in _get_module_details
return _get_module_details(pkg_main_name, error)
File "/data/nlp/sravani_g/miniconda3/envs/mixtralkit/lib/python3.10/runpy.py", line 110, in _get_module_details
import(pkg_name)
File "/data/nlp/sravani_g/miniconda3/envs/mixtralkit/lib/python3.10/site-packages/bitsandbytes/init.py", line 6, in
from . import cuda_setup, utils, research
File "/data/nlp/sravani_g/miniconda3/envs/mixtralkit/lib/python3.10/site-packages/bitsandbytes/research/init.py", line 1, in
from . import nn
File "/data/nlp/sravani_g/miniconda3/envs/mixtralkit/lib/python3.10/site-packages/bitsandbytes/research/nn/init.py", line 1, in
from .modules import LinearFP8Mixed, LinearFP8Global
File "/data/nlp/sravani_g/miniconda3/envs/mixtralkit/lib/python3.10/site-packages/bitsandbytes/research/nn/modules.py", line 8, in
from bitsandbytes.optim import GlobalOptimManager
File "/data/nlp/sravani_g/miniconda3/envs/mixtralkit/lib/python3.10/site-packages/bitsandbytes/optim/init.py", line 6, in
from bitsandbytes.cextension import COMPILED_WITH_CUDA
File "/data/nlp/sravani_g/miniconda3/envs/mixtralkit/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 20, in
raise RuntimeError('''
RuntimeError:
CUDA Setup failed despite GPU being available. Please run the following command to get more information:

    python -m bitsandbytes

    Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
    to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
    and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues

Expected behavior

I am finetuning the llama 2 7b model then I ran into this error after executing the following function
from peft import LoraConfig, get_peft_model

config = LoraConfig(
r=8,
lora_alpha=16,
target_modules=[
"q_proj",
"k_proj",
"v_proj",
"o_proj",
"gate_proj",
"up_proj",
"down_proj",
"lm_head",
],
bias="none",
lora_dropout=0.05, # Conventional
task_type="CAUSAL_LM",
)

model = get_peft_model(model, config)
print_trainable_parameters(model)

@LIDALT
Copy link

LIDALT commented Dec 9, 2024

same Error,have you fixed it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants