Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using Venv or Dockerisation #80

Closed
bkutasi opened this issue Oct 1, 2023 · 5 comments
Closed

Using Venv or Dockerisation #80

bkutasi opened this issue Oct 1, 2023 · 5 comments

Comments

@bkutasi
Copy link

bkutasi commented Oct 1, 2023

Hi, terrific work as always. I was wondering if we could use venvs (or maybe even Docker) for running exllamav2.

A lot of tools in the generative AI space have some kind of option for self containment(eg: oobabooga, ComfyUI).
I had a lot of troubles when I cleaned up my system from nvidia bloat. I tried running it without having nvcc installed system wide, but even when using a venv that contains all the required dependencies leads to missing files during compile.
nvcc is available inside my env.

nvcc --version                   
nvcc: NVIDIA (R) Cuda compiler driver                                                                                                                                                                
Copyright (c) 2005-2022 NVIDIA Corporation                                                                                                                                                           
Built on Tue_May__3_18:49:52_PDT_2022                                                                                                                                                                
Cuda compilation tools, release 11.7, V11.7.64                                                                                                                                                       
Build cuda_11.7.r11.7/compiler.31294372_0

The error:

RuntimeError: Error building extension 'exllamav2_ext': [1/1] c++ ext.o pack_tensor.cuda.o quantize.cuda.o q_matrix.cuda.o q_attn.cuda.o q_mlp.cuda.o q_gemm.cuda.o rms_norm.cuda.o rope.cuda.o quantize_func.o sampling.o -shared -L/media/bkutasi/60824A4F824A29BC/Other_projects/exllamav2/installer_files/env/lib/python3.10/site-packages/torch/lib -lc10 -lc10_cuda -ltorch_cpu -ltorch_cuda -ltorch -ltorch_python -L/media/bkutasi/60824A4F824A29BC/Other_projects/exllamav2/installer_files/env/lib64 -lcudart -o exllamav2_ext.so
FAILED: exllamav2_ext.so 
c++ ext.o pack_tensor.cuda.o quantize.cuda.o q_matrix.cuda.o q_attn.cuda.o q_mlp.cuda.o q_gemm.cuda.o rms_norm.cuda.o rope.cuda.o quantize_func.o sampling.o -shared -L/media/bkutasi/60824A4F824A29BC/Other_projects/exllamav2/installer_files/env/lib/python3.10/site-packages/torch/lib -lc10 -lc10_cuda -ltorch_cpu -ltorch_cuda -ltorch -ltorch_python -L/media/bkutasi/60824A4F824A29BC/Other_projects/exllamav2/installer_files/env/lib64 -lcudart -o exllamav2_ext.so
/usr/bin/ld: cannot find -lcudart: No such file or directory
collect2: error: ld returned 1 exit status
ninja: build stopped: subcommand failed.
@SinanAkkoyun
Copy link
Contributor

I would advise you to use miniconda3 on your local environment.

After installation, create a python 3.10 or 3.11 env: conda create -n exllamav2 python=3.10

If you are on NVIDIA and not AMD, it totally suffices to just run the pip install -r requirements.txt and everything should work fine if nvcc is installed on your system. If not, you can still install all NVIDIA libraries through conda.

turboderp/exllama#192 (comment)

If you still run into issues please let me know.
(Regarding docker, it should be easier, make sure to use the right nvidia container and just do the above without conda.)

I hope this helps

@SinanAkkoyun
Copy link
Contributor

Also, are you running on WSL? Sometimes you need to link libraries manually there (although the above setup works fine for me)

@bkutasi
Copy link
Author

bkutasi commented Oct 2, 2023

Hey thanks for the comment. I'm on a Ubuntu server 22. I did exactly what you suggested: I tried miniconda, and venv, but it failed to compile. I suspect that the cuda toolkit is most definitely needed for compilation, because after I installed the latest cuda using apt(i followed nvidia docs) it worked. Probably if I get the precompiled package, it runs with my environments.
I will try to look into getting a docker running, I already have the nvidia container toolkit set up for other projects.

@SinanAkkoyun
Copy link
Contributor

You could also try to install the prebuilt wheels published, look in the README for more information

@guialfaro053
Copy link

When I got the same error message, doing this on ubuntu helped. Here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants