Using Venv or Dockerisation #80

bkutasi · 2023-10-01T08:47:23Z

Hi, terrific work as always. I was wondering if we could use venvs (or maybe even Docker) for running exllamav2.

A lot of tools in the generative AI space have some kind of option for self containment(eg: oobabooga, ComfyUI).
I had a lot of troubles when I cleaned up my system from nvidia bloat. I tried running it without having nvcc installed system wide, but even when using a venv that contains all the required dependencies leads to missing files during compile.
nvcc is available inside my env.

nvcc --version                   
nvcc: NVIDIA (R) Cuda compiler driver                                                                                                                                                                
Copyright (c) 2005-2022 NVIDIA Corporation                                                                                                                                                           
Built on Tue_May__3_18:49:52_PDT_2022                                                                                                                                                                
Cuda compilation tools, release 11.7, V11.7.64                                                                                                                                                       
Build cuda_11.7.r11.7/compiler.31294372_0

The error:

RuntimeError: Error building extension 'exllamav2_ext': [1/1] c++ ext.o pack_tensor.cuda.o quantize.cuda.o q_matrix.cuda.o q_attn.cuda.o q_mlp.cuda.o q_gemm.cuda.o rms_norm.cuda.o rope.cuda.o quantize_func.o sampling.o -shared -L/media/bkutasi/60824A4F824A29BC/Other_projects/exllamav2/installer_files/env/lib/python3.10/site-packages/torch/lib -lc10 -lc10_cuda -ltorch_cpu -ltorch_cuda -ltorch -ltorch_python -L/media/bkutasi/60824A4F824A29BC/Other_projects/exllamav2/installer_files/env/lib64 -lcudart -o exllamav2_ext.so
FAILED: exllamav2_ext.so 
c++ ext.o pack_tensor.cuda.o quantize.cuda.o q_matrix.cuda.o q_attn.cuda.o q_mlp.cuda.o q_gemm.cuda.o rms_norm.cuda.o rope.cuda.o quantize_func.o sampling.o -shared -L/media/bkutasi/60824A4F824A29BC/Other_projects/exllamav2/installer_files/env/lib/python3.10/site-packages/torch/lib -lc10 -lc10_cuda -ltorch_cpu -ltorch_cuda -ltorch -ltorch_python -L/media/bkutasi/60824A4F824A29BC/Other_projects/exllamav2/installer_files/env/lib64 -lcudart -o exllamav2_ext.so
/usr/bin/ld: cannot find -lcudart: No such file or directory
collect2: error: ld returned 1 exit status
ninja: build stopped: subcommand failed.

The text was updated successfully, but these errors were encountered:

SinanAkkoyun · 2023-10-02T11:13:33Z

I would advise you to use miniconda3 on your local environment.

After installation, create a python 3.10 or 3.11 env: conda create -n exllamav2 python=3.10

If you are on NVIDIA and not AMD, it totally suffices to just run the pip install -r requirements.txt and everything should work fine if nvcc is installed on your system. If not, you can still install all NVIDIA libraries through conda.

turboderp/exllama#192 (comment)

If you still run into issues please let me know.
(Regarding docker, it should be easier, make sure to use the right nvidia container and just do the above without conda.)

I hope this helps

SinanAkkoyun · 2023-10-02T11:14:28Z

Also, are you running on WSL? Sometimes you need to link libraries manually there (although the above setup works fine for me)

bkutasi · 2023-10-02T11:50:12Z

Hey thanks for the comment. I'm on a Ubuntu server 22. I did exactly what you suggested: I tried miniconda, and venv, but it failed to compile. I suspect that the cuda toolkit is most definitely needed for compilation, because after I installed the latest cuda using apt(i followed nvidia docs) it worked. Probably if I get the precompiled package, it runs with my environments.
I will try to look into getting a docker running, I already have the nvidia container toolkit set up for other projects.

SinanAkkoyun · 2023-10-02T15:44:59Z

You could also try to install the prebuilt wheels published, look in the README for more information

guialfaro053 · 2023-10-13T07:47:12Z

When I got the same error message, doing this on ubuntu helped. Here

turboderp closed this as completed Mar 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using Venv or Dockerisation #80

Using Venv or Dockerisation #80

bkutasi commented Oct 1, 2023

SinanAkkoyun commented Oct 2, 2023

SinanAkkoyun commented Oct 2, 2023

bkutasi commented Oct 2, 2023

SinanAkkoyun commented Oct 2, 2023

guialfaro053 commented Oct 13, 2023

Using Venv or Dockerisation #80

Using Venv or Dockerisation #80

Comments

bkutasi commented Oct 1, 2023

SinanAkkoyun commented Oct 2, 2023

SinanAkkoyun commented Oct 2, 2023

bkutasi commented Oct 2, 2023

SinanAkkoyun commented Oct 2, 2023

guialfaro053 commented Oct 13, 2023