Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot compile exllama_ext on ROCm #131

Closed
fgdfgfthgr-fox opened this issue Jul 4, 2023 · 9 comments
Closed

Cannot compile exllama_ext on ROCm #131

fgdfgfthgr-fox opened this issue Jul 4, 2023 · 9 comments

Comments

@fgdfgfthgr-fox
Copy link

I am using oobabooga's webui, which includes exllama. I cloned exllama into the repositories, installed the dependencies and am ready to compile it. However, it seems like my system won't compile exllama_ext.
My system information:

System:
  Kernel: 5.15.0-75-generic x86_64 bits: 64 compiler: gcc v: 11.3.0 Desktop: Cinnamon 5.6.8
    tk: GTK 3.24.33 wm: muffin dm: LightDM Distro: Linux Mint 21.1 Vera base: Ubuntu 22.04 jammy
Machine:
  Type: Desktop Mobo: Micro-Star model: B550M PRO-VDH (MS-7C95) v: 1.0
    serial: <superuser required> UEFI: American Megatrends LLC. v: 2.E0 date: 03/06/2023
CPU:
  Info: 6-core model: AMD Ryzen 5 5500 bits: 64 type: MT MCP arch: Zen 3 rev: 0 cache: L1: 384 KiB
    L2: 3 MiB L3: 16 MiB
  Speed (MHz): avg: 1843 high: 2787 min/max: 1400/3600 boost: enabled cores: 1: 2787 2: 1661
    3: 1818 4: 1851 5: 1967 6: 1669 7: 1814 8: 1639 9: 1621 10: 1797 11: 1699 12: 1802
    bogomips: 86230
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
Graphics:
  Device-1: AMD Vega 20 [Radeon VII] driver: amdgpu v: 5.18.13 pcie: speed: 8 GT/s lanes: 16
    ports: active: HDMI-A-1 empty: DP-1,DP-2,DP-3 bus-ID: 12:00.0 chip-ID: 1002:66af
  Display: x11 server: X.Org v: 1.21.1.4 driver: X: loaded: amdgpu,ati
    unloaded: fbdev,modesetting,radeon,vesa gpu: amdgpu display-ID: :0 screens: 1
  Screen-1: 0 s-res: 1920x1200 s-dpi: 96
  Monitor-1: HDMI-A-0 mapped: HDMI-A-1 model: Philips 240B res: 1920x1200 dpi: 94
    diag: 612mm (24.1")
  OpenGL: renderer: AMD Radeon VII (vega20 LLVM 15.0.7 DRM 3.48 5.15.0-75-generic)
    v: 4.6 Mesa 23.1.2 direct render: Yes

My command and the error output:

(textgen) fgdfgfthgr@fgdfgfthgr-MS-7C95:/mnt/7018F20D48B6C548/text-generation-webui/repositories/exllama$ python test_benchmark_inference.py -d /mnt/7018F20D48B6C548/gptq-llama30b-128g/llama-30b-4bit-128g.safetensors -p -ppl
Successfully preprocessed all matching files.
Traceback (most recent call last):
  File "/home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1893, in _run_ninja_build
    subprocess.run(
  File "/home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/mnt/7018F20D48B6C548/text-generation-webui/repositories/exllama/test_benchmark_inference.py", line 1, in <module>
    from model import ExLlama, ExLlamaCache, ExLlamaConfig
  File "/mnt/7018F20D48B6C548/text-generation-webui/repositories/exllama/model.py", line 12, in <module>
    import cuda_ext
  File "/mnt/7018F20D48B6C548/text-generation-webui/repositories/exllama/cuda_ext.py", line 43, in <module>
    exllama_ext = load(
  File "/home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1284, in load
    return _jit_compile(
  File "/home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1509, in _jit_compile
    _write_ninja_file_and_build_library(
  File "/home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1624, in _write_ninja_file_and_build_library
    _run_ninja_build(
  File "/home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1909, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension 'exllama_ext': [1/6] c++ -MMD -MF exllama_ext_hip.o.d -DTORCH_EXTENSION_NAME=exllama_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/mnt/7018F20D48B6C548/text-generation-webui/repositories/exllama/exllama_ext -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/TH -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THC -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THH -isystem /opt/rocm-5.4.2/include -isystem /opt/rocm-5.4.2/miopen/include -isystem /opt/rocm-5.4.2/hip/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -c /mnt/7018F20D48B6C548/text-generation-webui/repositories/exllama/exllama_ext/exllama_ext_hip.cpp -o exllama_ext_hip.o -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1
FAILED: exllama_ext_hip.o 
c++ -MMD -MF exllama_ext_hip.o.d -DTORCH_EXTENSION_NAME=exllama_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/mnt/7018F20D48B6C548/text-generation-webui/repositories/exllama/exllama_ext -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/TH -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THC -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THH -isystem /opt/rocm-5.4.2/include -isystem /opt/rocm-5.4.2/miopen/include -isystem /opt/rocm-5.4.2/hip/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -c /mnt/7018F20D48B6C548/text-generation-webui/repositories/exllama/exllama_ext/exllama_ext_hip.cpp -o exllama_ext_hip.o -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1
In file included from /mnt/7018F20D48B6C548/text-generation-webui/repositories/exllama/exllama_ext/exllama_ext_hip.cpp:4:
/home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/ATen/hip/HIPContext.h:7:10: fatal error: hipsparse/hipsparse.h: 没有那个文件或目录
    7 | #include <hipsparse/hipsparse.h>
      |          ^~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
[2/6] /opt/rocm-5.4.2/bin/hipcc  -DWITH_HIP -DTORCH_EXTENSION_NAME=exllama_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/mnt/7018F20D48B6C548/text-generation-webui/repositories/exllama/exllama_ext -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/TH -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THC -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THH -isystem /opt/rocm-5.4.2/include -isystem /opt/rocm-5.4.2/miopen/include -isystem /opt/rocm-5.4.2/hip/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -lineinfo -U__HIP_NO_HALF_CONVERSIONS__ -O3 --amdgpu-target=gfx900 --amdgpu-target=gfx906 --amdgpu-target=gfx908 --amdgpu-target=gfx90a --amdgpu-target=gfx1030 -fno-gpu-rdc -c /mnt/7018F20D48B6C548/text-generation-webui/repositories/exllama/exllama_ext/hip_func/q4_mlp.hip -o q4_mlp.cuda.o 
FAILED: q4_mlp.cuda.o 
/opt/rocm-5.4.2/bin/hipcc  -DWITH_HIP -DTORCH_EXTENSION_NAME=exllama_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/mnt/7018F20D48B6C548/text-generation-webui/repositories/exllama/exllama_ext -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/TH -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THC -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THH -isystem /opt/rocm-5.4.2/include -isystem /opt/rocm-5.4.2/miopen/include -isystem /opt/rocm-5.4.2/hip/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -lineinfo -U__HIP_NO_HALF_CONVERSIONS__ -O3 --amdgpu-target=gfx900 --amdgpu-target=gfx906 --amdgpu-target=gfx908 --amdgpu-target=gfx90a --amdgpu-target=gfx1030 -fno-gpu-rdc -c /mnt/7018F20D48B6C548/text-generation-webui/repositories/exllama/exllama_ext/hip_func/q4_mlp.hip -o q4_mlp.cuda.o 
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
clang-15: warning: -lineinfo: 'linker' input unused [-Wunused-command-line-argument]
In file included from /mnt/7018F20D48B6C548/text-generation-webui/repositories/exllama/exllama_ext/hip_func/q4_mlp.hip:3:
In file included from /mnt/7018F20D48B6C548/text-generation-webui/repositories/exllama/exllama_ext/hip_func/../hip_func/q4_mlp.cuh:8:
/home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/ATen/hip/HIPContext.h:7:10: fatal error: 'hipsparse/hipsparse.h' file not found
#include <hipsparse/hipsparse.h>
         ^~~~~~~~~~~~~~~~~~~~~~~
1 error generated when compiling for gfx1030.
[3/6] /opt/rocm-5.4.2/bin/hipcc  -DWITH_HIP -DTORCH_EXTENSION_NAME=exllama_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/mnt/7018F20D48B6C548/text-generation-webui/repositories/exllama/exllama_ext -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/TH -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THC -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THH -isystem /opt/rocm-5.4.2/include -isystem /opt/rocm-5.4.2/miopen/include -isystem /opt/rocm-5.4.2/hip/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -lineinfo -U__HIP_NO_HALF_CONVERSIONS__ -O3 --amdgpu-target=gfx900 --amdgpu-target=gfx906 --amdgpu-target=gfx908 --amdgpu-target=gfx90a --amdgpu-target=gfx1030 -fno-gpu-rdc -c /mnt/7018F20D48B6C548/text-generation-webui/repositories/exllama/exllama_ext/hip_func/q4_matmul.hip -o q4_matmul.cuda.o 
FAILED: q4_matmul.cuda.o 
/opt/rocm-5.4.2/bin/hipcc  -DWITH_HIP -DTORCH_EXTENSION_NAME=exllama_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/mnt/7018F20D48B6C548/text-generation-webui/repositories/exllama/exllama_ext -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/TH -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THC -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THH -isystem /opt/rocm-5.4.2/include -isystem /opt/rocm-5.4.2/miopen/include -isystem /opt/rocm-5.4.2/hip/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -lineinfo -U__HIP_NO_HALF_CONVERSIONS__ -O3 --amdgpu-target=gfx900 --amdgpu-target=gfx906 --amdgpu-target=gfx908 --amdgpu-target=gfx90a --amdgpu-target=gfx1030 -fno-gpu-rdc -c /mnt/7018F20D48B6C548/text-generation-webui/repositories/exllama/exllama_ext/hip_func/q4_matmul.hip -o q4_matmul.cuda.o 
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
clang-15: warning: -lineinfo: 'linker' input unused [-Wunused-command-line-argument]
In file included from /mnt/7018F20D48B6C548/text-generation-webui/repositories/exllama/exllama_ext/hip_func/q4_matmul.hip:3:
In file included from /mnt/7018F20D48B6C548/text-generation-webui/repositories/exllama/exllama_ext/hip_func/../hip_func/q4_matmul.cuh:9:
/home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/ATen/hip/HIPContext.h:7:10: fatal error: 'hipsparse/hipsparse.h' file not found
#include <hipsparse/hipsparse.h>
         ^~~~~~~~~~~~~~~~~~~~~~~
1 error generated when compiling for gfx1030.
[4/6] /opt/rocm-5.4.2/bin/hipcc  -DWITH_HIP -DTORCH_EXTENSION_NAME=exllama_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/mnt/7018F20D48B6C548/text-generation-webui/repositories/exllama/exllama_ext -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/TH -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THC -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THH -isystem /opt/rocm-5.4.2/include -isystem /opt/rocm-5.4.2/miopen/include -isystem /opt/rocm-5.4.2/hip/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -lineinfo -U__HIP_NO_HALF_CONVERSIONS__ -O3 --amdgpu-target=gfx900 --amdgpu-target=gfx906 --amdgpu-target=gfx908 --amdgpu-target=gfx90a --amdgpu-target=gfx1030 -fno-gpu-rdc -c /mnt/7018F20D48B6C548/text-generation-webui/repositories/exllama/exllama_ext/hip_func/q4_attn.hip -o q4_attn.cuda.o 
FAILED: q4_attn.cuda.o 
/opt/rocm-5.4.2/bin/hipcc  -DWITH_HIP -DTORCH_EXTENSION_NAME=exllama_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/mnt/7018F20D48B6C548/text-generation-webui/repositories/exllama/exllama_ext -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/TH -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THC -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THH -isystem /opt/rocm-5.4.2/include -isystem /opt/rocm-5.4.2/miopen/include -isystem /opt/rocm-5.4.2/hip/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -lineinfo -U__HIP_NO_HALF_CONVERSIONS__ -O3 --amdgpu-target=gfx900 --amdgpu-target=gfx906 --amdgpu-target=gfx908 --amdgpu-target=gfx90a --amdgpu-target=gfx1030 -fno-gpu-rdc -c /mnt/7018F20D48B6C548/text-generation-webui/repositories/exllama/exllama_ext/hip_func/q4_attn.hip -o q4_attn.cuda.o 
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
clang-15: warning: -lineinfo: 'linker' input unused [-Wunused-command-line-argument]
In file included from /mnt/7018F20D48B6C548/text-generation-webui/repositories/exllama/exllama_ext/hip_func/q4_attn.hip:3:
In file included from /mnt/7018F20D48B6C548/text-generation-webui/repositories/exllama/exllama_ext/hip_func/../hip_func/q4_mlp.cuh:8:
/home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/ATen/hip/HIPContext.h:7:10: fatal error: 'hipsparse/hipsparse.h' file not found
#include <hipsparse/hipsparse.h>
         ^~~~~~~~~~~~~~~~~~~~~~~
1 error generated when compiling for gfx1030.
[5/6] /opt/rocm-5.4.2/bin/hipcc  -DWITH_HIP -DTORCH_EXTENSION_NAME=exllama_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/mnt/7018F20D48B6C548/text-generation-webui/repositories/exllama/exllama_ext -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/TH -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THC -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THH -isystem /opt/rocm-5.4.2/include -isystem /opt/rocm-5.4.2/miopen/include -isystem /opt/rocm-5.4.2/hip/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -lineinfo -U__HIP_NO_HALF_CONVERSIONS__ -O3 --amdgpu-target=gfx900 --amdgpu-target=gfx906 --amdgpu-target=gfx908 --amdgpu-target=gfx90a --amdgpu-target=gfx1030 -fno-gpu-rdc -c /mnt/7018F20D48B6C548/text-generation-webui/repositories/exllama/exllama_ext/hip_func/half_matmul.hip -o half_matmul.cuda.o 
FAILED: half_matmul.cuda.o 
/opt/rocm-5.4.2/bin/hipcc  -DWITH_HIP -DTORCH_EXTENSION_NAME=exllama_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/mnt/7018F20D48B6C548/text-generation-webui/repositories/exllama/exllama_ext -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/TH -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THC -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THH -isystem /opt/rocm-5.4.2/include -isystem /opt/rocm-5.4.2/miopen/include -isystem /opt/rocm-5.4.2/hip/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -lineinfo -U__HIP_NO_HALF_CONVERSIONS__ -O3 --amdgpu-target=gfx900 --amdgpu-target=gfx906 --amdgpu-target=gfx908 --amdgpu-target=gfx90a --amdgpu-target=gfx1030 -fno-gpu-rdc -c /mnt/7018F20D48B6C548/text-generation-webui/repositories/exllama/exllama_ext/hip_func/half_matmul.hip -o half_matmul.cuda.o 
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
clang-15: warning: -lineinfo: 'linker' input unused [-Wunused-command-line-argument]
In file included from /mnt/7018F20D48B6C548/text-generation-webui/repositories/exllama/exllama_ext/hip_func/half_matmul.hip:3:
In file included from /mnt/7018F20D48B6C548/text-generation-webui/repositories/exllama/exllama_ext/hip_func/../hip_func/half_matmul.cuh:8:
/home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/ATen/hip/HIPContext.h:7:10: fatal error: 'hipsparse/hipsparse.h' file not found
#include <hipsparse/hipsparse.h>
         ^~~~~~~~~~~~~~~~~~~~~~~
1 error generated when compiling for gfx1030.
ninja: build stopped: subcommand failed.

My conda environment:

# packages in environment at /home/fgdfgfthgr/anaconda3/envs/textgen:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main  
_openmp_mutex             5.1                       1_gnu  
accelerate                0.20.3                   pypi_0    pypi
aiofiles                  23.1.0                   pypi_0    pypi
aiohttp                   3.8.4                    pypi_0    pypi
aiosignal                 1.3.1                    pypi_0    pypi
altair                    5.0.1                    pypi_0    pypi
anyio                     3.7.0                    pypi_0    pypi
async-timeout             4.0.2                    pypi_0    pypi
attrs                     23.1.0                   pypi_0    pypi
auto-gptq                 0.2.2+cu117              pypi_0    pypi
bitsandbytes              0.39.1                   pypi_0    pypi
bzip2                     1.0.8                h7b6447c_0  
ca-certificates           2023.01.10           h06a4308_0  
certifi                   2022.12.7                pypi_0    pypi
charset-normalizer        2.1.1                    pypi_0    pypi
click                     8.1.3                    pypi_0    pypi
cmake                     3.25.0                   pypi_0    pypi
colorama                  0.4.6                    pypi_0    pypi
contourpy                 1.0.7                    pypi_0    pypi
cycler                    0.11.0                   pypi_0    pypi
datasets                  2.12.0                   pypi_0    pypi
dill                      0.3.6                    pypi_0    pypi
diskcache                 5.6.1                    pypi_0    pypi
einops                    0.6.1                    pypi_0    pypi
exceptiongroup            1.1.1                    pypi_0    pypi
exllama                   0.0.5+cu117              pypi_0    pypi
fastapi                   0.95.2                   pypi_0    pypi
ffmpy                     0.3.0                    pypi_0    pypi
filelock                  3.9.0                    pypi_0    pypi
flexgen                   0.1.7                    pypi_0    pypi
fonttools                 4.39.4                   pypi_0    pypi
frozenlist                1.3.3                    pypi_0    pypi
fsspec                    2023.5.0                 pypi_0    pypi
gradio                    3.33.1                   pypi_0    pypi
gradio-client             0.2.5                    pypi_0    pypi
h11                       0.14.0                   pypi_0    pypi
httpcore                  0.17.2                   pypi_0    pypi
httpx                     0.24.1                   pypi_0    pypi
huggingface-hub           0.14.1                   pypi_0    pypi
idna                      3.4                      pypi_0    pypi
jinja2                    3.1.2                    pypi_0    pypi
jsonschema                4.17.3                   pypi_0    pypi
kiwisolver                1.4.4                    pypi_0    pypi
ld_impl_linux-64          2.38                 h1181459_1  
libffi                    3.4.4                h6a678d5_0  
libgcc-ng                 11.2.0               h1234567_1  
libgomp                   11.2.0               h1234567_1  
libstdcxx-ng              11.2.0               h1234567_1  
libuuid                   1.41.5               h5eee18b_0  
linkify-it-py             2.0.2                    pypi_0    pypi
lit                       15.0.7                   pypi_0    pypi
llama-cpp-python          0.1.66                   pypi_0    pypi
markdown                  3.4.3                    pypi_0    pypi
markdown-it-py            2.2.0                    pypi_0    pypi
markupsafe                2.1.2                    pypi_0    pypi
matplotlib                3.7.1                    pypi_0    pypi
mdit-py-plugins           0.3.3                    pypi_0    pypi
mdurl                     0.1.2                    pypi_0    pypi
mpmath                    1.2.1                    pypi_0    pypi
multidict                 6.0.4                    pypi_0    pypi
multiprocess              0.70.14                  pypi_0    pypi
ncurses                   6.4                  h6a678d5_0  
networkx                  3.0                      pypi_0    pypi
ninja                     1.11.1                   pypi_0    pypi
numpy                     1.24.3                   pypi_0    pypi
openssl                   1.1.1t               h7f8727e_0  
orjson                    3.8.14                   pypi_0    pypi
packaging                 23.1                     pypi_0    pypi
pandas                    2.0.2                    pypi_0    pypi
peft                      0.4.0.dev0               pypi_0    pypi
pillow                    9.5.0                    pypi_0    pypi
pip                       23.0.1          py310h06a4308_0  
psutil                    5.9.5                    pypi_0    pypi
pulp                      2.7.0                    pypi_0    pypi
pyarrow                   12.0.0                   pypi_0    pypi
pydantic                  1.10.8                   pypi_0    pypi
pydub                     0.25.1                   pypi_0    pypi
pygments                  2.15.1                   pypi_0    pypi
pyparsing                 3.0.9                    pypi_0    pypi
pyrsistent                0.19.3                   pypi_0    pypi
python                    3.10.11              h7a1cb2a_2  
python-dateutil           2.8.2                    pypi_0    pypi
python-multipart          0.0.6                    pypi_0    pypi
pytorch-triton-rocm       2.0.1                    pypi_0    pypi
pytz                      2023.3                   pypi_0    pypi
pyyaml                    6.0                      pypi_0    pypi
quant-cuda                0.0.0                    pypi_0    pypi
readline                  8.2                  h5eee18b_0  
regex                     2023.5.5                 pypi_0    pypi
requests                  2.28.1                   pypi_0    pypi
responses                 0.18.0                   pypi_0    pypi
rouge                     1.0.1                    pypi_0    pypi
safetensors               0.3.1                    pypi_0    pypi
scipy                     1.10.1                   pypi_0    pypi
semantic-version          2.10.0                   pypi_0    pypi
sentencepiece             0.1.99                   pypi_0    pypi
setuptools                67.8.0          py310h06a4308_0  
six                       1.16.0                   pypi_0    pypi
sniffio                   1.3.0                    pypi_0    pypi
sqlite                    3.41.2               h5eee18b_0  
starlette                 0.27.0                   pypi_0    pypi
sympy                     1.11.1                   pypi_0    pypi
tk                        8.6.12               h1ccaba5_0  
tokenizers                0.13.3                   pypi_0    pypi
toolz                     0.12.0                   pypi_0    pypi
torch                     2.0.1+rocm5.4.2          pypi_0    pypi
torchaudio                2.0.2+rocm5.4.2          pypi_0    pypi
torchvision               0.15.2+rocm5.4.2          pypi_0    pypi
tqdm                      4.65.0                   pypi_0    pypi
transformers              4.30.2                   pypi_0    pypi
typing-extensions         4.6.3                    pypi_0    pypi
tzdata                    2023.3                   pypi_0    pypi
uc-micro-py               1.0.2                    pypi_0    pypi
urllib3                   1.26.13                  pypi_0    pypi
uvicorn                   0.22.0                   pypi_0    pypi
websockets                11.0.3                   pypi_0    pypi
wheel                     0.38.4          py310h06a4308_0  
xxhash                    3.2.0                    pypi_0    pypi
xz                        5.4.2                h5eee18b_0  
yarl                      1.9.2                    pypi_0    pypi
zlib                      1.2.13               h5eee18b_0  

I saw similar issues in #7, but it wasn't very clear how did they solved it...

@fgdfgfthgr-fox
Copy link
Author

Just to add, if installed in a brand new conda environment and strictly follow the instruction in readme, I got this:

(exllama) fgdfgfthgr@fgdfgfthgr-MS-7C95:/mnt/7018F20D48B6C548/exllama$ python test_benchmark_inference.py -d /mnt/7018F20D48B6C548/text-generation-webui/models/gptq-llama30b-128g -p -ppl
No ROCm runtime is found, using ROCM_HOME='/opt/rocm-5.4.2'
No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
Traceback (most recent call last):
  File "/mnt/7018F20D48B6C548/exllama/test_benchmark_inference.py", line 1, in <module>
    from model import ExLlama, ExLlamaCache, ExLlamaConfig
  File "/mnt/7018F20D48B6C548/exllama/model.py", line 12, in <module>
    import cuda_ext
  File "/mnt/7018F20D48B6C548/exllama/cuda_ext.py", line 43, in <module>
    exllama_ext = load(
  File "/home/fgdfgfthgr/anaconda3/envs/exllama/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1301, in load
    return _jit_compile(
  File "/home/fgdfgfthgr/anaconda3/envs/exllama/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1524, in _jit_compile
    _write_ninja_file_and_build_library(
  File "/home/fgdfgfthgr/anaconda3/envs/exllama/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1626, in _write_ninja_file_and_build_library
    _write_ninja_file_to_build_library(
  File "/home/fgdfgfthgr/anaconda3/envs/exllama/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2030, in _write_ninja_file_to_build_library
    cuda_flags = common_cflags + COMMON_NVCC_FLAGS + _get_cuda_arch_flags()
  File "/home/fgdfgfthgr/anaconda3/envs/exllama/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1796, in _get_cuda_arch_flags
    arch_list[-1] += '+PTX'
IndexError: list index out of range

@jmoney7823956789378
Copy link

The -d option should be used with the model's folder, rather than the model file. It also seems like you're missing some of the rocm hip libraries.
Are you on ROCm 5.4.2?

@ardfork
Copy link
Contributor

ardfork commented Jul 4, 2023

From the error, it seem like you are missing hipSPARSE on your system. I wasn't able to check if it was available in your repo distro. If they are not, the easiest solution is probably to use a ROCm container with docker or podman.

@fgdfgfthgr-fox
Copy link
Author

The -d option should be used with the model's folder, rather than the model file. It also seems like you're missing some of the rocm hip libraries. Are you on ROCm 5.4.2?
From the error, it seem like you are missing hipSPARSE on your system. I wasn't able to check if it was available in your repo distro. If they are not, the easiest solution is probably to use a ROCm container with docker or podman.

Hi there, I am on ROCm 5.4.2. It's from my repo distro. Although the readme does said that the docker image currently only supports NVIDIA GPUs...

@fgdfgfthgr-fox
Copy link
Author

I just updated my Rocm to 5.6.0. Still output the same error message.

(textgen) fgdfgfthgr@fgdfgfthgr-MS-7C95:/mnt/7018F20D48B6C548/exllama$ python test_benchmark_inference.py -d /mnt/7018F20D48B6C548/text-generation-webui/models/gptq-llama30b-128g -p -ppl
Successfully preprocessed all matching files.
Traceback (most recent call last):
  File "/home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1893, in _run_ninja_build
    subprocess.run(
  File "/home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/mnt/7018F20D48B6C548/exllama/test_benchmark_inference.py", line 1, in <module>
    from model import ExLlama, ExLlamaCache, ExLlamaConfig
  File "/mnt/7018F20D48B6C548/exllama/model.py", line 12, in <module>
    import cuda_ext
  File "/mnt/7018F20D48B6C548/exllama/cuda_ext.py", line 43, in <module>
    exllama_ext = load(
  File "/home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1284, in load
    return _jit_compile(
  File "/home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1509, in _jit_compile
    _write_ninja_file_and_build_library(
  File "/home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1624, in _write_ninja_file_and_build_library
    _run_ninja_build(
  File "/home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1909, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension 'exllama_ext': [1/12] c++ -MMD -MF rep_penalty.o.d -DTORCH_EXTENSION_NAME=exllama_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/mnt/7018F20D48B6C548/exllama/exllama_ext -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/TH -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THC -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THH -isystem /opt/rocm-5.6.0/include -isystem /opt/rocm-5.6.0/miopen/include -isystem /opt/rocm-5.6.0/hip/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -c /mnt/7018F20D48B6C548/exllama/exllama_ext/cpu_func/rep_penalty.cpp -o rep_penalty.o -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1
[2/12] /opt/rocm-5.6.0/bin/hipcc  -DWITH_HIP -DTORCH_EXTENSION_NAME=exllama_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/mnt/7018F20D48B6C548/exllama/exllama_ext -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/TH -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THC -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THH -isystem /opt/rocm-5.6.0/include -isystem /opt/rocm-5.6.0/miopen/include -isystem /opt/rocm-5.6.0/hip/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -lineinfo -U__HIP_NO_HALF_CONVERSIONS__ -O3 --amdgpu-target=gfx900 --amdgpu-target=gfx906 --amdgpu-target=gfx908 --amdgpu-target=gfx90a --amdgpu-target=gfx1030 -fno-gpu-rdc -c /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/half_matmul.hip -o half_matmul.cuda.o 
FAILED: half_matmul.cuda.o 
/opt/rocm-5.6.0/bin/hipcc  -DWITH_HIP -DTORCH_EXTENSION_NAME=exllama_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/mnt/7018F20D48B6C548/exllama/exllama_ext -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/TH -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THC -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THH -isystem /opt/rocm-5.6.0/include -isystem /opt/rocm-5.6.0/miopen/include -isystem /opt/rocm-5.6.0/hip/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -lineinfo -U__HIP_NO_HALF_CONVERSIONS__ -O3 --amdgpu-target=gfx900 --amdgpu-target=gfx906 --amdgpu-target=gfx908 --amdgpu-target=gfx90a --amdgpu-target=gfx1030 -fno-gpu-rdc -c /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/half_matmul.hip -o half_matmul.cuda.o 
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
clang-16: warning: -lineinfo: 'linker' input unused [-Wunused-command-line-argument]
In file included from /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/half_matmul.hip:3:
In file included from /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../hip_func/half_matmul.cuh:8:
/home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/ATen/hip/HIPContext.h:7:10: fatal error: 'hipsparse/hipsparse.h' file not found
#include <hipsparse/hipsparse.h>
         ^~~~~~~~~~~~~~~~~~~~~~~
1 error generated when compiling for gfx1030.
[3/12] /opt/rocm-5.6.0/bin/hipcc  -DWITH_HIP -DTORCH_EXTENSION_NAME=exllama_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/mnt/7018F20D48B6C548/exllama/exllama_ext -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/TH -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THC -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THH -isystem /opt/rocm-5.6.0/include -isystem /opt/rocm-5.6.0/miopen/include -isystem /opt/rocm-5.6.0/hip/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -lineinfo -U__HIP_NO_HALF_CONVERSIONS__ -O3 --amdgpu-target=gfx900 --amdgpu-target=gfx906 --amdgpu-target=gfx908 --amdgpu-target=gfx90a --amdgpu-target=gfx1030 -fno-gpu-rdc -c /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matmul.hip -o q4_matmul.cuda.o 
FAILED: q4_matmul.cuda.o 
/opt/rocm-5.6.0/bin/hipcc  -DWITH_HIP -DTORCH_EXTENSION_NAME=exllama_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/mnt/7018F20D48B6C548/exllama/exllama_ext -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/TH -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THC -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THH -isystem /opt/rocm-5.6.0/include -isystem /opt/rocm-5.6.0/miopen/include -isystem /opt/rocm-5.6.0/hip/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -lineinfo -U__HIP_NO_HALF_CONVERSIONS__ -O3 --amdgpu-target=gfx900 --amdgpu-target=gfx906 --amdgpu-target=gfx908 --amdgpu-target=gfx90a --amdgpu-target=gfx1030 -fno-gpu-rdc -c /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matmul.hip -o q4_matmul.cuda.o 
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
clang-16: warning: -lineinfo: 'linker' input unused [-Wunused-command-line-argument]
In file included from /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matmul.hip:3:
In file included from /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../hip_func/q4_matmul.cuh:9:
/home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/ATen/hip/HIPContext.h:7:10: fatal error: 'hipsparse/hipsparse.h' file not found
#include <hipsparse/hipsparse.h>
         ^~~~~~~~~~~~~~~~~~~~~~~
1 error generated when compiling for gfx1030.
[4/12] c++ -MMD -MF exllama_ext_hip.o.d -DTORCH_EXTENSION_NAME=exllama_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/mnt/7018F20D48B6C548/exllama/exllama_ext -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/TH -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THC -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THH -isystem /opt/rocm-5.6.0/include -isystem /opt/rocm-5.6.0/miopen/include -isystem /opt/rocm-5.6.0/hip/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -c /mnt/7018F20D48B6C548/exllama/exllama_ext/exllama_ext_hip.cpp -o exllama_ext_hip.o -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1
FAILED: exllama_ext_hip.o 
c++ -MMD -MF exllama_ext_hip.o.d -DTORCH_EXTENSION_NAME=exllama_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/mnt/7018F20D48B6C548/exllama/exllama_ext -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/TH -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THC -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THH -isystem /opt/rocm-5.6.0/include -isystem /opt/rocm-5.6.0/miopen/include -isystem /opt/rocm-5.6.0/hip/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -c /mnt/7018F20D48B6C548/exllama/exllama_ext/exllama_ext_hip.cpp -o exllama_ext_hip.o -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1
In file included from /mnt/7018F20D48B6C548/exllama/exllama_ext/exllama_ext_hip.cpp:4:
/home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/ATen/hip/HIPContext.h:7:10: fatal error: hipsparse/hipsparse.h: 没有那个文件或目录
    7 | #include <hipsparse/hipsparse.h>
      |          ^~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
[5/12] /opt/rocm-5.6.0/bin/hipcc  -DWITH_HIP -DTORCH_EXTENSION_NAME=exllama_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/mnt/7018F20D48B6C548/exllama/exllama_ext -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/TH -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THC -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THH -isystem /opt/rocm-5.6.0/include -isystem /opt/rocm-5.6.0/miopen/include -isystem /opt/rocm-5.6.0/hip/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -lineinfo -U__HIP_NO_HALF_CONVERSIONS__ -O3 --amdgpu-target=gfx900 --amdgpu-target=gfx906 --amdgpu-target=gfx908 --amdgpu-target=gfx90a --amdgpu-target=gfx1030 -fno-gpu-rdc -c /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_mlp.hip -o q4_mlp.cuda.o 
FAILED: q4_mlp.cuda.o 
/opt/rocm-5.6.0/bin/hipcc  -DWITH_HIP -DTORCH_EXTENSION_NAME=exllama_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/mnt/7018F20D48B6C548/exllama/exllama_ext -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/TH -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THC -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THH -isystem /opt/rocm-5.6.0/include -isystem /opt/rocm-5.6.0/miopen/include -isystem /opt/rocm-5.6.0/hip/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -lineinfo -U__HIP_NO_HALF_CONVERSIONS__ -O3 --amdgpu-target=gfx900 --amdgpu-target=gfx906 --amdgpu-target=gfx908 --amdgpu-target=gfx90a --amdgpu-target=gfx1030 -fno-gpu-rdc -c /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_mlp.hip -o q4_mlp.cuda.o 
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
clang-16: warning: -lineinfo: 'linker' input unused [-Wunused-command-line-argument]
In file included from /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_mlp.hip:3:
In file included from /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../hip_func/q4_mlp.cuh:8:
/home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/ATen/hip/HIPContext.h:7:10: fatal error: 'hipsparse/hipsparse.h' file not found
#include <hipsparse/hipsparse.h>
         ^~~~~~~~~~~~~~~~~~~~~~~
1 error generated when compiling for gfx1030.
[6/12] /opt/rocm-5.6.0/bin/hipcc  -DWITH_HIP -DTORCH_EXTENSION_NAME=exllama_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/mnt/7018F20D48B6C548/exllama/exllama_ext -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/TH -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THC -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THH -isystem /opt/rocm-5.6.0/include -isystem /opt/rocm-5.6.0/miopen/include -isystem /opt/rocm-5.6.0/hip/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -lineinfo -U__HIP_NO_HALF_CONVERSIONS__ -O3 --amdgpu-target=gfx900 --amdgpu-target=gfx906 --amdgpu-target=gfx908 --amdgpu-target=gfx90a --amdgpu-target=gfx1030 -fno-gpu-rdc -c /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_attn.hip -o q4_attn.cuda.o 
FAILED: q4_attn.cuda.o 
/opt/rocm-5.6.0/bin/hipcc  -DWITH_HIP -DTORCH_EXTENSION_NAME=exllama_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/mnt/7018F20D48B6C548/exllama/exllama_ext -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/TH -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THC -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THH -isystem /opt/rocm-5.6.0/include -isystem /opt/rocm-5.6.0/miopen/include -isystem /opt/rocm-5.6.0/hip/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -lineinfo -U__HIP_NO_HALF_CONVERSIONS__ -O3 --amdgpu-target=gfx900 --amdgpu-target=gfx906 --amdgpu-target=gfx908 --amdgpu-target=gfx90a --amdgpu-target=gfx1030 -fno-gpu-rdc -c /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_attn.hip -o q4_attn.cuda.o 
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
clang-16: warning: -lineinfo: 'linker' input unused [-Wunused-command-line-argument]
In file included from /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_attn.hip:3:
In file included from /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../hip_func/q4_mlp.cuh:8:
/home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/ATen/hip/HIPContext.h:7:10: fatal error: 'hipsparse/hipsparse.h' file not found
#include <hipsparse/hipsparse.h>
         ^~~~~~~~~~~~~~~~~~~~~~~
1 error generated when compiling for gfx1030.
[7/12] /opt/rocm-5.6.0/bin/hipcc  -DWITH_HIP -DTORCH_EXTENSION_NAME=exllama_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/mnt/7018F20D48B6C548/exllama/exllama_ext -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/TH -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THC -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THH -isystem /opt/rocm-5.6.0/include -isystem /opt/rocm-5.6.0/miopen/include -isystem /opt/rocm-5.6.0/hip/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -lineinfo -U__HIP_NO_HALF_CONVERSIONS__ -O3 --amdgpu-target=gfx900 --amdgpu-target=gfx906 --amdgpu-target=gfx908 --amdgpu-target=gfx90a --amdgpu-target=gfx1030 -fno-gpu-rdc -c /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip -o hip_buffers.cuda.o 
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
clang-16: warning: -lineinfo: 'linker' input unused [-Wunused-command-line-argument]
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:29:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipSetDevice(_device);
    ^~~~~~~~~~~~ ~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:31:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamCreate(&alt_stream_1);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:32:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamCreate(&alt_stream_2);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:33:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamCreate(&alt_stream_3);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:34:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventCreate(&alt_stream_1_done);
    ^~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:35:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventCreate(&alt_stream_2_done);
    ^~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:36:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventCreate(&alt_stream_3_done);
    ^~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:41:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamDestroy(alt_stream_1);
    ^~~~~~~~~~~~~~~~ ~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:42:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamDestroy(alt_stream_2);
    ^~~~~~~~~~~~~~~~ ~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:43:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamDestroy(alt_stream_3);
    ^~~~~~~~~~~~~~~~ ~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:44:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventDestroy(alt_stream_1_done);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:45:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventDestroy(alt_stream_2_done);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:46:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventDestroy(alt_stream_3_done);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:54:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
        hipMemsetAsync(temp_zeros_float, 0, max_zeros_float * sizeof(float));
        ^~~~~~~~~~~~~~
14 warnings generated when compiling for gfx1030.
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:29:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipSetDevice(_device);
    ^~~~~~~~~~~~ ~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:31:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamCreate(&alt_stream_1);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:32:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamCreate(&alt_stream_2);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:33:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamCreate(&alt_stream_3);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:34:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventCreate(&alt_stream_1_done);
    ^~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:35:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventCreate(&alt_stream_2_done);
    ^~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:36:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventCreate(&alt_stream_3_done);
    ^~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:41:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamDestroy(alt_stream_1);
    ^~~~~~~~~~~~~~~~ ~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:42:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamDestroy(alt_stream_2);
    ^~~~~~~~~~~~~~~~ ~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:43:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamDestroy(alt_stream_3);
    ^~~~~~~~~~~~~~~~ ~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:44:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventDestroy(alt_stream_1_done);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:45:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventDestroy(alt_stream_2_done);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:46:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventDestroy(alt_stream_3_done);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:54:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
        hipMemsetAsync(temp_zeros_float, 0, max_zeros_float * sizeof(float));
        ^~~~~~~~~~~~~~
14 warnings generated when compiling for gfx900.
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:29:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipSetDevice(_device);
    ^~~~~~~~~~~~ ~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:31:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamCreate(&alt_stream_1);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:32:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamCreate(&alt_stream_2);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:33:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamCreate(&alt_stream_3);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:34:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventCreate(&alt_stream_1_done);
    ^~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:35:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventCreate(&alt_stream_2_done);
    ^~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:36:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventCreate(&alt_stream_3_done);
    ^~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:41:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamDestroy(alt_stream_1);
    ^~~~~~~~~~~~~~~~ ~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:42:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamDestroy(alt_stream_2);
    ^~~~~~~~~~~~~~~~ ~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:43:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamDestroy(alt_stream_3);
    ^~~~~~~~~~~~~~~~ ~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:44:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventDestroy(alt_stream_1_done);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:45:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventDestroy(alt_stream_2_done);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:46:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventDestroy(alt_stream_3_done);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:54:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
        hipMemsetAsync(temp_zeros_float, 0, max_zeros_float * sizeof(float));
        ^~~~~~~~~~~~~~
14 warnings generated when compiling for gfx906.
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:29:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipSetDevice(_device);
    ^~~~~~~~~~~~ ~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:31:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamCreate(&alt_stream_1);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:32:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamCreate(&alt_stream_2);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:33:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamCreate(&alt_stream_3);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:34:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventCreate(&alt_stream_1_done);
    ^~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:35:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventCreate(&alt_stream_2_done);
    ^~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:36:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventCreate(&alt_stream_3_done);
    ^~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:41:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamDestroy(alt_stream_1);
    ^~~~~~~~~~~~~~~~ ~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:42:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamDestroy(alt_stream_2);
    ^~~~~~~~~~~~~~~~ ~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:43:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamDestroy(alt_stream_3);
    ^~~~~~~~~~~~~~~~ ~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:44:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventDestroy(alt_stream_1_done);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:45:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventDestroy(alt_stream_2_done);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:46:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventDestroy(alt_stream_3_done);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:54:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
        hipMemsetAsync(temp_zeros_float, 0, max_zeros_float * sizeof(float));
        ^~~~~~~~~~~~~~
14 warnings generated when compiling for gfx908.
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:29:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipSetDevice(_device);
    ^~~~~~~~~~~~ ~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:31:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamCreate(&alt_stream_1);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:32:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamCreate(&alt_stream_2);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:33:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamCreate(&alt_stream_3);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:34:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventCreate(&alt_stream_1_done);
    ^~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:35:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventCreate(&alt_stream_2_done);
    ^~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:36:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventCreate(&alt_stream_3_done);
    ^~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:41:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamDestroy(alt_stream_1);
    ^~~~~~~~~~~~~~~~ ~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:42:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamDestroy(alt_stream_2);
    ^~~~~~~~~~~~~~~~ ~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:43:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamDestroy(alt_stream_3);
    ^~~~~~~~~~~~~~~~ ~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:44:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventDestroy(alt_stream_1_done);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:45:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventDestroy(alt_stream_2_done);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:46:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventDestroy(alt_stream_3_done);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:54:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
        hipMemsetAsync(temp_zeros_float, 0, max_zeros_float * sizeof(float));
        ^~~~~~~~~~~~~~
14 warnings generated when compiling for gfx90a.
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:29:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipSetDevice(_device);
    ^~~~~~~~~~~~ ~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:31:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamCreate(&alt_stream_1);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:32:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamCreate(&alt_stream_2);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:33:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamCreate(&alt_stream_3);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:34:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventCreate(&alt_stream_1_done);
    ^~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:35:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventCreate(&alt_stream_2_done);
    ^~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:36:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventCreate(&alt_stream_3_done);
    ^~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:41:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamDestroy(alt_stream_1);
    ^~~~~~~~~~~~~~~~ ~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:42:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamDestroy(alt_stream_2);
    ^~~~~~~~~~~~~~~~ ~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:43:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipStreamDestroy(alt_stream_3);
    ^~~~~~~~~~~~~~~~ ~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:44:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventDestroy(alt_stream_1_done);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:45:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventDestroy(alt_stream_2_done);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:46:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipEventDestroy(alt_stream_3_done);
    ^~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_buffers.hip:54:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
        hipMemsetAsync(temp_zeros_float, 0, max_zeros_float * sizeof(float));
        ^~~~~~~~~~~~~~
14 warnings generated when compiling for host.
[8/12] /opt/rocm-5.6.0/bin/hipcc  -DWITH_HIP -DTORCH_EXTENSION_NAME=exllama_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/mnt/7018F20D48B6C548/exllama/exllama_ext -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/TH -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THC -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THH -isystem /opt/rocm-5.6.0/include -isystem /opt/rocm-5.6.0/miopen/include -isystem /opt/rocm-5.6.0/hip/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -lineinfo -U__HIP_NO_HALF_CONVERSIONS__ -O3 --amdgpu-target=gfx900 --amdgpu-target=gfx906 --amdgpu-target=gfx908 --amdgpu-target=gfx90a --amdgpu-target=gfx1030 -fno-gpu-rdc -c /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/column_remap.hip -o column_remap.cuda.o 
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
clang-16: warning: -lineinfo: 'linker' input unused [-Wunused-command-line-argument]
In file included from /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/column_remap.hip:4:
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:44:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:58:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
2 warnings generated when compiling for gfx1030.
In file included from /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/column_remap.hip:4:
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:44:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:58:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
2 warnings generated when compiling for gfx900.
In file included from /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/column_remap.hip:4:
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:44:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:58:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
2 warnings generated when compiling for gfx906.
In file included from /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/column_remap.hip:4:
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:44:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:58:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
2 warnings generated when compiling for gfx908.
In file included from /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/column_remap.hip:4:
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:44:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:58:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
2 warnings generated when compiling for gfx90a.
In file included from /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/column_remap.hip:4:
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:44:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:58:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
2 warnings generated when compiling for host.
[9/12] /opt/rocm-5.6.0/bin/hipcc  -DWITH_HIP -DTORCH_EXTENSION_NAME=exllama_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/mnt/7018F20D48B6C548/exllama/exllama_ext -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/TH -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THC -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THH -isystem /opt/rocm-5.6.0/include -isystem /opt/rocm-5.6.0/miopen/include -isystem /opt/rocm-5.6.0/hip/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -lineinfo -U__HIP_NO_HALF_CONVERSIONS__ -O3 --amdgpu-target=gfx900 --amdgpu-target=gfx906 --amdgpu-target=gfx908 --amdgpu-target=gfx90a --amdgpu-target=gfx1030 -fno-gpu-rdc -c /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip -o q4_matrix.cuda.o 
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
clang-16: warning: -lineinfo: 'linker' input unused [-Wunused-command-line-argument]
In file included from /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:5:
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:44:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:58:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:46:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipSetDevice(device);
    ^~~~~~~~~~~~ ~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:109:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipMalloc(&cuda_new_qweight, height / 8 * width * sizeof(uint32_t));
    ^~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:110:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipMalloc(&cuda_x_map, height * sizeof(uint32_t));  // TODO: Should probably be allocated in PyTorch
    ^~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:145:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipMemcpyAsync(cuda_x_map, cpu_x_map, height * sizeof(uint32_t), hipMemcpyHostToDevice);
    ^~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:161:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipMemcpyAsync(cuda_qweight, cuda_new_qweight, height / 8 * width * sizeof(uint32_t), hipMemcpyDeviceToDevice);
    ^~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:165:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:166:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipFree(cuda_new_qweight);
    ^~~~~~~ ~~~~~~~~~~~~~~~~
9 warnings generated when compiling for gfx1030.
In file included from /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:5:
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:44:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:58:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:46:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipSetDevice(device);
    ^~~~~~~~~~~~ ~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:109:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipMalloc(&cuda_new_qweight, height / 8 * width * sizeof(uint32_t));
    ^~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:110:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipMalloc(&cuda_x_map, height * sizeof(uint32_t));  // TODO: Should probably be allocated in PyTorch
    ^~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:145:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipMemcpyAsync(cuda_x_map, cpu_x_map, height * sizeof(uint32_t), hipMemcpyHostToDevice);
    ^~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:161:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipMemcpyAsync(cuda_qweight, cuda_new_qweight, height / 8 * width * sizeof(uint32_t), hipMemcpyDeviceToDevice);
    ^~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:165:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:166:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipFree(cuda_new_qweight);
    ^~~~~~~ ~~~~~~~~~~~~~~~~
9 warnings generated when compiling for gfx900.
In file included from /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:5:
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:44:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:58:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:46:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipSetDevice(device);
    ^~~~~~~~~~~~ ~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:109:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipMalloc(&cuda_new_qweight, height / 8 * width * sizeof(uint32_t));
    ^~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:110:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipMalloc(&cuda_x_map, height * sizeof(uint32_t));  // TODO: Should probably be allocated in PyTorch
    ^~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:145:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipMemcpyAsync(cuda_x_map, cpu_x_map, height * sizeof(uint32_t), hipMemcpyHostToDevice);
    ^~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:161:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipMemcpyAsync(cuda_qweight, cuda_new_qweight, height / 8 * width * sizeof(uint32_t), hipMemcpyDeviceToDevice);
    ^~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:165:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:166:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipFree(cuda_new_qweight);
    ^~~~~~~ ~~~~~~~~~~~~~~~~
9 warnings generated when compiling for gfx906.
In file included from /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:5:
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:44:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:58:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:46:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipSetDevice(device);
    ^~~~~~~~~~~~ ~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:109:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipMalloc(&cuda_new_qweight, height / 8 * width * sizeof(uint32_t));
    ^~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:110:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipMalloc(&cuda_x_map, height * sizeof(uint32_t));  // TODO: Should probably be allocated in PyTorch
    ^~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:145:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipMemcpyAsync(cuda_x_map, cpu_x_map, height * sizeof(uint32_t), hipMemcpyHostToDevice);
    ^~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:161:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipMemcpyAsync(cuda_qweight, cuda_new_qweight, height / 8 * width * sizeof(uint32_t), hipMemcpyDeviceToDevice);
    ^~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:165:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:166:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipFree(cuda_new_qweight);
    ^~~~~~~ ~~~~~~~~~~~~~~~~
9 warnings generated when compiling for gfx908.
In file included from /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:5:
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:44:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:58:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:46:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipSetDevice(device);
    ^~~~~~~~~~~~ ~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:109:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipMalloc(&cuda_new_qweight, height / 8 * width * sizeof(uint32_t));
    ^~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:110:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipMalloc(&cuda_x_map, height * sizeof(uint32_t));  // TODO: Should probably be allocated in PyTorch
    ^~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:145:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipMemcpyAsync(cuda_x_map, cpu_x_map, height * sizeof(uint32_t), hipMemcpyHostToDevice);
    ^~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:161:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipMemcpyAsync(cuda_qweight, cuda_new_qweight, height / 8 * width * sizeof(uint32_t), hipMemcpyDeviceToDevice);
    ^~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:165:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:166:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipFree(cuda_new_qweight);
    ^~~~~~~ ~~~~~~~~~~~~~~~~
9 warnings generated when compiling for gfx90a.
In file included from /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:5:
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:44:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:58:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:46:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipSetDevice(device);
    ^~~~~~~~~~~~ ~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:109:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipMalloc(&cuda_new_qweight, height / 8 * width * sizeof(uint32_t));
    ^~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:110:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipMalloc(&cuda_x_map, height * sizeof(uint32_t));  // TODO: Should probably be allocated in PyTorch
    ^~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:145:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipMemcpyAsync(cuda_x_map, cpu_x_map, height * sizeof(uint32_t), hipMemcpyHostToDevice);
    ^~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:161:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipMemcpyAsync(cuda_qweight, cuda_new_qweight, height / 8 * width * sizeof(uint32_t), hipMemcpyDeviceToDevice);
    ^~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:165:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/q4_matrix.hip:166:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipFree(cuda_new_qweight);
    ^~~~~~~ ~~~~~~~~~~~~~~~~
9 warnings generated when compiling for host.
[10/12] /opt/rocm-5.6.0/bin/hipcc  -DWITH_HIP -DTORCH_EXTENSION_NAME=exllama_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/mnt/7018F20D48B6C548/exllama/exllama_ext -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/TH -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THC -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THH -isystem /opt/rocm-5.6.0/include -isystem /opt/rocm-5.6.0/miopen/include -isystem /opt/rocm-5.6.0/hip/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -lineinfo -U__HIP_NO_HALF_CONVERSIONS__ -O3 --amdgpu-target=gfx900 --amdgpu-target=gfx906 --amdgpu-target=gfx908 --amdgpu-target=gfx90a --amdgpu-target=gfx1030 -fno-gpu-rdc -c /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/rope.hip -o rope.cuda.o 
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
clang-16: warning: -lineinfo: 'linker' input unused [-Wunused-command-line-argument]
In file included from /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/rope.hip:4:
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:44:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:58:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
2 warnings generated when compiling for gfx1030.
In file included from /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/rope.hip:4:
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:44:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:58:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
2 warnings generated when compiling for gfx900.
In file included from /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/rope.hip:4:
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:44:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:58:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
2 warnings generated when compiling for gfx906.
In file included from /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/rope.hip:4:
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:44:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:58:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
2 warnings generated when compiling for gfx908.
In file included from /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/rope.hip:4:
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:44:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:58:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
2 warnings generated when compiling for gfx90a.
In file included from /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/rope.hip:4:
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:44:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:58:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
2 warnings generated when compiling for host.
[11/12] /opt/rocm-5.6.0/bin/hipcc  -DWITH_HIP -DTORCH_EXTENSION_NAME=exllama_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/mnt/7018F20D48B6C548/exllama/exllama_ext -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/TH -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THC -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/lib/python3.10/site-packages/torch/include/THH -isystem /opt/rocm-5.6.0/include -isystem /opt/rocm-5.6.0/miopen/include -isystem /opt/rocm-5.6.0/hip/include -isystem /home/fgdfgfthgr/anaconda3/envs/textgen/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -lineinfo -U__HIP_NO_HALF_CONVERSIONS__ -O3 --amdgpu-target=gfx900 --amdgpu-target=gfx906 --amdgpu-target=gfx908 --amdgpu-target=gfx90a --amdgpu-target=gfx1030 -fno-gpu-rdc -c /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/rms_norm.hip -o rms_norm.cuda.o 
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
clang-16: warning: -lineinfo: 'linker' input unused [-Wunused-command-line-argument]
In file included from /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/rms_norm.hip:5:
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:44:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:58:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
2 warnings generated when compiling for gfx1030.
In file included from /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/rms_norm.hip:5:
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:44:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:58:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
2 warnings generated when compiling for gfx900.
In file included from /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/rms_norm.hip:5:
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:44:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:58:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
2 warnings generated when compiling for gfx906.
In file included from /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/rms_norm.hip:5:
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:44:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:58:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
2 warnings generated when compiling for gfx908.
In file included from /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/rms_norm.hip:5:
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:44:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:58:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
2 warnings generated when compiling for gfx90a.
In file included from /mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/rms_norm.hip:5:
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:44:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
/mnt/7018F20D48B6C548/exllama/exllama_ext/hip_func/../util_hip.cuh:58:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    hipDeviceSynchronize();
    ^~~~~~~~~~~~~~~~~~~~
2 warnings generated when compiling for host.
ninja: build stopped: subcommand failed.

@fgdfgfthgr-fox
Copy link
Author

it seem like you are missing hipSPARSE on your system

I then added the hipsparse-dev package using apt install hipsparse. Doesn't seem to make a difference, it still failes to build exllama_ext extension.

@ardfork
Copy link
Contributor

ardfork commented Jul 5, 2023

The error is quite clear: fatal error: 'hipsparse/hipsparse.h' file not found.

As it is finding hipcc, I don't think it's a problem of it finding your ROCm dir. Verify that you correctly have /opt/rocm-5.4.2/include/hipblas/hipblas.h or /opt/rocm-5.6.0/include/hipblas/hipblas.h. If not find which package provide that or use a ROCm container.

@fgdfgfthgr-fox
Copy link
Author

Alright, I think I get it running successfully, after installing hipsparse, rocThrust and rocPRIM.

@ardfork
Copy link
Contributor

ardfork commented Jul 5, 2023

In most distro, they should have a group like rocm-hip-sdk, at least amd repo for ubuntu/rhel/suse and arch repo have it named that way. That group will install all the necessary package to compile and run projects using ROCm. It's easier to install that instead of installing all of the needed ROCm packages manually, it's like 30 packages in total.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants