KoboldCPP-ROCm-v1.43.2
- Added a few updates like MMQ tuning for RDNA1, RDNA2, & RDNA3 GPUs
- Separated hipBLAS references from CuBLAS
- Fixed up some pyinstaller files
- Attempted adding RX 6700XT support.
Note, if you get a runtime error about customtkinter when starting the exe, try using the files directly from koboldcpp-rocm_precompiled.zip
or using those files to build the EXE yourself. If you get a different error about customtkinter not found; in your command-prompt try pip install customtkinter
. It's picky and apparently building an exe with customtkinter requires a special build environment when distributing to other people.
You need ROCm to build it, but not to run it: https://rocm.docs.amd.com/en/latest/deploy/windows/quick_start.html
Compiled for the GPU's that have Tensile Libraries/ marked as supported: gfx906, gfx1030, gfx1100, gfx1101, gfx1102
6700XT Users: I've tried including gfx1031 support (6700xt), but I can't test it. The files marked -with_gfx1031 are for you. To do this, I had to copy all the gfx1030 files and rename the copied versions to gfx1031. If you attempt to do this, I included a zipfile gfx1031_files.zip
of all the gfx1030 files renamed to gfx1031; just drag it's rocblas
folder into /koboldcpp-rocm/ after copying koboldcpp_hipblas.dll to the main koboldcpp-rocm folder. If you don't want to build the exe, you can try adding those files to the base ROCm rocblas folder here: C:\Program Files\AMD\ROCm\5.5\bin\rocblas\library
. Looking into better ways
To run, open it; or start via command-line
Example:
./koboldcpp_rocm.exe --usecublas normal mmq --threads 1 --stream --contextsize 4096 --usemirostat 2 6 0.1 --gpulayers 45 C:\Users\YellowRose\llama-2-7b-chat.Q8_0.gguf
Building it yourself:
This site may be useful, it has some patches for Windows ROCm to help it with compilation that I used, but I'm not sure if it's necessary. https://streamhpc.com/blog/2023-08-01/how-to-get-full-cmake-support-for-amd-hip-sdk-on-windows-including-patches/
Build command used (ROCm Required):
cd koboldcpp-rocm
mkdir build && cd build
cmake .. -G "Ninja" -DCMAKE_BUILD_TYPE=Release -DLLAMA_HIPBLAS=ON -DCMAKE_C_COMPILER="C:/Program Files/AMD/ROCm/5.5/bin/clang.exe" -DCMAKE_CXX_COMPILER="C:/Program Files/AMD/ROCm/5.5/bin/clang++.exe" -DAMDGPU_TARGETS="gfx803;gfx900;gfx906;gfx908;gfx90a;gfx1010;gfx1030;gfx1031;gfx1032;gfx1100;gfx1101;gfx1102"
cmake --build . -j 6
(-j 6 means use 6 CPU cores, if you have more or less, feel free to change it to speed things up)
That puts koboldcpp_hipblas.dll inside of .\koboldcpp-rocm\build\bin
copy koboldcpp_hipblas.dll to the main koboldcpp-rocm folder
(You can run koboldcpp.py like this right away) like this:
python koboldcpp.py --usecublas normal mmq --threads 1 --stream --contextsize 4096 --usemirostat 2 6 0.1 --gpulayers 45 C:\Users\YellowRose\llama-2-7b-chat.Q8_0.gguf
To make it into an exe, we use make_pyinstaller_exe_rocm_only.bat
which will attempt to build the exe for you and place it in /koboldcpp-rocm/dists/
kobold_rocm_only.exe is built!
If you'd like to do a full feature build with OPENBLAS and CLBLAST backends, you'll need w64devkit. Once downloaded, open w64devkit.exe and cd
into the folder then run make LLAMA_OPENBLAS=1 LLAMA_CLBLAST=1 -j4
then it will build the rest of the backend files.
Once they're all built, you should be able to just run make_pyinst_rocm_hybrid_henk_yellow.bat
and it'll bundle the files together into koboldcpp_rocm.exe in the \koboldcpp-rocm\dists folder