Skip to content

KoboldCPP-ROCm-v1.43.2

Compare
Choose a tag to compare
@YellowRoseCx YellowRoseCx released this 16 Sep 03:27
· 4755 commits to main since this release
  • Added a few updates like MMQ tuning for RDNA1, RDNA2, & RDNA3 GPUs
  • Separated hipBLAS references from CuBLAS
  • Fixed up some pyinstaller files
  • Attempted adding RX 6700XT support.

Note, if you get a runtime error about customtkinter when starting the exe, try using the files directly from koboldcpp-rocm_precompiled.zip or using those files to build the EXE yourself. If you get a different error about customtkinter not found; in your command-prompt try pip install customtkinter. It's picky and apparently building an exe with customtkinter requires a special build environment when distributing to other people.


You need ROCm to build it, but not to run it: https://rocm.docs.amd.com/en/latest/deploy/windows/quick_start.html

Compiled for the GPU's that have Tensile Libraries/ marked as supported: gfx906, gfx1030, gfx1100, gfx1101, gfx1102
6700XT Users: I've tried including gfx1031 support (6700xt), but I can't test it. The files marked -with_gfx1031 are for you. To do this, I had to copy all the gfx1030 files and rename the copied versions to gfx1031. If you attempt to do this, I included a zipfile gfx1031_files.zip of all the gfx1030 files renamed to gfx1031; just drag it's rocblas folder into /koboldcpp-rocm/ after copying koboldcpp_hipblas.dll to the main koboldcpp-rocm folder. If you don't want to build the exe, you can try adding those files to the base ROCm rocblas folder here: C:\Program Files\AMD\ROCm\5.5\bin\rocblas\library. Looking into better ways


To run, open it; or start via command-line
Example:
./koboldcpp_rocm.exe --usecublas normal mmq --threads 1 --stream --contextsize 4096 --usemirostat 2 6 0.1 --gpulayers 45 C:\Users\YellowRose\llama-2-7b-chat.Q8_0.gguf


Building it yourself:

This site may be useful, it has some patches for Windows ROCm to help it with compilation that I used, but I'm not sure if it's necessary. https://streamhpc.com/blog/2023-08-01/how-to-get-full-cmake-support-for-amd-hip-sdk-on-windows-including-patches/

Build command used (ROCm Required):

cd koboldcpp-rocm
mkdir build && cd build

cmake .. -G "Ninja" -DCMAKE_BUILD_TYPE=Release -DLLAMA_HIPBLAS=ON -DCMAKE_C_COMPILER="C:/Program Files/AMD/ROCm/5.5/bin/clang.exe" -DCMAKE_CXX_COMPILER="C:/Program Files/AMD/ROCm/5.5/bin/clang++.exe" -DAMDGPU_TARGETS="gfx803;gfx900;gfx906;gfx908;gfx90a;gfx1010;gfx1030;gfx1031;gfx1032;gfx1100;gfx1101;gfx1102"

cmake --build . -j 6 (-j 6 means use 6 CPU cores, if you have more or less, feel free to change it to speed things up)

That puts koboldcpp_hipblas.dll inside of .\koboldcpp-rocm\build\bin
copy koboldcpp_hipblas.dll to the main koboldcpp-rocm folder
(You can run koboldcpp.py like this right away) like this:
python koboldcpp.py --usecublas normal mmq --threads 1 --stream --contextsize 4096 --usemirostat 2 6 0.1 --gpulayers 45 C:\Users\YellowRose\llama-2-7b-chat.Q8_0.gguf

To make it into an exe, we use make_pyinstaller_exe_rocm_only.bat
which will attempt to build the exe for you and place it in /koboldcpp-rocm/dists/
kobold_rocm_only.exe is built!


If you'd like to do a full feature build with OPENBLAS and CLBLAST backends, you'll need w64devkit. Once downloaded, open w64devkit.exe and cd into the folder then run make LLAMA_OPENBLAS=1 LLAMA_CLBLAST=1 -j4 then it will build the rest of the backend files.

Once they're all built, you should be able to just run make_pyinst_rocm_hybrid_henk_yellow.bat and it'll bundle the files together into koboldcpp_rocm.exe in the \koboldcpp-rocm\dists folder