Releases · YellowRoseCx/koboldcpp-rocm

18 Dec 01:37

v1.52.2.yr0-ROCm

509ad00

KoboldCPP-v1.52.2.yr0-ROCm

https://github.com/LostRuins/koboldcpp/releases/tag/v1.52.2

NEW: Added a new bare-bones KoboldCpp NoScript WebUI, which does not require Javascript to work. It should be W3C HTML compliant and should run on every browser in the last 20 years, even text-based ones like Lynx (e.g. in the terminal over SSH). It is accessible by default at /noscript e.g. http://localhost:5001/noscript . This can be helpful when running KoboldCpp from systems which do not support a modern browser with Javascript.

Partial per-layer KV offloading is now merged for CUDA. Important: this means that the number of layers you can offload to GPU might be reduced, as each layer now takes up more space. To avoid per-layer KV offloading, use the --usecublas lowvram option (equivalent to -nkvo in llama.cpp). Fully offloaded models should behave the same as before.

The /api/extra/tokencount endpoint now also returns an array of token ids in the response body from the tokenizer.

Merged support for QWEN and Mixtral from upstream. Note: Mixtral seems to perform large batch prompt processing extremely slowly. This is probably an implementation issue. For now, you might have better luck using --noblas or setting --blasbatchsize -1 when using Mixtral

Selecting a .kcpps in the GUI when choosing a model will load the model specified inside that config file instead.

Added the Mamba Multitool script (from @henk717). This is a shell script that can be used in Linux to setup an environment with all dependencies required for building and running KoboldCpp on Linux.

Improved KCPP Embedded Horde Worker fault tolerance, should now gracefully backoff for increasing durations whenever encountering errors polling from AI Horde, and will automatically recover from up to 24 hours of Horde downtime.

Added a new parameter that shows number of Horde Worker errors in the /api/extra/perf endpoint, this can be used to monitor your embedded horde worker if it goes down.

Pulled other fixes and improvements from upstream, updated Kobold Lite, added asynchronous file autosaves (thanks @aleksusklim), various other improvements.

Hotfix 1.52.1: Fixed 'not enough memory' loading errors for large (20B+) models. See #563
NEW: Added Linux PyInstaller binaries
Hotfix 1.52.2: Merged fixes for Mixtral prompt processing

To use, download and run the koboldcpp.exe, which is a one-file pyinstaller.
If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller.
If you're using AMD, you can try koboldcpp_rocm at YellowRoseCx's fork here

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001/
For more information, be sure to run the program from command line with the --help flag.

Assets 3

13 Sep 08:06

YellowRoseCx

v1.43-ROCm

509ad00

Windows KoboldCPP-ROCm v1.43 .exe

Windows Compiled KoboldCPP with ROCm support!

I want to thank @LostRuins for making KoboldCPP and general guidance, @henk717 for all his dedication to KoboldAI that brought us here in the first place, and to @SlyEcho who originally started the ROCm Port for llama.cpp

You need ROCm to build it, but not to run it: https://rocm.docs.amd.com/en/latest/deploy/windows/quick_start.html

Compiled for the GPU's that have Tensile Libraries/ marked as supported: gfx906, gfx1030, gfx1100, gfx1101, gfx1102

To run, open it; or start via command-line
Example:
./koboldcpp_rocm.exe --usecublas normal mmq --threads 1 --stream --contextsize 4096 --usemirostat 2 6 0.1 --gpulayers 45 C:\Users\YellowRose\llama-2-7b-chat.Q8_0.gguf

This site may be useful, it has some patches for Windows ROCm to help it with compilation that I used, but I'm not sure if it's necessary. https://streamhpc.com/blog/2023-08-01/how-to-get-full-cmake-support-for-amd-hip-sdk-on-windows-including-patches/

Build command used (ROCm Required):

cd koboldcpp-rocm
mkdir build && cd build
cmake .. -G "Ninja" -DCMAKE_BUILD_TYPE=Release -DLLAMA_HIPBLAS=ON -DCMAKE_C_COMPILER="C:/Program Files/AMD/ROCm/5.5/bin/clang.exe" -DCMAKE_CXX_COMPILER="C:/Program Files/AMD/ROCm/5.5/bin/clang++.exe" -DAMDGPU_TARGETS="gfx906;gfx1030;gfx1100;gfx1101;gfx1102"
cmake --build . -j 6

That puts koboldcpp_cublas.dll inside of .\koboldcpp-rocm\build\bin
copy koboldcpp_cublas.dll to the main koboldcpp-rocm folder
(You can run koboldcpp.py like this right away)

To make it into an exe, we use make_pyinst_rocm_hybrid_henk_yellow.bat
But that file's set up to add CLBlast and OpenBlas too, you can either remove those lines so it's just this code:

cd /d "%~dp0"
copy "C:\Program Files\AMD\ROCm\5.5\bin\hipblas.dll" .\ /Y
copy "C:\Program Files\AMD\ROCm\5.5\bin\rocblas.dll" .\ /Y
xcopy /E /I "C:\Program Files\AMD\ROCm\5.5\bin\rocblas" .\rocblas\
 
PyInstaller --noconfirm --onefile --collect-all customtkinter --clean --console --icon ".\niko.ico" --add-data "./klite.embd;." --add-data "./koboldcpp_cublas.dll;." --add-data "./hipblas.dll;." --add-data "./rocblas.dll;." --add-data "./rwkv_vocab.embd;." --add-data "./rocblas;." --add-data "C:/Windows/System32/msvcp140.dll;." --add-data "C:/Windows/System32/vcruntime140_1.dll;." "./koboldcpp.py" -n "koboldcppRocm.exe"

or you can download w64devkit and cd into the folder and run make LLAMA_OPENBLAS=1 LLAMA_CLBLAST=1 -j4 then it will build the rest of the backend files

Once they're all built, you should be able to just run make_pyinst_rocm_hybrid_henk_yellow.bat as it is and it'll bundle the files together into koboldcppRocm.exe in the \koboldcpp-rocm\dists folder

Contributors

SlyEcho, henk717, and LostRuins

Assets 3

15 Dec 04:47

github-actions

v1.52.1.yr0-ROCm

eee005e

KoboldCPP-v1.52.1.yr0-ROCm

Merge remote-tracking branch 'upstream/concedo'

Assets 3

13 Dec 21:25

github-actions

v1.52.yr0-ROCm

a2dcd33

KoboldCPP-v1.52.yr0-ROCm

Various new features including new model Mixtral support

Assets 3

12 Dec 20:45

github-actions

v1.52.RC1.yr1-ROCm-Mixtral

4984d0b

Mixtral-Kcpp-v1.52.RC1.yr1-ROCm FanService Ed. Pre-release

Pre-release

Unofficial release candidate build containing experimental features and Mixtral Model support

Assets 3

02 Dec 21:37

github-actions

v1.51.1.yr1-ROCm

260296f

KoboldCPP-v1.51.1.yr1-ROCm

Now includes the full build featuring hipBLAS (ROCm), CLBlast, OpenBLAS, No BLAS, and 2 Backup backends

Assets 3

26 Nov 10:14

github-actions

v1.50.1.yr1-ROCm

3a4f4b1

KoboldCPP-v1.50.1.yr1-ROCm

Merge remote-tracking branch 'upstream/concedo'

Assets 3

19 Nov 20:57

github-actions

v1.50.yr0-ROCm

8080c01

KoboldCPP-v1.50.yr0-ROCm

Merge remote-tracking branch 'upstream/concedo'

Assets 3

06 Nov 07:11

github-actions

1.48.1.yr2-ROCm

a913ad9

KoboldCPP-1.48.1.yr2-ROCm

Merge remote-tracking branch 'upstream/concedo'

Assets 3

24 Oct 18:49

github-actions

v1.47.2.yr0-ROCm

acec91d

KoboldCPP-v1.47.2.yr0-ROCm

Merge remote-tracking branch 'upstream/concedo'

Assets 3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Windows Compiled KoboldCPP with ROCm support!

Contributors

Releases: YellowRoseCx/koboldcpp-rocm

KoboldCPP-v1.52.2.yr0-ROCm

Windows KoboldCPP-ROCm v1.43 .exe

Windows Compiled KoboldCPP with ROCm support!

Contributors

KoboldCPP-v1.52.1.yr0-ROCm

KoboldCPP-v1.52.yr0-ROCm

Mixtral-Kcpp-v1.52.RC1.yr1-ROCm FanService Ed.

KoboldCPP-v1.51.1.yr1-ROCm

KoboldCPP-v1.50.1.yr1-ROCm

KoboldCPP-v1.50.yr0-ROCm

KoboldCPP-1.48.1.yr2-ROCm

KoboldCPP-v1.47.2.yr0-ROCm