26 Feb 08:15

a6e7e7b

KoboldCPP-v1.59.1.yr1-ROCm

Upstream Changelog:

Added --nocertify mode which allows you to disable SSL certificate checking on your embedded Horde worker. This can help bypass some SSL certificate errors.

Fixed pre-gguf models loading with incorrect thread counts. This issue affected the past 2 versions.

Added build target for Old CPU (NoAVX2) Vulkan support.

Fixed cloudflare remotetunnel URLs not displaying on runpod.

Reverted CLBlast back to 1.6.0, pending CNugteren/CLBlast#533 and other correctness fixes.

Smartcontext toggle is now hidden when contextshift toggle is on.

Various improvements and bugfixes merged from upstream, which includes google gemma support.

Bugfixes and updates for Kobold Lite

Changed makefile build flags, fix for tooltips, merged IQ3_S support

For a full Linux build, make sure you have the OpenBLAS and CLBlast packages installed:
For Arch Linux: Install cblas openblas and clblast.
For Debian: Install libclblast-dev and libopenblas-dev.
then run make LLAMA_HIPBLAS=1 LLAMA_OPENBLAS=1 LLAMA_CLBLAST=1 -j4

If you're using NVIDIA, you can try koboldcpp.exe at LostRuin's upstream repo here
If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller, also at LostRuin's repo.
To use on Linux, clone the repo and build with make LLAMA_HIPBLAS=1 -j4

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001

For more information, be sure to run the program from command line with the --help flag.

Assets 4

25 Feb 21:27

github-actions

v1.59.yr1-ROCm

5acb4e2

KoboldCPP-v1.59.yr1-ROCm

update build file to add vulkan

Assets 4

18 Feb 05:53

github-actions

v1.58.yr0-ROCm

b9860f7

KoboldCPP-v1.58.yr0-ROCm

KoboldCpp-1.58.yr0-ROCm

Upstream Changelog:

Added a toggle for row split mode with cuda multigpu. Split mode changed to layer split by default. If using command line, add rowsplit to --usecublas to enable row split mode. With the GUI launcher, it's a checkbox toggle.

Multiple bugfixes: fixed benchmark command, fixed SSL streaming issues, fixed some SSE formatting with OAI endpoints.

Make context shifting more forgiving when determining eligibility.

Upgraded CLBlast to latest version, should result in a modest prompt processing speedup when using CL.

Various improvements and bugfixes merged from upstream.

Updated Kobold Lite with many improvements and new features:

New: Integrated 'AI Vision' for images, this uses AI Horde or a local A1111 endpoint to perform image interrogation, allowing the AI to recognize and interpret uploaded or generated images. This should provide an option for multimodality similar to llava, although not as precise. Click on any image and you can enable it within Lite. This functionality is not provided by KCPP itself.

New: Importing characters from Pygmalion.Chat is now supported in Lite, select it from scenarios.

Added option to run Lite in background. It plays a dynamically generated silent audio sound. This should prevent the browser tab from hibernating.

Fixed printable view, persist streaming text on error, fixed instruct timestamps

Added "Auto" option for idle responses.

Allow importing images into story from local disk

Multiple minor formatting and bug fixes.

For more information, be sure to run the program from command line with the --help flag.

Assets 4

11 Feb 03:57

github-actions

v1.57.1.yr1-ROCm

ae6ece1

KoboldCPP-v1.57.1.yr1-ROCm

KoboldCpp-1.57.1.yr1-ROCm

Windows build does not contain the Vulkan backend yet.

Experimental ROCm Support for Windows was added for the following GPUs thanks to @harish0201 and @jasyuiop:

Desktop GPUs	Laptop GPUs
AMD Radeon PRO W6600	AMD Radeon PRO W6600M
AMD Radeon PRO W6600X	AMD Radeon PRO W6600X
AMD Radeon RX 6600	AMD Radeon RX 6600S
AMD Radeon RX 6600 XT	AMD Radeon RX 6700S
AMD Radeon RX 6650 XT	AMD Radeon RX 6800S
AMD Radeon RX 6700	AMD Radeon RX 6650M
AMD Radeon RX 6700 XT	AMD Radeon RX 6650M XT
AMD Radeon RX 6750 XT	AMD Radeon RX 6700M
AMD Radeon RX 6750 GRE 10 GB	AMD Radeon RX 6800M
AMD Radeon RX 6750 GRE 12 GB	AMD Radeon RX 6850M XT

Upstream Changelog:

Added a benchmarking feature with --benchmark, which automatically runs a benchmark with your provided settings, outputting run parameters, timing and speed information as well as testing for coherence, and exiting on completion. You can provide a filename e.g. --benchmark result.csv and it will write CSV formatted data appended to that file.

Added temperature Quad-Sampling (set via API with parameter smoothing_factor) PR from @AAbushady, (credits @kalomaze).

Improved timing displays. Also, displays the seed used, and also shows llama.cpp styled timings when run in --debugmode. The timings will appear faster as they do not include overheads, measuring only specific eval functions.

Improved abort generation behavior (allows second user aborting while in queue)

Vulkan enhancements from @0cc4m merged: APU memory handling and multigpu. To use multigpu, you can now specify additional IDs, for example --usevulkan 0 2 3 which will use GPUs with IDs 0,2, and 3. Allocation is determined by --tensor_split. Multigpu for Vulkan is currently configurable via commandline only, the GUI launcher does not allow selecting multiple devices for Vulkan.

Various improvements and bugfixes merged from upstream.

Updated Kobold Lite with many improvements and new features:

NEW: The Aesthetic UI is now available for Story and Adventure modes as well!

Added "AI Impersonate" feature for Instruct mode.

Smoothing factor added, can be configured in dynamic temperature panel.

Added a toggle to enable printable view (unlock vertical scrolling).

Added a toggle to inject timestamps, allowing the AI to be aware of time passing.

Persist API info for A1111 and XTTS, allows specifying custom negative prompts for image gen, allows specifying custom horde keys in KCPP mode.

Fixes for XTTS to handle devices with over 100 voices, and also adds an option to narrate dialogue only.

Toggle to request A1111 backend to save generated images to disk.

Fix for chub.ai card fetching.

Hotfix1.57.1: Fixed some crashes and fixed multigpu for vulkan.

For more information, be sure to run the program from command line with the --help flag.

Contributors

harish0201, 0cc4m, and 3 other contributors

Assets 4

31 Jan 05:55

github-actions

v1.56.yr1-ROCm

d0d4c80

KoboldCPP-v1.56.yr1-ROCm | Test Build Pre-release

Pre-release

Test build to try adding AMD Radeon™ RX 6700XT, 6750XT, 6700M, and 6800M support for Windows

Assets 4

28 Jan 05:14

github-actions

v1.56.yr0-ROCm

e0a3aa3

KoboldCPP-v1.56.yr0-ROCm

Windows build does not contain the Vulkan backend yet.

NEW: Added early support for new Vulkan GPU backend by @0cc4m. You can try it out with the command --usevulkan (gpu id) or via the GUI launcher. Now included with the Windows and Linux prebuilt binaries.

Updated and merged the new GGML backend rework from upstream. This update includes many extensive fixes, improvements and changes across over a hundred commits. Support for earlier non-gguf models has been preserved via a fossilized earlier version of the library. Please open an issue if you encounter problems. The Wiki and Readme have been updated too.

Added support for setting dynatemp_exponent, previously was defaulted at 1.0. Support added over API and in Lite.

Fixed issues with Linux CUDA on Pascal, added more flags to handle conda and colab builds correctly.

Added support for Old CPU fallbacks (NoAVX2 and Failsafe modes) in build targets in the Linux prebuilt binary (and koboldcpp.sh)

Added missing 48k context option, fixed clearing file selection, better abort handling support, fixed aarch64 termux builds, various other fixes.

Updated Kobold Lite with many improvements and new features:

NEW: Added XTTS API Server support (Local AI powered text-to-speech).

Added option to let AI impersonate you for a turn in a chat.

HD image generation options.

Added popup-on-complete browser notification options.

Improved DynaTemp wizard, added options to set exponent

Bugfixes, padding adjustments, A1111 parameter fixes, image color fixes for invert color mode.

Contributors

0cc4m

Assets 4

11 Jan 00:36

github-actions

v1.55.yr0-ROCm

cdb2b73

KoboldCPP-v1.55.yr0-ROCm

Added Dynamic Temperature (DynaTemp), which is specified by a Temperature Value and a Temperature Range (Credits: @kalomaze). When used, the actual temperature is allowed to be automatically adjusted dynamically between DynaTemp ± DynaTempRange. For example, setting temperature=0.4 and dynatemp_range=0.1 will result in a minimum temp of 0.3 and max of 0.5. For ease of use, a UI to select min and max temperature for dynatemp directly is also provided in Lite. Both inputs will work and auto update the other.

Try to reuse cloudflared file when running remote tunnel, but also handle if cloudflared fails to download correctly.

Added a field to show the most recently used seed in the perf endpoint

Switched cuda pool malloc back to the old implementation

Updated Lite, added support for DynaTemp

Merged new improvements and fixes from upstream llama.cpp

Various minor fixes.

To use on Windows, download and run the koboldcpp_rocm.exe, which is a one-file pyinstaller OR download koboldcpp_rocm_files.zip and run python koboldcpp.py (additional python pip modules might need installed, like customtkinter and tk or python-tk.
To use on Linux, clone the repo and build with make LLAMA_HIPBLAS=1 -j4 (-j4 can be adjusted to your number of CPU threads for faster build times)
For a full Linux build, make sure you have the OpenBLAS and CLBlast packages installed:
For Arch Linux: Install cblas openblas and clblast.
For Debian: Install libclblast-dev and libopenblas-dev.
then run make LLAMA_HIPBLAS=1 LLAMA_OPENBLAS=1 LLAMA_CLBLAST=1 -j4

Contributors

kalomaze

Assets 4

02 Jan 03:59

github-actions

v1.54.yr0-ROCm

2e41f66

KoboldCPP-v1.54.yr0-ROCm

koboldcpp-1.54-ROCm

Merge with @LostRuins latest upstream update

welcome to 2024 edition

Added logit_bias support (for both OpenAI and Kobold APIs. Accepts a dictionary of key-value pairs, which indicate the token IDs (int) and logit bias (float) to apply for that token. Object format is the same as and compatible with the official OpenAI implementation, though token IDs are model specific. (thanks @DebuggingLife46)

Updated Lite, added support for custom background images (thanks @Ar57m), and added customizable settings for stepcount and cfgscale for Horde/A1111 image generation.

Added mouseover tooltips for all labels in the GUI launcher.

Cleaned up and simplified the UI of the quick launch tab in the GUI launcher, some advanced options moved to other tabs.

Bug fixes for garbled output in Termux with q5k Phi

Fixed paged memory fallback when pinned memory alloc fails while not using mmap.

Attempt to fix on-exit segfault on some Linux systems.

Updated KAI United class.py, added new parameters.

Makefile fix for Linux CI build using conda (thanks @henk717)

Merged new improvements and fixes from upstream llama.cpp (includes VMM pool support)

Included prebuilt binary for no-cuda Linux as well.

Various minor fixes.

To use on Windows, download and run the koboldcpp_rocm.exe, which is a one-file pyinstaller OR download koboldcpp_rocm_files.zip and run python koboldcpp.py
If you're using NVIDIA, you can try koboldcpp.exe at LostRuin's upstream repo here
If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller, also at LostRuin's repo.
To use on Linux, clone the repo and build with make LLAMA_HIPBLAS=1 -j4

Contributors

henk717, LostRuins, and Ar57m

Assets 4

23 Dec 09:35

github-actions

v1.53.yr0-ROCm

b85d59e

KoboldCPP-v1.53.yr0-ROCm

koboldcpp-1.53-ROCm

Merge with @LostRuins latest upstream update

Added support for SSL. You can now import your own SSL cert to use with KoboldCpp and serve it over HTTPS with --ssl [cert.pem] [key.pem] or via the GUI. The .pem files must be unencrypted, you can also generate them with OpenSSL, eg. openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -sha256 -days 365 -config openssl.cnf -nodes (location of openssl.cnf might differ on linux distros. try searching for it with locate openssl.cnf) for your own self signed certificate.

Added support for presence penalty (alternative rep pen) over the KAI API and in Lite. If Presence Penalty is set over the OpenAI API, and rep_pen is not set, then rep_pen will be set to a default of 1.0 instead of 1.1. Both penalties can be used together, although this is probably not a good idea.

Added fixes for Broken Pipe error, thanks @mahou-shoujo.

Added fixes for aborting ongoing connections while streaming in SillyTavern.

Merged upstream support for Phi models and speedups for Mixtral

The default non-blas batch size for GGUF models is now increased from 8 to 32.

Merged HIPBlas fixes from @YellowRoseCx

Fixed an issue with building convert tools in 1.52

To use, download and run the koboldcpp_rocm.exe, which is a one-file pyinstaller.
If you're using NVIDIA, you can try koboldcpp.exe at LostRuin's upstream repo here
If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller, also at LostRuin's repo.

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001/

For more information, be sure to run the program from command line with the --help flag.

Contributors

mahou-shoujo, LostRuins, and YellowRoseCx

Assets 3

19 Dec 08:02

github-actions

v1.52.2.yr1-ROCm

031c60b

KoboldCPP-v1.52.2.yr1-ROCm

Add --checkforupdates argument
If enabled, the argument --checkforupdates will fetch the KoboldCpp-ROCm release page(via Github API) one time on start up via HTTPS and compare the latest version number with the current version number and notify the user if a new version is available.
A GUI button is shown on the Network tab. Disabled by default.
hipBLAS autopicking and hipBLAS .kcpps bug fixes
Fixed a mistake preventing hipBLAS from being autopicked on startup
Fixed a bug that occurred when importing a .kcpps file with the backend "Use hipBLAS (ROCm)" and it not selecting "Use hipBLAS (ROCm)".

Assets 3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KoboldCpp-1.58.yr0-ROCm

KoboldCpp-1.57.1.yr1-ROCm

Contributors

KoboldCPP-v1.56.yr0-ROCm

Contributors

KoboldCPP-v1.55.yr0-ROCm

Contributors

koboldcpp-1.54-ROCm

Contributors

koboldcpp-1.53-ROCm

Contributors

Releases: YellowRoseCx/koboldcpp-rocm

KoboldCPP-v1.59.1.yr1-ROCm

KoboldCPP-v1.59.yr1-ROCm

KoboldCPP-v1.58.yr0-ROCm

KoboldCpp-1.58.yr0-ROCm

KoboldCPP-v1.57.1.yr1-ROCm

KoboldCpp-1.57.1.yr1-ROCm

Contributors

KoboldCPP-v1.56.yr1-ROCm | Test Build

KoboldCPP-v1.56.yr0-ROCm

KoboldCPP-v1.56.yr0-ROCm

Contributors

KoboldCPP-v1.55.yr0-ROCm

KoboldCPP-v1.55.yr0-ROCm

Contributors

KoboldCPP-v1.54.yr0-ROCm

koboldcpp-1.54-ROCm

Contributors

KoboldCPP-v1.53.yr0-ROCm

koboldcpp-1.53-ROCm

Contributors

KoboldCPP-v1.52.2.yr1-ROCm