-
Notifications
You must be signed in to change notification settings - Fork 393
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inability to use capabilities of dGPU, CLBlast(Old CPU) + other suggestions. #1272
Comments
Comparsion of Vulkan and Old CPU with 1K context. |
CLBlast with 0 layers doesnt work at all? |
Yeah, absolutely, neither with HD Graphics 4000 or AMD Accelerated Parallel Processing(Oland), though it worked with versions earlier than 1.7X, but only with HD Graphics 4000. |
CLBlast(Old CPU) with 0 layers(1.79.1) |
Same error with Oland. |
Also if it's true or not, I noticed 1.79.1 is less creative compared to 1.69.1 is it because of DRY or XTC, I don't know, but the outputs are always different compared to 1.69.1, even if I disable DRY and XTC. |
The 2 versions shouldnt have any difference in creativity. 1.80 is just released, you can try that. |
After a comprehensive testing of 1.80 the same errors, but noticeable performance boost(especially with longer contexts). |
Also to clarify all confusions with Creativity, I'll provide all fine-tuned custom settings for UI and saved Character Card, so you'll able to reproduce all the errors related to creativity |
All the errors is happening when you try to change DRY, even if Mult./Base/A.Len 0, and cannot be disabled either from UI or from command-line arguments, even others like XTC, TOP K and etc. |
Also GPU Utilization is same 0% with 1.80, only about ~40MB of GPU is used, and 100% of CPU : |
Also Vulkan with more than 0 layers crashes : |
It will only use noavx2.dll if you selected "old cpu" option. If you have avx2 support you should not use that! |
Then what other variant do I have? CPU (Old CPU) Works, but CLBlast (Old CPU) doesn't, it uses koboldcpp_clblast.dll, even with --noavx2 --nommap --usecpu flags, koboldcpp_clblast_noavx2.dll library is completely unusable. |
If I delete koboldcpp_clblast.dll and rename koboldcpp_clblast_noavx2.dll to koboldcpp_clblast.dll it works surprisingly well(Only used 256 context for testing): |
I've tested CLblast with forced library(koboldcpp_clblast_noavx2.dll), and it's a bit faster than Vulkan, which proves 0% GPU utilization and how faster it is with direct usage of OpenCL when Intel(R) Core(TM) i3-3120M CPU @ 2.50GHz is being selected: |
1.80.1 - same errors, and CLblast still works, but only with library replacement: |
@LostRuins tell me any useful tools to debug errors for you to see the exact problem, because KoboldAI's own UI option 'Debug Mode' provides not enough to see the exact problem. |
@LostRuins So, you've completely ignored all the messages answering exactly same things I'll type right now. I launced even with --showgui --noavx2 it still uses koboldcpp_clblast.dll instead of koboldcpp_clblast_noavx2.dll. |
If KoboldCPP GUI uses noavx2=false, even with flags, then it's the issue from GUI itself. Still works if I replace library. |
No, i am not ignoring your messages. I'm saying the behavior when you run with The txt file you are sending me does not seem to match the command lines you have sent. Somehow, you seem to be running a benchmark? Are you loading another config file by mistake? That will override the flags you set. |
@LostRuins No. Even with noavx2=false Vulkan (Old CPU) will use koboldcpp_vulkan_noavx2.dll, but with CLBlast (Old CPU), it will use koboldcpp_clblast.dll, even if I use --noavx2. It will ONLY use koboldcpp_clblast_noavx2.dll if I run directly from terminal without actually using any UI. |
@LostRuins You can directly reproduce this error if you run with these flags : --showgui --noavx2 |
@LostRuins WITHOUT config. |
Still koboldcpp_clblast.dll, ALWAYS koboldcpp_clblast.dll with UI. |
@LostRuins Also can you add "Break" button to forcefully stop prompt from browser's frontend? |
Okay I think I see the bug. I will do a new build |
@LostRuins Awesome, what do you think about adding toggle-able functions like DRY, XTC and etc. for UI and/or flags, because with some models I don't use such things like DRY, XTC, Top-K, Top-P, Temperature, etc., and disabling some of them might increase performance, especially on low-end machines. Also what can I do to provide you with enough info to fix the inability of GPU utilization for my Vulkan device? |
As I still can't use my Vulkan device, even with 1.80.1, it uses only about ~20MB. of GPU memory, but still uses CPU only as I mentioned earlier. |
@Luro223 fix is up, please try latest version 1.80.3 |
Meanwhile, what error does vulkan give you when you try to use it with offloaded layers |
@LostRuins Thanks, CLblast works with terminal, as well with custom settings too: |
Edit: Sorry, the errors same as attached from 1.80.1 : |
@LostRuins More layers - same errors. |
@LostRuins Any news? Or maybe additional tools for me to debug more info from these errors? |
You need to disable quantized KV cache. It's not supported with Vulkan. |
@LostRuins After some tests I noticed that GPU kinda works with KV off, but max GPU utilization was 64%, and with KV2 it's significantly faster than Vulkan, even with 5 layers(Crashes with more layers.). |
Why are you using BlasBatchSize = -1? That basically negates the prompt processing speedup of the GPU. |
Yeah it worked, with test 256 context size I got <60Sec. instead of 100. but, after experimenting with blas it crashed midway by blas 512, and now I can't run with any blas settings, only no blas with 0 layers work(will show later): |
Another error, similar to first one, but crashed even with blas 256. And as same with previous one, the Vulkan device becomes completely unusable, even with full dGPU driver reload, only full reboot helps: |
Also tested with 1layer no blas, and 0layers blas32, only 0layers and no blas works after error(midway crash): |
@LostRuins After a long testing I've learned that I only have 1GB of Vram, such a powerful capabilities will be limited because of VRAM... So, I've managed to launch it, and here are the results: |
You can select a blasbatchsize from the supported list using a custom value apart from the above is not supported at this time. You can try either 128 or 256. |
@LostRuins Ok i'll wait for custom value then, so I'll be able to use more than 512mb or 768mb. |
@LostRuins After upgrading to 1.82, CLBlast (Older CPU) is waaay too slow, even 2X slower compared to Failsafe: CLBlast NoAVX2 (Old CPU)-I3-3120M-1.81.1.txt |
@LostRuins Another issue with 1.82.2: |
@LostRuins Also with MMAP unchecked it shows: |
This is now fixed in 1.82.3 |
Hello dear developer of KoboldAI CPP.
I've been using 1.79.1 since release and I have i3-3120M with HD Graphics 4000 and HD 8700M, since 1.79.1 I finally managed to use Vulkan for HD 8700M, it is a bit faster now, but still I can't use dGPU's capabilities, KoboldAI CPP still only uses CPU, it only uses small amount of GPU's memory, also it crashes if I use more than 0 GPU layers on Vulkan.
CLBlast worked earlier with HD Graphics 4000, but with 1.7X it stopped working, AMD Accelerated Parallel Processing never actually worked with CLBlast.
Using --sdvaeauto slightly increases performance, I'll show the results later.
Edit: --sdvaeauto slightly increases performance but in rare cases, I need to test more.
Also is it possible for you to add toggle-able DRY, XTC, Temperature, Top-P, Top-K, either from command-line interface or GUI? I don't use any of these with some models, because, I think, some of them might affect performance, even if not used (0 value), especially with XTC.
The text was updated successfully, but these errors were encountered: