Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Still running in CPU #604

Open
mmw1984 opened this issue Aug 6, 2024 · 20 comments
Open

Still running in CPU #604

mmw1984 opened this issue Aug 6, 2024 · 20 comments

Comments

@mmw1984
Copy link

mmw1984 commented Aug 6, 2024

Screenshot_2024-08-05-22-01-16-01_507096bf411ffee187df405bf527ca60
Sorry that the screenshot is not in English

@mmw1984
Copy link
Author

mmw1984 commented Aug 6, 2024

Screenshot_2024-08-05-22-01-10-46_507096bf411ffee187df405bf527ca60
Device: Oneplus Ace 2 Pro
Chioset: Snapdragon 8 Gen 2
System: Android 14
RAM: 16GB

@mmw1984 mmw1984 changed the title Still running in GPU Still running in CPU Aug 6, 2024
@danemadsen
Copy link
Member

fixed

@ILOVEPIE
Copy link

No, it's not. The new changes include the x86_64 Linux build of the Vulkan SDK, not the correct ARM version, so maid_llm doesn't compile with Vulkan support.

@danemadsen danemadsen reopened this Aug 24, 2024
@danemadsen
Copy link
Member

Awe thats no fun

@danemadsen
Copy link
Member

@ILOVEPIE any idea how to install that in github actions

@ILOVEPIE
Copy link

@ILOVEPIE any idea how to install that in github actions

The only thing I can think of off the top of my head is installing the dart binding for Vulkan, but I'm not sure if that'll work. I'll do a little more research.

@ILOVEPIE
Copy link

Apparently the Vulkan headers and libraries come as part of the Android SDK. I need to figure out where Flutter puts that so that you can adjust your CMake options to point to that.

@danemadsen
Copy link
Member

No because the gitlab pipeline compiles it with vulkan fine.

download this build here:
https://gitlab.com/mobile-artificial-intelligence/maid/-/jobs/7663138914

its an issue with the github action thats causing it to not detect the vulkan headers

@danemadsen
Copy link
Member

danemadsen commented Aug 26, 2024

nvm its not working there either

@ILOVEPIE
Copy link

OK, I figgured it out... probably. the docs on this are terrible but i have an idea now. Ill test it on my fork then PR the changes.

@danemadsen
Copy link
Member

Ok, much appreciated if you can get it working.

@ILOVEPIE
Copy link

Ok, much appreciated if you can get it working.

Assuming I don't need to make any other minor adustments to the build file, and the binary doesn't crash, I should be able to send that PR over shortly. I've already got it detecting the android vulkan library. There were some outdated paths in the CMakeLists though, which caused the build to fail when vulkan was available. So, fingers crossed that it doesn't crash when I install and test it.

@ILOVEPIE
Copy link

Ugg... The vulkan c++ headers aren't in the NDK, only the c ones.

@danemadsen
Copy link
Member

You got this bro, i believe in you.

@ILOVEPIE
Copy link

You got this bro, i believe in you.

I have some good news and some bad news. I got the Vulkan support compiling. I haven't tested it yet, but I got it to find the headers and library in the compile phase. The problem is it's going to require a higher minimum version of android than we currently require, the llama.cpp vulkan implementation (for the version of llama.cpp we're currently using) requires (at least) vulkan 1.1 support (maybe higher) which means Android 14 minimum. The only solution I can come up with is having both a CPU-based version of the library and a GPU-based version of the library, and we detect if the user has Vulkan support and use the CPU library if they don't. And then we also give them an option to toggle GPU acceleration off, because phone SOC manufacturers aren't known for their GPUs and GPU Drivers. What I mean is phones tend to have fairly buggy GPUs, so we should at least have the option to turn the acceleration off.

@danemadsen
Copy link
Member

danemadsen commented Aug 27, 2024

The only solution I can come up with is having both a CPU-based version of the library and a GPU-based version of the library, and we detect if the user has Vulkan support and use the CPU library if they don't.

Its probably best to just ship 2 releases, 1 for vulkan / android 14 and 1 for versions below 14. Obviously it would be best to allow the user to switch between GPU / CPU (and eventually NPU) but for now shipping 2 apk's / bundles is probably a faster solution.

@ILOVEPIE
Copy link

The only solution I can come up with is having both a CPU-based version of the library and a GPU-based version of the library, and we detect if the user has Vulkan support and use the CPU library if they don't.

Its probably best to just ship 2 releases, 1 for vulkan / android 14 and 1 for versions below 14. Obviously it would be best to allow the user to switch between GPU / CPU (and eventually NPU) but for now shipping 2 apk's / bundles is probably a faster solution.

To be honest, I think it's about the same amount of work to do either option. So I'd rather do the more permanent solution.

@ILOVEPIE
Copy link

Just about to test the vulkan build.

@ILOVEPIE
Copy link

ILOVEPIE commented Sep 2, 2024

I've been trying to figure out why the Vulkan build is crashing. I'm not exactly sure. I'm getting some sort of weird illegal signal on ARM. I'm not too familiar with the ARM architecture. So I'm not sure if this is indicating an illegal instruction or some type of illegal register value or something. I don't know.

@ILOVEPIE
Copy link

ILOVEPIE commented Sep 3, 2024

I'm going to attempt to make a x86_64 build of the app to see if that will elucidate anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants