-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add rocm builds and documentation #1012
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please ensure that the pull request (PR) focuses on a single part. I would suggest that you only modify the Docker-related CI configuration files (.yml/.Dockerfile) in this PR to facilitate a smoother merge.
Well I mean documentation seems to be a pretty important part of it and I wouldn't think there are many issues there. Sure I can split it again, but I'm sinking many hours into testing and splitting this, there's a limit somewhere to what I can manage as well. This already consumed most of my Sunday now. |
I fully understand and would like to express my gratitude once again for your contribution :) I mainly refer to certain changes (e.g., Rust 1.73 -> 1.8, manylinux build). These are all valuable, but they require careful testing to ensure that everything still works. The documentation part is fine to retain; I will provide comments once we reach a good point in the remaing part. |
Yeah sure I can just downgrade that again. Thought as docker uses latest stable anyway, might as well update that. |
Well the manylinux image for rocm now seems to run into a disk space limit, could you maybe try and delete some caches or so? In theory it's much smaller (< 6GB) than the cuda (~10GB) image, which is why I'm slightly confused about it... Maybe it's running an parallel on the same VM? |
I'll suggest skip manylinux build atm, and checkin docker image in this PR. |
I think it's just a fluke in the build and the build has to be restarted (as it worked before with a bigger container). Can't do that myself though. Need to update anyway, so we'll see if it wants to run this time. |
# Conflicts: # .github/workflows/release.yml # crates/llama-cpp-bindings/src/llama.rs
Found the error: The size on docker hub is compressed and apparently ROCm is quite big, but compresses super well plus the manylinux and pytorch also added magma and MLOpen to it... Got everything working by installing only the minimum needed on the fly. The biggest problem is that this adds ~4min to the ROCm build and I think the ROCm repo might also be slower than today sometimes... As a solution I have prepared this: https://github.com/cromefire/hipblas-manylinux. Super basic docker build setup, but if you want me to switch to that, I'd recommend we move the repo to this org, so it's easier to maintain (basically to only thing required would be to upgrade to the latest ROCm version from time to time) as I probably won't always remember to upgrade it to the latest ROCm version. |
Also Windows build is possible as well (HIP install for windows can be found here), but I'll probably just open an issue for that, because I don't think I have the bandwidth right now to deal with windows, maybe one day... |
Hey - I reverted most docker related part - given the manylinux build now works and I’m thinking of re-organize the docker image build to be around these manylinux binaries directly - will do it as a followup from my side. The other part LGTM, thanks for your effort! |
Yeah as said though, we probably want to transfer the image repo for ROCm to this org, so you don't have to wait on me for any upgrades and it's all under your control. |
And also please at least restore the Dockerfile themselves for manual building, as they are an easy way for people to build their own optimized images quickly. For ROCm this is important as you may have seen these AMDGPU_TARGETS, and while it includes most GPUs, it doesn't include all, so having a way for people to just build it quickly for their specific GPU would be really important, that you I always built and tested all of this as well. |
Also because everything, including compilers, is old and crusty you'd probably actually want to have an optimized build in docker which can use all the newer tools there, including the latest compiler optimizations and mitigations, after all, you don't have to worry about compatibility. |
It would also enable you to adjust the build, so it builds the ROCm libraries from source (they are open source after all), so you could build everything just for your GPU, which would probably shrink the container from like 10GB+ to more like 1-2GB. |
Since the user need to build and test it manually - I'll prefer just leave it in source code.
Let me think about this a bit more - but I'm still a bit leaning to just copy manylinux binary to containers to make build process simpler |
Yeah, not saying you have to base your official docker builds on it, but it'd be good to at least have the Dockerfile there for people to use to build specific builds or just for development. As said I basically never compiled it on bare metal as I initially had some build issues and in docker it just worked. (And you'd also have to install ROCm on your machine directly) |
That's a fair point - essentially it served as devcontainer. |
Something like that, that's the nice thing with that, you don't need to install anything for it, you also don't even need to mess around with devcontainers, which are still a bit finicky on JetBrains IDEs, you just say |
Fixes #636
Proper display in Telemetry and UI not included, see #902 for that.