-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Dockerfile to use devel image for compatibility #2848
base: main
Are you sure you want to change the base?
Conversation
This PR solves my issue descibed here, thank you! |
I just noticed that you changed the implementation between me building the image and my previous message. Should I test again? |
Not yet! still broken I just pushed to try to build it on my server but so far it is not working! will ping you when it is 💯 |
@KreshLaDoge I tried multiple variants of using only Any ideas about how to get it to work, or reduce the size of the final image? |
Do you have a way of reproducing the error more quickly than building the entire image from scratch every time? And do you manage to get to the exact compiler error? If you could share that, I might take a look. |
As previously suggested, the fix cannot be accepted as-is. It bloats the image way too much (20GB vs 12GB). First we need to reproduce locally, then figure out why the hell triton wants to recompile something (not a kernel obviously, since it tries |
This would be a workaround, but it does not solve the underlying bug. If you look at #2838, Triton finds the directory with
|
Besides the above it would also be useful to post the contents of the following files on the host system:
(Or if they don't exist, check in these directories for a JSON file for nvidia.) |
What does this PR do?
The TGI server fails to start due to missing Python headers during the compilation of Triton indexing kernels. The solution is to change the base image to nvidia/cuda:12.4.1-devel-ubuntu22.04 to match the builder image, ensuring the necessary headers are included.
This change increases the image size but resolves the startup issue.
Fixes # (issue)
This pull request addresses the issue #2838
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.