-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Ubuntu version to 24.04 #52
Comments
Hi @adamoutler - thank you for raising your issue. The error message does not look a python version issue to me, but I will take a note to check the compatibility of this plug-in. |
Plugin dev identified as python issue and recommended Ubuntu 23+. |
Thanks - I have now read the conversation you had with mamel16. Upgrading the python version requires a lot of testing on my end - this application is a leaning tower of machine learning dependencies - but I will roll it into some other major updates that I have in mind. Then again, sometimes it just works right away! You are welcome to modify the Docker file's base image if you'd like to give it a go. |
I understand. I attempted to do so myself already and many of the pip packages have been moved into Ubuntu package repositories. While I'm no stranger to dependency management, and I do like the stability from package managers over individual packaging, I'm positive there are differences that will hinder me further. I may have some time to continue later. Even building this on a 20 thread processor took a very long time. Not sure what I can do to speed it up. Any recommendations on making a smaller build from your multi-state Dockerfile? |
Glad that you're trying it out! Please feel welcome to share your progress and experience. Make sure that you specify the Docker builds cache their steps by default and will only re-build if something changes. Bear this in mind when you're tweaking things - try to put test variations and experiments after the longer build steps. The default image takes about five minutes to build on my 5950x - I would expect your times to be similar to that. |
So it looks like we're basically required to let Ubuntu manage the virtual environment in 24.04. ####################
### BUILD IMAGES ###
####################
# COMMON
FROM ubuntu:24.04 AS app_base
# Pre-reqs
RUN apt-get update && apt-get install --no-install-recommends -y \
git vim build-essential python3-dev python3-venv python3-pip\
python3-virtualenv
# Instantiate venv and pre-activate
RUN virtualenv /venv
# Credit, Itamar Turner-Trauring: https://pythonspeed.com/articles/activate-virtualenv-dockerfile/
ENV VIRTUAL_ENV=/venv
RUN python3 -m venv $VIRTUAL_ENV
ENV PATH="$VIRTUAL_ENV/bin:$PATH"
... The only changes required were to update the image, and then use the |
Looks like it's expecting 3.10 but getting 3.10.9?
|
Hi @adamoutler, thank you for trying the update and sharing your results - it's useful to know your experience with it. Seems like we did not get lucky with just changing the base image - it was worth a shot, though! I'm not sure why the venv would need to be managed by Ubuntu... I would prefer to keep it via pip. I'll have a look at installing a different python version in a venv, and also consider using conda-forge environment instead. The first option is a smaller change, the second option is a larger rework. |
The reason for apt install is when working directly on the OS, you must now either |
I'm not quite sure I follow - isn't that the same as the current approach in the Dockerfile? |
No. https://github.com/Atinoda/text-generation-webui-docker/blob/master/Dockerfile#L11 |
Thanks for clarifying - I can see the distinction now. Just using Seems like there will be a need to manage simultaneous 3.10 and 3.11 environments starting from v1.8 of textgen - it'll be interesting navigating that! |
Is there any requirement to use 3.10? Seems like 3.11 is the way. Actually I was also wondering; Is there any requirement to use Ubuntu? It might be cleaner to start with an Alpine container and build in exactly what is required. I'm considering doing this because
|
The release notes mention that 3.10 is required for TensorRT-LLM - apparently, it's the new and fastest backend in the project. I haven't had time to test it yet though. Regarding Ubuntu as a base, I used to have a nvidia machine learning image as the base - which produced even larger images - so Ubuntu was the slimming down effort at the time plus the benefits of generalisation. I'm a big fan of alpine based images for embedded systems and micro-services, and use them regularly there - and you raise valid points about their strengths. Given that The build times can indeed be slow and painful for this image and its variants - and that was one of my main motivations for setting up the project and pushing the pre-built versions to docker hub. Textgen is a fast-moving and cutting-edge project rather than mature and stable software - slimming its deployment down and tuning dependencies when they might change tomorrow is making a rod for your own back. Rather the goal is to ease its deployment, increase accessibility, and get people up and running quickly - but still offer the option to get hands-on and build or tweak it yourself. None of these things are a requirement as such - very few things in life are - but that's my current thinking and motivation. Spend more of my finite resources on developing features and less on minimising the image footprint. |
I added TensorRT-LLM support to the latest model, but there's not much available for it at the moment. It's also limited to Nvidia hardware only. Therefore, I'm not going to consider its integration as a stumbling block in the move to python 3.11. The next update will investigate making the shift to 3.11, checking if there's any other dependency issues or hidden gotchas! TensorRT-LLM will either be dropped or be held in a separate variant if the transition is successful. |
Below is only discouraging Alpine, which has already been decided to avoid AFAIK. I just wanted to add additional context as to why it'd be bad.
Alpine has been known to have a variety of problems that are not fun to troubleshoot. DNS and glibc vs musl are common issues that can be encountered, along with memory leaks (and higher usage in general IIRC) and often slower performance (try building a rust project and notice it can be 2-3x slower). Smaller size is often very minor if you build the image properly. I've made images with Fedora as a base that are only slightly larger than the equivalent Alpine image. In regards to Python, you can find resources online sharing woes with Alpine specific to that. Runtime performance is one IIRC. Make sure you properly test/benchmark such a switch before adopting such. It's easy to look at advice parroting online that suggests Alpine for small sizes and smaller attack surfaces, which is easy to observe and reason with, rather than the more project specific concerns and networking quirks you can run into that waste time troubleshooting the cause. Image weightBesides the bulk of the image weight is from this image doing the $ dua /venv/lib/python3.10/site-packages
206.89 MiB sudachidict_core
353.01 MiB exllamav2_ext.cpython-310-x86_64-linux-gnu.so
389.21 MiB bitsandbytes
418.73 MiB triton
587.74 MiB llama_cpp_cuda
633.33 MiB flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so
642.19 MiB llama_cpp_cuda_tensorcores
1.48 GiB torch
2.74 GiB nvidia
9.62 GiB total
# Total image size (rounded up):
$ du -shx --bytes --si /
11G $ dua /venv/lib/python3.10/site-packages/torch/lib
453.07 MiB libtorch_cpu.so
815.36 MiB libtorch_cuda.so
1.38 GiB total
$ dua /venv/lib/python3.10/site-packages/nvidia
94.25 MiB curand
185.40 MiB cufft
185.51 MiB cusolver
209.34 MiB nccl
252.91 MiB cusparse
594.99 MiB cublas
1.10 GiB cudnn
2.74 GiB total Some of it is the wider cuda support bundled into a single image, while a fair amount from bundling the other variety of supported options. So I guess there is not much that can be done about that when the image itself is meant to be for convenience. With Alpine isn't likely to save much there presumably, unless these files are larger than they need to be due to how they were compiled. Rant - non-root exampleThis is not unlike some advice to adopt non-root users for containers. Root in a container is not equivalent to root on the Docker host despite what some might think there is already constraints. Often the exploits that are referred to rely on conditions that are from misconfiguration, there is some valid reasoning to prefer non-root so that a user does not have to drop capabilities by default themselves. My issue with non-root being adopted for security reasons is when the project then works around the lack of needed capabilities that root would have provided (or requires opt-in via config if non-default caps that would be capable of more damage).. by granting an executable those capabilities, it defeats the purpose (this is often done with |
@polarathene - thank you for your interesting comment! It is a helpful perspective that reinforces my decision to stay away from alpine in this case. Your commentary regarding the non-root vs root containers was good to read. I run a lot of services and there's a mix of the two - root tends to be much easier to set-up and manage. My feelings are that non-root maybe has fewer security foot guns for a casual user, but it also tends to have a lot more complexities in day-to-day operations. I had been considering refactoring this image as non-root but that particular task has been pushed even further down the list now! Besides, it's already complicated enough for people to get their hardware acceleration working properly. |
Ubuntu 22.04LTS comes with Python 3.10. 22.04LTS is replaced by 24.04LTS. 24.04LTS has Python 3.11. Ubuntu 22.04 with Python 3.10 is causing issues with LLM_Web_search.
According to LLM_Web_Search dev, LLM_Web_Search requires Python 3.11, and Python 3.10 will not work.
Update to 23.**+ is required for plugins which are used in Textgen webui
The text was updated successfully, but these errors were encountered: