-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ollama model server example #46
Conversation
cc @slemeur |
Thanks for the ping @MichaelClifford ! |
One caveat worth mentioning with this approach is that is has the user download the models to their host machine using ollama's CLI tooling. Meaning they already have ollama running on their machine, so why use the containerized version instead? I think this experience could be improved by baking the model pulling steps into the ai studio somewhere. |
Needs a rebase. |
4f88b5b
to
9b2f81a
Compare
8ce63e6
to
83f38cb
Compare
LGTM |
Signed-off-by: Michael Clifford <[email protected]>
If you reorder the steps, run "podman run" and then do a "ollama pull" inside the container, you can just use the ollama binary inside the container for everything. It's probably better to avoid having to update two separate binaries. You don't need to install ollama directly on your host machine to use it (and in fact that can be kinda nice you can do things rootless, etc.). The only thing you need to install is Nvidia drivers/container toolkit/etc. if you are using Nvidia. I do have some ideas in this space partially related to podman-ollama project on my GitHub account I'd like to share with you guys sometime. Another thing to checkout is Universal Blue's integration, they start the ollama container as a quadlet. I think there's some advantages to trying to do this in a daemonless way though.
|
I have some MRs open in Ollama upstream also that help install Ollama in a containerised way: https://github.com/ollama/ollama/pulls/ericcurtin there's a PR backlog in Ollama though. |
Fully agree. My comment was mainly an artifact of how we were (are) managing models. via volume mounts onto the host machine without write permissions. Meaning, there'd need to be a unique model management path to deal with ollama's registry. Mainly I think we can probably come up with a better way manage model files in general.
Sounds good. Want to open an new issue to start that discussion? |
Linked discussion: |
This PR is intended as an example of how we could integrate ollama into our project. Open to any questions or discussions that arise form this PR 😄
model_services
where we can store information about the different model services available.model_services/ollama
which includes an extremely simple Containerfile that calls the existing ollama/ollama:latest as well as aREADME.md
describing how this model service can be used.chatbot-langchain/chabout_ui.py
so that it works well with either the llamacpp or the ollama model service.