Add ollama model server example #46

MichaelClifford · 2024-02-16T13:47:28Z

This PR is intended as an example of how we could integrate ollama into our project. Open to any questions or discussions that arise form this PR 😄

Added a new directory model_services where we can store information about the different model services available.
Added model_services/ollama which includes an extremely simple Containerfile that calls the existing ollama/ollama:latest as well as a README.md describing how this model service can be used.
Updated the chatbot-langchain/chabout_ui.py so that it works well with either the llamacpp or the ollama model service.

MichaelClifford · 2024-02-16T13:49:59Z

cc @slemeur

slemeur · 2024-02-16T13:53:31Z

Thanks for the ping @MichaelClifford !
Looks very promising!

MichaelClifford · 2024-02-16T14:51:44Z

One caveat worth mentioning with this approach is that is has the user download the models to their host machine using ollama's CLI tooling. Meaning they already have ollama running on their machine, so why use the containerized version instead?

I think this experience could be improved by baking the model pulling steps into the ai studio somewhere.

rhatdan · 2024-03-27T21:04:44Z

Needs a rebase.

rhatdan · 2024-03-29T19:10:12Z

LGTM

Signed-off-by: Michael Clifford <[email protected]>

ericcurtin · 2024-04-22T21:09:23Z

One caveat worth mentioning with this approach is that is has the user download the models to their host machine using ollama's CLI tooling. Meaning they already have ollama running on their machine, so why use the containerized version instead?

If you reorder the steps, run "podman run" and then do a "ollama pull" inside the container, you can just use the ollama binary inside the container for everything. It's probably better to avoid having to update two separate binaries.

You don't need to install ollama directly on your host machine to use it (and in fact that can be kinda nice you can do things rootless, etc.). The only thing you need to install is Nvidia drivers/container toolkit/etc. if you are using Nvidia.

I do have some ideas in this space partially related to podman-ollama project on my GitHub account I'd like to share with you guys sometime.

Another thing to checkout is Universal Blue's integration, they start the ollama container as a quadlet.

I think there's some advantages to trying to do this in a daemonless way though.

I think this experience could be improved by baking the model pulling steps into the ai studio somewhere.

ericcurtin · 2024-04-22T21:12:29Z

I have some MRs open in Ollama upstream also that help install Ollama in a containerised way:

https://github.com/ollama/ollama/pulls/ericcurtin

there's a PR backlog in Ollama though.

MichaelClifford · 2024-04-24T15:55:28Z

If you reorder the steps, run "podman run" and then do a "ollama pull" inside the container, you can just use the ollama binary inside the container for everything. It's probably better to avoid having to update two separate binaries.

Fully agree. My comment was mainly an artifact of how we were (are) managing models. via volume mounts onto the host machine without write permissions. Meaning, there'd need to be a unique model management path to deal with ollama's registry. Mainly I think we can probably come up with a better way manage model files in general.

I do have some ideas in this space partially related to podman-ollama project on my GitHub account I'd like to share with you guys sometime.

Sounds good. Want to open an new issue to start that discussion?

ericcurtin · 2024-04-27T14:53:50Z

Linked discussion:

#349

MichaelClifford requested review from jeffmaury and sallyom February 16, 2024 13:47

MichaelClifford force-pushed the ollama branch 2 times, most recently from 4f88b5b to 9b2f81a Compare March 29, 2024 13:15

MichaelClifford changed the title ~~Add ollama model server example~~ [WIP] Add ollama model server example Mar 29, 2024

MichaelClifford marked this pull request as draft March 29, 2024 13:16

MichaelClifford force-pushed the ollama branch 2 times, most recently from 8ce63e6 to 83f38cb Compare March 29, 2024 18:41

MichaelClifford marked this pull request as ready for review March 29, 2024 18:42

MichaelClifford changed the title ~~[WIP] Add ollama model server example~~ Add ollama model server example Mar 29, 2024

MichaelClifford force-pushed the ollama branch from 83f38cb to 102a48a Compare April 1, 2024 16:03

MichaelClifford requested review from rhatdan, lmilbaum and cgwalters as code owners April 1, 2024 16:03

add ollama model server

c382a69

Signed-off-by: Michael Clifford <[email protected]>

MichaelClifford force-pushed the ollama branch from 102a48a to c382a69 Compare April 1, 2024 16:25

rhatdan merged commit 942df22 into containers:main Apr 1, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ollama model server example #46

Add ollama model server example #46

MichaelClifford commented Feb 16, 2024

MichaelClifford commented Feb 16, 2024

slemeur commented Feb 16, 2024

MichaelClifford commented Feb 16, 2024

rhatdan commented Mar 27, 2024

rhatdan commented Mar 29, 2024

ericcurtin commented Apr 22, 2024 •

edited

Loading

ericcurtin commented Apr 22, 2024

MichaelClifford commented Apr 24, 2024

ericcurtin commented Apr 27, 2024

Add ollama model server example #46

Add ollama model server example #46

Conversation

MichaelClifford commented Feb 16, 2024

MichaelClifford commented Feb 16, 2024

slemeur commented Feb 16, 2024

MichaelClifford commented Feb 16, 2024

rhatdan commented Mar 27, 2024

rhatdan commented Mar 29, 2024

ericcurtin commented Apr 22, 2024 • edited Loading

ericcurtin commented Apr 22, 2024

MichaelClifford commented Apr 24, 2024

ericcurtin commented Apr 27, 2024

ericcurtin commented Apr 22, 2024 •

edited

Loading