Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update mistral model to K_M and image updates to ai-lab #123

Merged
merged 1 commit into from
Mar 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/model_servers.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ jobs:

- name: Download model
working-directory: ./model_servers/llamacpp_python/
run: make llama-2-7b-chat.Q5_K_S.gguf
run: make mistral-7b-instruct-v0.1.Q4_K_M.gguf

- name: Set up Python
uses: actions/[email protected]
Expand Down
10 changes: 4 additions & 6 deletions ai-lab-recipes-images.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
## Images (x86_64, aarch64) currently built from GH Actions in this repository

- quay.io/redhat-et/locallm-model-service:latest
- quay.io/ai-lab/llamacpp-python:latest
- quay.io/redhat-et/locallm-text-summarizer:latest
- quay.io/redhat-et/locallm-chatbot:latest
- quay.io/ai-lab/chatbot:latest
- quay.io/redhat-et/locallm-rag:latest
- quay.io/redhat-et/locallm-codegen:latest
- quay.io/redhat-et/locallm-chromadb:latest
Expand All @@ -11,9 +11,7 @@

## Model Images (x86_64, aarch64) currently in `quay.io/redhat-et/locallm-*`

- quay.io/redhat-et/locallm-llama-2-7b:latest
- [model download link](https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q5_K_S.gguf)
- quay.io/redhat-et/locallm-mistral-7b-gguf:latest
- [model download link](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF/resolve/main/mistral-7b-instruct-v0.1.Q4_K_S.gguf)
- quay.io/ai-lab/mistral-7b-instruct:latest
- [model download link](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF/resolve/main/mistral-7b-instruct-v0.1.Q4_K_M.gguf)
- quay.io/redhat-et/locallm-codellama-7b-gguf:latest
- [model download link](https://huggingface.co/TheBloke/CodeLlama-7B-Instruct-GGUF/resolve/main/codellama-7b-instruct.Q4_K_M.gguf)
5 changes: 4 additions & 1 deletion model_servers/llamacpp_python/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,16 @@ build:
llama-2-7b-chat.Q5_K_S.gguf:
curl -s -S -L -f https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q5_K_S.gguf -z $@ -o [email protected] && mv -f [email protected] $@ 2>/dev/null || rm -f [email protected] $@

mistral-7b-instruct-v0.1.Q4_K_M.gguf:
curl -s -S -L -f https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF/resolve/main/mistral-7b-instruct-v0.1.Q4_K_M.gguf -z $@ -o [email protected] && mv -f [email protected] $@ 2>/dev/null || rm -f [email protected] $@

.PHONY: install
install:
pip install -r tests/requirements-test.txt

.PHONY: run
run:
podman run -it -d -p 8001:8001 -v ./models:/locallm/models:ro,Z -e MODEL_PATH=models/llama-2-7b-chat.Q5_K_S.gguf -e HOST=0.0.0.0 -e PORT=8001 --net=host ghcr.io/redhat-et/model_servers
podman run -it -d -p 8001:8001 -v ./models:/locallm/models:ro,Z -e MODEL_PATH=models/mistral-7b-instruct-v0.1.Q4_K_M.gguf -e HOST=0.0.0.0 -e PORT=8001 --net=host ghcr.io/redhat-et/model_servers

.PHONY: test
test:
Expand Down
4 changes: 2 additions & 2 deletions model_servers/llamacpp_python/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ At the time of this writing, 2 models are known to work with this service
- **Llama2-7b**
- Download URL: [https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q5_K_S.gguf](https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q5_K_S.gguf)
- **Mistral-7b**
- Download URL: [https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF/resolve/main/mistral-7b-instruct-v0.1.Q4_K_S.gguf](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF/resolve/main/mistral-7b-instruct-v0.1.Q4_K_S.gguf)
- Download URL: [https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF/resolve/main/mistral-7b-instruct-v0.1.Q4_K_M.gguf](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF/resolve/main/mistral-7b-instruct-v0.1.Q4_K_M.gguf)

```bash
cd ../models
Expand All @@ -29,7 +29,7 @@ cd ../
```
or
```bash
make -f Makefile models/llama-2-7b-chat.Q5_K_S.gguf
make -f Makefile models/mistral-7b-instruct-v0.1.Q4_K_M.gguf
```

### Deploy Model Service
Expand Down
2 changes: 1 addition & 1 deletion model_servers/llamacpp_python/tests/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
)
],
extra_environment_variables={
"MODEL_PATH": "models/llama-2-7b-chat.Q5_K_S.gguf",
"MODEL_PATH": "models/mistral-7b-instruct-v0.1.Q4_K_M.gguf",
"HOST": "0.0.0.0",
"PORT": "8001"
},
Expand Down
2 changes: 1 addition & 1 deletion model_servers/llamacpp_python/tooling_options.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
"This notebook assumes that the playground image is running locally. Once built, you can use the below to start the model service image. \n",
"\n",
"```bash\n",
"podman run -it -p 8000:8000 -v <YOUR-LOCAL-PATH>/locallm/models:/locallm/models:Z -e MODEL_PATH=models/llama-2-7b-chat.Q5_K_S.gguf playground\n",
"podman run -it -p 8000:8000 -v <YOUR-LOCAL-PATH>/locallm/models:/locallm/models:Z -e MODEL_PATH=models/mistral-7b-instruct-v0.1.Q4_K_M.gguf playground\n",
"```"
]
},
Expand Down
4 changes: 2 additions & 2 deletions models/Containerfile
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
#https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q5_K_S.gguf
#https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF/resolve/main/mistral-7b-instruct-v0.1.Q4_K_S.gguf
#https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF/resolve/main/mistral-7b-instruct-v0.1.Q4_K_M.gguf
#https://huggingface.co/TheBloke/CodeLlama-7B-Instruct-GGUF/resolve/main/codellama-7b-instruct.Q4_K_M.gguf
#https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-small.bin
# podman build --build-arg MODEL_URL=https://... -t quay.io/yourimage .
FROM registry.access.redhat.com/ubi9/ubi-micro:9.3-13
ARG MODEL_URL
ARG MODEL_URL=https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF/resolve/main/mistral-7b-instruct-v0.1.Q4_K_M.gguf
WORKDIR /model
ADD $MODEL_URL .
4 changes: 2 additions & 2 deletions recipes/natural_language_processing/chatbot/ai-lab.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ application:
- amd64
ports:
- 8001
image: quay.io/redhat-et/locallm-model-service:latest
image: quay.io/ai-lab/llamacppp-python:latest
- name: streamlit-chat-app
contextdir: .
containerfile: builds/Containerfile
Expand All @@ -24,4 +24,4 @@ application:
- amd64
ports:
- 8501
image: quay.io/redhat-et/locallm-chatbot:latest
image: quay.io/ai-lab/chatbot:latest
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ spec:
initContainers:
- name: model-file
image: quay.io/ai-lab/mistral-7b-instruct:latest
command: ['/usr/bin/install', "/model/mistral-7b-instruct-v0.1.Q4_K_S.gguf", "/shared/"]
command: ['/usr/bin/install', "/model/mistral-7b-instruct-v0.1.Q4_K_M.gguf", "/shared/"]
volumeMounts:
- name: model-file
mountPath: /shared
Expand All @@ -29,7 +29,7 @@ spec:
- name: PORT
value: 8001
- name: MODEL_PATH
value: /model/mistral-7b-instruct-v0.1.Q4_K_S.gguf
value: /model/mistral-7b-instruct-v0.1.Q4_K_M.gguf
image: quay.io/ai-lab/llamacpp-python:latest
name: chatbot-model-service
ports:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ application:
- amd64
ports:
- 8001
image: quay.io/redhat-et/locallm-model-service:latest
image: quay.io/ai-lab/llamacpp-python:latest
- name: codegen-app
contextdir: .
containerfile: builds/Containerfile
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,5 @@ WantedBy=codegen.service

[Image]
Image=quay.io/redhat-et/locallm-codellama-7b-gguf:latest
Image=quay.io/redhat-et/locallm-model-service:latest
Image=quay.io/ai-lab/llamacpp-python:latest
Image=quay.io/redhat-et/locallm-codegen:latest
4 changes: 2 additions & 2 deletions recipes/natural_language_processing/rag/ai-lab.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ application:
- amd64
ports:
- 8001
image: quay.io/redhat-et/locallm-model-service:latest
image: quay.io/ai-lab/llamacpp-python:latest
- name: chromadb-server
contextdir: ../../../vector_dbs/chromadb
containerfile: Containerfile
Expand All @@ -34,4 +34,4 @@ application:
- amd64
ports:
- 8501
image: quay.io/redhat-et/locallm-rag:latest
image: quay.io/redhat-et/locallm-rag:latest
4 changes: 2 additions & 2 deletions recipes/natural_language_processing/summarizer/ai-lab.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ application:
- amd64
ports:
- 8001
image: quay.io/redhat-et/locallm-model-service:latest
image: quay.io/ai-lab/llamacpp-python:latest
- name: streamlit-summary-app
contextdir: .
containerfile: builds/Containerfile
Expand All @@ -24,4 +24,4 @@ application:
- amd64
ports:
- 8501
image: quay.io/redhat-et/locallm-text-summarizer:latest
image: quay.io/redhat-et/locallm-text-summarizer:latest
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@
WantedBy=summarizer.service

[Image]
Image=quay.io/redhat-et/locallm-mistral-7b-gguf:latest
Image=quay.io/redhat-et/locallm-model-service:latest
Image=quay.io/ai-lab/mistral-7b-instruct:latest
Image=quay.io/ai-lab/llamacpp-python:latest
Image=quay.io/redhat-et/locallm-text-summarizer:latest
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@ metadata:
spec:
initContainers:
- name: model-file
image: quay.io/redhat-et/locallm-mistral-7b-gguf:latest
command: ['/usr/bin/install', "/model/mistral-7b-instruct-v0.1.Q4_K_S.gguf", "/shared/"]
image: quay.io/ai-lab/mistral-7b-instruct:latest
command: ['/usr/bin/install', "/model/mistral-7b-instruct-v0.1.Q4_K_M.gguf", "/shared/"]
volumeMounts:
- name: model-file
mountPath: /shared
Expand All @@ -29,8 +29,8 @@ spec:
- name: PORT
value: 8001
- name: MODEL_PATH
value: /model/mistral-7b-instruct-v0.1.Q4_K_S.gguf
image: quay.io/redhat-et/locallm-model-service:latest
value: /model/mistral-7b-instruct-v0.1.Q4_K_M.gguf
image: quay.io/ai-lab/llamacpp-python:latest
name: summarizer-model-service
ports:
- containerPort: 8001
Expand Down
Loading