Merge pull request #119 from sallyom/embed-workload-bootable-containers

Embed workload bootable containers
containers · Mar 29, 2024 · 07fe7d2 · 07fe7d2
2 parents 205d2e2 + 84c3b77
commit 07fe7d2
Show file tree

Hide file tree

Showing 8 changed files with 126 additions and 12 deletions.
diff --git a/README.md b/README.md
@@ -19,7 +19,7 @@ However, each sample application can be paired with a variety of model servers.
 
 Learn how to build and run the llamacpp_python model server by following the [llamacpp_python model server README.](/model_servers/llamacpp_python/README.md).
 
-## Current Recipes: 
+## Current Recipes 
 
 There are several sample applications in this repository. They live in the [recipes](./recipes) folder. 
 They fall under the categories:
@@ -36,7 +36,6 @@ Many sample applications utilize the [Streamlit UI](https://docs.streamlit.io/).
 Learn how to build and run each application by visiting each of the categories above. For example
 the [chatbot recipe](./recipes/natural_language_processing/chatbot).
 
-
 ## Current Locallm Images built from this repository
 
 Images for many sample applications and models are available in `quay.io`. All currently built images are  tracked in

diff --git a/recipes/natural_language_processing/chatbot/embed-in-bootable-image/Containerfile b/recipes/natural_language_processing/chatbot/embed-in-bootable-image/Containerfile
@@ -0,0 +1,27 @@
+# In this example, an AI powered sample application will be embedded as a systemd service
+# by placing podman quadlet files in /usr/share/containers/systemd
+
+FROM quay.io/centos-bootc/centos-bootc:stream9
+# Build like this:
+# podman build --build-arg "sshpubkey=$(cat ~/.ssh/mykey.pub)" -t quay.io/exampleos/example-image .
+#Substitute YOUR public key for the below-private key holder for the following public key will have root access
+ARG sshpubkey
+ARG model-server-image=quay.io/redhat-et/locallm-model-service:latest
+
+RUN mkdir /usr/etc-system && \
+    echo 'AuthorizedKeysFile /usr/etc-system/%u.keys' >> /etc/ssh/sshd_config.d/30-auth-system.conf && \
+    echo $sshpubkey > /usr/etc-system/root.keys && chmod 0600 /usr/etc-system/root.keys
+
+RUN dnf install -y vim && dnf clean all
+
+# Code-generation application
+COPY quadlet/chatbot.kube.example /usr/share/containers/systemd/chatbot.kube
+COPY quadlet/chatbot.yaml /usr/share/containers/systemd/chatbot.yaml
+COPY quadlet/chatbot.image /usr/share/containers/systemd/chatbot.image
+
+# pre-load workload images
+# Comment the pull commands to keep bootc image smaller.
+# With above quadlet .image file, these will be pulled on boot if not pre-loaded here
+RUN podman pull quay.io/redhat-et/locallm-mistral-7b-gguf:latest
+RUN podman pull quay.io/redhat-et/locallm-chatbot:latest
+RUN podman pull $model-server-image
diff --git a/recipes/natural_language_processing/chatbot/embed-in-bootable-image/README.md b/recipes/natural_language_processing/chatbot/embed-in-bootable-image/README.md
@@ -0,0 +1,88 @@
+## Embed workload (AI sample applications) in a bootable container image
+
+### Create a custom centos-bootc:stream9 image
+
+* [Containerfile](./Containerfile) - embeds an LLM-powered sample chat application.
+
+Details on the application can be found [in the chatbot/README.md](../README.md). By default, this Containerfile includes a model-server
+that is meant to run with CPU - no additional GPU drivers or toolkits are embedded. You can substitute the llamacpp_python model-server image
+for one that has GPU drivers and toolkits with additional build-args. The `FROM` must be replaced with a base image that has the necessary
+kernel drivers and toolkits if building for GPU enabled systems. For an example of an NVIDIA/CUDA base image,
+see [NVIDIA bootable image example](https://gitlab.com/bootc-org/examples/-/tree/main/nvidia?ref_type=heads)
+
+In order to pre-pull the workload images, you need to build from the same architecture you're building for.
+If not pre-pulling the workload images, you can cross build (ie, build from a Mac for an X86_64 system).
+To build the derived bootc image for x86_64 architecture, run the following:
+
+```bash
+cd recipes/natural_language_processing/chatbot
+
+# for CPU powered sample LLM application
+# to switch to aarch64 platform, pass --platform linux/arm64
+podman build --build-arg "sshpubkey=$(cat ~/.ssh/id_rsa.pub)" \
+           --cap-add SYS_ADMIN \
+           --platform linux/amd64 \
+           -t quay.io/yourrepo/youros:tag .
+
+# for GPU powered sample LLM application with llamacpp cuda model server
+podman build --build-arg "sshpubkey=$(cat ~/.ssh/id_rsa.pub)" \
+           --build-arg "model-server-image="quay.io/redhat-et/locallm-llamacpp-cuda-model-server:latest" \
+           --from <YOUR BOOTABLE IMAGE WITH NVIDIA/CUDA> \
+           --cap-add SYS_ADMIN \
+           --platform linux/amd64 \
+           -t quay.io/yourrepo/youros:tag .
+
+podman push quay.io/yourrepo/youros:tag
+```
+
+### Update a bootc-enabled system with the new derived image
+
+To build a disk image from an OCI bootable image, you can refer to [bootc-org/examples](https://gitlab.com/bootc-org/examples).
+For this example, we will assume a bootc enabled system is already running.
+If already running a bootc-enabled OS, `bootc switch` can be used to update the system to target a new bootable OCI image with embedded workloads.
+
+SSH into the bootc-enabled system and run:
+
+```bash
+bootc switch quay.io/yourrepo/youros:tag
+```
+
+The necessary image layers will be downloaded from the OCI registry, and the system will prompt you to reboot into the new operating system.
+From this point, with any subsequent modifications and pushes to the `quay.io/yourrepo/youreos:tag` OCI image, your OS can be updated with:
+
+```bash
+bootc upgrade
+```
+
+### Accessing the embedded workloads
+
+The chatbot can be accessed by visiting port `8150` of the running bootc system.
+They will be running as systemd services from podman quadlet files placed at `/usr/share/containers/systemd/` on the bootc system.
+For more information about running containerized applications as systemd services with podman, refer to this
+[podman quadlet post](https://www.redhat.com/sysadmin/quadlet-podman) or, [podman documentation](https://podman.io/docs)
+
+To monitor the sample applications, SSH into the bootc system and run either:
+
+```bash
+systemctl status chatbot
+```
+
+You can also view the pods and containers that are managed with systemd by running:
+
+```
+podman pod list
+podman ps -a
+```
+
+To stop the sample applications, SSH into the bootc system and run:
+
+```bash
+systemctl stop chatbot
+```
+
+To run the sample application _not_ as a systemd service, stop the services then
+run the appropriate commands based on the application you have embedded.
+
+```bash
+podman kube play /usr/share/containers/systemd/chatbot.yaml
+```
diff --git a/...s/natural_language_processing/chatbot/embed-in-bootable-image/quadlet/README.md b/...s/natural_language_processing/chatbot/embed-in-bootable-image/quadlet/README.md
@@ -0,0 +1,10 @@
+### Run chatbot as a systemd service
+
+```bash
+cp chatbot.yaml /usr/share/containers/systemd/chatbot.yaml
+cp chatbot.kube.example /usr/share/containers/chatbot.kube
+cp chatbot.image /usr/share/containers/chatbot.image
+/usr/libexec/podman/quadlet --dryrun (optional)
+systemctl daemon-reload
+systemctl start chatbot
+```
diff --git a/..._processing/chatbot/quadlet/chatbot.image → ...d-in-bootable-image/quadlet/chatbot.image b/..._processing/chatbot/quadlet/chatbot.image → ...d-in-bootable-image/quadlet/chatbot.image
diff --git a/...sing/chatbot/quadlet/chatbot.kube.example → ...otable-image/quadlet/chatbot.kube.example b/...sing/chatbot/quadlet/chatbot.kube.example → ...otable-image/quadlet/chatbot.kube.example
diff --git a/...e_processing/chatbot/quadlet/chatbot.yaml → ...ed-in-bootable-image/quadlet/chatbot.yaml b/...e_processing/chatbot/quadlet/chatbot.yaml → ...ed-in-bootable-image/quadlet/chatbot.yaml
diff --git a/recipes/natural_language_processing/chatbot/quadlet/README.md b/recipes/natural_language_processing/chatbot/quadlet/README.md