Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: add optional k3s-cuda base image flavors #118

Closed
wants to merge 10 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Empty file removed .github/.gitkeep
Empty file.
29 changes: 13 additions & 16 deletions .github/workflows/build-test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,35 +7,32 @@ on:
- "docs/**"
- "CODEOWNERS"

permissions:
id-token: write
contents: read

jobs:
test-clean-install:
runs-on: ubuntu-latest

permissions:
id-token: write
contents: read

strategy:
matrix:
image: ["rancher/k3s"]
version: ["v1.29.8-k3s1", "v1.30.4-k3s1", "v1.31.0-k3s1"]
k3s_image_repository: ["rancher/k3s"]
k3s_image_version:
["v1.28.8-k3s1", "v1.29.8-k3s1", "v1.30.4-k3s1", "v1.31.0-k3s1"]

steps:
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4
- uses: actions/checkout@eef61447b9ff4aafe5dcd4e0bbf5d482be7e7871 # v4

- name: Setup UDS
if: always()
uses: defenseunicorns/uds-common/.github/actions/setup@e3008473beab00b12a94f9fcc7340124338d5c08 # v0.13.1
with:
username: ${{secrets.IRON_BANK_ROBOT_USERNAME}}
password: ${{secrets.IRON_BANK_ROBOT_PASSWORD}}

# Step is not currently being used, could be uncommented if custom image support is needed in the future
# - name: Build the custom k3s image
# if: ${{matrix.image}} != "rancher/k3s"
# run: uds run build-image --set VERSION=${{matrix.version}} --no-progress
registry1Username: ${{ secrets.IRON_BANK_ROBOT_USERNAME }}
registry1Password: ${{ secrets.IRON_BANK_ROBOT_PASSWORD }}
ghToken: ${{ secrets.GITHUB_TOKEN }}

- name: Create and deploy the uds-k3d package
run: uds run --set IMAGE_NAME=${{matrix.image}} --set VERSION=${{matrix.version}} --no-progress
run: uds run --set K3S_IMAGE_VERSION=${{matrix.k3s_image_version}} --set K3S_IMAGE_REPOSITORY=${{matrix.k3s_image_repository}} --no-progress

- name: Validate uds-k3d package
run: uds run validate --no-progress
43 changes: 0 additions & 43 deletions .github/workflows/publish-image.yaml

This file was deleted.

73 changes: 60 additions & 13 deletions .github/workflows/tag-and-release.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,10 @@ on:

jobs:
tag-new-version:
permissions: write-all
runs-on: ubuntu-latest

permissions: write-all

outputs:
release_created: ${{ steps.release-flag.outputs.release_created }}
steps:
Expand All @@ -25,28 +27,73 @@ jobs:
if: ${{ needs.tag-new-version.outputs.release_created == 'true'}}
runs-on: ubuntu-latest

strategy:
matrix:
k3s_image_repository: ["rancher/k3s"]
k3s_image_version:
["v1.28.8-k3s1", "v1.29.8-k3s1", "v1.30.4-k3s1", "v1.31.0-k3s1"]

permissions:
contents: read
packages: write

steps:
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4
- uses: actions/checkout@eef61447b9ff4aafe5dcd4e0bbf5d482be7e7871 # v4

- name: Setup UDS
if: always()
uses: defenseunicorns/uds-common/.github/actions/setup@e3008473beab00b12a94f9fcc7340124338d5c08 # v0.13.1
with:
username: ${{secrets.IRON_BANK_ROBOT_USERNAME}}
password: ${{secrets.IRON_BANK_ROBOT_PASSWORD}}
registry1Username: ${{ secrets.IRON_BANK_ROBOT_USERNAME }}
registry1Password: ${{ secrets.IRON_BANK_ROBOT_PASSWORD }}
ghToken: ${{ secrets.GITHUB_TOKEN }}

- name: Publish the base capability
run: |
uds zarf package create --confirm -a arm64 -o oci://ghcr.io/defenseunicorns/packages \
--set K3S_IMAGE_REPOSITORY=${{ matrix.k3s_image_repository }} \
--set K3S_IMAGE_VERSION=${{ matrix.k3s_image_version }}

uds zarf package create --confirm -a amd64 -o oci://ghcr.io/defenseunicorns/packages \
--set K3S_IMAGE_REPOSITORY=${{ matrix.k3s_image_repository }} \
--set K3S_IMAGE_VERSION=${{ matrix.k3s_image_version }}

publish-uds-cuda-package:
needs: tag-new-version
if: ${{ needs.tag-new-version.outputs.release_created == 'true'}}
runs-on: ubuntu-latest

strategy:
matrix:
k3s_image_repository: ["rancher/k3s"]
k3s_image_version:
["v1.28.8-k3s1", "v1.29.8-k3s1", "v1.30.4-k3s1", "v1.31.0-k3s1"]
cuda_image_version:
[
11.8.0-base-ubuntu22.04,
12.1.0-base-ubuntu22.04,
12.5.0-base-ubuntu22.04,
]

steps:
- uses: actions/checkout@eef61447b9ff4aafe5dcd4e0bbf5d482be7e7871 # v4

- name: Login to GHCR
uses: docker/login-action@9780b0c442fbb1117ed29e0efdff1e18412f7567 # v3
- uses: docker/setup-buildx-action@8026d2bc3645ea78b0d2544766a1225eb5691f89 # v3.7.0

- name: Setup UDS
uses: defenseunicorns/uds-common/.github/actions/setup@e3008473beab00b12a94f9fcc7340124338d5c08 # v0.13.1
with:
registry: ghcr.io
username: dummy
password: ${{ secrets.GITHUB_TOKEN }}
registry1Username: ${{ secrets.IRON_BANK_ROBOT_USERNAME }}
registry1Password: ${{ secrets.IRON_BANK_ROBOT_PASSWORD }}
ghToken: ${{ secrets.GITHUB_TOKEN }}

- name: Publish the CUDA K3s image
run: |
uds run publish-cuda-image \
--set K3S_IMAGE_REPOSITORY=${{ matrix.k3s_image_repository }} \
--set K3S_IMAGE_VERSION="${{ matrix.k3s_image_version }}" \
--set CUDA_IMAGE_VERSION="${{ matrix.cuda_image_version }}" \
--no-progress

- name: Publish the capability
- name: Publish the CUDA capability
run: |
uds zarf package create --confirm -a arm64 -o oci://ghcr.io/defenseunicorns/packages
uds zarf package create --confirm -a amd64 -o oci://ghcr.io/defenseunicorns/packages
uds zarf package create --confirm -a amd64 -o oci://ghcr.io/defenseunicorns/packages --flavor cuda
13 changes: 9 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,16 +16,17 @@ sudo ssh -N -L 80:localhost:80 -L 443:localhost:443 -L 6550:localhost:6550 <your
> [!NOTE]
> UDS K3d intentionally does not address airgap concerns for K3d or the load balancer logic deployed in this package. This allows running `zarf init` or deploying a Zarf Init Package via a UDS Bundle after the UDS K3d environment is deployed.

## Prerequisites
## Pre-Requisites

- [UDS CLI](https://github.com/defenseunicorns/uds-cli/blob/main/README.md#install) & [K3d](https://k3d.io/#installation) using the versions specified in the [uds-common repo](https://github.com/defenseunicorns/uds-common/blob/main/README.md#supported-tool-versions)
- [Docker](https://docs.docker.com/get-docker/) or [Podman](https://podman.io/getting-started/installation) for running K3d
- See the [GPU Configuration](./docs/GPU.md) information for more details on enabling NVIDIA GPU support within the cluster

## Deploy

<!-- x-release-please-start-version -->

`uds zarf package deploy oci://defenseunicorns/uds-k3d:0.9.0`
`uds zarf package deploy oci://ghcr.io/defenseunicorns/packages/uds-k3d:0.9.0`

<!-- x-release-please-end -->

Expand Down Expand Up @@ -53,13 +54,13 @@ k3d cluster start uds

## Additional Info

You can set extra k3d args by setting the deploy-time ZARF_VAR_K3D_EXTRA_ARGS. See below `zarf-config.yaml` example k3d args:
You can set extra K3d arguments by setting the deploy-time `ZARF_VAR_K3D_EXTRA_ARGS`. See below `zarf-config.yaml` example below for K3d args examples:

```yaml
package:
deploy:
set:
k3d_extra_args: "--k3s-arg --gpus=1 --k3s-arg --<arg2>=<value>"
k3d_extra_args: --k3s-arg "--<arg2>=<value>@server:*" --gpus=all
```

### Configure MinIO
Expand All @@ -69,3 +70,7 @@ package:
### DNS Assumptions

- [DNS Assumptions](docs/DNS.md)

### Enabling GPU Support

- [GPU Workload Configuration](docs/GPU.md)
5 changes: 0 additions & 5 deletions docker/Dockerfile

This file was deleted.

35 changes: 35 additions & 0 deletions docker/Dockerfile.gpu
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
ARG K3S_REPOSITORY="rancher/k3s"
ARG K3S_TAG="v1.30.4-k3s1"
ARG CUDA_TAG="12.1.0-base-ubuntu22.04"

FROM $K3S_REPOSITORY:$K3S_TAG AS k3s

FROM nvidia/cuda:$CUDA_TAG

# Install the NVIDIA container toolkit
RUN apt-get update && \
apt-get install -y curl && \
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg && \
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
tee /etc/apt/sources.list.d/nvidia-container-toolkit.list \
apt-get update && \
apt-get install -y nvidia-container-toolkit-base nvidia-container-toolkit nvidia-container-runtime util-linux && \
nvidia-ctk runtime configure --runtime=containerd

COPY --from=k3s / / --exclude=/bin/
COPY --from=k3s /bin /bin

VOLUME /var/lib/kubelet
VOLUME /var/lib/rancher/k3s
VOLUME /var/lib/cni
VOLUME /var/log

# Resolve fsnotify issues
RUN sysctl -w fs.inotify.max_user_watches=100000 && \
sysctl -w fs.inotify.max_user_instances=100000

ENV PATH="$PATH:/bin/aux"

ENTRYPOINT ["/bin/k3s"]
CMD ["agent"]
Loading