Add QLoRA

RareCompute · Dec 6, 2024 · 26ba623 · 26ba623
1 parent a65400f
commit 26ba623
Show file tree

Hide file tree

Showing 6 changed files with 124 additions and 0 deletions.
diff --git a/apps/qlora/Dockerfile b/apps/qlora/Dockerfile
@@ -0,0 +1,79 @@
+FROM docker.io/library/python:3.9-slim-bookworm
+
+LABEL \
+  maintainer="Liana64" \
+  org.opencontainers.image.source="https://github.com/chaidiscovery/chai-lab"
+
+ARG TARGETPLATFORM
+ARG VERSION
+ARG CHANNEL
+ARG DEBIAN_FRONTEND=noninteractive
+
+ENV \
+  NVIDIA_DRIVER_CAPABILITIES="compute,video,utility,graphics" \
+  PATH="/opt/venv/bin:$PATH" \
+  UMASK="0002" \
+  LANG=C.UTF-8 \
+  TZ="Etc/UTC" \
+  USERNAME=rare \
+  UID=900 \
+  GID=900 \
+  PYTHONDONTWRITEBYTECODE=1 \
+  PYTHONUNBUFFERED=1 \
+  PYTHONFAULTHANDLER=1 \
+  PIP_ROOT_USER_ACTION=ignore \
+  PIP_NO_CACHE_DIR=1 \
+  PIP_DISABLE_PIP_VERSION_CHECK=1 \
+  PIP_BREAK_SYSTEM_PACKAGES=1 \
+  UV_HTTP_TIMEOUT=1000
+
+USER root
+WORKDIR /app
+
+RUN \
+  groupadd --gid ${GID} ${USERNAME} \
+  && useradd --uid ${UID} --gid ${GID} --create-home --shell /bin/bash ${USERNAME} \
+  && apt-get update && apt-get install -y --no-install-recommends \
+  curl unzip build-essential catatonit jq \
+  gnupg ca-certificates lsb-release \
+  nano vim tree git \
+  # -----------------------------------
+  # TODO: Build images with extras
+  # -----------------------------------
+  # htop tmux psmisc \
+  # socat rsync aria2 openssh-server \
+  # -----------------------------------
+  # TODO: Build images with RDMA & InfiniBand
+  # -----------------------------------
+  # libibverbs1 librdmacm1 \
+  # -----------------------------------
+  && git clone https://github.com/artidoro/qlora.git /tmp/app \
+  && cp -R /tmp/app/LICENSE /app \
+  && cp -R /tmp/app/qlora.py /app \
+  && cp -R /tmp/app/data /app \
+  && cp -R /tmp/app/eval /app \
+  && cp -R /tmp/app/examples /app \
+  && cp -R /tmp/app/scripts /app \
+  && printf "UpdateMethod=docker\nBranch=master\nPackageVersion=%s\nPackageAuthor=[RareCompute](https://github.com/RareCompute)\n" "${VERSION}" > /app/package_info \
+  && chown -R ${UID}:${GID} /app && chmod -R 755 /app \
+  && curl -LsSf https://astral.sh/uv/0.5.6/install.sh | sh \
+  && . $HOME/.local/bin/env \
+  && uv venv --no-python-downloads /opt/venv \
+  && . /opt/venv/bin/activate \
+  && cd /tmp/app \
+  && uv pip install torch torchvision torchaudio \
+  && uv pip install -r /tmp/app/requirements.txt \
+  && chown -R ${UID}:${GID} /opt/venv && chmod -R 755 /opt/venv \
+  && apt-get purge -y build-essential \
+  && apt-get autoremove -y \
+  && apt-get clean \
+  && rm -rf /root/.cache /var/lib/apt/lists/* /tmp/* /var/tmp/* \
+  && chsh -s /bin/bash
+
+COPY --chown=${UID}:${GID} ./apps/chai/entrypoint.sh /entrypoint.sh
+RUN chmod -R 755 /entrypoint.sh
+
+USER ${USERNAME}
+WORKDIR /app
+
+ENTRYPOINT ["/usr/bin/catatonit", "--", "/entrypoint.sh"]
diff --git a/apps/qlora/README.md b/apps/qlora/README.md
@@ -0,0 +1,15 @@
+# QLoRA
+
+From the [upstream repository](https://github.com/artidoro/qlora)
+
+> We present QLoRA, an efficient finetuning approach that reduces memory usage enough to finetune a 65B parameter model on a single 48GB GPU while preserving full 16-bit finetuning task performance. QLoRA backpropagates gradients through a frozen, 4-bit quantized pretrained language model into Low Rank Adapters (LoRA). Our best model family, which we name Guanaco, outperforms all previous openly released models on the Vicuna benchmark, reaching 99.3% of the performance level of ChatGPT while only requiring 24 hours of finetuning on a single GPU. QLoRA introduces a number of innovations to save memory without sacrificing performance: (a) 4-bit NormalFloat (NF4), a new data type that is information theoretically optimal for normally distributed weights (b) Double Quantization to reduce the average memory footprint by quantizing the quantization constants, and (c) Paged Optimizers to manage memory spikes. We use QLoRA to finetune more than 1,000 models, providing a detailed analysis of instruction following and chatbot performance across 8 instruction datasets, multiple model types (LLaMA, T5), and model scales that would be infeasible to run with regular finetuning (e.g. 33B and 65B parameter models). Our results show that QLoRA finetuning on a small high-quality dataset leads to state-of-the-art results, even when using smaller models than the previous SoTA. We provide a detailed analysis of chatbot performance based on both human and GPT-4 evaluations showing that GPT-4 evaluations are a cheap and reasonable alternative to human evaluation. Furthermore, we find that current chatbot benchmarks are not trustworthy to accurately evaluate the performance levels of chatbots. We release all of our models and code, including CUDA kernels for 4-bit training.
+
+## Environment variables
+
+You can configure the docker image using the below environment variables
+
+| Environment Variable | CLI Flag | Type      | Default Value | Description                |
+| -------------------- | -------- | --------- | ------------- | -------------------------- |
+| `USERNAME`           | N/A      | `STR`     | `rare`        | Username for the container |
+| `UID`                | N/A      | `INTEGER` | `900`         | UID for the container      |
+| `GID`                | N/A      | `INTEGER` | `900`         | GID for the container      |
diff --git a/apps/qlora/ci/goss.yaml b/apps/qlora/ci/goss.yaml
@@ -0,0 +1,5 @@
+---
+# yaml-language-server: $schema=https://raw.githubusercontent.com/goss-org/goss/master/docs/schema.yaml
+file:
+  /app/LICENSE:
+    exists: true
diff --git a/apps/qlora/ci/latest.sh b/apps/qlora/ci/latest.sh
@@ -0,0 +1,7 @@
+#!/usr/bin/env bash
+git clone --quiet https://github.com/artidoro/qlora.git /tmp/qlora
+pushd /tmp/qlora > /dev/null || exit
+version=$(git rev-list --count --first-parent HEAD)
+popd > /dev/null || exit
+rm -rf /tmp/qlora
+printf "1.0.%d" "${version}"
diff --git a/apps/qlora/entrypoint.sh b/apps/qlora/entrypoint.sh
@@ -0,0 +1,7 @@
+#!/usr/bin/env bash
+
+alias qlora="python /app/qlora.py"
+
+exec \
+  qlora \
+  "$@" \
diff --git a/apps/qlora/metadata.yaml b/apps/qlora/metadata.yaml
@@ -0,0 +1,11 @@
+---
+#yamllint disable
+app: qlora
+semver: true
+channels:
+  - name: stable
+    platforms: ["linux/amd64", "linux/arm64"]
+    stable: true
+    tests:
+      enabled: false
+      type: cli