Skip to content

Commit

Permalink
Add no AVX2 variant for default-nvidia
Browse files Browse the repository at this point in the history
  • Loading branch information
Atinoda committed Jul 26, 2024
1 parent 714f6a3 commit 4676bf2
Show file tree
Hide file tree
Showing 2 changed files with 25 additions and 0 deletions.
24 changes: 24 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,21 @@ FROM app_nvidia AS app_nvidia_x
RUN chmod +x /scripts/build_extensions.sh && \
. /scripts/build_extensions.sh

# Base No AVX2
FROM app_base AS app_nvidia_noavx2
# Install pytorch for CUDA 12.1
RUN pip3 install torch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2 \
--index-url https://download.pytorch.org/whl/cu121
# Install oobabooga/text-generation-webui
RUN ls /app
RUN pip3 install -r /app/requirements_noavx2.txt

# Extended No AVX2
FROM app_nvidia_x AS app_nvidia_noavx2_x
# Install extensions
RUN chmod +x /scripts/build_extensions.sh && \
. /scripts/build_extensions.sh


# ROCM [Untested. Widen your hardware support, AMD!]
# Base
Expand Down Expand Up @@ -159,6 +174,15 @@ RUN echo "Nvidia Extended" > /variant.txt
ENV EXTRA_LAUNCH_ARGS=""
CMD ["python3", "/app/server.py"]

# Extended without AVX2
FROM run_base AS default-nvidia-noavx2
# Copy venv
COPY --from=app_nvidia_noavx2_x $VIRTUAL_ENV $VIRTUAL_ENV
# Variant parameters
RUN echo "Nvidia Extended (No AVX2)" > /variant.txt
ENV EXTRA_LAUNCH_ARGS=""
CMD ["python3", "/app/server.py"]


# ROCM
# Base
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ Choose the desired variant by setting the image `:tag` in `docker-compose.yml` u
| Platform | Description |
|---|---|
| `*-nvidia` | CUDA 12.1 inference acceleration. |
| `*-nvidia-noavx2` | CUDA 12.1 inference acceleration with no AVX2 CPU instructions. *Typical use-case is legacy CPU with modern GPU.* |
| `*-cpu` | CPU-only inference. *Has become surprisingly fast since the early days!* |
| `*-rocm` | ROCM 5.6 inference acceleration. *Experimental and unstable.* |
| `*-arc` | Intel Arc XPU and oneAPI inference acceleration. **Not compatible with Intel integrated GPU (iGPU).** *Experimental and unstable.* |
Expand Down

0 comments on commit 4676bf2

Please sign in to comment.