Add examples/gke/tgi-multi-lora-deployment (#102)

* Update `doc-build.yml` and `doc-pr-build.yml` triggers To also include modifications on `examples`, `Makefile`, and `docs/scripts` or anything under `docs/` * Fix `kubectl wait` command due to typo * Add `tgi-multi-lora-deployment` example Still pending on the official release of the latest TGI DLC on Google Cloud * Add `imgs` and update `README.md` * Add latest example to existing listings TODO(alvarobartt): automate this shortly as it's not so straight forward :/ * Add `GITHUB_BRANCH` to generate working links * Fix indendation to use 4 spaces instead * Fix indentation on `!NOTE` to `Tip` * Fix `replacement` function in `auto-generate-examples.py` * Update `README.md` * Update `README.md`, `docs/source/resources.mdx` and `examples/{cloud-run,gke,vertex-ai}/README.md` Update example listing via `python scripts/internal/update_example_tables.py` so as to automatically generate those listings (respecting the previous content within the file, to alphabetically sort those as Vertex AI > GKE > Cloud Run, and some more fixes and improvements * Add `scripts/internal/update_example_tables.py` * Update `scripts/internal/update_example_tables.py` To include the `examples/` prefix within the paths to the examples from the root directory i.e. in the `README.md` file * Update `README.md` * Update `docs/source/resources.mdx` * Fix `scripts/internal/update_example_tables.py` * Update `README.md` Fix `examples/<service>/examples/<service>/...` * Update `README.md`, `docs/source/resources.mdx` and `examples/gke/README.md` * Escape nested backticks
huggingface · Oct 10, 2024 · ceec771 · ceec771
1 parent 392c15b
commit ceec771
Showing 17 changed files with 766 additions and 79 deletions.
diff --git a/.github/workflows/doc-build.yml b/.github/workflows/doc-build.yml
@@ -6,7 +6,10 @@ on:
       - main
       - doc-builder*
     paths:
-      - docs/source/**
+      - docs/**
+      - examples/**/*.md
+      - examples/**/*.ipynb
+      - Makefile
       - .github/workflows/doc-build.yml
 
 jobs:

diff --git a/.github/workflows/doc-pr-build.yml b/.github/workflows/doc-pr-build.yml
@@ -3,7 +3,10 @@ name: Build PR Documentation
 on:
   pull_request:
     paths:
-      - docs/source/**
+      - docs/**
+      - examples/**/*.md
+      - examples/**/*.ipynb
+      - Makefile
       - .github/workflows/doc-pr-build.yml
 
 concurrency:
@@ -20,3 +23,5 @@ jobs:
       package_name: google-cloud
       additional_args: --not_python_module
       pre_command: cd Google-Cloud-Containers && make docs
+    env:
+      GITHUB_BRANCH: ${{ github.head_ref || github.ref_name }}
diff --git a/README.md b/README.md
@@ -44,27 +44,28 @@ The [`examples`](./examples) directory contains examples for using the container
 
 | Service   | Example                                                                                                                                    | Title                                                                       |
 | --------- | ------------------------------------------------------------------------------------------------------------------------------------------ | --------------------------------------------------------------------------- |
+| Vertex AI | [examples/vertex-ai/notebooks/trl-lora-sft-fine-tuning-on-vertex-ai](./examples/vertex-ai/notebooks/trl-lora-sft-fine-tuning-on-vertex-ai) | Fine-tune Gemma 2B with PyTorch Training DLC using SFT + LoRA on Vertex AI  |
+| Vertex AI | [examples/vertex-ai/notebooks/trl-full-sft-fine-tuning-on-vertex-ai](./examples/vertex-ai/notebooks/trl-full-sft-fine-tuning-on-vertex-ai) | Fine-tune Mistral 7B v0.3 with PyTorch Training DLC using SFT on Vertex AI  |
 | GKE       | [examples/gke/trl-full-fine-tuning](./examples/gke/trl-full-fine-tuning)                                                                   | Fine-tune Gemma 2B with PyTorch Training DLC using SFT on GKE               |
 | GKE       | [examples/gke/trl-lora-fine-tuning](./examples/gke/trl-lora-fine-tuning)                                                                   | Fine-tune Mistral 7B v0.3 with PyTorch Training DLC using SFT + LoRA on GKE |
-| Vertex AI | [examples/vertex-ai/notebooks/trl-full-sft-fine-tuning-on-vertex-ai](./examples/vertex-ai/notebooks/trl-full-sft-fine-tuning-on-vertex-ai) | Fine-tune Mistral 7B v0.3 with PyTorch Training DLC using SFT on Vertex AI  |
-| Vertex AI | [examples/vertex-ai/notebooks/trl-lora-sft-fine-tuning-on-vertex-ai](./examples/vertex-ai/notebooks/trl-lora-sft-fine-tuning-on-vertex-ai) | Fine-tune Gemma 2B with PyTorch Training DLC using SFT + LoRA on Vertex AI  |
 
 ### Inference Examples
 
-| Service   | Example                                                                                                                                                | Title                                                      |
-| --------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------------------- |
-| GKE       | [examples/gke/tgi-deployment](./examples/gke/tgi-deployment)                                                                                           | Deploy Meta Llama 3 8B with TGI DLC on GKE                 |
-| GKE       | [examples/gke/tgi-from-gcs-deployment](./examples/gke/tgi-from-gcs-deployment)                                                                         | Deploy Qwen2 7B with TGI DLC from GCS on GKE               |
-| GKE       | [examples/gke/tgi-llama-405b-deployment](./examples/gke/tgi-llama-405b-deployment)                                                                     | Deploy Llama 3.1 405B with TGI DLC on GKE                  |
-| GKE       | [examples/gke/tei-deployment](./examples/gke/tei-deployment)                                                                                           | Deploy Snowflake's Arctic Embed with TEI DLC on GKE        |
-| GKE       | [examples/gke/tei-from-gcs-deployment](./examples/gke/tei-from-gcs-deployment)                                                                         | Deploy BGE Base v1.5 with TEI DLC from GCS on GKE          |
-| Vertex AI | [examples/vertex-ai/notebooks/deploy-bert-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-bert-on-vertex-ai)                                       | Deploy BERT Models with PyTorch Inference DLC on Vertex AI |
-| Vertex AI | [examples/vertex-ai/notebooks/deploy-embedding-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-embedding-on-vertex-ai)                             | Deploy Embedding Models with TEI DLC on Vertex AI          |
-| Vertex AI | [examples/vertex-ai/notebooks/deploy-gemma-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-gemma-on-vertex-ai)                                     | Deploy Gemma 7B with TGI DLC on Vertex AI                  |
-| Vertex AI | [examples/vertex-ai/notebooks/deploy-gemma-from-gcs-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-gemma-from-gcs-on-vertex-ai)                   | Deploy Gemma 7B with TGI DLC from GCS on Vertex AI         |
-| Vertex AI | [examples/vertex-ai/notebooks/deploy-flux-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-flux-on-vertex-ai)                                       | Deploy FLUX with PyTorch Inference DLC on Vertex AI        |
-| Vertex AI | [examples/vertex-ai/notebooks/deploy-llama-3-1-405b-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-llama-405b-on-vertex-ai/vertex-notebook.ipynb) | Deploy Meta Llama 3.1 405B with TGI DLC on Vertex AI       |
-| Cloud Run | [examples/cloud-run/tgi-deployment](./examples/cloud-run/tgi-deployment/README.md)                                                                     | Deploy Meta Llama 3.1 with TGI DLC on Cloud Run            |
+| Service   | Example                                                                                                                              | Title                                                         |
+| --------- | ------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------- |
+| Vertex AI | [examples/vertex-ai/notebooks/deploy-bert-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-bert-on-vertex-ai)                     | Deploy BERT Models with PyTorch Inference DLC on Vertex AI    |
+| Vertex AI | [examples/vertex-ai/notebooks/deploy-embedding-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-embedding-on-vertex-ai)           | Deploy Embedding Models with TEI DLC on Vertex AI             |
+| Vertex AI | [examples/vertex-ai/notebooks/deploy-flux-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-flux-on-vertex-ai)                     | Deploy FLUX with PyTorch Inference DLC on Vertex AI           |
+| Vertex AI | [examples/vertex-ai/notebooks/deploy-gemma-from-gcs-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-gemma-from-gcs-on-vertex-ai) | Deploy Gemma 7B with TGI DLC from GCS on Vertex AI            |
+| Vertex AI | [examples/vertex-ai/notebooks/deploy-gemma-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-gemma-on-vertex-ai)                   | Deploy Gemma 7B with TGI DLC on Vertex AI                     |
+| Vertex AI | [examples/vertex-ai/notebooks/deploy-llama-3-1-405b-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-llama-3-1-405b-on-vertex-ai) | Deploy Meta Llama 3.1 405B with TGI DLC on Vertex AI          |
+| GKE       | [examples/gke/tei-from-gcs-deployment](./examples/gke/tei-from-gcs-deployment)                                                       | Deploy BGE Base v1.5 with TEI DLC from GCS on GKE             |
+| GKE       | [examples/gke/tgi-multi-lora-deployment](./examples/gke/tgi-multi-lora-deployment)                                                   | Deploy Gemma2 with multiple LoRA adapters with TGI DLC on GKE |
+| GKE       | [examples/gke/tgi-llama-405b-deployment](./examples/gke/tgi-llama-405b-deployment)                                                   | Deploy Llama 3.1 405B with TGI DLC on GKE                     |
+| GKE       | [examples/gke/tgi-deployment](./examples/gke/tgi-deployment)                                                                         | Deploy Meta Llama 3 8B with TGI DLC on GKE                    |
+| GKE       | [examples/gke/tgi-from-gcs-deployment](./examples/gke/tgi-from-gcs-deployment)                                                       | Deploy Qwen2 7B with TGI DLC from GCS on GKE                  |
+| GKE       | [examples/gke/tei-deployment](./examples/gke/tei-deployment)                                                                         | Deploy Snowflake's Arctic Embed with TEI DLC on GKE           |
+| Cloud Run | [examples/cloud-run/tgi-deployment](./examples/cloud-run/tgi-deployment)                                                             | Deploy Meta Llama 3.1 8B with TGI DLC on Cloud Run            |
 
 ### Evaluation
 

diff --git a/docs/scripts/auto-generate-examples.py b/docs/scripts/auto-generate-examples.py
@@ -1,6 +1,8 @@
 import os
 import re
 
+GITHUB_BRANCH = os.getenv("GITHUB_BRANCH", "main")
+
 
 def process_readme_files():
     print("Processing README.md files from examples/gke and examples/cloud-run...")
@@ -35,37 +37,32 @@ def process_file(root, file, dir):
     # Replace image and link paths
     content = re.sub(
         r"\(\./(imgs|assets)/([^)]*\.png)\)",
-        r"(https://raw.githubusercontent.com/huggingface/Google-Cloud-Containers/main/"
+        rf"(https://raw.githubusercontent.com/huggingface/Google-Cloud-Containers/{GITHUB_BRANCH}/"
         + root
         + r"/\1/\2)",
         content,
     )
     content = re.sub(
         r"\(\.\./([^)]+)\)",
-        r"(https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/"
+        rf"(https://github.com/huggingface/Google-Cloud-Containers/tree/{GITHUB_BRANCH}/examples/"
         + dir
         + r"/\1)",
         content,
     )
     content = re.sub(
         r"\(\.\/([^)]+)\)",
-        r"(https://github.com/huggingface/Google-Cloud-Containers/tree/main/"
+        rf"(https://github.com/huggingface/Google-Cloud-Containers/tree/{GITHUB_BRANCH}/"
         + root
         + r"/\1)",
         content,
     )
 
-    # Regular expression to match the specified blocks
-    pattern = r"> \[!(NOTE|WARNING)\]\n((?:> .*\n)+)"
-
     def replacement(match):
         block_type = match.group(1)
         content = match.group(2)
 
-        # Remove '> ' from the beginning of each line and strip whitespace
-        lines = [
-            line.lstrip("> ").strip() for line in content.split("\n") if line.strip()
-        ]
+        # Remove '> ' from the beginning of each line
+        lines = [line[2:] for line in content.split("\n") if line.strip()]
 
         # Determine the Tip type
         tip_type = " warning" if block_type == "WARNING" else ""
@@ -77,11 +74,14 @@ def replacement(match):
 
         return new_block
 
+    # Regular expression to match the specified blocks
+    pattern = r"> \[!(NOTE|WARNING)\]\n((?:>.*(?:\n|$))+)"
+
     # Perform the transformation
     content = re.sub(pattern, replacement, content, flags=re.MULTILINE)
 
-    # Remove blockquotes
-    content = re.sub(r"^(>[ ]*)+", "", content, flags=re.MULTILINE)
+    # Remove any remaining '>' or '> ' at the beginning of lines
+    content = re.sub(r"^>[ ]?", "", content, flags=re.MULTILINE)
 
     # Check for remaining relative paths
     if re.search(r"\(\.\./|\(\./", content):

diff --git a/docs/source/resources.mdx b/docs/source/resources.mdx
@@ -24,45 +24,44 @@ Learn how to use Hugging Face in Google Cloud by reading our blog posts, present
 
 - [All examples](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples)
 
-### GKE
-
-- Training
-
-  - [Fine-tune Gemma 2B with PyTorch Training DLC using SFT on GKE](https://github.com/huggingface/Google-Cloud-Containers/blob/main/examples/gke/trl-full-fine-tuning)
-  - [Fine-tune Mistral 7B v0.3 with PyTorch Training DLC using SFT + LoRA on GKE](https://github.com/huggingface/Google-Cloud-Containers/blob/main/examples/gke/trl-lora-fine-tuning)
+### Vertex AI
 
 - Inference
 
-  - [Deploy Meta Llama 3 8B with TGI DLC on GKE](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/gke/tgi-deployment)
-  - [Deploy Llama3 8B with TGI DLC on GKE](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/gke/tgi-from-gcs-deployment)
-  - [Deploy Llama 3.1 405B with TGI DLC on GKE](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/gke/tgi-from-gcs-deployment)
-  - [Deploy Snowflake's Arctic Embed with TEI DLC on GKE](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/gke/tei-deployment)
-  - [Deploy BGE Base v1.5 with TEI DLC from GCS on GKE](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/gke/tei-from-gcs-deployment)
-
-### Vertex AI
+  - [Deploy BERT Models with PyTorch Inference DLC on Vertex AI](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/vertex-ai/notebooks/deploy-bert-on-vertex-ai)
+  - [Deploy Embedding Models with TEI DLC on Vertex AI](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/vertex-ai/notebooks/deploy-embedding-on-vertex-ai)
+  - [Deploy FLUX with PyTorch Inference DLC on Vertex AI](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/vertex-ai/notebooks/deploy-flux-on-vertex-ai)
+  - [Deploy Gemma 7B with TGI DLC from GCS on Vertex AI](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/vertex-ai/notebooks/deploy-gemma-from-gcs-on-vertex-ai)
+  - [Deploy Gemma 7B with TGI DLC on Vertex AI](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/vertex-ai/notebooks/deploy-gemma-on-vertex-ai)
+  - [Deploy Meta Llama 3.1 405B with TGI DLC on Vertex AI](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/vertex-ai/notebooks/deploy-llama-3-1-405b-on-vertex-ai)
 
 - Training
 
-  - [Fine-tune Mistral 7B v0.3 with PyTorch Training DLC using SFT on Vertex AI](https://github.com/huggingface/Google-Cloud-Containers/blob/main/examples/vertex-ai/notebooks/trl-full-sft-fine-tuning-on-vertex-ai/vertex-notebook.ipynb)
-  - [Fine-tune Gemma 2B with PyTorch Training DLC using SFT + LoRA on Vertex AI](https://github.com/huggingface/Google-Cloud-Containers/blob/main/examples/vertex-ai/notebooks/trl-lora-sft-fine-tuning-on-vertex-ai/vertex-notebook.ipynb)
+  - [Fine-tune Gemma 2B with PyTorch Training DLC using SFT + LoRA on Vertex AI](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/vertex-ai/notebooks/trl-lora-sft-fine-tuning-on-vertex-ai)
+  - [Fine-tune Mistral 7B v0.3 with PyTorch Training DLC using SFT on Vertex AI](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/vertex-ai/notebooks/trl-full-sft-fine-tuning-on-vertex-ai)
 
-- Inference
+- Evaluation
 
-  - [Deploy BERT Models with PyTorch Inference DLC on Vertex AI](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/vertex-ai/notebooks/deploy-bert-on-vertex-ai/vertex-notebook.ipynb)
-  - [Deploy Embedding Models with TEI DLC on Vertex AI](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/vertex-ai/notebooks/deploy-embedding-on-vertex-ai/vertex-notebook.ipynb)
-  - [Deploy Gemma 7B with TGI DLC on Vertex AI](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/vertex-ai/notebooks/deploy-gemma-on-vertex-ai/vertex-notebook.ipynb)
-  - [Deploy Gemma 7B with TGI DLC from GCS on Vertex AI](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/vertex-ai/notebooks/deploy-gemma-from-gcs-on-vertex-ai/vertex-notebook.ipynb)
-  - [Deploy FLUX with PyTorch Inference DLC on Vertex AI](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/vertex-ai/notebooks/deploy-flux-on-vertex-ai/vertex-notebook.ipynb)
-  - [Deploy Meta Llama 3.1 405B with TGI DLC on Vertex AI](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/vertex-ai/notebooks/deploy-llama-3-1-405b-on-vertex-ai/vertex-notebook.ipynb)
+  - [Evaluate open LLMs with Vertex AI and Gemini](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/vertex-ai/notebooks/evaluate-llms-with-vertex-ai)
 
+### GKE
 
-- Evaluation
+- Inference
 
-  - [Evaluate open LLMs with Vertex AI and Gemini](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/vertex-ai/notebooks/evaluate-llms-with-vertex-ai)
+  - [Deploy BGE Base v1.5 with TEI DLC from GCS on GKE](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/gke/tei-from-gcs-deployment)
+  - [Deploy Gemma2 with multiple LoRA adapters with TGI DLC on GKE](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/gke/tgi-multi-lora-deployment)
+  - [Deploy Llama 3.1 405B with TGI DLC on GKE](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/gke/tgi-llama-405b-deployment)
+  - [Deploy Meta Llama 3 8B with TGI DLC on GKE](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/gke/tgi-deployment)
+  - [Deploy Qwen2 7B with TGI DLC from GCS on GKE](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/gke/tgi-from-gcs-deployment)
+  - [Deploy Snowflake's Arctic Embed with TEI DLC on GKE](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/gke/tei-deployment)
+
+- Training
 
+  - [Fine-tune Gemma 2B with PyTorch Training DLC using SFT on GKE](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/gke/trl-full-fine-tuning)
+  - [Fine-tune Mistral 7B v0.3 with PyTorch Training DLC using SFT + LoRA on GKE](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/gke/trl-lora-fine-tuning)
 
 ### (Preview) Cloud Run
 
 - Inference
 
-    - [Deploy Meta Llama 3.1 with TGI DLC on Cloud Run](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/cloud-run/tgi-deployment)
+  - [Deploy Meta Llama 3.1 8B with TGI DLC on Cloud Run](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/cloud-run/tgi-deployment)
diff --git a/examples/gke/README.md b/examples/gke/README.md
@@ -11,10 +11,11 @@ This directory contains usage examples of the Hugging Face Deep Learning Contain
 
 ## Inference Examples
 
-| Example                                                  | Title                                               |
-| -------------------------------------------------------- | --------------------------------------------------- |
-| [tgi-deployment](./tgi-deployment)                       | Deploy Meta Llama 3 8B with TGI DLC on GKE          |
-| [tgi-from-gcs-deployment](./tgi-from-gcs-deployment)     | Deploy Qwen2 7B with TGI DLC from GCS on GKE        |
-| [tgi-llama-405b-deployment](./tgi-llama-405b-deployment) | Deploy Llama 3.1 405B with TGI DLC on GKE           |
-| [tei-deployment](./tei-deployment)                       | Deploy Snowflake's Arctic Embed with TEI DLC on GKE |
-| [tei-from-gcs-deployment](./tei-from-gcs-deployment)     | Deploy BGE Base v1.5 with TEI DLC from GCS on GKE   |
+| Example                                                  | Title                                                         |
+| -------------------------------------------------------- | ------------------------------------------------------------- |
+| [tei-deployment](./tei-deployment)                       | Deploy Snowflake's Arctic Embed with TEI DLC on GKE           |
+| [tei-from-gcs-deployment](./tei-from-gcs-deployment)     | Deploy BGE Base v1.5 with TEI DLC from GCS on GKE             |
+| [tgi-deployment](./tgi-deployment)                       | Deploy Meta Llama 3 8B with TGI DLC on GKE                    |
+| [tgi-from-gcs-deployment](./tgi-from-gcs-deployment)     | Deploy Qwen2 7B with TGI DLC from GCS on GKE                  |
+| [tgi-llama-405b-deployment](./tgi-llama-405b-deployment) | Deploy Llama 3.1 405B with TGI DLC on GKE                     |
+| [tgi-multi-lora-deployment](./tgi-multi-lora-deployment) | Deploy Gemma2 with multiple LoRA adapters with TGI DLC on GKE |