huggingface · alvarobartt · Oct 10, 2024 · Sep 23, 2024 · Sep 23, 2024 · Sep 26, 2024
diff --git a/.github/workflows/doc-build.yml b/.github/workflows/doc-build.yml
@@ -6,7 +6,10 @@ on:
       - main
       - doc-builder*
     paths:
-      - docs/source/**
+      - docs/**
+      - examples/**/*.md
+      - examples/**/*.ipynb
+      - Makefile
       - .github/workflows/doc-build.yml
 
 jobs:

diff --git a/.github/workflows/doc-pr-build.yml b/.github/workflows/doc-pr-build.yml
@@ -3,7 +3,10 @@ name: Build PR Documentation
 on:
   pull_request:
     paths:
-      - docs/source/**
+      - docs/**
+      - examples/**/*.md
+      - examples/**/*.ipynb
+      - Makefile
       - .github/workflows/doc-pr-build.yml
 
 concurrency:
@@ -20,3 +23,5 @@ jobs:
       package_name: google-cloud
       additional_args: --not_python_module
       pre_command: cd Google-Cloud-Containers && make docs
+    env:
+      GITHUB_BRANCH: ${{ github.head_ref || github.ref_name }}
diff --git a/README.md b/README.md
@@ -51,24 +51,23 @@ The [`examples`](./examples) directory contains examples for using the container
 
 ### Inference Examples
 
-
-| Service   | Example                                                                                                                   | Description                                                                                                                                     |
-| --------- | ------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- |
-| GKE       | [tgi-deployment](./examples/gke/tgi-deployment)                                                                           | Deploying Llama3 8B with Text Generation Inference (TGI) on GKE.                                                                                |
-| GKE       | [tgi-from-gcs-deployment](./examples/gke/tgi-from-gcs-deployment)                                                         | Deploying Qwen2 7B Instruct with Text Generation Inference (TGI) from a GCS Bucket on GKE.                                                      |
-| GKE       | [tei-deployment](./examples/gke/tei-deployment)                                                                           | Deploying Snowflake's Arctic Embed (M) with Text Embeddings Inference (TEI) on GKE.                                                             |
-| GKE       | [tei-from-gcs-deployment](./examples/gke/tei-from-gcs-deployment)                                                         | Deploying BGE Base v1.5 (English) with Text Embeddings Inference (TEI) from a GCS Bucket on GKE.                                                |
-| Vertex AI | [deploy-bert-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-bert-on-vertex-ai)                                       | Deploying a BERT model for a text classification task using `huggingface-inference-toolkit` for a Custom Prediction Routine (CPR) on Vertex AI. |
-| Vertex AI | [deploy-embedding-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-embedding-on-vertex-ai)                             | Deploying an embedding model with Text Embeddings Inference (TEI) on Vertex AI.                                                                 |
-| Vertex AI | [deploy-gemma-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-gemma-on-vertex-ai)                                     | Deploying Gemma 7B Instruct with Text Generation Inference (TGI) on Vertex AI.                                                                  |
-| Vertex AI | [deploy-gemma-from-gcs-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-gemma-from-gcs-on-vertex-ai)                   | Deploying Gemma 7B Instruct with Text Generation Inference (TGI) from a GCS Bucket on Vertex AI.                                                |
-| Vertex AI | [deploy-flux-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-flux-on-vertex-ai)                                       | Deploying FLUX with Hugging Face PyTorch DLCs for Inference on Vertex AI.                                                                       |
-| Vertex AI | [deploy-llama-3-1-405b-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-llama-405b-on-vertex-ai/vertex-notebook.ipynb) | Deploying Meta Llama 3.1 405B in FP8 with Hugging Face DLC for TGI on Vertex AI.                                                                |
-| Cloud Run | [tgi-deployment](./examples/cloud-run/tgi-deployment/README.md)                                                           | Deploying Meta Llama 3.1 8B with Text Generation Inference on Cloud Run.                                                                        |
-
+| Service   | Example                                                                                                                                                | Title                                                         |
+| --------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------- |
+| GKE       | [examples/gke/tgi-deployment](./examples/gke/tgi-deployment)                                                                                           | Deploy Meta Llama 3 8B with TGI DLC on GKE                    |
+| GKE       | [examples/gke/tgi-from-gcs-deployment](./examples/gke/tgi-from-gcs-deployment)                                                                         | Deploy Qwen2 7B with TGI DLC from GCS on GKE                  |
+| GKE       | [examples/gke/tgi-multi-lora-deployment](./examples/gke/tgi-multi-lora-deployment)                                                                     | Deploy Gemma2 with multiple LoRA adapters with TGI DLC on GKE |
+| GKE       | [examples/gke/tei-deployment](./examples/gke/tei-deployment)                                                                                           | Deploy Snowflake's Arctic Embed with TEI DLC on GKE           |
+| GKE       | [examples/gke/tei-from-gcs-deployment](./examples/gke/tei-from-gcs-deployment)                                                                         | Deploy BGE Base v1.5 with TEI DLC from GCS on GKE             |
+| Vertex AI | [examples/vertex-ai/notebooks/deploy-bert-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-bert-on-vertex-ai)                                       | Deploy BERT Models with PyTorch Inference DLC on Vertex AI    |
+| Vertex AI | [examples/vertex-ai/notebooks/deploy-embedding-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-embedding-on-vertex-ai)                             | Deploy Embedding Models with TEI DLC on Vertex AI             |
+| Vertex AI | [examples/vertex-ai/notebooks/deploy-gemma-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-gemma-on-vertex-ai)                                     | Deploy Gemma 7B with TGI DLC on Vertex AI                     |
+| Vertex AI | [examples/vertex-ai/notebooks/deploy-gemma-from-gcs-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-gemma-from-gcs-on-vertex-ai)                   | Deploy Gemma 7B with TGI DLC from GCS on Vertex AI            |
+| Vertex AI | [examples/vertex-ai/notebooks/deploy-flux-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-flux-on-vertex-ai)                                       | Deploy FLUX with PyTorch Inference DLC on Vertex AI           |
+| Vertex AI | [examples/vertex-ai/notebooks/deploy-llama-3-1-405b-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-llama-405b-on-vertex-ai/vertex-notebook.ipynb) | Deploy Meta Llama 3.1 405B with TGI DLC on Vertex AI          |
+| Cloud Run | [examples/cloud-run/tgi-deployment](./examples/cloud-run/tgi-deployment/README.md)                                                                     | Deploy Meta Llama 3.1 with TGI DLC on Cloud Run               |
 
 ### Evaluation
 
-| Service   | Example                                                                                     | Description                                     |
-| --------- | ------------------------------------------------------------------------------------------- | ----------------------------------------------- |
-| Vertex AI | [evaluate-llms-with-vertex-ai](./examples/vertex-ai/notebooks/evaluate-llms-with-vertex-ai) | Evaluating open LLMs with Vertex AI and Gemini. |
+| Service   | Example                                                                                                                  | Title                                        |
+| --------- | ------------------------------------------------------------------------------------------------------------------------ | -------------------------------------------- |
+| Vertex AI | [examples/vertex-ai/notebooks/evaluate-llms-with-vertex-ai](./examples/vertex-ai/notebooks/evaluate-llms-with-vertex-ai) | Evaluate open LLMs with Vertex AI and Gemini |
diff --git a/docs/scripts/auto-generate-examples.py b/docs/scripts/auto-generate-examples.py
@@ -1,6 +1,8 @@
 import os
 import re
 
+GITHUB_BRANCH = os.getenv("GITHUB_BRANCH", "main")
+
 
 def process_readme_files():
     print("Processing README.md files from examples/gke and examples/cloud-run...")
@@ -35,37 +37,32 @@ def process_file(root, file, dir):
     # Replace image and link paths
     content = re.sub(
         r"\(\./(imgs|assets)/([^)]*\.png)\)",
-        r"(https://raw.githubusercontent.com/huggingface/Google-Cloud-Containers/main/"
+        rf"(https://raw.githubusercontent.com/huggingface/Google-Cloud-Containers/{GITHUB_BRANCH}/"
         + root
         + r"/\1/\2)",
         content,
     )
     content = re.sub(
         r"\(\.\./([^)]+)\)",
-        r"(https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/"
+        rf"(https://github.com/huggingface/Google-Cloud-Containers/tree/{GITHUB_BRANCH}/examples/"
         + dir
         + r"/\1)",
         content,
     )
     content = re.sub(
         r"\(\.\/([^)]+)\)",
-        r"(https://github.com/huggingface/Google-Cloud-Containers/tree/main/"
+        rf"(https://github.com/huggingface/Google-Cloud-Containers/tree/{GITHUB_BRANCH}/"
         + root
         + r"/\1)",
         content,
     )
 
-    # Regular expression to match the specified blocks
-    pattern = r"> \[!(NOTE|WARNING)\]\n((?:> .*\n)+)"
-
     def replacement(match):
         block_type = match.group(1)
         content = match.group(2)
 
-        # Remove '> ' from the beginning of each line and strip whitespace
-        lines = [
-            line.lstrip("> ").strip() for line in content.split("\n") if line.strip()
-        ]
+        # Remove '> ' from the beginning of each line
+        lines = [line[2:] for line in content.split("\n") if line.strip()]
 
         # Determine the Tip type
         tip_type = " warning" if block_type == "WARNING" else ""
@@ -77,11 +74,14 @@ def replacement(match):
 
         return new_block
 
+    # Regular expression to match the specified blocks
+    pattern = r"> \[!(NOTE|WARNING)\]\n((?:>.*(?:\n|$))+)"
+
     # Perform the transformation
     content = re.sub(pattern, replacement, content, flags=re.MULTILINE)
 
-    # Remove blockquotes
-    content = re.sub(r"^(>[ ]*)+", "", content, flags=re.MULTILINE)
+    # Remove any remaining '>' or '> ' at the beginning of lines
+    content = re.sub(r"^>[ ]?", "", content, flags=re.MULTILINE)
 
     # Check for remaining relative paths
     if re.search(r"\(\.\./|\(\./", content):

diff --git a/docs/source/resources.mdx b/docs/source/resources.mdx
@@ -30,7 +30,8 @@ Learn how to use Hugging Face in Google Cloud by reading our blog posts, Google
 - Inference
 
   - [Deploy Meta Llama 3 8B with TGI DLC on GKE](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/gke/tgi-deployment)
-  - [Deploying Llama3 8B with Text Generation Inference (TGI) on GKE](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/gke/tgi-from-gcs-deployment)
+  - [Deploy Llama3 8B with TGI DLC on GKE](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/gke/tgi-from-gcs-deployment)
+  - [Deploy Gemma2 with multiple LoRA adapters with TGI DLC on GKE](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/gke/tgi-multi-lora-deployment)
   - [Deploy Snowflake's Arctic Embed with TEI DLC on GKE](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/gke/tei-deployment)
   - [Deploy BGE Base v1.5 with TEI DLC from GCS on GKE](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/gke/tei-from-gcs-deployment)
 
@@ -50,14 +51,13 @@ Learn how to use Hugging Face in Google Cloud by reading our blog posts, Google
   - [Deploy FLUX with PyTorch Inference DLC on Vertex AI](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/vertex-ai/notebooks/deploy-flux-on-vertex-ai/vertex-notebook.ipynb)
   - [Deploy Meta Llama 3.1 405B with TGI DLC on Vertex AI](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/vertex-ai/notebooks/deploy-llama-3-1-405b-on-vertex-ai/vertex-notebook.ipynb)
 
-
 - Evaluation
 
-  - [Evaluating open LLMs with Vertex AI and Gemini](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/vertex-ai/notebooks/evaluate-llms-with-vertex-ai)
+  - [Evaluate open LLMs with Vertex AI and Gemini](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/vertex-ai/notebooks/evaluate-llms-with-vertex-ai)
 
 
 ### (Preview) Cloud Run
 
 - Inference
 
-    - [Deploy Meta Llama 3.1 with TGI DLC on Cloud Run](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/cloud-run/tgi-deployment)
+    - [Deploy Meta Llama 3.1 with TGI DLC on Cloud Run](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/cloud-run/tgi-deployment)
diff --git a/examples/gke/README.md b/examples/gke/README.md
@@ -11,9 +11,10 @@ This directory contains usage examples of the Hugging Face Deep Learning Contain
 
 ## Inference Examples
 
-| Example                                              | Title                                               |
-| ---------------------------------------------------- | --------------------------------------------------- |
-| [tgi-deployment](./tgi-deployment)                   | Deploy Meta Llama 3 8B with TGI DLC on GKE          |
-| [tgi-from-gcs-deployment](./tgi-from-gcs-deployment) | Deploy Qwen2 7B with TGI DLC from GCS on GKE        |
-| [tei-deployment](./tei-deployment)                   | Deploy Snowflake's Arctic Embed with TEI DLC on GKE |
-| [tei-from-gcs-deployment](./tei-from-gcs-deployment) | Deploy BGE Base v1.5 with TEI DLC from GCS on GKE   |
+| Example                                                  | Title                                                         |
+| -------------------------------------------------------- | ------------------------------------------------------------- |
+| [tgi-deployment](./tgi-deployment)                       | Deploy Meta Llama 3 8B with TGI DLC on GKE                    |
+| [tgi-from-gcs-deployment](./tgi-from-gcs-deployment)     | Deploy Qwen2 7B with TGI DLC from GCS on GKE                  |
+| [tgi-multi-lora-deployment](./tgi-multi-lora-deployment) | Deploy Gemma2 with multiple LoRA adapters with TGI DLC on GKE |
+| [tei-deployment](./tei-deployment)                       | Deploy Snowflake's Arctic Embed with TEI DLC on GKE           |
+| [tei-from-gcs-deployment](./tei-from-gcs-deployment)     | Deploy BGE Base v1.5 with TEI DLC from GCS on GKE             |
diff --git a/examples/gke/tgi-deployment/README.md b/examples/gke/tgi-deployment/README.md
@@ -154,7 +154,7 @@ kubectl apply -f config/
 > Alternatively, you can just wait for the deployment to be ready with the following command:
 >
 > ```bash
-> kubectl wait --for=condition=Available --timeout=700s deployment/tei-deployment
+> kubectl wait --for=condition=Available --timeout=700s deployment/tgi-deployment
 > ```
 
 ## Inference with TGI