Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add examples within the docs/ #88

Merged
merged 37 commits into from
Sep 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
e54ecc3
Add `examples` as embeded notebooks in the `docs/`
alvarobartt Sep 12, 2024
ab63909
Remove `notebooks_folder` and `convert_notebooks` args
alvarobartt Sep 12, 2024
2d6ba24
Update `docs/source/_toctree.yml` to include examples (WIP)
alvarobartt Sep 12, 2024
0af20f7
Add sample paths to examples (WIP)
alvarobartt Sep 12, 2024
7023ddc
Update `thumbnail.png`
alvarobartt Sep 12, 2024
e656722
Split partially `index.mdx` in `features.mdx` and `resources.mdx`
alvarobartt Sep 16, 2024
4956df9
Add `docs/source/containers/*.mdx` (WIP)
alvarobartt Sep 16, 2024
b79d02c
Update `docs/source/containers/available.mdx`
alvarobartt Sep 16, 2024
ca2bce9
Fix path to `scripts/upload_model_to_gcs.sh`
alvarobartt Sep 16, 2024
5bffa57
Clean `docs/source`
alvarobartt Sep 16, 2024
f0c0e42
Add `Makefile` to auto-generate docs from examples
alvarobartt Sep 16, 2024
6b9cd7c
Add `pre_command: make docs`
alvarobartt Sep 16, 2024
84a1bbf
(debug) Add `ls -la` before `make docs`
alvarobartt Sep 16, 2024
5a959a7
Fix `pre_command` to `cd` into `Google-Cloud-Containers` first
alvarobartt Sep 16, 2024
da1a42d
Include `examples/cloud-run` directory in `make docs`
alvarobartt Sep 16, 2024
47972aa
Merge branch 'main' into add-notebooks-to-docs
alvarobartt Sep 16, 2024
b6c6621
Remove extra empty `>` lines and add `make serve`
alvarobartt Sep 16, 2024
058dab0
Update `Makefile` and add `docs/sed/huggingface-tip.sed`
alvarobartt Sep 17, 2024
841bafe
Add `docs/scripts/auto-generate-examples.py`
alvarobartt Sep 17, 2024
403754f
Update `Makefile` and `docs/scripts/auto-generate-examples.py`
alvarobartt Sep 17, 2024
3ffef3c
Update "Examples" section ordering
alvarobartt Sep 17, 2024
6fc2c88
Remove emojis within `docs/source/_toctree.yml`
alvarobartt Sep 17, 2024
0fc35ea
Add `metadata` to every example under `examples`
alvarobartt Sep 18, 2024
d394b96
Update `docs/scripts/auto-generate-examples.py`
alvarobartt Sep 18, 2024
8fe51c8
Add `docs/scripts/auto-update-toctree.py`
alvarobartt Sep 18, 2024
416bf17
Add `docs/source/examples` to `.gitignore`
alvarobartt Sep 18, 2024
1e69fa0
Update comment parsing for Jupyter Notebooks
alvarobartt Sep 18, 2024
8670e93
Clean metadata from `.mdx` files (and remove if none)
alvarobartt Sep 18, 2024
02a6566
Set `isExpanded: true` for top level examples
alvarobartt Sep 18, 2024
c967f57
Merge branch 'main' into add-notebooks-to-docs
alvarobartt Sep 18, 2024
25885e9
Update `docs/source/containers/available.mdx`
alvarobartt Sep 18, 2024
1522c41
Fix typo in `youself`->`yourself`
alvarobartt Sep 18, 2024
89d17c3
Split example introduction from TL;DR
alvarobartt Sep 18, 2024
2283052
Apply suggestions from code review
alvarobartt Sep 19, 2024
620ad34
Update `containers/tgi/README.md`
alvarobartt Sep 19, 2024
5c27af3
Update and align example titles
alvarobartt Sep 19, 2024
49c9e48
Fix `title` for `/resources`
alvarobartt Sep 19, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .github/workflows/doc-build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,14 @@ on:
- .github/workflows/doc-build.yml

jobs:
build:
build:
uses: huggingface/doc-builder/.github/workflows/build_main_documentation.yml@main
with:
commit_sha: ${{ github.sha }}
package: Google-Cloud-Containers
package_name: google-cloud
additional_args: --not_python_module
pre_command: cd Google-Cloud-Containers && make docs
secrets:
token: ${{ secrets.HUGGINGFACE_PUSH }}
hf_token: ${{ secrets.HF_DOC_BUILD_PUSH }}
1 change: 1 addition & 0 deletions .github/workflows/doc-pr-build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,3 +19,4 @@ jobs:
package: Google-Cloud-Containers
package_name: google-cloud
additional_args: --not_python_module
pre_command: cd Google-Cloud-Containers && make docs
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -161,3 +161,6 @@ cython_debug/

# .DS_Store files
.DS_Store

# Auto-generated docs
docs/source/examples/
33 changes: 33 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
.PHONY: docs clean help

docs: clean
@echo "Processing README.md files from examples/gke, examples/cloud-run, and examples/vertex-ai..."
@mkdir -p docs/source/examples
@echo "Converting Jupyter Notebooks to MDX..."
@doc-builder notebook-to-mdx examples/vertex-ai/notebooks/
@echo "Auto-generating example files for documentation..."
@python docs/scripts/auto-generate-examples.py
@echo "Cleaning up generated Markdown Notebook files..."
@find examples/vertex-ai/notebooks -name "vertex-notebook.md" -type f -delete
@echo "Generating YAML tree structure and appending to _toctree.yml..."
@python docs/scripts/auto-update-toctree.py
@echo "YAML tree structure appended to docs/source/_toctree.yml"
@echo "Documentation setup complete."

clean:
@echo "Cleaning up generated documentation..."
@rm -rf docs/source/examples
@awk '/^# GENERATED CONTENT DO NOT EDIT!/,/^# END GENERATED CONTENT/{next} {print}' docs/source/_toctree.yml > docs/source/_toctree.yml.tmp && mv docs/source/_toctree.yml.tmp docs/source/_toctree.yml
@echo "Cleaning up generated Markdown Notebook files (if any)..."
@find examples/vertex-ai/notebooks -name "vertex-notebook.md" -type f -delete
@echo "Cleanup complete."

serve:
@echo "Serving documentation via doc-builder"
doc-builder preview gcloud docs/source --not_python_module

help:
@echo "Usage:"
@echo " make docs - Auto-generate the examples for the docs"
@echo " make clean - Remove the auto-generated docs"
@echo " make help - Display this help message"
38 changes: 19 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,25 +42,25 @@ The [`examples`](./examples) directory contains examples for using the container

### Training Examples

| Service | Example | Description |
| --------- | ------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------- |
| GKE | [trl-full-fine-tuning](./examples/gke/trl-full-fine-tuning) | Full SFT fine-tuning of Gemma 2B in a multi-GPU instance with TRL on GKE. |
| GKE | [trl-lora-fine-tuning](./examples/gke/trl-lora-fine-tuning) | LoRA SFT fine-tuning of Mistral 7B v0.3 in a single GPU instance with TRL on GKE. |
| Vertex AI | [trl-full-sft-fine-tuning-on-vertex-ai](./examples/vertex-ai/notebooks/trl-full-sft-fine-tuning-on-vertex-ai) | Full SFT fine-tuning of Mistral 7B v0.3 in a multi-GPU instance with TRL on Vertex AI. |
| Vertex AI | [trl-lora-sft-fine-tuning-on-vertex-ai](./examples/vertex-ai/notebooks/trl-lora-sft-fine-tuning-on-vertex-ai) | LoRA SFT fine-tuning of Mistral 7B v0.3 in a single GPU instance with TRL on Vertex AI. |
| Service | Example | Title |
| --------- | ------------------------------------------------------------------------------------------------------------------------------------------ | --------------------------------------------------------------------------- |
| GKE | [examples/gke/trl-full-fine-tuning](./examples/gke/trl-full-fine-tuning) | Fine-tune Gemma 2B with PyTorch Training DLC using SFT on GKE |
| GKE | [examples/gke/trl-lora-fine-tuning](./examples/gke/trl-lora-fine-tuning) | Fine-tune Mistral 7B v0.3 with PyTorch Training DLC using SFT + LoRA on GKE |
| Vertex AI | [examples/vertex-ai/notebooks/trl-full-sft-fine-tuning-on-vertex-ai](./examples/vertex-ai/notebooks/trl-full-sft-fine-tuning-on-vertex-ai) | Fine-tune Mistral 7B v0.3 with PyTorch Training DLC using SFT on Vertex AI |
| Vertex AI | [examples/vertex-ai/notebooks/trl-lora-sft-fine-tuning-on-vertex-ai](./examples/vertex-ai/notebooks/trl-lora-sft-fine-tuning-on-vertex-ai) | Fine-tune Gemma 2B with PyTorch Training DLC using SFT + LoRA on Vertex AI |

### Inference Examples

| Service | Example | Description |
| --------- | ------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- |
| GKE | [tgi-deployment](./examples/gke/tgi-deployment) | Deploying Llama3 8B with Text Generation Inference (TGI) on GKE. |
| GKE | [tgi-from-gcs-deployment](./examples/gke/tgi-from-gcs-deployment) | Deploying Qwen2 7B Instruct with Text Generation Inference (TGI) from a GCS Bucket on GKE. |
| GKE | [tei-deployment](./examples/gke/tei-deployment) | Deploying Snowflake's Arctic Embed (M) with Text Embeddings Inference (TEI) on GKE. |
| GKE | [tei-from-gcs-deployment](./examples/gke/tei-from-gcs-deployment) | Deploying BGE Base v1.5 (English) with Text Embeddings Inference (TEI) from a GCS Bucket on GKE. |
| Vertex AI | [deploy-bert-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-bert-on-vertex-ai) | Deploying a BERT model for a text classification task using `huggingface-inference-toolkit` for a Custom Prediction Routine (CPR) on Vertex AI. |
| Vertex AI | [deploy-embedding-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-embedding-on-vertex-ai) | Deploying an embedding model with Text Embeddings Inference (TEI) on Vertex AI. |
| Vertex AI | [deploy-gemma-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-gemma-on-vertex-ai) | Deploying Gemma 7B Instruct with Text Generation Inference (TGI) on Vertex AI. |
| Vertex AI | [deploy-gemma-from-gcs-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-gemma-from-gcs-on-vertex-ai) | Deploying Gemma 7B Instruct with Text Generation Inference (TGI) from a GCS Bucket on Vertex AI. |
| Vertex AI | [deploy-flux-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-flux-on-vertex-ai) | Deploying FLUX with Hugging Face PyTorch DLCs for Inference on Vertex AI. |
| Vertex AI | [deploy-llama-3-1-405b-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-llama-405b-on-vertex-ai/vertex-notebook.ipynb) | Deploying Meta Llama 3.1 405B in FP8 with Hugging Face DLC for TGI on Vertex AI. |
| Cloud Run | [tgi-deployment](./examples/cloud-run/tgi-deployment/README.md) | Deploying Meta Llama 3.1 8B with Text Generation Inference on Cloud Run. |
| Service | Example | Title |
| --------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------------------- |
| GKE | [examples/gke/tgi-deployment](./examples/gke/tgi-deployment) | Deploy Meta Llama 3 8B with TGI DLC on GKE |
| GKE | [examples/gke/tgi-from-gcs-deployment](./examples/gke/tgi-from-gcs-deployment) | Deploy Qwen2 7B with TGI DLC from GCS on GKE |
| GKE | [examples/gke/tei-deployment](./examples/gke/tei-deployment) | Deploy Snowflake's Arctic Embed with TEI DLC on GKE |
| GKE | [examples/gke/tei-from-gcs-deployment](./examples/gke/tei-from-gcs-deployment) | Deploy BGE Base v1.5 with TEI DLC from GCS on GKE |
| Vertex AI | [examples/vertex-ai/notebooks/deploy-bert-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-bert-on-vertex-ai) | Deploy BERT Models with PyTorch Inference DLC on Vertex AI |
| Vertex AI | [examples/vertex-ai/notebooks/deploy-embedding-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-embedding-on-vertex-ai) | Deploy Embedding Models with TEI DLC on Vertex AI |
| Vertex AI | [examples/vertex-ai/notebooks/deploy-gemma-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-gemma-on-vertex-ai) | Deploy Gemma 7B with TGI DLC on Vertex AI |
| Vertex AI | [examples/vertex-ai/notebooks/deploy-gemma-from-gcs-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-gemma-from-gcs-on-vertex-ai) | Deploy Gemma 7B with TGI DLC from GCS on Vertex AI |
| Vertex AI | [examples/vertex-ai/notebooks/deploy-flux-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-flux-on-vertex-ai) | Deploy FLUX with PyTorch Inference DLC on Vertex AI |
| Vertex AI | [examples/vertex-ai/notebooks/deploy-llama-3-1-405b-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-llama-405b-on-vertex-ai/vertex-notebook.ipynb) | Deploy Meta Llama 3.1 405B with TGI DLC on Vertex AI |
| Cloud Run | [examples/cloud-run/tgi-deployment](./examples/cloud-run/tgi-deployment/README.md) | Deploy Meta Llama 3.1 with TGI DLC on Cloud Run |
8 changes: 4 additions & 4 deletions containers/tgi/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Below you will find the instructions on how to run and test the TGI containers a

To run the Docker container in GPUs you need to ensure that your hardware is supported (NVIDIA drivers on your device need to be compatible with CUDA version 12.2 or higher) and also install the NVIDIA Container Toolkit.

To find the supported models and hardware before running the TGI DLC, feel free to check [TGI's documentation](https://huggingface.co/docs/text-generation-inference/supported_models).
To find the supported models and hardware before running the TGI DLC, feel free to check [TGI Documentation](https://huggingface.co/docs/text-generation-inference/supported_models).

### Run

Expand Down Expand Up @@ -51,7 +51,7 @@ Which returns the following output containing the optimal configuration for depl
Then you are ready to run the container as follows:

```bash
docker run --gpus all -ti -p 8080:8080 \
docker run --gpus all -ti --shm-size 1g -p 8080:8080 \
-e MODEL_ID=google/gemma-7b-it \
-e NUM_SHARD=4 \
-e HF_TOKEN=$(cat ~/.cache/huggingface/token) \
Expand Down Expand Up @@ -85,7 +85,7 @@ curl 0.0.0.0:8080/v1/chat/completions \

Which will start streaming the completion tokens for the given messages until the stop sequences are generated.

Alternatively, you can also use the `/generate` endpoint instead, which already expects the inputs to be formatted according to the tokenizer's requirements, which is more convenient when working with base models without a pre-defined chat template or whenever you want to use a custom chat template instead, and can be used as follows:
Alternatively, you can also use the `/generate` endpoint instead, which already expects the inputs to be formatted according to the tokenizer requirements, which is more convenient when working with base models without a pre-defined chat template or whenever you want to use a custom chat template instead, and can be used as follows:

```bash
curl 0.0.0.0:8080/generate \
Expand All @@ -108,7 +108,7 @@ curl 0.0.0.0:8080/generate \
> [!WARNING]
> Building the containers is not recommended since those are already built by Hugging Face and Google Cloud teams and provided openly, so the recommended approach is to use the pre-built containers available in [Google Cloud's Artifact Registry](https://console.cloud.google.com/artifacts/docker/deeplearning-platform-release/us/gcr.io) instead.

In order to build TGI's Docker container, you will need an instance with at least 4 NVIDIA GPUs available with at least 24 GiB of VRAM each, since TGI needs to build and compile the kernels required for the optimized inference. Also note that the build process may take ~30 minutes to complete, depending on the instance's specifications.
In order to build TGI Docker container, you will need an instance with at least 4 NVIDIA GPUs available with at least 24 GiB of VRAM each, since TGI needs to build and compile the kernels required for the optimized inference. Also note that the build process may take ~30 minutes to complete, depending on the instance's specifications.

```bash
docker build -t us-docker.pkg.dev/deeplearning-platform-release/gcr.io/huggingface-text-generation-inference-cu121.2-2.ubuntu2204.py310 -f containers/tgi/gpu/2.2.0/Dockerfile .
Expand Down
103 changes: 103 additions & 0 deletions docs/scripts/auto-generate-examples.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
import os
import re


def process_readme_files():
print("Processing README.md files from examples/gke and examples/cloud-run...")
os.makedirs("docs/source/examples", exist_ok=True)

for dir in ["gke", "cloud-run", "vertex-ai/notebooks"]:
for root, _, files in os.walk(f"examples/{dir}"):
for file in files:
if file == "README.md" or file == "vertex-notebook.md":
process_file(root, file, dir)


def process_file(root, file, dir):
dir_name = dir if not dir.__contains__("/") else dir.replace("/", "-")

file_path = os.path.join(root, file)
subdir = root.replace(f"examples/{dir}/", "")
base = os.path.basename(subdir)

if file_path == f"examples/{dir}/README.md":
target = f"docs/source/examples/{dir_name}-index.mdx"
else:
target = f"docs/source/examples/{dir_name}-{base}.mdx"

print(f"Processing {file_path} to {target}")
with open(file_path, "r") as f:
content = f.read()

# For Juypter Notebooks, remove the comment i.e. `<!--` and the `--!>` but keep the metadata
content = re.sub(r"<!-- (.*?) -->", r"\1", content, flags=re.DOTALL)

# Replace image and link paths
content = re.sub(
r"\(\./(imgs|assets)/([^)]*\.png)\)",
r"(https://raw.githubusercontent.com/huggingface/Google-Cloud-Containers/main/"
+ root
+ r"/\1/\2)",
content,
)
content = re.sub(
r"\(\.\./([^)]+)\)",
r"(https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/"
+ dir
+ r"/\1)",
content,
)
content = re.sub(
r"\(\.\/([^)]+)\)",
r"(https://github.com/huggingface/Google-Cloud-Containers/tree/main/"
+ root
+ r"/\1)",
content,
)

# Regular expression to match the specified blocks
pattern = r"> \[!(NOTE|WARNING)\]\n((?:> .*\n)+)"

def replacement(match):
block_type = match.group(1)
content = match.group(2)

# Remove '> ' from the beginning of each line and strip whitespace
lines = [
line.lstrip("> ").strip() for line in content.split("\n") if line.strip()
]

# Determine the Tip type
tip_type = " warning" if block_type == "WARNING" else ""

# Construct the new block
new_block = f"<Tip{tip_type}>\n\n"
new_block += "\n".join(lines)
new_block += "\n\n</Tip>\n"

return new_block

# Perform the transformation
content = re.sub(pattern, replacement, content, flags=re.MULTILINE)

# Remove blockquotes
content = re.sub(r"^(>[ ]*)+", "", content, flags=re.MULTILINE)

# Check for remaining relative paths
if re.search(r"\(\.\./|\(\./", content):
print("WARNING: Relative paths still exist in the processed file.")
print(
"The following lines contain relative paths, consider replacing those with GitHub URLs instead:"
)
for i, line in enumerate(content.split("\n"), 1):
if re.search(r"\(\.\./|\(\./", line):
print(f"{i}: {line}")
else:
print("No relative paths found in the processed file.")

with open(target, "w") as f:
f.write(content)


if __name__ == "__main__":
process_readme_files()
Loading
Loading