Skip to content

Commit

Permalink
Add examples/gke/tgi-llama-405b-deployment (#103)
Browse files Browse the repository at this point in the history
* Add `tgi-llama-405b-deployment` example

* Add context for deprecated ingress annotation

* Fix `cURL` input payload

* Add Gemma Dev Day presentation link

* Update `README.md`, `examples/gke/README.md` and `docs/source/resources.mdx`

To include latest examples and update table indents
  • Loading branch information
alvarobartt authored Oct 8, 2024
1 parent 76251a4 commit 392c15b
Show file tree
Hide file tree
Showing 10 changed files with 425 additions and 28 deletions.
35 changes: 17 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,24 +51,23 @@ The [`examples`](./examples) directory contains examples for using the container

### Inference Examples


| Service | Example | Description |
| --------- | ------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- |
| GKE | [tgi-deployment](./examples/gke/tgi-deployment) | Deploying Llama3 8B with Text Generation Inference (TGI) on GKE. |
| GKE | [tgi-from-gcs-deployment](./examples/gke/tgi-from-gcs-deployment) | Deploying Qwen2 7B Instruct with Text Generation Inference (TGI) from a GCS Bucket on GKE. |
| GKE | [tei-deployment](./examples/gke/tei-deployment) | Deploying Snowflake's Arctic Embed (M) with Text Embeddings Inference (TEI) on GKE. |
| GKE | [tei-from-gcs-deployment](./examples/gke/tei-from-gcs-deployment) | Deploying BGE Base v1.5 (English) with Text Embeddings Inference (TEI) from a GCS Bucket on GKE. |
| Vertex AI | [deploy-bert-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-bert-on-vertex-ai) | Deploying a BERT model for a text classification task using `huggingface-inference-toolkit` for a Custom Prediction Routine (CPR) on Vertex AI. |
| Vertex AI | [deploy-embedding-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-embedding-on-vertex-ai) | Deploying an embedding model with Text Embeddings Inference (TEI) on Vertex AI. |
| Vertex AI | [deploy-gemma-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-gemma-on-vertex-ai) | Deploying Gemma 7B Instruct with Text Generation Inference (TGI) on Vertex AI. |
| Vertex AI | [deploy-gemma-from-gcs-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-gemma-from-gcs-on-vertex-ai) | Deploying Gemma 7B Instruct with Text Generation Inference (TGI) from a GCS Bucket on Vertex AI. |
| Vertex AI | [deploy-flux-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-flux-on-vertex-ai) | Deploying FLUX with Hugging Face PyTorch DLCs for Inference on Vertex AI. |
| Vertex AI | [deploy-llama-3-1-405b-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-llama-405b-on-vertex-ai/vertex-notebook.ipynb) | Deploying Meta Llama 3.1 405B in FP8 with Hugging Face DLC for TGI on Vertex AI. |
| Cloud Run | [tgi-deployment](./examples/cloud-run/tgi-deployment/README.md) | Deploying Meta Llama 3.1 8B with Text Generation Inference on Cloud Run. |

| Service | Example | Title |
| --------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------------------- |
| GKE | [examples/gke/tgi-deployment](./examples/gke/tgi-deployment) | Deploy Meta Llama 3 8B with TGI DLC on GKE |
| GKE | [examples/gke/tgi-from-gcs-deployment](./examples/gke/tgi-from-gcs-deployment) | Deploy Qwen2 7B with TGI DLC from GCS on GKE |
| GKE | [examples/gke/tgi-llama-405b-deployment](./examples/gke/tgi-llama-405b-deployment) | Deploy Llama 3.1 405B with TGI DLC on GKE |
| GKE | [examples/gke/tei-deployment](./examples/gke/tei-deployment) | Deploy Snowflake's Arctic Embed with TEI DLC on GKE |
| GKE | [examples/gke/tei-from-gcs-deployment](./examples/gke/tei-from-gcs-deployment) | Deploy BGE Base v1.5 with TEI DLC from GCS on GKE |
| Vertex AI | [examples/vertex-ai/notebooks/deploy-bert-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-bert-on-vertex-ai) | Deploy BERT Models with PyTorch Inference DLC on Vertex AI |
| Vertex AI | [examples/vertex-ai/notebooks/deploy-embedding-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-embedding-on-vertex-ai) | Deploy Embedding Models with TEI DLC on Vertex AI |
| Vertex AI | [examples/vertex-ai/notebooks/deploy-gemma-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-gemma-on-vertex-ai) | Deploy Gemma 7B with TGI DLC on Vertex AI |
| Vertex AI | [examples/vertex-ai/notebooks/deploy-gemma-from-gcs-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-gemma-from-gcs-on-vertex-ai) | Deploy Gemma 7B with TGI DLC from GCS on Vertex AI |
| Vertex AI | [examples/vertex-ai/notebooks/deploy-flux-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-flux-on-vertex-ai) | Deploy FLUX with PyTorch Inference DLC on Vertex AI |
| Vertex AI | [examples/vertex-ai/notebooks/deploy-llama-3-1-405b-on-vertex-ai](./examples/vertex-ai/notebooks/deploy-llama-405b-on-vertex-ai/vertex-notebook.ipynb) | Deploy Meta Llama 3.1 405B with TGI DLC on Vertex AI |
| Cloud Run | [examples/cloud-run/tgi-deployment](./examples/cloud-run/tgi-deployment/README.md) | Deploy Meta Llama 3.1 with TGI DLC on Cloud Run |

### Evaluation

| Service | Example | Description |
| --------- | ------------------------------------------------------------------------------------------- | ----------------------------------------------- |
| Vertex AI | [evaluate-llms-with-vertex-ai](./examples/vertex-ai/notebooks/evaluate-llms-with-vertex-ai) | Evaluating open LLMs with Vertex AI and Gemini. |
| Service | Example | Title |
| --------- | ------------------------------------------------------------------------------------------------------------------------ | -------------------------------------------- |
| Vertex AI | [examples/vertex-ai/notebooks/evaluate-llms-with-vertex-ai](./examples/vertex-ai/notebooks/evaluate-llms-with-vertex-ai) | Evaluate open LLMs with Vertex AI and Gemini |
13 changes: 9 additions & 4 deletions docs/source/resources.mdx
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# 📄 Other Resources

Learn how to use Hugging Face in Google Cloud by reading our blog posts, Google documentation and examples below.
Learn how to use Hugging Face in Google Cloud by reading our blog posts, presentations, Google documentation and examples below.

## Blog posts

Expand All @@ -9,6 +9,10 @@ Learn how to use Hugging Face in Google Cloud by reading our blog posts, Google
- [Making thousands of open LLMs bloom in the Vertex AI Model Garden](https://huggingface.co/blog/google-cloud-model-garden)
- [Deploy Meta Llama 3.1 405B on Google Cloud Vertex AI](https://huggingface.co/blog/llama31-on-vertex-ai)

## Presentations

- [Gemma Developer Day in Tokyo | Unleashing Gemma in production with Hugging Face Text Generation Inference (TGI)](https://rsvp.withgoogle.com/events/gemma-dev-day_2024tokyo/sessions/hugging-face-transformers)

## Google Documentation

- [Google Cloud Hugging Face Deep Learning Containers](https://cloud.google.com/deep-learning-containers/docs/choosing-container#hugging-face)
Expand All @@ -30,7 +34,8 @@ Learn how to use Hugging Face in Google Cloud by reading our blog posts, Google
- Inference

- [Deploy Meta Llama 3 8B with TGI DLC on GKE](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/gke/tgi-deployment)
- [Deploying Llama3 8B with Text Generation Inference (TGI) on GKE](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/gke/tgi-from-gcs-deployment)
- [Deploy Llama3 8B with TGI DLC on GKE](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/gke/tgi-from-gcs-deployment)
- [Deploy Llama 3.1 405B with TGI DLC on GKE](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/gke/tgi-from-gcs-deployment)
- [Deploy Snowflake's Arctic Embed with TEI DLC on GKE](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/gke/tei-deployment)
- [Deploy BGE Base v1.5 with TEI DLC from GCS on GKE](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/gke/tei-from-gcs-deployment)

Expand All @@ -53,11 +58,11 @@ Learn how to use Hugging Face in Google Cloud by reading our blog posts, Google

- Evaluation

- [Evaluating open LLMs with Vertex AI and Gemini](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/vertex-ai/notebooks/evaluate-llms-with-vertex-ai)
- [Evaluate open LLMs with Vertex AI and Gemini](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/vertex-ai/notebooks/evaluate-llms-with-vertex-ai)


### (Preview) Cloud Run

- Inference

- [Deploy Meta Llama 3.1 with TGI DLC on Cloud Run](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/cloud-run/tgi-deployment)
- [Deploy Meta Llama 3.1 with TGI DLC on Cloud Run](https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/cloud-run/tgi-deployment)
13 changes: 7 additions & 6 deletions examples/gke/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,10 @@ This directory contains usage examples of the Hugging Face Deep Learning Contain

## Inference Examples

| Example | Title |
| ---------------------------------------------------- | --------------------------------------------------- |
| [tgi-deployment](./tgi-deployment) | Deploy Meta Llama 3 8B with TGI DLC on GKE |
| [tgi-from-gcs-deployment](./tgi-from-gcs-deployment) | Deploy Qwen2 7B with TGI DLC from GCS on GKE |
| [tei-deployment](./tei-deployment) | Deploy Snowflake's Arctic Embed with TEI DLC on GKE |
| [tei-from-gcs-deployment](./tei-from-gcs-deployment) | Deploy BGE Base v1.5 with TEI DLC from GCS on GKE |
| Example | Title |
| -------------------------------------------------------- | --------------------------------------------------- |
| [tgi-deployment](./tgi-deployment) | Deploy Meta Llama 3 8B with TGI DLC on GKE |
| [tgi-from-gcs-deployment](./tgi-from-gcs-deployment) | Deploy Qwen2 7B with TGI DLC from GCS on GKE |
| [tgi-llama-405b-deployment](./tgi-llama-405b-deployment) | Deploy Llama 3.1 405B with TGI DLC on GKE |
| [tei-deployment](./tei-deployment) | Deploy Snowflake's Arctic Embed with TEI DLC on GKE |
| [tei-from-gcs-deployment](./tei-from-gcs-deployment) | Deploy BGE Base v1.5 with TEI DLC from GCS on GKE |
Loading

0 comments on commit 392c15b

Please sign in to comment.