Add `examples/gke/tgi-multi-lora-deployment` #102

alvarobartt · 2024-09-26T19:40:03Z

Description

This PR adds an example on how to deploy TGI via the Hugging Face DLC for Gemma2 using multiple LoRA adapters for inference on a single NVIDIA L4 instance.

The three adapters have been fine-tuned in collaboration with @Jofthomas and can be found under the https://hf.co/google-cloud-partnership org on the Hub (still private, datasets can be moved there too):

cc @philschmid for a potential Cloud Tuesday post, @Jofthomas for his presentation on the upcoming Gemma Developer Day in Tokyo, and @pagezyhf for visibility on the example itself

And kudos to @Narsil for support on reviewing and merging huggingface/text-generation-inference#2567, and @datavistics et al for their post at https://huggingface.co/blog/multi-lora-serving

Additionally

This PR also includes the scripts/internal/update_example_tables.py script, which is being internally used to automatically generate the tables with the examples across the different files within this repository, to be automated on another PR.

This would temporarily make things easier to maintain, as when adding a new example one can just python scripts/internal/update_example_tables.py in the meantime to update those.

To also include modifications on `examples`, `Makefile`, and `docs/scripts` or anything under `docs/`

Still pending on the official release of the latest TGI DLC on Google Cloud

HuggingFaceDocBuilderDev · 2024-09-26T19:41:17Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

TODO(alvarobartt): automate this shortly as it's not so straight forward :/

examples/gke/tgi-multi-lora-deployment/README.md

…run,gke,vertex-ai}/README.md` Update example listing via `python scripts/internal/update_example_tables.py` so as to automatically generate those listings (respecting the previous content within the file, to alphabetically sort those as Vertex AI > GKE > Cloud Run, and some more fixes and improvements

To include the `examples/` prefix within the paths to the examples from the root directory i.e. in the `README.md` file

Fix `examples/<service>/examples/<service>/...`

…DME.md`

alvarobartt added 3 commits September 23, 2024 14:04

Update doc-build.yml and doc-pr-build.yml triggers

9a77caa

To also include modifications on `examples`, `Makefile`, and `docs/scripts` or anything under `docs/`

Fix kubectl wait command due to typo

451e734

Add tgi-multi-lora-deployment example

56f4bb2

Still pending on the official release of the latest TGI DLC on Google Cloud

alvarobartt added examples tgi blocked labels Sep 26, 2024

alvarobartt requested review from philschmid and pagezyhf September 26, 2024 19:40

alvarobartt self-assigned this Sep 26, 2024

alvarobartt added 6 commits September 27, 2024 09:58

Add imgs and update README.md

64c0676

Add latest example to existing listings

f8d3a68

TODO(alvarobartt): automate this shortly as it's not so straight forward :/

Add GITHUB_BRANCH to generate working links

37b2632

Fix indendation to use 4 spaces instead

be324ec

Fix indentation on !NOTE to Tip

d94875b

Fix replacement function in auto-generate-examples.py

2bb4cd0

alvarobartt commented Oct 1, 2024

View reviewed changes

examples/gke/tgi-multi-lora-deployment/README.md Outdated Show resolved Hide resolved

Update README.md

f279add

alvarobartt commented Oct 1, 2024

View reviewed changes

examples/gke/tgi-multi-lora-deployment/README.md Outdated Show resolved Hide resolved

alvarobartt added 4 commits October 8, 2024 12:26

Add scripts/internal/update_example_tables.py

1784a98

Update scripts/internal/update_example_tables.py

bb4257d

To include the `examples/` prefix within the paths to the examples from the root directory i.e. in the `README.md` file

Update README.md

0a4f61d

alvarobartt mentioned this pull request Oct 8, 2024

Add Llama 3.2 Visual example on both GKE and Vertex AI #106

Merged

alvarobartt added 6 commits October 8, 2024 18:24

Update docs/source/resources.mdx

034799d

Fix scripts/internal/update_example_tables.py

6abd0d5

Update README.md

3629c60

Fix `examples/<service>/examples/<service>/...`

Merge branch 'main' into multi-lora-example

5bbc245

Update README.md, docs/source/resources.mdx and `examples/gke/REA…

6dfda06

…DME.md`

Escape nested backticks

5754366

alvarobartt merged commit ceec771 into main Oct 10, 2024
1 check passed

alvarobartt deleted the multi-lora-example branch October 10, 2024 13:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `examples/gke/tgi-multi-lora-deployment` #102

Add `examples/gke/tgi-multi-lora-deployment` #102

alvarobartt commented Sep 26, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Sep 26, 2024

Add examples/gke/tgi-multi-lora-deployment #102

Add examples/gke/tgi-multi-lora-deployment #102

Conversation

alvarobartt commented Sep 26, 2024 • edited Loading

Description

Additionally

HuggingFaceDocBuilderDev commented Sep 26, 2024

Add `examples/gke/tgi-multi-lora-deployment` #102

Add `examples/gke/tgi-multi-lora-deployment` #102

alvarobartt commented Sep 26, 2024 •

edited

Loading