Skip to content

Commit

Permalink
Remove unused files, update transformers to 4.38.1 (#18)
Browse files Browse the repository at this point in the history
* Remove unused files, update transformers to 4.38.1 and TGI to 1.4.2
  • Loading branch information
shub-kris authored Feb 23, 2024
1 parent 882a1dd commit d81e9a2
Show file tree
Hide file tree
Showing 14 changed files with 96 additions and 1,137 deletions.
20 changes: 10 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ The [`examples`](./examples) directory contains examples for using the container
_Note: we added the latest TGI version as an example into the repository, which can be build with._

```bash
docker build -t us-docker.pkg.dev/deeplearning-platform-release/gcr.io/huggingface-text-generation-inference-gpu.1.3.4 -f containers/tgi/gpu/1.3.4/Dockerfile .
docker build -t us-docker.pkg.dev/deeplearning-platform-release/gcr.io/huggingface-text-generation-inference-gpu.1.4.2 -f containers/tgi/gpu/1.4.2/Dockerfile .
```

### Mistral 7B test
Expand All @@ -29,7 +29,7 @@ docker run --gpus all -ti -p 8080:80 \
-e NUM_SHARD=$num_shard \
-e MAX_INPUT_LENGTH=$max_input_length \
-e MAX_TOTAL_TOKENS=$max_total_tokens \
us-docker.pkg.dev/deeplearning-platform-release/gcr.io/huggingface-text-generation-inference-gpu.1.3.4
us-docker.pkg.dev/deeplearning-platform-release/gcr.io/huggingface-text-generation-inference-gpu.1.4.2
```

Send request:
Expand All @@ -41,10 +41,10 @@ curl 127.0.0.1:8080/generate \
-H 'Content-Type: application/json'
```

### Golden Gate Test
### Gemma Test

```bash
model=gg-hf/golden-gate-7b
model=google/gemma-7b
num_shard=1
max_input_length=512
max_total_tokens=1024
Expand All @@ -58,7 +58,7 @@ docker run --gpus all -ti -p 8080:80 \
-e MAX_TOTAL_TOKENS=$max_total_tokens \
-e MAX_BATCH_PREFILL_TOKENS=$max_batch_prefill_tokens \
-e HUGGING_FACE_HUB_TOKEN=$token \
us-docker.pkg.dev/deeplearning-platform-release/gcr.io/huggingface-text-generation-inference-gpu.1.3.4
us-docker.pkg.dev/deeplearning-platform-release/gcr.io/huggingface-text-generation-inference-gpu.1.4.2
```

Send request:
Expand All @@ -70,7 +70,7 @@ curl 127.0.0.1:8080/generate \
-H 'Content-Type: application/json'
```

For a Vertex AI example checkout [Deploy Golden Gate on Vertex AI](./examples//vertex-ai/deploy-golden-gate-on-vertex-ai.ipynb)
For a Vertex AI example checkout [Deploy Gemma on Vertex AI](./examples/vertex-ai/notebooks/deploy-gemma-on-vertex-ai.ipynb)


## Configurations
Expand All @@ -87,13 +87,13 @@ After the containers are built, you can run the tests in the `tests` directory t

| Container Tag | Framework | Type | Accelerator |
| ----------------------------------------------------------------------------- | --------- | --------- | ----------- |
| [pytorch-training-gpu.2.1.transformers.4.37.2.py310](./containers/pytorch/training/gpu/2.1/transformers/4.37.2/py310/Dockerfile) | Pytorch | training | GPU |
| [text-generation-inference-gpu.1.3.4](https://github.com/huggingface/Google-Cloud-Containers/blob/main/containers/tgi/gpu/1.3.4/Dockerfile) | - | inference | GPU |
| [pytorch-training-gpu.2.1.transformers.4.38.1.py310](./containers/pytorch/training/gpu/2.1/transformers/4.38.1/py310/Dockerfile) | Pytorch | training | GPU |
| [text-generation-inference-gpu.1.4.2](./containers/tgi/gpu/1.4.2/Dockerfile) | - | inference | GPU |

## Directory Structure

The container files are organized in a nested folder structure based on the container tag. For example, the Dockerfile for the container with the tag `pytorch-training-gpu.2.0.transformers.4.35.0.py310` is located at `pytorch/training/gpu/2.0/transformers/4.35.0/py310/Dockerfile`.
The container files are organized in a nested folder structure based on the container tag. For example, the Dockerfile for the container with the tag `pytorch-training-gpu.2.1.transformers.4.38.1.py310` is located at `pytorch/training/gpu/2.1/transformers/4.38.1/py310/Dockerfile`.

## Updates

When we update the transformers version, we add a new folder in the `transformers` directory. For example, if we update the transformers version to 4.36.0, we would add a new folder at `pytorch/training/gpu/2.0/transformers/4.36.0`.
When we update the transformers version, we add a new folder in the `transformers` directory. For example, if we update the transformers version to 4.39.0, we would add a new folder at `pytorch/training/gpu/2.0/transformers/4.39.0`.
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ ARG DEBIAN_FRONTEND=noninteractive
# Versions
ARG CUDA="12"
ARG CUDNN="89"
ARG TRANSFORMERS='4.37.2'
ARG TRANSFORMERS='4.38.1'
ARG DIFFUSERS='0.26.1'
ARG DATASETS='2.16.1'
# jax and flax compatible with transformers ["jax>=0.4.1,<=0.4.13", "jaxlib>=0.4.1,<=0.4.13", "flax>=0.4.1,<=0.7.0"] as mentioned in setup.py
Expand Down
Empty file.
Empty file.

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ ARG DEBIAN_FRONTEND=noninteractive

# Versions
ARG FLASH_ATTN='2.5.2'
ARG TRANSFORMERS='4.37.2'
ARG TRANSFORMERS='4.38.1'
ARG DIFFUSERS='0.26.1'
ARG PEFT='0.8.2'
ARG TRL='0.7.10'
Expand Down
Original file line number Diff line number Diff line change
@@ -1,12 +1,11 @@
# Fine-tune Gemma-2B on Vertex AI WorkBench
# Fine-tune Gemma-7B on Vertex AI WorkBench

This file contains step by step instructions on how to build a docker image and then run it to test the Gemma-2B model using the
This file contains step by step instructions on how to build a docker image and then run it to test the Gemma-7B model using the
[gemma-finetuning-clm-lora-sft.ipynb](https://github.com/huggingface/Google-Cloud-Containers/blob/main/examples/vertex-ai/gemma-finetuning-clm-lora-sft.ipynb) Notebook on a Vertex AI WorkBench instance and on your local machine.

## Pre-requisites:
1. Access to [gg-hf](https://huggingface.co/gg-hf) on Hugging Face Hub in order to download the model and the tokenizer.
1. Access Terms and Conditions on [Hugging Face Hub](https://huggingface.co/google/gemma-7b) in order to download the model and the tokenizer.
2. Access to [Google-Cloud-Containers](https://github.com/huggingface/Google-Cloud-Containers) GitHub repository in order to access the docker file.
3. Access to [new-model-addition-golden-gate](https://github.com/huggingface/new-model-addition-golden-gate/) GitHub repository in order to use transformer library with the gg-hf model integrated into it.


We use the [gemma-finetuning-clm-lora-sft.ipynb](https://github.com/huggingface/Google-Cloud-Containers/blob/main/examples/vertex-ai/gemma-finetuning-clm-lora-sft.ipynb) Notebook to test the model.
Expand All @@ -18,17 +17,10 @@ Use the following command to build the docker image. Make sure to replace the va
```bash
git clone https://github.com/huggingface/Google-Cloud-Containers
cd Google-Cloud-Containers
export GITHUB_TOKEN=your-github-token
docker build --secret id=GITHUB_TOKEN,env=GITHUB_TOKEN -t pytorch-training-gpu.2.1.transformers.4.38.0.dev0.py310 -f containers/pytorch/training/gpu/2.1/transformers/4.38.0.dev0/py310/Dockerfile .
docker build -t pytorch-training-gpu.2.1.transformers.4.38.1.py310 -f containers/pytorch/training/gpu/2.1/transformers/4.38.1/py310/Dockerfile .
```

For setting the value of `GITHUB_TOKEN` please follow the detailed instructions mentioned in the following links:
- [Creating a fine-grained personal access token](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens#creating-a-fine-grained-personal-access-token)

- [Creating a personal access token (classic)](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens#creating-a-personal-access-token-classic)


## Using Vertex AI WorkBench Instance to fine-tune the Gemma-2B model
## Using Vertex AI WorkBench Instance to fine-tune the Gemma-7B model

It consists of the following steps:
1. Push the docker image to the Google Cloud Artifact registry.
Expand Down Expand Up @@ -56,12 +48,12 @@ Now, you can push the image to the Google Cloud Artifact registry using the foll
REGION="us-central1"
DOCKER_ARTIFACT_REPO="deep-learning-images"
PROJECT_ID="gcp-project-id"
BASE_IMAGE="pytorch-training-gpu.2.1.transformers.4.38.0.dev0.py310"
BASE_IMAGE="pytorch-training-gpu.2.1.transformers.4.38.1.py310"
FRAMEWORK="pytorch"
TYPE="training"
ACCELERATOR="gpu"
FRAMEWORK_VERSION="2.1"
TRANSFORMERS_VERISON="4.38.0.dev0"
TRANSFORMERS_VERISON="4.38.1"
PYTHON_VERSION="py310"

SERVING_CONTAINER_IMAGE_URI="${REGION}-docker.pkg.dev/${PROJECT_ID}/${DOCKER_ARTIFACT_REPO}/huggingface-${FRAMEWORK}-${TYPE}-${ACCELERATOR}.${FRAMEWORK_VERSION}.transformers.${TRANSFORMERS_VERISON}.${PYTHON_VERSION}:latest"
Expand Down Expand Up @@ -109,7 +101,7 @@ We will use the Google Cloud CLI to create a Vertex AI WorkBench instance from a

```bash
gcloud notebooks instances create example-instance-1 \
--container-repository=us-central1-docker.pkg.dev/gcp-project-id/deep-learning-images/huggingface-pytorch-training-gpu.2.1.transformers.4.38.0.dev0.py310 \
--container-repository=us-central1-docker.pkg.dev/gcp-project-id/deep-learning-images/huggingface-pytorch-training-gpu.2.1.transformers.4.38.1.py310 \
--container-tag=latest \
--machine-type=n1-standard-4 \
--location=us-central1-c \
Expand Down Expand Up @@ -137,7 +129,7 @@ Then, you can access the [gemma-finetuning-clm-lora-sft.ipynb](https://github.co
Make sure you have the [gemma-finetuning-clm-lora-sft.ipynb](https://github.com/huggingface/Google-Cloud-Containers/blob/main/examples/vertex-ai/gemma-finetuning-clm-lora-sft.ipynb) Notebook on your local machine. As we are mounting the current directory to the docker container.

```bash
docker run -it --gpus all -p 8080:8080 -v $(pwd):/workspace pytorch-training-gpu.2.1.transformers.4.38.0.dev0.py310
docker run -it --gpus all -p 8080:8080 -v $(pwd):/workspace pytorch-training-gpu.2.1.transformers.4.38.1.py310
```

Inside the docker container, you can run the following command to start the jupyter notebook:
Expand Down
Loading

0 comments on commit d81e9a2

Please sign in to comment.