Skip to content

Latest commit

 

History

History
165 lines (122 loc) · 5.6 KB

SETUP.md

File metadata and controls

165 lines (122 loc) · 5.6 KB

Setup

Initialize

  1. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    1. Set the PROJECT_ID environment variable.
    export PROJECT_ID="<your-project-id>"
    1. Set the REGION environment variable; it defaults to "us-central1". For now, the only other supported region is europe-west4. Learn more about Vertex AI regions.
    export REGION="us-central1"
  2. Make sure that billing is enabled for your Google Cloud project.

  3. Install the Google Cloud CLI.

  4. Initialize the gcloud CLI.

    gcloud init
  5. Enable the Vertex AI, Artifact Registry, Cloud Build, Cloud Deploy and Cloud Storage APIs.

    gcloud services enable \
      aiplatform.googleapis.com \
      artifactregistry.googleapis.com \
      compute.googleapis.com \
      cloudbuild.googleapis.com \
      clouddeploy.googleapis.com  \
      notebooks.googleapis.com \
      --project $PROJECT_ID

Permissions

Make sure the default Compute Engine service account has sufficient permissions.

  1. Add the iam.serviceAccountUser role, which includes the actAs permission to deploy to the runtime.

    gcloud iam service-accounts add-iam-policy-binding $(gcloud projects describe $PROJECT_ID \
    --format="value(projectNumber)")[email protected] \
    --member=serviceAccount:$(gcloud projects describe $PROJECT_ID \
    --format="value(projectNumber)")[email protected] \
    --role="roles/iam.serviceAccountUser" \
    --project=$PROJECT_ID
  2. Add the clouddeploy.jobRunner role.

    gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member=serviceAccount:$(gcloud projects describe $PROJECT_ID \
    --format="value(projectNumber)")[email protected] \
    --role="roles/clouddeploy.jobRunner"
  3. Add the roles/clouddeploy.viewer role.

    gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member=serviceAccount:$(gcloud projects describe $PROJECT_ID \
    --format="value(projectNumber)")[email protected] \
    --role="roles/clouddeploy.viewer"
  4. Add the roles/aiplatform.user role.

    gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member=serviceAccount:$(gcloud projects describe $PROJECT_ID \
    --format="value(projectNumber)")[email protected] \
    --role="roles/aiplatform.user"
  5. Add the roles/storage.objectCreator role.

    gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member=serviceAccount:$(gcloud projects describe $PROJECT_ID \
    --format="value(projectNumber)")[email protected] \
    --role="roles/storage.objectCreator"
  6. Add the roles/storage.objectViewer role.

    gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member=serviceAccount:$(gcloud projects describe $PROJECT_ID \
    --format="value(projectNumber)")[email protected] \
    --role="roles/storage.objectViewer"

Storage

  1. Create a Cloud Storage bucket.

    1. Set the BUCKET_URI variable. If you change the value, make sure to also update it in the configuration.
    export BUCKET_URI="gs://rlhf-artifacts"
    1. Create the Cloud Storage bucket used to store the tuning artifacts.
    gsutil mb -l $REGION -p $PROJECT_ID $BUCKET_URI
  2. Create Vertex AI Pipeline Registry repository.

    1. Set the PIPELINE_REGISTRY variable. If you change the value, make sure to also update it in the configuration.
    export PIPELINE_REGISTRY="rlhf-pipelines"
    1. Create the Artifact Registry backing the Vertex AI Pipeline Registry.
    gcloud artifacts repositories create $PIPELINE_REGISTRY \
    --location=$REGION \
    --repository-format=KFP

Configuration

The table below shows the supported models, hardware and regions.

large_model_reference supported accelerator_type supported region
text-bison@001 TPU_V3, NVIDIA_TESLA_A100 europe-west4, us-central1
chat-bison@001 TPU_V3, NVIDIA_TESLA_A100 europe-west4, us-central1
t5-small TPU_V3, NVIDIA_TESLA_A100 europe-west4, us-central1
t5-large TPU_V3 europe-west4
t5-xl TPU_V3 europe-west4
t5-xxl TPU_V3 europe-west4

Note that tuning jobs that run in:

  • us-central1 will use 8 Nvidia A100 80GB.
  • europe-west4 will use 64 v3 TPUs.