Development on local

make sure you have model in this folder

# Install required libraries
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

# Start server
gunicorn -b :8000 server:app

Deploying model on GCP virtual machine

Steps to deploy

Create a Google Cloud Platform account
Install Cloud SDK

Follow the the steps for installing cloud SDK: https://cloud.google.com/sdk/install

After installing Cloud SDK, initialize it

# Run the following command to initialize
# This will ask you to link your Google Cloud Platform account
gcloud init

Create and configure GCP project

# This commands creates a project with the name `brain-opera-deployment`
gcloud projects create brain-opera-deployment

# Set your working project
gcloud config set project brain-opera-deployment

# If the above project name is taken, choose a differet project name
# Note: project names need to be unique across GCP

# Set compute zone
gcloud config set compute/zone asia-southeast1-b

Set quota for GPU

First, go to the Compute Engine tab and initialize it. Wait for it to complete.

The default GPU quota for a GCP account with free credits is 0. A request for an increase in this quota is necessary to use GPUs.

Go to https://console.cloud.google.com/iam-admin/quotas. Make sure that brain-opera-deployment is chosen as the project in the header, as shown in the image below. Filter metric by GPUs (all regions). If the limit is 0, tick the checkbox and click on the Edit Quotas button at the top.

Fill in the necessary info and request for the limit to be raised to 1. An email will be sent to you for the quota request. The wait time is usually a few hours before the quota request is granted.

Set firewall rules

gcloud compute --project=brain-opera-deployment firewall-rules create brain-opera-port8000 --direction=INGRESS --priority=1000 --network=default --action=ALLOW --rules=tcp:8000 --source-ranges=0.0.0.0/0 --target-tags=port8000

Create VM

export IMAGE_FAMILY="tf-1-15-cu100"
export ZONE="asia-southeast1-b"
export INSTANCE_NAME="brain-opera-gpt2"
export INSTANCE_TYPE="n1-standard-2"

gcloud compute instances create $INSTANCE_NAME \
        --zone=$ZONE \
        --image-family=$IMAGE_FAMILY \
        --image-project=deeplearning-platform-release \
        --maintenance-policy=TERMINATE \
        --accelerator="type=nvidia-tesla-p4,count=1" \
        --machine-type=$INSTANCE_TYPE \
        --boot-disk-size=200GB \
        --metadata="install-nvidia-driver=True" \
        --tags=port8000

Setup VM

# Copy model from local directory to VM
gcloud compute scp --recurse ./checkpoint brain-opera-gpt2:~/checkpoint

# SSH into machine
gcloud compute --project "brain-opera-deployment" ssh --zone "asia-southeast1-b" "brain-opera-gpt2"

# Clone repo
git clone https://github.com/jonheng/brain-opera-gpt2-deployment.git

# Move model into repo
mv checkpoint/ brain-opera-gpt2-deployment/checkpoint

# Go to cloned repo
cd brain-opera-gpt2-deployment/

# Install python3-venv, enter Y when prompted
sudo apt-get install python3-venv

# Install required libraries
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

# Start server
gunicorn -b :8000 server:app

Final test

Check your application IP

gcloud compute instances list

Test that the connection works

Clean up

# To delete the vm instance
gcloud compute instances delete brain-opera-gpt2

# To delete entire project
gcloud projects delete brain-opera-deployment

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
images		images
kubernetes		kubernetes
scripts		scripts
src		src
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
requirements.txt		requirements.txt
server.py		server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Development on local

Deploying model on GCP virtual machine

Steps to deploy

About

Releases

Packages

Contributors 4

Languages

jonheng/brain-opera-gpt2-deployment

Folders and files

Latest commit

History

Repository files navigation

Development on local

Deploying model on GCP virtual machine

Steps to deploy

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages