Skip to content

Commit

Permalink
Merge pull request #67 from Vizzuality/SKY30-141
Browse files Browse the repository at this point in the history
Analysis cloud function
  • Loading branch information
Agnieszka Figiel authored Nov 22, 2023
2 parents 1b42f36 + 5cdb4bb commit c070dd2
Show file tree
Hide file tree
Showing 15 changed files with 522 additions and 64 deletions.
52 changes: 40 additions & 12 deletions .github/workflows/deploy.yml
Original file line number Diff line number Diff line change
@@ -1,13 +1,3 @@
# This workflow build and push a Docker container to Google Artifact Registry and deploy it on Cloud Run when a commit is pushed to the $default-branch branch
#
# Overview:
#
# 1. Authenticate to Google Cloud
# 2. Authenticate Docker to Artifact Registry
# 3. Build a docker container
# 4. Publish it to Google Artifact Registry
# 5. Deploy it to Cloud Run
#
# The workflow uses GH Secrets managed by Terraform:
# - GCP_PROJECT_ID
# - GCP_REGION
Expand All @@ -18,6 +8,7 @@
# - <environment>_CLIENT_SERVICE
# - <environment>_CMS_REPOSITORY
# - <environment>_CMS_SERVICE
# - <environment>_ANALYSIS_CF_NAME
#
# it also uses the following secrets not managed by Terraform:
# - <environment>_CLIENT_ENV
Expand All @@ -30,10 +21,11 @@ on:
branches:
- main
- develop
- infrastructure/setup

paths:
- 'frontend/**'
- 'cms/**'
- 'cloud_functions/**'
- '.github/workflows/*'

env:
Expand Down Expand Up @@ -139,7 +131,6 @@ jobs:
run: echo "environment=$ENVIRONMENT" >> $GITHUB_OUTPUT
id: extract_environment

#- name: Google Auth authentication via credentials json
- name: Google Auth
id: auth
uses: 'google-github-actions/auth@v1'
Expand Down Expand Up @@ -190,3 +181,40 @@ jobs:
- name: Show Output
run: echo ${{ steps.deploy.outputs.url }}

deploy_cloud_functions:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4

- name: Extract branch name
shell: bash
run: echo "branch=${GITHUB_HEAD_REF:-${GITHUB_REF#refs/heads/}}" >> $GITHUB_OUTPUT
id: extract_branch

- name: Extract environment name
env:
ENVIRONMENT: ${{ steps.extract_branch.outputs.branch == 'main' && 'PRODUCTION' || 'STAGING' }}
run: echo "environment=$ENVIRONMENT" >> $GITHUB_OUTPUT
id: extract_environment

- name: Google Auth
id: auth
uses: 'google-github-actions/auth@v1'
with:
credentials_json: "${{ secrets[format('{0}_GCP_SA_KEY', steps.extract_environment.outputs.environment)] }}"
token_format: 'access_token'
- name: 'Set up Cloud SDK'
uses: 'google-github-actions/setup-gcloud@v1'
with:
version: '>= 363.0.0'
- name: 'Use gcloud CLI'
run: 'gcloud info'
- name: 'Deploy to gen2 cloud function'
env:
CLOUD_FUNCTION_NAME: ${{ secrets[format('{0}_ANALYSIS_CF_NAME', steps.extract_environment.outputs.environment)] }}
run: |
gcloud functions deploy ${{ env.CLOUD_FUNCTION_NAME }} \
--gen2 \
--region=${{ env.REGION }} \
--source=./cloud_functions/analysis \
32 changes: 32 additions & 0 deletions cloud_functions/analysis/connect_tcp.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
import os
import ssl
import sqlalchemy

def connect_tcp_socket() -> sqlalchemy.engine.base.Engine:
"""Initializes a TCP connection pool for a Cloud SQL instance of Postgres."""
# Note: Saving credentials in environment variables is convenient, but not
# secure - consider a more secure solution such as
# Cloud Secret Manager (https://cloud.google.com/secret-manager) to help
# keep secrets safe.
db_host = os.environ[
"DATABASE_HOST"
] # e.g. '127.0.0.1' ('172.17.0.1' if deployed to GAE Flex)
db_user = os.environ["DATABASE_USERNAME"] # e.g. 'my-db-user'
db_pass = os.environ["DATABASE_PASSWORD"] # e.g. 'my-db-password'
db_name = os.environ["DATABASE_NAME"] # e.g. 'my-database'
db_port = 5432 # e.g. 5432

pool = sqlalchemy.create_engine(
# Equivalent URL:
# postgresql+pg8000://<db_user>:<db_pass>@<db_host>:<db_port>/<db_name>
sqlalchemy.engine.url.URL.create(
drivername="postgresql+pg8000",
username=db_user,
password=db_pass,
host=db_host,
port=db_port,
database=db_name,
),
# ...
)
return pool
36 changes: 36 additions & 0 deletions cloud_functions/analysis/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
import functions_framework
import sqlalchemy

from connect_tcp import connect_tcp_socket

db = connect_tcp_socket()

@functions_framework.http
def index(request):
"""HTTP Cloud Function.
Args:
request (flask.Request): The request object.
<https://flask.palletsprojects.com/en/1.1.x/api/#incoming-request-data>
Returns:
The response text, or any set of values that can be turned into a
Response object using `make_response`
<https://flask.palletsprojects.com/en/1.1.x/api/#flask.make_response>.
Note:
For more information on how Flask integrates with Cloud
Functions, see the `Writing HTTP functions` page.
<https://cloud.google.com/functions/docs/writing/http#http_frameworks>
"""
return get_locations_stats(db)

def get_locations_stats(db: sqlalchemy.engine.base.Engine) -> dict:
with db.connect() as conn:
stmt = sqlalchemy.text(
"SELECT COUNT(*) FROM locations WHERE type=:type"
)
regions_count = conn.execute(stmt, parameters={"type": "region"}).scalar()
countries_count = conn.execute(stmt, parameters={"type": "country"}).scalar()

return {
"regions_count": regions_count,
"countries_count": countries_count
}
3 changes: 3 additions & 0 deletions cloud_functions/analysis/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
functions-framework
sqlalchemy
pg8000
97 changes: 67 additions & 30 deletions infrastructure/README.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,16 @@
# Infrastructure

While the application can be deployed in any server configuration that supports the application's dependencies, this project includes a [Terraform](https://www.terraform.io/) project that you can use to easily and quickly deploy it using
[Google Cloud Platform](https://cloud.google.com/).
While the application can be deployed in any server configuration that supports the application's dependencies, this project includes:
- a [Terraform](https://www.terraform.io/) project that you can use to easily and quickly provision the resources and deploy the code using [Google Cloud Platform](https://cloud.google.com/),
- and a GH Actions workflow to deploy code updates.

![GCP infrastructure - GH Actions drawio](https://github.com/Vizzuality/skytruth-30x30/assets/134055/c20e52d4-89f0-42e2-be25-e6b76a3a4fe6)

## Dependencies

Here is the list of technical dependencies for deploying the SkyTruth 30x30 Dashboard app using these infrastructure
resources. Note that these requirements are for this particular deployment strategy, and not dependencies of the SkyTruth 30x30 Dashboard application itself - which can be deployed to other infrastructures.
Here is the list of technical dependencies for deploying the SkyTruth 30x30 Dashboard app using these infrastructure resources. Note that these requirements are for this particular deployment strategy, and not dependencies of the SkyTruth 30x30 Dashboard application itself - which can be deployed to other infrastructures.

Before proceeding, be sure you are familiar with all of these tools, as these instructions
will skip over the basics, and assume you are conformable using all of them.
Before proceeding, be sure you are familiar with all of these tools, as these instructions will skip over the basics, and assume you are conformable using all of them.

- [Google Cloud Platform](https://cloud.google.com)
- [Terraform](https://www.terraform.io/)
Expand All @@ -19,14 +20,13 @@ will skip over the basics, and assume you are conformable using all of them.
- DNS management
- A purchased domain

## Structure
### Terraform project

This project has 2 main sections, each of which with a folder named after it. Each of these sections has a Terraform project, that logically depends on their predecessors. There is a 3rd component to this architecture, which is handled by Github Actions.
This project (in the inrastructure directory) has 2 main sections, each of which with a folder named after it. Each of these sections has a Terraform project, that logically depends on their predecessors. There is a 3rd component to this architecture, which is handled by Github Actions.

#### Remote state

Creates a [GCP Storage Bucket](https://cloud.google.com/storage/docs/json_api/v1/buckets)
that will store the Terraform remote state.
Creates a [GCP Storage Bucket](https://cloud.google.com/storage/docs/json_api/v1/buckets), which will store the Terraform remote state.

#### Base

Expand All @@ -36,14 +36,16 @@ These resources include, but are not limited to:

- Google Compute instance - bastion host to access the GCP infrastructure
- Artifact Registry, for docker image storage
- Cloud Run, to host the live applications
- Cloud Run, to host the client application and the API/CMS
- Cloud Functions, for serving the analysis results
- Cloud SQL, for relational data storage
- Networking resources
- Uptime monitoring
- Error reporting
- Service accounts and permissions
- GH Secrets

To apply this project, you will need the following GCP permissions. These could probably be further fleshed out to a
more restrictive set of permissions/roles, but this combination is know to work:
To apply this project, you will need the following GCP permissions. These could probably be further fleshed out to a more restrictive set of permissions/roles, but this combination is know to work:

- "Editor" role
- "Secret Manager Admin" role
Expand All @@ -53,37 +55,72 @@ more restrictive set of permissions/roles, but this combination is know to work:

The output values include access data for some of the resources above.

Please note, there are some actions that need to be carried out manually - you'll get a promt from terraform with links to follow to complete the actions:
Please note, there are some actions that might to be carried out manually - you'll get a promt from terraform with links to follow to complete the actions, e.g.:
- Compute Engine API needs to be enabled

#### Github Actions

As part of this infrastructure, Github Actions are used to automatically build and push Docker images to [Artifact Registry](https://cloud.google.com/artifact-registry), and to deploy those images to CloudRun once they are pushed. Access by Github to GCP is configured through special authorization rules, automatically set up by the Terraform `base` project above.
These permissions are necessary for the service account that runs the deployment:
- "roles/iam.serviceAccountTokenCreator",
- "roles/iam.serviceAccountUser",
- "roles/run.developer",
- "roles/artifactregistry.reader",
- "roles/artifactregistry.writer"

There are 2 CloudRun instances, one for the client application and one for the API. Github Secrets are used to provide environment secrets to these instances. Some of the secrets are managed by terraform when provisioning resources (e.g. database credentials for the API). To make it clear, the respective GH Secrets are suffixed "TF_MANAGED".

## How to deploy
#### How to run

Deploying the included Terraform project is done in steps:

- Terraform `apply` the `Remote State` project.
- Terraform `apply` the `Base` project.
- Terraform `apply` the `Remote State` project. This needs to be done once, before applying the base project.
- Terraform `apply` the `Base` project. This needs to be repeated after any changes in the base project.

For both commands, please use `-var-file=vars/terraform.tfvars`` to provide the necessary terraform variables.

For the latter step, you will also need to set 2 environment variables:
For the second command, you will also need to set 2 environment variables:
- GITHUB_TOKEN (your GH token)
- GITHUB_OWNER (Vizzuality)
to allow terraform to write to GH Secrets.

Please note: when provisioning for the first time in a clean project, amend the `cloudrun` module by uncommenting the image setting to be used for first time deployment, which deploys a dummy "hello" image (because actual application images are going to be available in GAR only once the infrastructure is provisioned and the GH Actions deployment passed)

### Github Actions

As part of this infrastructure, Github Actions are used to automatically apply code updates for the client application, API/CMS and the cloud functions.

#### Building new code versions

Deployment to the CloudRun instances is accomplished by building Docker images are built and pushing to [Artifact Registry](https://cloud.google.com/artifact-registry). When building the images, environment secrets are injected from GH Secrets as follows:
- for the client application:
- the following secrets set by terraform in STAGING_CLIENT_ENV_TF_MANAGED (in the format of an .env file):
- NEXT_PUBLIC_URL
- NEXT_PUBLIC_API_URL
- NEXT_PUBLIC_ANALYSIS_CF_URL
- NEXT_PUBLIC_ENVIRONMENT
- LOG_LEVEL
- additional secrets set manually in STAGING_CLIENT_ENV (copy to be managed in LastPass)
- for the CMS/API application
- the following secrets set by terraform in STAGING_CMS_ENV_TF_MANAGED (in the format of an .env file):
- HOST
- PORT
- APP_KEYS
- API_TOKEN_SALT
- ADMIN_JWT_SECRET
- TRANSFER_TOKEN_SALT
- JWT_SECRET
- CMS_URL
- DATABASE_CLIENT
- DATABASE_HOST
- DATABASE_NAME
- DATABASE_USERNAME
- DATABASE_PASSWORD
- DATABASE_SSL

Deployment to the cloud function is accomplished by pushing the source code. Secrets and env vars are set via terraform.

The workflow is currently set up to deploy to the staging instance when merging to develop.

#### Service account permissions

Access by Github to GCP is configured through special authorization rules, automatically set up by the Terraform `base` project above.
These permissions are necessary for the service account that runs the deployment:
- "roles/iam.serviceAccountTokenCreator",
- "roles/iam.serviceAccountUser",
- "roles/run.developer",
- "roles/artifactregistry.reader",
- "roles/artifactregistry.writer",
- "roles/cloudfunctions.developer"

## Maintenance

### Connecting to the Cloud SQL databases
Expand Down
51 changes: 35 additions & 16 deletions infrastructure/base/.terraform.lock.hcl

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit c070dd2

Please sign in to comment.