Skip to content

Commit

Permalink
Add devops docs
Browse files Browse the repository at this point in the history
  • Loading branch information
Alexander-Cairns committed Dec 19, 2024
1 parent c5368f1 commit 9886e8f
Showing 1 changed file with 346 additions and 0 deletions.
346 changes: 346 additions & 0 deletions docs/devops.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,346 @@
# Deploying sites and nodes

## Introduction

The stack is managed using the following tools, Ansible, Helm, Kubectl, and the
aws cli. This document will explain how to add new nodes, and intall/upgrade
sites.

## Installing/Updating sites

Each site is contained within a Kubernetes namespace, and each service is
installed via a helm chart. To simplify management shell scripts have been
created and are stored in the `/opt/helm_values/scripts` directory on the dc
server. The helm configuration for each site is stored in a directory in
`/opt/helm_values` with subdirectories for each service.

The file tree below show what files will be used to manage the sites.
```
/opt/helm_values
├── dc
│   ├── activemq
│   │   └── affinity.yaml -> ../shared/affinity.yaml
│   ├── alpaca
│   │   └── affinity.yaml -> ../shared/affinity.yaml
│   ├── cantaloupe
│   │   ├── affinity.yaml -> ../shared/affinity.yaml
│   │   └── ingress.yaml -> ../shared/ingress.yaml
│   ├── charts.yaml
│   ├── clamav
│   │   └── affinity.yaml -> ../shared/affinity.yaml
│   ├── crayfish
│   │   └── affinity.yaml -> ../shared/affinity.yaml
│   ├── drupal
│   │   ├── affinity.yaml -> ../shared/affinity.yaml
│   │   ├── base.yaml
│   │   ├── ingress.yaml -> ../shared/ingress.yaml
│   │   ├── saml.yaml
│   │   └── values.yaml
│   ├── memcache
│   │   └── affinity.yaml -> ../shared/affinity.yaml
│   ├── postgres
│   │   └── affinity.yaml -> ../shared/affinity.yaml
│   ├── secrets
│   │   └── values.yaml
│   ├── shared
│   │   ├── affinity.yaml
│   │   └── ingress.yaml
│   └── solr
│   └── affinity.yaml -> ../shared/affinity.yaml
└── scripts
├── export-config.sh
├── fix-perms.sh
├── update-all.sh
└── update-helm.sh
```

### Scripts

#### `update-helm.sh`

This script when provided with a service/helm installation name, helm chart
reference, and a site/namespace, will upgrade or install it. For example
running `./scripts/update-helm.sh cantaloupe dgi/cantaloupe dc` will install
the cantaloupe chart in the dc namespace. If the script detects any changes it
will display them to the user with a prompt to apply them.

#### `update-all.sh`

This script will run `update-helm.sh` against all of a site's services. The
script creates a list of services to run against from the `$ns/charts.yaml`
file. The file contains an object called `charts` where the keys are the
service/installation name and the values contains a helm chart reference.

For example running `update-all.sh dc` will install all the services for the dc
site declared in the file `dc/charts.yaml`

```yaml
charts:
secrets:
chart: dgi/aws-secrets
activemq:
chart: dgi/activemq
alpaca:
chart: dgi/alpaca
cantaloupe:
chart: dgi/cantaloupe
clamav:
chart: dgi/clamav
memcache:
chart: dgi/memcache
postgres:
chart: dgi/postgres
drupal:
chart: dgi/drupal
```
The set of charts installed should not have to change across sites.
### `export-config.sh`

This script will export the config for the provided site and create and store
it in a tarball.

For example running `export-config.sh dc` will export the config to `dc/config`
and compress it to `dc/config.tar.gz`

### `fix-perms.sh`

This is a helper to provide write access to the `microk8s` group so that all
Kubernetes admins can edit the files. Needs to be run with root priveleges.

### Configuration

The helm charts take in yaml files to customize what is deployed. Each site has
a directory containing subdirectories for each service where the yaml files are
stored. The `update-all.sh` and `update-helm.sh` scripts will automatically
reference all the files in a service's directory.

Some of the files are symlinks to avoid repeating configuration that is used by
multiple services. These files are stored in `$ns/shared`

Below will contain explanations of the required configurations.

#### Shared

The configurations in this directory are used by multiple services by being
symlinked.

`affinity.yaml` is required all services that contain compute
resources(everything other than secrets). It is used to select the correct node
for a site to run on.

```yaml
---
nodeSelector:
# Set to the hostname of the node the site should run on
kubernetes.io/hostname: beln-arca-dc.dc.sfu.ca
```

`ingress.yaml` is used by the services which require ingress. Currently that
only contains cantaloupe and drupal. It needs to be updated with the hostname
of the site.

```yaml
ingress:
# Set the hostname for the site
- host: dc-i2.arcabc.ca
tls:
issuer: letsencrypt-prod
```

#### Drupal

The drupal directory contains multiple values files. `affinity.yaml` and
`ingress.yaml` are the described symlinks to the shared configs.

`base.yaml` contains configuration that should not need to change across sites
but is specific to the cluster.

```yaml
smtp:
host: mailhost.sfu.ca
port: "25"
rwxStorageClass: nfs-csi
fedoraVolume:
nfs:
path: /colo/arca_be_data/fedoraData
readOnly: true
server: bbysvm-tier1.its.sfu.ca
```

`values.yaml` contains configuration that will be updated for each site.

The two main values that will need to be updated are the `FEDORA_OBJECT_PATH`
value which should point at mounted path of the objectStore for the site. And
the `image.tag` which will set which version of the drupal image is deployed.

```yaml
additionalEnvVars:
- name: http_proxy
value: http://bby-vcontrol-proxy.its.sfu.ca:8080
- name: https_proxy
value: http://bby-vcontrol-proxy.its.sfu.ca:8080
- name: no_proxy
value: cluster.local,arca-stage.its.sfu.ca
- name: LOG_DIR
value: /opt/ingest_data/migration
- name: DRUSH
value: drush
# Must be updated per site
- name: FEDORA_OBJECT_PATH
value: /data/fedora-data/dc/objectStore
image:
pullSecret: regcred
repository: 231224489621.dkr.ecr.us-east-1.amazonaws.com/drupal-bceln
tag: 1.51.2
```

`saml.yaml` contains the configuration for SimpleSAMLphp and needs to be
configured per site to use saml login. Due to the required AWS permissions the
values for the dgi sso will need to be provided by dgi.

#### Secrets

The secrets chart needs to be configured with the secrets prefix that was used
when running `gen-secrets.sh`

```yaml
# Needs to be set for each site and match what used in the secrets creation script
secretPrefix: prod/bceln
secretStore:
create: false
name: aws-secrets
type: ClusterSecretStore
```

### Installation

#### Secret generation

The site requires secrets to be generated and stored in AWS. To generate the
secrets use
[`gen-secrets.sh`](https://github.com/discoverygarden/helm-charts/blob/main/charts/aws-secrets/gen-secrets.sh)
in the dgi helm-charts repo.

To run the script on your workstation, you need to have access to the
containerprod account with the permissions to create secrets.

Run the script with the secret prefix to generate the secrets.

Ex: `./gen-secrets.sh bceln/siteName`

#### Configuration and installation

On the dc server cd to `/opt/helm_values` and copy an existing site's directory
as a starting point. Then update the sites configuration based on the
[configuration](### Configuration) section of this document.

To install the site run `./scripts/update-all.sh siteName`.

Ex: For a site named foo
```bash
cd /opt/helm_values
cp -r dc foo
# make required configuration changes
vim foo/shared/affinity.yaml
./scripts/update-all.sh foo
```

### Upgrading

Running `update-helm.sh` and `update-all.sh` will deploy the latest release to
a service. However, some more care is required when updating drupal.

Before updating drupal configuration should be exported and merged back in. The
proccess involves cheking out the currently deployed tag locally and replacing
the drupal configs with what is running in production, then creating a pull
request.

Example:
```bash
ssh beln-arca-dc.dc.sfu.ca
cd /opt/helm_values
# export the configs
./scripts/export-config.sh dc
# get the currently deployed tag
./scripts/get-tag.sh dc
exit
# cd to where the repo exists locally
cd bceln-drupal
# checkout the tag that is currently deployed into a prod reconcile branch
git checkout -b dc-reconcile vTagFromGetTag
# Delete existing config to get a clean diff
rm -rf config
# Get exported configs from the server and extract them locally.
scp beln-arca-dc.dc.sfu.ca:/opt/helm_values/dc/config.tar.gz .
tar -xf config.tar.gz
rm config.tar.gz
# Commit and push
git add config
git commit -m 'DC reconcile'
git push --set-upstream origin dc-reconcile
# Create the pull request, can also be done from the GitHub website
gh pr create --title="DC reconcile" --body="" --label="patch"
```

Once the pull request has been merged and the new image built, the drupal
installtion can be updated.

When updating drupal a backup of the database will be taken and config will be
imported.

## Deploying nodes

We use ansible to deploy microk8s to new nodes. The playbook's repository
contains the full documentation on how to create a cluster from scratch. Since
the cluster has already been created this document will explain how to add
additional nodes to the cluster.

## Local requirements
Before running the following requirements must be installed.

- [ansible](https://docs.ansible.com/ansible/latest/installation_guide/intro_installation.html)
- [python boto 3](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/quickstart.html#installation)
- [aws cli](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html)

Your workstation must also be configured to access the containerprod account
with read access the required secrets. DGI will provide you the account access.

## Adding the Node with Ansible

First make sure you have added your ssh keys to the bastion and new node, and
that you ssh you are only prompted for an mfa code.

To add the node to the cluster in ansible add the hostname under `[bceln_prod]`
in `inventory/prod`.

To test the connection and that it has been added to the correct group run the
following command with the new node's hostname instead of dc's, and input you
password, and mfa code when prompted. You should

```bash
ansible -i inventory/prod -m ping --become --ask-become-pass 'bceln_prod:&beln-arca-dc.dc.sfu.ca'
BECOME password:
(user@bastion) Enter your SFU MFA code: 111111
beln-arca-dc.dc.sfu.ca | SUCCESS => {
"changed": false,
"ping": "pong"
}
```

To provision the node with Ansible run replacing the hostname with that of the
new node:

```bash
ansible-playbook microk8s.yml -i inventory/prod --diff --ask-become-pass -l new.node.hostname
```

Once microk8s has been provisioned, add the node the cluster by running
`microk8s add-node` on the dc node, and running the generated `microk8s join`
command on the new node. To verify that the node has been added run `Kubectl
get nodes` from the dc node.

0 comments on commit 9886e8f

Please sign in to comment.