From 6d36349a346d2c4c92e4e02f5ce03a7bb11392ba Mon Sep 17 00:00:00 2001
From: Niall Thomson <thomson.niall@gmail.com>
Date: Tue, 26 Mar 2024 17:18:14 +0000
Subject: [PATCH] Updated content authoring and tests documentation

---
 docs/authoring_content.md | 195 ++++++++++++++++++++++++--------------
 docs/automated_tests.md   |  26 +----
 2 files changed, 127 insertions(+), 94 deletions(-)

diff --git a/docs/authoring_content.md b/docs/authoring_content.md
index 69ab68f05..6a974ada5 100644
--- a/docs/authoring_content.md
+++ b/docs/authoring_content.md
@@ -2,6 +2,19 @@
 
 This guide outlines how to author content for the workshop, whether adding new content or modifying existing content.
 
+<!-- toc -->
+
+1. [Pre-requisites](#pre-requisites)
+1. [Create a work branch](#create-a-work-branch)
+1. [Environment setup](#environment-setup)
+1. [Planning your content](#planning-your-content)
+1. [Writing the markdown](#writing-the-markdown)
+1. [Writing the Terraform](#writing-the-terraform)
+1. [Cleaning up your lab](#cleaning-up-your-lab)
+1. [Testing](#testing)
+1. [Tear down AWS resources](#raising-a-pull-request)
+<!-- tocstop -->
+
 ## Pre-requisites
 
 The following pre-requisites are necessary to work on the content:
@@ -19,24 +32,22 @@ The following pre-requisites are necessary to work on the content:
 
 The first step is to create a working branch to create the content. There are two ways to do this depending on your access level:
 
-1. If you have `write` access to this repository you can clone it locally create a new branch directly
-2. Otherwise fork the repository, clone it and create a new branch
+1. (Preferred) Fork the repository, clone it and create a new branch
+2. If you have `write` access to this repository you can clone it locally create a new branch directly
 
 Modifications to the workshop will only be accepted via Pull Requests.
 
-## Writing content
-
-Once you have a working branch on your local machine you can start writing the workshop content. The Markdown files for the content are all contained in the `website` directory of the repository. This directory is structured using the standard [Docusaurus directory layout](https://docusaurus.io/docs/installation#project-structure). It is recommended to use other modules as guidelines for format and structure.
+## Environment setup
 
-Please see the [style guide](./style_guide.md) documentation for specific guidance on how to write content so it is consistent across the workshop content.
+To start developing you'll need to run some initial commands.
 
-As you write the content you can run a live local server that renders the final web site on your local machine. First install the dependencies by running the following command in the root of the repository.
+First install the dependencies by running the following command in the root of the repository.
 
 ```
 make install
 ```
 
-Then, run the following command to start the development server:
+Once this is complete you can run the following command to start the development server:
 
 ```
 make serve
@@ -48,65 +59,15 @@ You can then access the content at `http://localhost:3000`.
 
 As you make changes to the Markdown content the site will refresh automatically, you will not need to re-run the command to re-load.
 
-### What if I need to install a component in the EKS cluster?
-
-Where possible the workshop content aims to avoid having users install components in the EKS cluster using Helm charts, Kubernetes manifests or other means. The goal of the workshop is to teach learners how to use components, not how to install them. As such, the default choice should be to align with the existing patterns for pre-installing all components in the EKS cluster using automation.
-
-Where possible the preference is to use Terraform and EKS Blueprints addons to install dependencies like Helm charts in the EKS cluster. There are a [number of addons](https://github.com/aws-ia/terraform-aws-eks-blueprints/tree/main/modules/kubernetes-addons) packaged with EKS Blueprints which can be used if your particular component is supported. You can see examples of how to install these addons for workshop content [here](../terraform/modules/cluster/addons.tf).
-
-If the component you require is not already supported by EKS Blueprints you can create a custom addon within this repository. You can see an example of creating a custom addon module [here](../terraform/modules/addons/descheduler/main.tf) and it is installed [here](../terraform/modules/cluster/addons.tf).
-
-#### Helm chart versions
-
-In order to keep up with new versions of Helm charts being published there is an automated mechanism used to monitor all Helm charts used in the workshop content that will raise PRs when new versions are published.
-
-In addition to adding a component to Terraform as outlined in the previous section you must also do the following:
-
-- Edit the file `helm/charts.yaml` and specify the Helm repository, chart name etc.
-- Edit the file `terraform/modules/cluster/helm_versions.tf.json` and specify the initial version, note the map name must match the `name` field from `charts.yaml` for your chart.
-
-By default the automated system will look for the latest version of any charts added, but you can control this by using the `constraint` field, which uses the [NPM semantic versioning](https://docs.npmjs.com/about-semantic-versioning) constraint syntax. Please use this sparingly, as any constraints used will require additional maintenance overhead to keep updated. This should mainly be used for charts where:
-
-- The latest chart versions are incompatible with the version of EKS in the content
-- The content requires significant changes to bring it inline with a new version
-
-Example constraint in `helm/charts.yaml`:
-
-```
-...
-- name: aws-load-balancer-controller
-  repository: https://aws.github.io/eks-charts
-  chart: aws-load-balancer-controller
-  constraint: '>=1.4.0 <1.5.0'
-...
-```
-
-### What if I need to change the AWS infrastructure like VPC, EKS configuration etc?
-
-Any content changes are expected to be accompanied by the any corresponding infrastructure changes in the same Pull Request.
-
-All Terraform configuration resides in the `terraform` directory, and is structured as follows:
+There are some additional things to set up which are not required but will make it more likely to get a PR merged with fewer issues:
 
-- `modules/cluster` contains resources related to VPC, EKS and those used by workloads in EKS (IAM roles)
-- `modules/ide` contains resources related to the Cloud9 IDE and its bootstrapping
-- `cluster-only` is a small wrapper around `modules/cluster`
-- `full` invokes both modules and and connects them together, providing all necessary resources
-
-### What if my content need a new tool installed for the workshop user?
-
-The workshop content has various tools and utilities that are necessary to for the learner to complete it, the primary example being `kubectl` along with supporting tools like `jq` and `curl`.
-
-See `environment/Dockerfile` and `environment/installer.sh` to configure the installed utilities.
-
-## Testing
-
-All changes should be tested before raising a PR against the repository. There are two ways to test which can be used at different stages of your authoring process.
+- Install pre-commit and run `pre-commit install` so that the pre-commit hooks are run. This will perform basic checks on your changes.
 
 ### Creating the infrastructure
 
-When creating your content you will want to test the commands you specify against infrastructure that mirrors what will be used in the actual workshop by learners. All infrastructure (VPC, EKS cluster etc) is expressed as Terraform configuration in the `terraform` directory.
+When creating your content you will want to test the commands you specify against infrastructure that mirrors what will be used in the actual workshop by learners. This can easily by done locally and will use the cluster configuration in `./cluster/eksctl/cluster.yaml`.
 
-Ensure that your AWS credentials are set so Terraform is able to authenticate against your IAM account. Terraform will pull credentials from your `~/.aws/credentials` and `~/.aws/config` folders. You can find instructions [here](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html).
+Ensure that your AWS credentials are set so eksctl is able to authenticate against your IAM account. It will source credentials following the standard mechanism used by the likes of the AWS CLI, which you can find documented [here](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-authentication.html).
 
 You can then use the following convenience command to create the infrastructure:
 
@@ -114,21 +75,21 @@ You can then use the following convenience command to create the infrastructure:
 make create-infrastructure
 ```
 
-If you make any changes to the Terraform as part of your workshop content as outlined above you can run this command repeatedly to update the infrastructure incrementally.
-
 Once you're finished with the test environment you can delete all of the infrastructure using the following convenience command:
 
 ```
 make destroy-infrastructure
 ```
 
-### Manual testing
+### Simulating the workshop environment
+
+When in the process of creating the content its likely you'll need to be fairly interactive in testing commands etc. During a real workshop users would do this on the Cloud9 IDE, but for our purposes for developing content quickly this is a poor experience because it is designed to refresh content automatically from GitHub. As a result it is recommended to _NOT use the Cloud9 IDE_ created by the Cloud Formation in this repository and instead use the flow below.
 
-When in the process of creating the content its likely you'll need to be fairly interactive in testing commands etc. For this theres a mechanism to easily create an interactive shell with access to the EKS cluster created by the Terraform, as well as including all the necessary tools and utilities without installing them locally.
+The repository provides a mechanism to easily create an interactive shell with access to the EKS cluster created by `make create-infrastructure`. This shell will automatically pick up changes to the content on your local machine and mirrors the Cloud9 used in a real workshop in terms of tools and setup.
 
 To use this utility you must:
 
-- Already have created the workshop infrastructure as outlined in the section above
+- Already run `make create-infrastructure`
 - Have some AWS credentials available in your current shell session (ie. you `aws` CLI must work)
 
 The shell session created will have AWS credentials injected, so you will immediately be able to use the `aws` CLI and `kubectl` commands with no further configuration:
@@ -139,7 +100,7 @@ If using [finch CLI](https://github.com/runfinch/finch) instead of `docker` CLI
 export CONTAINER_CLI=finch
 ```
 
-Run the `make shell`
+Run `make shell`:
 
 ```bash
 ➜  eks-workshop-v2 git:(main) ✗ make shell
@@ -148,7 +109,7 @@ Generating temporary AWS credentials...
 Building container images...
 sha256:cd6a00c814bd8fad5fe3bdd47a03cffbb3a6597d233636ed11155137f1506aee
 Starting shell in container...
-Added new context arn:aws:eks:us-west-2:111111111:cluster/eksw-env-cluster-eks to /root/.kube/config
+Added new context arn:aws:eks:us-west-2:111111111:cluster/eks-workshop to /root/.kube/config
 [root@43267b0ac0c8 /]$ aws eks list-clusters
 {
     "clusters": [
@@ -175,12 +136,102 @@ Depending on your Docker/Finch version, you might need to add a flag to enable [
 
 If your AWS credentials expire you can `exit` and restart the shell, which will not affect your cluster.
 
+## Planning your content
+
+An EKS Workshop lab generally consists of several components:
+
+1. The markdown content in `.md` files that contains the commands to run and explanations for the user
+1. Kubernetes manifests that will be referenced in (1) and usually applied to the EKS cluster
+1. Terraform configuration to customize the lab environment, for example installing extra components in the EKS cluster or provisioning AWS resources like a DynamoDB table or S3 bucket
+1. A shell script which will be used behind the scenes to clean up any changes made to the environment outside of the Terraform in (3) during the course of your lab
+
+Before you begin writing your content it is wise to plan out which of these you will require for your lab. You should refer to existing labs to see examples of patterns that are similar to your scenario.
+
+## Writing the markdown
+
+Once you have a working branch on your local machine you can start writing the workshop content. The Markdown files for the content are all contained in the `website/docs` directory of the repository. This directory is structured using the standard [Docusaurus directory layout](https://docusaurus.io/docs/installation#project-structure). It is recommended to use other modules as guidelines for format and structure.
+
+Please see the [style guide](./style_guide.md) documentation for specific guidance on how to write content so it is consistent across the workshop content.
+
+As you write the content you can use the live local server that we can above to check that it displays correctly.
+
+### What if my content need a new tool installed in the workshop IDE?
+
+The workshop content has various tools and utilities that are necessary to for the learner to complete it, the primary example being `kubectl` along with supporting tools like `jq` and `curl`.
+
+See `lab/Dockerfile` and `lab/scripts/installer.sh` to configure the installed utilities.
+
+## Writing the Terraform
+
+If Terraform is needed it should be created at `./manifests/modules/<yourpath>/.workshop/terraform`. This Terraform will be automatically applied when the user runs `prepare-environment` and destroyed when they move to the next lab.
+
+You can use the directory `manifests/template/.workshop/terraform` as a starter example. The Terraform is treated as a module and the variables in that directory must match exactly in order to meet the "contract" with the rest of the framework. See `vars.tf` and `outputs.tf`.
+
+### Variables
+
+Certain variables will be provided by the code that invokes your Terraform lab module, you can review these in `vars.tf` mentioned above. These include values such as the EKS cluster name, the cluster version and an "addon context" object which contains values such as the EKS cluster endpoint and OIDC issuer URL.
+
+### Outputs
+
+One optional output is expected, and that is `environment_variables`. This is a map of environment variables that will be added to the users IDE shell. For example:
+
+```hcl
+output "environment_variables" {
+  description = "Environment variables to be added to the IDE shell"
+  value       = {
+    MY_ENVIRONMENT_VARIABLE = "abc1234"
+  }
+}
+```
+
+### Terraform best practices
+
+The following are best practices for writing Terraform for a lab:
+
+1. Any AWS infrastructure provisioned should include the EKS cluster name in its name to avoid affecting the automated tests
+1. It shouldn't take more than 60 seconds for the Terraform to complete. Remember: the user will be waiting
+1. Anything installed (addons, helm charts) should be pinned to explicit versions to unexpected breakages
+
+## Cleaning up your lab
+
+An important part of EKS Workshop is the ability to run labs in any order, and to switch between them with minimal effort. To accomplish this we need to be able to clean up a lab so that the workshop environment is in a known, consistent state before starting the next lab.
+
+The `prepare-environment` command helps orchestrate this clean up by:
+
+1. Resetting the sample application back to its initial state
+1. Resetting the EKS Managed Node Groups back to their initial size
+1. Destroying all resources created via the Terraform
+1. Running a cleanup script provided by the lab
+
+As a workshop author the main unit of work is the cleanup script, which should be created at `./manifests/modules/<yourpath>/.workshop/cleanup.sh`. This should clean up all resources and changes made to the cluster during your lab content **outside of the Terraform configuration**.
+
+Some common examples include:
+
+- Deleting Kubernetes resources applied by the user
+- Removing Helm charts installed by the user
+- Removing EKS addons installed by the user
+- Deleting additional EKS Managed Node groups created by the user
+
+It is also important that all resources be removed conditionally and that errors not be silently swallowed. Failures should bubble up to the user since that means their environment is in an inconsistent state and may need fixed. As a result it would be considered best practice to check that resources exist before deleting them, since this is what allows them to switch labs at any point.
+
+## Testing
+
+All changes should be tested before raising a PR against the repository. There are two ways to test which can be used at different stages of your authoring process.
+
+### Manual testing
+
+Using the `make shell` mechanism outlined above you can manually run through your workshop steps.
+
 ### Automated testing
 
-There is also an automated testing capability provided with the workshop that allows testing of the entire workshop flow as a unit test suite. This is useful once your content is stable and has been manually tested.
+There is an automated testing capability provided with the workshop that allows testing of workshop labs as a unit test suite. This is useful once your content is stable and has been manually tested.
+
+**Your content must be able to be tested in an automated manner. If this is not possible then the content will be rejected due to maintenance burden.**
 
 See this [doc](./automated_tests.md) for more information on automated tests.
 
-## Raise a Pull Request
+## Raising a Pull Request
+
+Once your content is completed and is tested appropriately please raise a Pull Request to the `main` branch. This will trigger review processes before the content is merged. All status checks must pass before the PR will be merged, if a check fails then please check the error and attempt to resolve it. If you need assistance then leave a comment on the PR.
 
-Once your content is completed and is tested appropriately please raise a Pull Request to the `main` branch. This will trigger review processes before the content is merged.
+Please read the PR template carefully as it will provide guidance on providing a proper title, labels etc.
diff --git a/docs/automated_tests.md b/docs/automated_tests.md
index 85ed0be04..99a0deaca 100644
--- a/docs/automated_tests.md
+++ b/docs/automated_tests.md
@@ -10,9 +10,10 @@ This section documents running the tests locally as well as via PRs.
 
 You should run the tests locally before raising a PR, this can be done with some convenience scripts.
 
-To use this utility you must:
+To use this utility make sure you have:
 
-- Have some AWS credentials available in your current shell session (ie. you `aws` CLI must work)
+- Run `make create-infrastructure` as outlined [here](./authoring_content.md#creating-the-infrastructure)
+- Run `make shell` as outlined [here](./authoring_content.md#simulating-the-workshop-environment)
 
 First, ensure you have the workshop infrastructure running in your AWS account:
 
@@ -30,7 +31,7 @@ Building container images...
 sha256:62fb5cc6e348d854a59a5e00429554eab770adc24150753df2e810355e678899
 sha256:a6b3c8675c79804587b5e4d5b8dfc5cfd6d01b40c89ee181ad662902e0cb650d
 Running test suite...
-Added new context arn:aws:eks:us-west-2:111111111:cluster/eksw-env-cluster-eks to /root/.kube/config
+Added new context arn:aws:eks:us-west-2:111111111:cluster/eks-workshop to /root/.kube/config
 ✔ Generating test cases
 ✔ Building Mocha suites
 
@@ -101,25 +102,6 @@ Finally, once you are done if needed you can destroy the infrastructure:
 make destroy-infrastructure
 ```
 
-### Pull Requests
-
-**Note:** This section is for repository maintainers
-
-By default the end-to-end tests will not run against a PR because:
-
-- The PR should be reviewed first
-- Tests take time to execute so only selective modules should be run
-
-The first thing to do is to assess which top-level modules should be tested, which are the directories contained in `website/docs` (`observability`, `autoscaling` etc). Each of the top level modules has a corresponding label which can be applied to PRs, for example if you want to test the Security module there is a label `content/security`, Observability has the label `content/observability` and so on.
-
-If you do not apply any content labels then only the Introduction and Cleanup modules will be run, which can be useful for just testing Terraform changes. Note: The Introduction and Cleanup modules will **always** be run.
-
-Once you have added the required `content/*` labels to the PR then apply the label `automation/e2e-tests`. This label is what triggers the tests to run against the PR, and it will detect all of the content labels previously added to determine what tests to run.
-
-The test suite will generally take **at least** 30 minutes to complete but will generally be more.
-
-As long as the `automation/e2e-tests` label is applied the test suite will re-run any time there is a push to the branch associated with the PR. Removing the label will stop this behavior.
-
 ## Writing tests
 
 The automated tests primarily work by extracting all of the `bash` code snippets out of the Markdown files and executing them in order. It is aware of `weight` and `sidebar_position` so does not need any further help to figure this out. This means that by adhering to the recommendations in the style guide your commands will have tests generated for them without any further action from you.