Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new: Added a new lab 'Mountpoint for S3' #1176

Merged
merged 37 commits into from
Dec 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
9800fe4
Work in progress to create S3 workshop, pre-commit hooks skipped
Aug 23, 2024
4e47ce2
Work in progress to create S3 workshop, pre-commit hooks skipped
Aug 23, 2024
a788054
Aligned files with workshop
Aug 23, 2024
a5ef29b
Working through lab, need to fix mounting part
Sep 10, 2024
4760230
Experimenting with different deployments to scale replica
Sep 17, 2024
bf4e89e
Simplified file structure for S3 module
Sep 17, 2024
649b065
Merge branch 'aws-samples:main' into s3module
rchandra20 Sep 17, 2024
b45e5a7
Container permission issue is occurring that is not allowing the file…
Sep 25, 2024
2f0e5ad
Permissions issues fixed by renaming Terraform resources and able to …
Sep 26, 2024
e4b9566
Updated lab to reflect new paradigm of uploading to S3 bucket first a…
Sep 26, 2024
1e7f8c1
S3 Module Lab works correctly in this version of the repository
Oct 1, 2024
64b9635
Merge branch 'aws-samples:main' into s3module
rchandra20 Oct 1, 2024
a959450
Polished up README.md before submitting module for user testing
Oct 1, 2024
f8ec02f
Merge branch 'aws-samples:main' into s3module
rchandra20 Oct 1, 2024
7918f54
add mountpoint for s3 workshop content
pengc99 Oct 14, 2024
50e69c8
Update 30-persistent-object-storage-with-mountpoint-for-amazon-s3.md
pengc99 Oct 14, 2024
29740d6
Merge branch 'aws-samples:main' into s3module
rchandra20 Oct 15, 2024
5dbb80f
Minor edit to S3 lab cleanup file
Oct 15, 2024
4563021
updated revision
pengc99 Oct 15, 2024
065be36
updated assets-s3.webp
pengc99 Oct 15, 2024
89cf356
Update 20-introduction-to-mountpoint-for-amazon-s3.md
pengc99 Oct 28, 2024
e7fb902
Update 30-persistent-object-storage-with-mountpoint-for-amazon-s3.md
pengc99 Oct 28, 2024
8786a88
Merge branch 'aws-samples:main' into s3module
rchandra20 Oct 29, 2024
0db327c
Merge branch 'aws-samples:main' into s3module
rchandra20 Nov 4, 2024
c844386
Updated Markdown for 'mountpoint for s3' module to pass automated tests
Nov 4, 2024
527df7a
Removed unnecessary files
Nov 4, 2024
aae29c4
Making changes to fix PR issues
Nov 4, 2024
25ecd54
Ran prettier on index.md in lab
Nov 4, 2024
992276a
Merge branch 'aws-samples:main' into s3module
rchandra20 Nov 5, 2024
8a72068
Merging
Nov 5, 2024
08d93f9
Fixing spelling errors and Terraform validation errors
Nov 5, 2024
b9e0f10
Removed comments on s3pvclaim.yaml
Dec 16, 2024
191ff54
Changed from region to region=
Dec 16, 2024
8c17ec4
Trimmed down deployment section in 'Ephemeral Container Storage'
Dec 16, 2024
f924839
Used YAML UI component, described PVC, and trimmed deployment informa…
Dec 16, 2024
21364ae
Use AWS CLI instead of eksctl, general language review
niallthomson Dec 20, 2024
6dd1297
Fix formatting
niallthomson Dec 20, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .spelling
Original file line number Diff line number Diff line change
Expand Up @@ -127,4 +127,5 @@ sheetal
joshi
keda
AIML
DCGM
DCGM
Mountpoint
35 changes: 35 additions & 0 deletions manifests/modules/fundamentals/storage/s3/.workshop/cleanup.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# #!/bin/bash

# Anything user has created after prepare-environment

set -e

# Delete local directory of image files
rm -rf ~/environment/assets-images/

logmessage "Deleting assets-images folder..."

addon_exists=$(aws eks list-addons --cluster-name $EKS_CLUSTER_NAME --query "addons[? @ == 'aws-mountpoint-s3-csi-driver']" --output text)

# Scale down assets
kubectl scale -n assets --replicas=0 deployment/assets

logmessage "Scaling down assets deployment..."

# Check if the S3 CSI driver addon exists
if [ ! -z "$addon_exists" ]; then
# Delete if addon exists
logmessage "Deleting S3 CSI driver addon..."

aws eks delete-addon --cluster-name $EKS_CLUSTER_NAME --addon-name aws-mountpoint-s3-csi-driver

aws eks wait addon-deleted --cluster-name $EKS_CLUSTER_NAME --addon-name aws-mountpoint-s3-csi-driver
fi

# Delete PVC
kubectl delete pvc s3-claim -n assets --ignore-not-found=true

# Delete PV
kubectl delete pv s3-pv --ignore-not-found=true

logmessage "Deleting PV and PVC that were created..."
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Create S3 bucket
resource "aws_s3_bucket" "mountpoint_s3" {

bucket_prefix = "${var.addon_context.eks_cluster_id}-mountpoint-s3"
force_destroy = true
}

# Create S3 CSI Driver IAM Role and associated policy
module "mountpoint_s3_csi_driver_irsa" {
source = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks"
version = "5.39.1"

# Create prefixes
role_name_prefix = "${var.addon_context.eks_cluster_id}-s3-csi-"
policy_name_prefix = "${var.addon_context.eks_cluster_id}-s3-csi-"

# IAM policy to attach to driver
attach_mountpoint_s3_csi_policy = true

mountpoint_s3_csi_bucket_arns = [aws_s3_bucket.mountpoint_s3.arn]
mountpoint_s3_csi_path_arns = ["${aws_s3_bucket.mountpoint_s3.arn}/*"]

oidc_providers = {
main = {
provider_arn = var.addon_context.eks_oidc_provider_arn
namespace_service_accounts = ["kube-system:s3-csi-driver-sa"]
}
}

tags = var.tags

force_detach_policies = true
}

resource "aws_iam_role_policy" "eks_workshop_ide_s3_put_access" {
name = "eks-workshop-ide-s3-put-access"
role = "eks-workshop-ide-role"

policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "s3:PutObject",
"Resource": "${aws_s3_bucket.mountpoint_s3.arn}/*"
}
]
}
EOF
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
output "environment_variables" {
description = "Environment variables to be added to the IDE shell"
value = {
S3_CSI_ADDON_ROLE = module.mountpoint_s3_csi_driver_irsa.iam_role_arn
BUCKET_NAME = aws_s3_bucket.mountpoint_s3.id
EKS_CLUSTER_NAME = var.eks_cluster_id
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# tflint-ignore: terraform_unused_declarations
variable "eks_cluster_id" {
description = "EKS cluster name"
type = string
}

# tflint-ignore: terraform_unused_declarations
variable "eks_cluster_version" {
description = "EKS cluster version"
type = string
}

# tflint-ignore: terraform_unused_declarations
variable "cluster_security_group_id" {
description = "EKS cluster security group ID"
type = any
}

# tflint-ignore: terraform_unused_declarations
variable "addon_context" {
description = "Addon context that can be passed directly to blueprints addon modules"
type = any
}

# tflint-ignore: terraform_unused_declarations
variable "tags" {
description = "Tags to apply to AWS resources"
type = any
}

# tflint-ignore: terraform_unused_declarations
variable "resources_precreated" {
description = "Have expensive resources been created already"
type = bool
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: assets
spec:
replicas: 2
template:
spec:
containers:
- name: assets
volumeMounts:
- name: mountpoint-s3
mountPath: /mountpoint-s3
volumes:
- name: mountpoint-s3
persistentVolumeClaim:
claimName: s3-claim
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../../../../../base-application/assets
- s3pvclaim.yaml
patches:
- path: deployment.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
apiVersion: v1
kind: PersistentVolume
metadata:
name: s3-pv
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteMany
mountOptions:
- allow-delete
- allow-other
- uid=999
- gid=999
- region=us-west-2
csi:
driver: s3.csi.aws.com
volumeHandle: s3-csi-driver-volume
volumeAttributes:
bucketName: $BUCKET_NAME
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: s3-claim
namespace: assets
spec:
accessModes:
- ReadWriteMany
storageClassName: ""
resources:
requests:
storage: 1Gi
volumeName: s3-pv
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
---
title: Ephemeral Container Storage
sidebar_position: 10
---

In this section, we'll explore how to handle storage in Kubernetes deployments using a simple image hosting example. We'll start with an existing deployment from our sample store application and modify it to serve as an image host. The assets microservice runs a webserver on EKS, which is an excellent example for demonstrating deployments since they enable **horizontal scaling** and **declarative state management** of Pods.

The assets component serves static product images from a container. These images are bundled into the container during the build process. However, this approach has a limitation - when new images are added to one container, they don't automatically appear in other containers. To address this, we'll implement a solution using [Amazon S3](https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html) and Kubernetes [Persistent Volume](https://kubernetes.io/docs/concepts/storage/persistent-volumes/) to create a shared storage environment. This will allow multiple web server containers to serve assets while scaling to meet demand.

Let's examine the current Deployment's volume configuration:

```bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can shorten the amount of information here, its a lot not related to the volumes etc.

Copy link
Contributor Author

@rchandra20 rchandra20 Dec 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$ kubectl describe deployment -n assets
Name: assets
Namespace: assets
[...]
Containers:
assets:
Image: public.ecr.aws/aws-containers/retail-store-sample-assets:0.4.0
Port: 8080/TCP
Host Port: 0/TCP
Limits:
memory: 128Mi
Requests:
cpu: 128m
memory: 128Mi
Liveness: http-get http://:8080/health.html delay=0s timeout=1s period=3s #success=1 #failure=3
Environment Variables from:
assets ConfigMap Optional: false
Environment: <none>
Mounts:
/tmp from tmp-volume (rw)
Volumes:
tmp-volume:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium: Memory
SizeLimit: <unset>
[...]
```

Looking at the [`Volumes`](https://kubernetes.io/docs/concepts/storage/volumes/#emptydir-configuration-example) section, we can see that the Deployment currently uses an [EmptyDir volume type](https://kubernetes.io/docs/concepts/storage/volumes/#emptydir) that exists only for the Pod's lifetime.

![Assets with emptyDir](./assets/assets-emptydir.webp)

An `emptyDir` volume is created when a Pod is assigned to a node and persists only while that Pod runs on that node. As its name suggests, the volume starts empty. While all containers within the Pod can read and write files in the emptyDir volume (even when mounted at different paths), **when a Pod is removed from a node for any reason, the data in the emptyDir is deleted permanently.** This makes EmptyDir unsuitable for sharing data between multiple Pods in the same Deployment when that data needs to persist.

The container comes with some initial product images, which are copied during the build process to `/usr/share/nginx/html/assets`. We can verify this by running:

```bash
$ kubectl exec --stdin deployment/assets \
-n assets -- bash -c "ls /usr/share/nginx/html/assets/"
chrono_classic.jpg
gentleman.jpg
pocket_watch.jpg
smart_1.jpg
smart_2.jpg
wood_watch.jpg
```

To demonstrate the limitations of EmptyDir storage, let's scale up the `assets` Deployment to multiple replicas:

```bash
$ kubectl scale -n assets --replicas=2 deployment/assets
deployment.apps/assets scaled

$ kubectl rollout status -n assets deployment/assets --timeout=60s
deployment "assets" successfully rolled out
```

Now, let's add a new product image called `divewatch.png` to the `/usr/share/nginx/html/assets` directory of the first Pod and verify it exists:

```bash
$ POD_NAME=$(kubectl -n assets get pods -o jsonpath='{.items[0].metadata.name}')
$ kubectl exec --stdin $POD_NAME \
-n assets -- bash -c 'touch /usr/share/nginx/html/assets/divewatch.jpg'
$ kubectl exec --stdin $POD_NAME \
-n assets -- bash -c 'ls /usr/share/nginx/html/assets'
chrono_classic.jpg
divewatch.jpg <-----------
gentleman.jpg
pocket_watch.jpg
smart_1.jpg
smart_2.jpg
wood_watch.jpg
```

Let's check if the new product image `divewatch.jpg` appears in the second Pod:

```bash
$ POD_NAME=$(kubectl -n assets get pods -o jsonpath='{.items[1].metadata.name}')
$ kubectl exec --stdin $POD_NAME \
-n assets -- bash -c 'ls /usr/share/nginx/html/assets'
chrono_classic.jpg
gentleman.jpg
pocket_watch.jpg
smart_1.jpg
smart_2.jpg
wood_watch.jpg
```

As we can see, `divewatch.jpg` doesn't exist in the second Pod. This demonstrates why we need a shared filesystem that persists across multiple Pods when scaling horizontally, allowing file updates without requiring redeployment.
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
---
title: Mountpoint for Amazon S3
sidebar_position: 20
---

Before proceeding with this section, it's important to understand the Kubernetes storage concepts (volumes, persistent volumes (PV), persistent volume claims (PVC), dynamic provisioning, and ephemeral storage) that were covered in the [Storage](../index.md) main section.

The [Mountpoint for Amazon S3 Container Storage Interface (CSI) Driver](https://github.com/awslabs/mountpoint-s3-csi-driver) enables Kubernetes applications to access Amazon S3 objects using a standard file system interface. Built on [Mountpoint for Amazon S3](https://github.com/awslabs/mountpoint-s3), the Mountpoint CSI driver exposes an Amazon S3 bucket as a storage volume that containers in your Kubernetes cluster can access. The driver implements the [CSI](https://github.com/container-storage-interface/spec/blob/master/spec.md) specification, allowing container orchestrators (CO) to manage storage volumes effectively.

The following architecture diagram illustrates how we will use Mountpoint for Amazon S3 as persistent storage for our EKS pods:

![Assets with S3](./assets/assets-s3.webp)

Let's begin by creating a staging directory for the images needed by our image hosting web application:

```bash
$ mkdir ~/environment/assets-images/
$ cd ~/environment/assets-images/
$ curl --remote-name-all https://raw.githubusercontent.com/aws-containers/retail-store-sample-app/main/src/assets/public/assets/{chrono_classic.jpg,gentleman.jpg,pocket_watch.jpg,smart_2.jpg,wood_watch.jpg}
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 98157 100 98157 0 0 242k 0 --:--:-- --:--:-- --:--:-- 242k
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 58439 100 58439 0 0 214k 0 --:--:-- --:--:-- --:--:-- 214k
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 58655 100 58655 0 0 260k 0 --:--:-- --:--:-- --:--:-- 260k
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 20795 100 20795 0 0 96273 0 --:--:-- --:--:-- --:--:-- 96273
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 43122 100 43122 0 0 244k 0 --:--:-- --:--:-- --:--:-- 243k
$ ls
chrono_classic.jpg gentleman.jpg pocket_watch.jpg smart_2.jpg wood_watch.jpg
```

Next, we'll copy these image assets to our S3 bucket using the `aws s3 cp` command:

```bash
$ cd ~/environment/
$ aws s3 cp ~/environment/assets-images/ s3://$BUCKET_NAME/ --recursive
upload: assets-images/smart_2.jpg to s3://eks-workshop-mountpoint-s320241014192132282600000002/smart_2.jpg
upload: assets-images/wood_watch.jpg to s3://eks-workshop-mountpoint-s320241014192132282600000002/wood_watch.jpg
upload: assets-images/gentleman.jpg to s3://eks-workshop-mountpoint-s320241014192132282600000002/gentleman.jpg
upload: assets-images/pocket_watch.jpg to s3://eks-workshop-mountpoint-s320241014192132282600000002/pocket_watch.jpg
upload: assets-images/chrono_classic.jpg to s3://eks-workshop-mountpoint-s320241014192132282600000002/chrono_classic.jpg
```

We can verify the uploaded objects in our bucket using the `aws s3 ls` command:

```bash
$ aws s3 ls $BUCKET_NAME
2024-10-14 19:29:05 98157 chrono_classic.jpg
2024-10-14 19:29:05 58439 gentleman.jpg
2024-10-14 19:29:05 58655 pocket_watch.jpg
2024-10-14 19:29:05 20795 smart_2.jpg
2024-10-14 19:29:05 43122 wood_watch.jpg
```

With our initial objects now in the Amazon S3 bucket, we'll configure the Mountpoint for Amazon S3 CSI driver to provide persistent, shared storage for our pods.

Let's install the Mountpoint for Amazon S3 CSI addon to our EKS cluster. This operation will take a few minutes to complete:

```bash
$ aws eks create-addon --cluster-name $EKS_CLUSTER_NAME --addon-name aws-mountpoint-s3-csi-driver \
--service-account-role-arn $S3_CSI_ADDON_ROLE
$ aws eks wait addon-active --cluster-name $EKS_CLUSTER_NAME --addon-name aws-mountpoint-s3-csi-driver
```

Once completed, we can verify what the addon created in our EKS cluster:

```bash
$ kubectl get daemonset s3-csi-node -n kube-system
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
s3-csi-node 3 3 3 3 3 kubernetes.io/os=linux 61s
```
Loading
Loading