Skip to content

Commit

Permalink
feat: Apache Superset feature support (#447)
Browse files Browse the repository at this point in the history
Co-authored-by: Vara Bonthu <[email protected]>
Co-authored-by: avasu80 <[email protected]>
  • Loading branch information
3 people authored Apr 29, 2024
1 parent 3c6f7ae commit 99e5440
Show file tree
Hide file tree
Showing 15 changed files with 1,636 additions and 0 deletions.
52 changes: 52 additions & 0 deletions analytics/terraform/superset-on-eks/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
## Requirements

For security reasons, ALB is deployed as internal one and it can be changed to internet-facing during the deployment, if needed.
## Providers

| Name | Version |
|------|---------|
| <a name="provider_aws"></a> [aws](#provider\_aws) | 5.36.0 |
| <a name="provider_helm"></a> [helm](#provider\_helm) | 2.12.1 |
| <a name="provider_kubernetes"></a> [kubernetes](#provider\_kubernetes) | 2.25.2 |
| <a name="provider_null"></a> [null](#provider\_null) | 3.2.2 |

## Modules

| Name | Source | Version |
|------|--------|---------|
| <a name="module_ebs_csi_driver_irsa"></a> [ebs\_csi\_driver\_irsa](#module\_ebs\_csi\_driver\_irsa) | terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks | ~> 5.20 |
| <a name="module_eks"></a> [eks](#module\_eks) | terraform-aws-modules/eks/aws | ~> 19.15 |
| <a name="module_eks_blueprints_addons"></a> [eks\_blueprints\_addons](#module\_eks\_blueprints\_addons) | aws-ia/eks-blueprints-addons/aws | ~> 1.2 |
| <a name="module_lb_role"></a> [lb\_role](#module\_lb\_role) | terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks | 5.37.1 |
| <a name="module_vpc"></a> [vpc](#module\_vpc) | terraform-aws-modules/vpc/aws | ~> 5.0 |

## Resources

| Name | Type |
|------|------|
| [helm_release.alb_controller](https://registry.terraform.io/providers/hashicorp/helm/latest/docs/resources/release) | resource |
| [helm_release.superset](https://registry.terraform.io/providers/hashicorp/helm/latest/docs/resources/release) | resource |
| [kubernetes_ingress_class_v1.aws_alb](https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs/resources/ingress_class_v1) | resource |
| [kubernetes_ingress_v1.superset](https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs/resources/ingress_v1) | resource |
| [kubernetes_namespace.superset](https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs/resources/namespace) | resource |
| [kubernetes_service_account.service_account](https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs/resources/service_account) | resource |
| [null_resource.add_superset_repo](https://registry.terraform.io/providers/hashicorp/null/latest/docs/resources/resource) | resource |
| [null_resource.helm_update_repos](https://registry.terraform.io/providers/hashicorp/null/latest/docs/resources/resource) | resource |
| [aws_availability_zones.available](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/availability_zones) | data source |

## Inputs

| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_eks_cluster_version"></a> [eks\_cluster\_version](#input\_eks\_cluster\_version) | EKS Cluster version | `string` | `"1.28"` | no |
| <a name="input_name"></a> [name](#input\_name) | Name of the VPC and EKS Cluster | `string` | `"superset-on-eks"` | no |
| <a name="input_region"></a> [region](#input\_region) | Region | `string` | `"us-east-1"` | no |
| <a name="input_secondary_cidr_blocks"></a> [secondary\_cidr\_blocks](#input\_secondary\_cidr\_blocks) | Secondary CIDR blocks to be attached to VPC | `list(string)` | <pre>[<br> "100.64.0.0/16"<br>]</pre> | no |
| <a name="input_vpc_cidr"></a> [vpc\_cidr](#input\_vpc\_cidr) | VPC CIDR. This should be a valid private (RFC 1918) CIDR range | `string` | `"10.1.0.0/21"` | no |

## Outputs

| Name | Description |
|------|-------------|
| <a name="output_configure_kubectl"></a> [configure\_kubectl](#output\_configure\_kubectl) | Configure kubectl: make sure you're logged in with the correct AWS profile and run the following command to update your kubeconfig |
| <a name="output_superset_url"></a> [superset\_url](#output\_superset\_url) | Configure kubectl: Once the kubeconfig is configured as above, use the below command to get the Superset URL |
160 changes: 160 additions & 0 deletions analytics/terraform/superset-on-eks/addons.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
#---------------------------------------------------------------
# GP3 Encrypted Storage Class
#---------------------------------------------------------------
resource "kubernetes_annotations" "disable_gp2" {
annotations = {
"storageclass.kubernetes.io/is-default-class" : "false"
}
api_version = "storage.k8s.io/v1"
kind = "StorageClass"
metadata {
name = "gp2"
}
force = true

depends_on = [module.eks.eks_cluster_id]
}

resource "kubernetes_storage_class" "default_gp3" {
metadata {
name = "gp3"
annotations = {
"storageclass.kubernetes.io/is-default-class" : "true"
}
}

storage_provisioner = "ebs.csi.aws.com"
reclaim_policy = "Delete"
allow_volume_expansion = true
volume_binding_mode = "WaitForFirstConsumer"
parameters = {
fsType = "ext4"
encrypted = true
type = "gp3"
}

depends_on = [kubernetes_annotations.disable_gp2]
}

#---------------------------------------------------------------
# IRSA for EBS CSI Driver
#---------------------------------------------------------------
module "ebs_csi_driver_irsa" {
source = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks"
version = "~> 5.20"
role_name_prefix = format("%s-%s", local.name, "ebs-csi-driver-")
attach_ebs_csi_policy = true
oidc_providers = {
main = {
provider_arn = module.eks.oidc_provider_arn
namespace_service_accounts = ["kube-system:ebs-csi-controller-sa"]
}
}
tags = local.tags
}

module "eks_blueprints_addons" {
source = "aws-ia/eks-blueprints-addons/aws"
version = "~> 1.2"

cluster_name = module.eks.cluster_name
cluster_endpoint = module.eks.cluster_endpoint
cluster_version = module.eks.cluster_version
oidc_provider_arn = module.eks.oidc_provider_arn

#---------------------------------------
# Amazon EKS Managed Add-ons
#---------------------------------------
eks_addons = {
aws-ebs-csi-driver = {
service_account_role_arn = module.ebs_csi_driver_irsa.iam_role_arn
}
coredns = {
preserve = true
}
vpc-cni = {
preserve = true
}
kube-proxy = {
preserve = true
}
}

#---------------------------------------
# AWS Load Balancer Controller Add-on
#---------------------------------------
enable_aws_load_balancer_controller = true
# turn off the mutating webhook for services because we are using
# service.beta.kubernetes.io/aws-load-balancer-type: external
aws_load_balancer_controller = {
set = [{
name = "enableServiceMutatorWebhook"
value = "false"
}]
}

tags = local.tags
}

module "eks_data_addons" {
source = "aws-ia/eks-data-addons/aws"
version = "~> 1.31.5" # ensure to update this to the latest/desired version

oidc_provider_arn = module.eks.oidc_provider_arn

#---------------------------------------
# AWS Apache Superset Add-on
#---------------------------------------
enable_superset = true
superset_helm_config = {
values = [templatefile("${path.module}/helm-values/superset-values.yaml", {})]
}
depends_on = [module.eks_blueprints_addons]

}

#------------------------------------------------------------
# Create AWS Application Load balancer with Ingres
#------------------------------------------------------------
resource "kubernetes_ingress_class_v1" "aws_alb" {
metadata {
name = "aws-alb"
}

spec {
controller = "ingress.k8s.aws/alb"
}

depends_on = [module.eks.cluster_id]
}

resource "kubernetes_ingress_v1" "superset" {
metadata {
name = "superset-ingress3"
namespace = "superset"
annotations = {
"alb.ingress.kubernetes.io/scheme" = "internet-facing"
"alb.ingress.kubernetes.io/target-type" = "ip"
}
}
spec {
ingress_class_name = "aws-alb"
rule {
http {
path {
path = "/*"
backend {
service {
name = "superset"
port {
number = 8088
}
}
}
}
}
}
}

depends_on = [module.eks_blueprints_addons, module.eks_data_addons]
}
50 changes: 50 additions & 0 deletions analytics/terraform/superset-on-eks/cleanup.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
#!/bin/bash
set -o errexit
set -o pipefail

targets=(
"module.eks_blueprints_addons"
"module.eks"
"module.vpc"
)

#-------------------------------------------
# Helpful to delete the stuck in "Terminating" namespaces
# Rerun the cleanup.sh script to detect and delete the stuck resources
#-------------------------------------------
terminating_namespaces=$(kubectl get namespaces --field-selector status.phase=Terminating -o json | jq -r '.items[].metadata.name')

# If there are no terminating namespaces, exit the script
if [[ -z $terminating_namespaces ]]; then
echo "No terminating namespaces found"
fi

for ns in $terminating_namespaces; do
echo "Terminating namespace: $ns"
kubectl get namespace $ns -o json | sed 's/"kubernetes"//' | kubectl replace --raw "/api/v1/namespaces/$ns/finalize" -f -
done

#-------------------------------------------
# Terraform destroy per module target
#-------------------------------------------
for target in "${targets[@]}"
do
destroy_output=$(terraform destroy -target="$target" -auto-approve | tee /dev/tty)
if [[ ${PIPESTATUS[0]} -eq 0 && $destroy_output == *"Destroy complete!"* ]]; then
echo "SUCCESS: Terraform destroy of $target completed successfully"
else
echo "FAILED: Terraform destroy of $target failed"
exit 1
fi
done

#-------------------------------------------
# Terraform destroy full
#-------------------------------------------
destroy_output=$(terraform destroy -target="$target" -auto-approve | tee /dev/tty)
if [[ ${PIPESTATUS[0]} -eq 0 && $destroy_output == *"Destroy complete!"* ]]; then
echo "SUCCESS: Terraform destroy of all targets completed successfully"
else
echo "FAILED: Terraform destroy of all targets failed"
exit 1
fi
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# Superset node configuration
supersetNode:
replicaCount: 1
autoscaling:
enabled: true
minReplicas: 1
maxReplicas: 100
targetCPUUtilizationPercentage: 80

resources:
limits:
cpu: 200m
memory: 256Mi
requests:
cpu: 200m
memory: 256Mi

# Superset Celery worker configuration
supersetWorker:
replicaCount: 1
autoscaling:
enabled: true
minReplicas: 1
maxReplicas: 100
targetCPUUtilizationPercentage: 80
command:
- "/bin/sh"
- "-c"
- ". {{ .Values.configMountPath }}/superset_bootstrap.sh; celery --app=superset.tasks.celery_app:app worker"
# -- If true, forces deployment to reload on each upgrade
forceReload: false
# -- Init container
# @default -- a container waiting for postgres and redis
initContainers:
- name: wait-for-postgres-redis
image: "{{ .Values.initImage.repository }}:{{ .Values.initImage.tag }}"
imagePullPolicy: "{{ .Values.initImage.pullPolicy }}"
envFrom:
- secretRef:
name: "{{ tpl .Values.envFromSecret . }}"
command:
- /bin/sh
- -c
- dockerize -wait "tcp://$DB_HOST:$DB_PORT" -wait "tcp://$REDIS_HOST:$REDIS_PORT" -timeout 120s

resources:
limits:
cpu: 200m
memory: 512Mi
requests:
cpu: 200m
memory: 400Mi

persistence:
enabled: true


postgresql:
## Set to false if bringing your own PostgreSQL.
enabled: true
loadExamples: true
primary:
persistence:
## Enable PostgreSQL persistence using Persistent Volume Claims.
enabled: true
storageClass: gp3

configOverrides:
secret: |
SECRET_KEY = '5WPuGEgPfGTrk9MCVLFkzNk0fO4hyfsykSrM03fHn1m8d3yQQd4yjyvf'
redis:

master:
##
## Image configuration
# image:
##
## docker registry secret names (list)
# pullSecrets: nil
##
persistence:
##
## Use a PVC to persist data.
enabled: true
##
## Persistent class
# storageClass: classname
##
## Access mode:
accessModes:
- ReadWriteOnce
runAsUser: 1000
34 changes: 34 additions & 0 deletions analytics/terraform/superset-on-eks/install.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
#!/bin/bash

echo "Initializing ..."
terraform init || echo "\"terraform init\" failed"

# List of Terraform modules to apply in sequence
targets=(
"module.vpc"
"module.eks"
"module.eks_blueprints_addons"
)

# Apply modules in sequence
for target in "${targets[@]}"
do
echo "Applying module $target..."
apply_output=$(terraform apply -target="$target" -auto-approve 2>&1 | tee /dev/tty)
if [[ ${PIPESTATUS[0]} -eq 0 && $apply_output == *"Apply complete"* ]]; then
echo "SUCCESS: Terraform apply of $target completed successfully"
else
echo "FAILED: Terraform apply of $target failed"
exit 1
fi
done

# Final apply to catch any remaining resources
echo "Applying remaining resources..."
apply_output=$(terraform apply -auto-approve 2>&1 | tee /dev/tty)
if [[ ${PIPESTATUS[0]} -eq 0 && $apply_output == *"Apply complete"* ]]; then
echo "SUCCESS: Terraform apply of all modules completed successfully"
else
echo "FAILED: Terraform apply of all modules failed"
exit 1
fi
Loading

0 comments on commit 99e5440

Please sign in to comment.