Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating cloudwatch stuff to reflect new pod and deployment names #1691

Merged
merged 12 commits into from
Dec 12, 2024

Conversation

ben851
Copy link
Contributor

@ben851 ben851 commented Dec 3, 2024

Summary | Résumé

In prep for helmfile migration

Also removed github arc stuff that's no longer used

Related Issues | Cartes liées

Before merging this PR

Read code suggestions left by the
cds-ai-codereviewer bot. Address
valid suggestions and shortly write down reasons to not address others. To help
with the classification of the comments, please use these reactions on each of the
comments made by the AI review:

Classification Reaction Emoticon
Useful +1 👍
Noisy eyes 👀
Hallucination confused 😕
Wrong but teachable rocket 🚀
Wrong and incorrect -1 👎

The classifications will be extracted and summarized into an analysis of how helpful
or not the AI code review really is.

Test instructions | Instructions pour tester la modification

After the helm migration, verify the queries and dashboards

@@ -314,7 +314,7 @@ resource "aws_cloudwatch_dashboard" "notify_system" {
"x": 0,
"type": "log",
"properties": {
"query": "SOURCE '/aws/containerinsights/${aws_eks_cluster.notification-canada-ca-eks-cluster.name}/application' | fields @timestamp, log, kubernetes.container_name as app, kubernetes.pod_name as pod_name, @logStream\n| filter kubernetes.container_name not like /^celery/\n| fields @message like /HTTP\\/\\d+\\.\\d+\\\\\" 50\\d/ as is_error\n| stats sum(is_error) as errors by bin(1m)\n",
"query": "SOURCE '/aws/containerinsights/${aws_eks_cluster.notification-canada-ca-eks-cluster.name}/application' | fields @timestamp, log, kubernetes.container_name as app, kubernetes.pod_name as pod_name, @logStream\n| filter kubernetes.container_name not like /^notify-celery/\n| fields @message like /HTTP\\/\\d+\\.\\d+\\\\\" 50\\d/ as is_error\n| stats sum(is_error) as errors by bin(1m)\n",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using a more specific regex pattern to match notify-celery container names to avoid potential mismatches with other container names that might start with notify-celery.

@@ -343,7 +343,7 @@
"type": "metric",
"properties": {
"metrics": [
[ "ContainerInsights/Prometheus", "kube_deployment_status_replicas_available", "namespace", "notification-canada-ca", "ClusterName", "${aws_eks_cluster.notification-canada-ca-eks-cluster.name}", "deployment", "celery-primary", { "region": "${var.region}", "label": "celery-primary" } ],
[ "ContainerInsights/Prometheus", "kube_deployment_status_replicas_available", "namespace", "notification-canada-ca", "ClusterName", "${aws_eks_cluster.notification-canada-ca-eks-cluster.name}", "deployment", "notify-celery-primary", { "region": "${var.region}", "label": "notify-celery-primary" } ],
[ "...", "celery-scalable", { "region": "${var.region}", "label": "celery-scalable" } ]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metric label 'celery-scalable' should be updated to 'notify-celery-scalable' to reflect the new deployment name, similar to the change made on line 346.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so close but I'm changing primary not scalable

@@ -362,7 +362,7 @@
"type": "metric",
"properties": {
"metrics": [
[ "ContainerInsights/Prometheus", "kube_deployment_status_replicas_available", "namespace", "notification-canada-ca", "ClusterName", "${aws_eks_cluster.notification-canada-ca-eks-cluster.name}", "deployment", "celery-sms-send-primary", { "region": "${var.region}", "label": "celery-sms-send-primary" } ],
[ "ContainerInsights/Prometheus", "kube_deployment_status_replicas_available", "namespace", "notification-canada-ca", "ClusterName", "${aws_eks_cluster.notification-canada-ca-eks-cluster.name}", "deployment", "notify-celery-sms-send-primary", { "region": "${var.region}", "label": "notify-celery-sms-send-primary" } ],
[ "...", "celery-sms-send-scalable", { "region": "${var.region}", "label": "celery-sms-send-scalable" } ]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metric name 'celery-sms-send-scalable' should be updated to 'notify-celery-sms-send-scalable' to reflect the new deployment name, similar to the change made on line 365.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same reasoning as above

Copy link

staging: pinpoint_to_sqs_sms_callbacks

✅   Terraform Init: success
✅   Terraform Validate: success
✅   Terraform Format: success
✅   Terraform Plan: success
✅   Conftest: success

Plan: 0 to add, 1 to change, 0 to destroy
Show summary
CHANGE NAME
update aws_cloudwatch_dashboard.sms-send-rate[0]
Show plan
Resource actions are indicated with the following symbols:
  ~ update in-place

Terraform will perform the following actions:

  # aws_cloudwatch_dashboard.sms-send-rate[0] will be updated in-place
  ~ resource "aws_cloudwatch_dashboard" "sms-send-rate" {
      ~ dashboard_body = (sensitive value)
        id             = "Specialized-sms-send-rate"
        # (2 unchanged attributes hidden)
    }

Plan: 0 to add, 1 to change, 0 to destroy.

─────────────────────────────────────────────────────────────────────────────

Saved the plan to: plan.tfplan

To perform exactly these actions, run the following command to apply:
    terraform apply "plan.tfplan"
Show Conftest results
WARN - plan.json - main - Missing Common Tags: ["aws_cloudwatch_log_group.pinpoint_deliveries"]
WARN - plan.json - main - Missing Common Tags: ["aws_cloudwatch_log_group.pinpoint_deliveries_failures"]
WARN - plan.json - main - Missing Common Tags: ["aws_cloudwatch_log_group.pinpoint_to_sqs_sms_callbacks_log_group[0]"]
WARN - plan.json - main - Missing Common Tags: ["aws_cloudwatch_metric_alarm.lambda-image-pinpoint-delivery-receipts-errors-critical[0]"]
WARN - plan.json - main - Missing Common Tags: ["aws_cloudwatch_metric_alarm.lambda-image-pinpoint-delivery-receipts-errors-warning[0]"]
WARN - plan.json - main - Missing Common Tags: ["aws_cloudwatch_metric_alarm.logs-1-500-error-1-minute-warning-pinpoint_to_sqs_sms_callbacks-api[0]"]
WARN - plan.json - main - Missing Common Tags: ["aws_cloudwatch_metric_alarm.logs-10-500-error-5-minutes-critical-pinpoint_to_sqs_sms_callbacks-api[0]"]
WARN - plan.json - main - Missing Common Tags: ["aws_cloudwatch_metric_alarm.pinpoint-sms-blocked-as-spam-warning[0]"]
WARN - plan.json - main - Missing Common Tags: ["aws_cloudwatch_metric_alarm.pinpoint-sms-failures-bell-warning[0]"]
WARN - plan.json - main - Missing Common Tags: ["aws_cloudwatch_metric_alarm.pinpoint-sms-failures-bragg-warning[0]"]
WARN - plan.json - main - Missing Common Tags: ["aws_cloudwatch_metric_alarm.pinpoint-sms-failures-freedom-warning[0]"]
WARN - plan.json - main - Missing Common Tags: ["aws_cloudwatch_metric_alarm.pinpoint-sms-failures-iristel-warning[0]"]
WARN - plan.json - main - Missing Common Tags: ["aws_cloudwatch_metric_alarm.pinpoint-sms-failures-maritime-warning[0]"]
WARN - plan.json - main - Missing Common Tags: ["aws_cloudwatch_metric_alarm.pinpoint-sms-failures-mts-warning[0]"]
WARN - plan.json - main - Missing Common Tags: ["aws_cloudwatch_metric_alarm.pinpoint-sms-failures-rogers-warning[0]"]
WARN - plan.json - main - Missing Common Tags: ["aws_cloudwatch_metric_alarm.pinpoint-sms-failures-telus-warning[0]"]
WARN - plan.json - main - Missing...

Copy link

staging: eks

✅   Terraform Init: success
✅   Terraform Validate: success
✅   Terraform Format: success
✅   Terraform Plan: success
✅   Conftest: success

⚠️   Warning: resources will be destroyed by this change!

Plan: 0 to add, 47 to change, 6 to destroy
Show summary
CHANGE NAME
delete aws_cloudwatch_log_metric_filter.github-arc-runner-alarm[0]
aws_cloudwatch_metric_alarm.github-arc-runner-error-alarm[0]
aws_cloudwatch_query_definition.gh-arc-errors[0]
aws_iam_role.secrets_csi_github
aws_iam_role_policy_attachment.parameters_csi_github
aws_iam_role_policy_attachment.secrets_csi_github
update aws_cloudwatch_dashboard.errors[0]
aws_cloudwatch_dashboard.kubernetes[0]
aws_cloudwatch_dashboard.notify_system[0]
aws_cloudwatch_log_metric_filter.admin-evicted-pods[0]
aws_cloudwatch_log_metric_filter.api-evicted-pods[0]
aws_cloudwatch_log_metric_filter.celery-evicted-pods[0]
aws_cloudwatch_log_metric_filter.document-download-evicted-pods[0]
aws_cloudwatch_log_metric_filter.documentation-evicted-pods[0]
aws_cloudwatch_metric_alarm.admin-pods-high-cpu-warning[0]
aws_cloudwatch_metric_alarm.admin-pods-high-memory-warning[0]
aws_cloudwatch_metric_alarm.admin-replicas-unavailable[0]
aws_cloudwatch_metric_alarm.api-pods-high-cpu-warning[0]
aws_cloudwatch_metric_alarm.api-pods-high-memory-warning[0]
aws_cloudwatch_metric_alarm.api-replicas-unavailable[0]
aws_cloudwatch_metric_alarm.celery-beat-replicas-unavailable[0]
aws_cloudwatch_metric_alarm.celery-email-send-primary-replicas-unavailable[0]
aws_cloudwatch_metric_alarm.celery-email-send-scalable-replicas-unavailable[0]
aws_cloudwatch_metric_alarm.celery-primary-pods-high-cpu-warning[0]
aws_cloudwatch_metric_alarm.celery-primary-pods-high-memory-warning[0]
aws_cloudwatch_metric_alarm.celery-primary-replicas-unavailable[0]
aws_cloudwatch_metric_alarm.celery-scalable-pods-high-cpu-warning[0]
aws_cloudwatch_metric_alarm.celery-scalable-replicas-unavailable[0]
aws_cloudwatch_metric_alarm.celery-sms-pods-high-cpu-warning[0]
aws_cloudwatch_metric_alarm.celery-sms-pods-high-memory-warning[0]
aws_cloudwatch_metric_alarm.celery-sms-replicas-unavailable[0]
aws_cloudwatch_metric_alarm.celery-sms-send-primary-replicas-unavailable[0]
aws_cloudwatch_metric_alarm.celery-sms-send-scalable-replicas-unavailable[0]
aws_cloudwatch_metric_alarm.document-download-api-replicas-unavailable[0]
aws_cloudwatch_metric_alarm.documentation-replicas-unavailable[0]
aws_cloudwatch_metric_alarm.karpenter-replicas-unavailable[0]
aws_cloudwatch_query_definition.admin-50X-errors[0]
aws_cloudwatch_query_definition.api-50X-errors[0]
aws_cloudwatch_query_definition.bounce-rate-critical[0]
aws_cloudwatch_query_definition.bounce-rate-warning[0]
aws_cloudwatch_query_definition.bounce-rate-warnings-and-criticals[0]
aws_cloudwatch_query_definition.callback-errors-by-url[0]
aws_cloudwatch_query_definition.callback-failures[0]
aws_cloudwatch_query_definition.callback-max-retry-failures-by-service[0]
aws_cloudwatch_query_definition.celery-errors[0]
aws_cloudwatch_query_definition.celery-filter-by-job[0]
aws_cloudwatch_query_definition.celery-filter-by-notification-id[0]
aws_cloudwatch_query_definition.celery-queues[0]
aws_cloudwatch_query_definition.celery-starts[0]
aws_cloudwatch_query_definition.celery-worker-exited-normally[0]
aws_cloudwatch_query_definition.celery-worker-exited-prematurely[0]
aws_cloudwatch_query_definition.celery-worker-exits-cold-vs-warm[0]
aws_cloudwatch_query_definition.retry-attemps-by-duration[0]

✂   Warning: plan has been truncated! See the full plan in the logs.

Show plan
Resource actions are indicated with the following symbols:
  ~ update in-place
  - destroy

Terraform will perform the following actions:

  # aws_cloudwatch_dashboard.errors[0] will be updated in-place
  ~ resource "aws_cloudwatch_dashboard" "errors" {
      ~ dashboard_body = (sensitive value)
        id             = "Errors"
        # (2 unchanged attributes hidden)
    }

  # aws_cloudwatch_dashboard.kubernetes[0] will be updated in-place
  ~ resource "aws_cloudwatch_dashboard" "kubernetes" {
      ~ dashboard_body = jsonencode(
          ~ {
              ~ widgets = [
                    {
                        height     = 15
                        properties = {
                            aggregateBy   = {
                                func = "MAX"
                                key  = "Name"
                            }
                            labels        = [
                                {
                                    key   = "Name"
                                    value = "notification-canada-ca"
                                },
                            ]
                            metrics       = [
                                {
                                    metricName   = "node_cpu_limit"
                                    resourceType = "AWS::EKS::Cluster"
                                    stat         = "Maximum"
                                },
                                {
                                    metricName   = "node_cpu_usage_total"
                                    resourceType = "AWS::EKS::Cluster"
                                    stat         = "Maximum"
                                },
                                {
                                    metricName   = "node_memory_limit"
                                    resourceType = "AWS::EKS::Cluster"
                                    stat         = "Maximum"
                                },
                                {
                                    metricName   = "node_memory_working_set"
                                    resourceType = "AWS::EKS::Cluster"
                                    stat         = "Maximum"
                                },
                                {
                                    metricName   = "cluster_failed_node_count"
                                    resourceType = "AWS::EKS::Cluster"
                                    stat         = "Sum"
                                },
                                {
                                    metricName   = "cluster_node_count"
                                    resourceType = "AWS::EKS::Cluster"
                                    stat         = "Sum"
                                },
                            ]
                            period        = 300
                            region        = "ca-central-1"
                            splitBy       = "Name"
                            widgetOptions = {
                                legend        = {
                                    position = "bottom"
                                }
                                rowsPerPage   = 50
                                stacked       = false
                                view          = "timeSeries"
                                widgetsPerRow = 2
                            }
                        }
                        type       = "explorer"
                        width      = 24
                        x          = 0
                        y          = 14
                    },
                  ~ {
                      ~ properties = {
                          ~ metrics = [
                              ~ [
                                    # (2 unchanged elements hidden)
                                    "PodName",
                                  ~ "admin" -> "notify-admin",
                                    "ClusterName",
                                    # (3 unchanged elements hidden)
                                ],
                              ~ [
                                    "...",
                                  ~ "api" -> "notify-api",
                                    ".",
                                    # (3 unchanged elements hidden)
                                ],
                              ~ [
                                    "...",
                                  ~ "celery" -> "notify-celery",
                                    ".",
                                    # (3 unchanged elements hidden)
                                ],
                              ~ [
                                    "...",
                                  ~ "document-download-api" -> "notify-document-download",
                                    ".",
                                    # (3 unchanged elements hidden)
                                ],
                            ]
                            # (7 unchanged attributes hidden)
                        }
                        # (5 unchanged attributes hidden)
                    },
                  ~ {
                      ~ properties = {
                          ~ metrics = [
                              ~ [
                                    # (4 unchanged elements hidden)
                                    "Service",
                                  ~ "api" -> "notify-api",
                                    "Namespace",
                                    # (2 unchanged elements hidden)
                                ],
                              ~ [
                                    "...",
                                  ~ "celery" -> "notify-celery",
                                    ".",
                                    # (2 unchanged elements hidden)
                                ],
                              ~ [
                                    "...",
                                  ~ "admin" -> "notify-admin",
                                    ".",
                                    # (2 unchanged elements hidden)
                                ],
                              ~ [
                                    # (4 unchanged elements hidden)
                                    ".",
                                  ~ "celery" -> "notify-celery",
                                    ".",
                                    # (2 unchanged elements hidden)
                                ],
                              ~ [
                                    "...",
                                  ~ "api" -> "notify-api",
                                    ".",
                                    # (2 unchanged elements hidden)
                                ],
                              ~ [
                                    "...",
                                  ~ "admin" -> "notify-admin",
                                    ".",
                                    # (1 unchanged element hidden)
                                ],
                            ]
                            # (7 unchanged attributes hidden)
                        }
                        # (5 unchanged attributes hidden)
                    },
                ]
            }
        )
        id             = "Kubernetes"
        # (2 unchanged attributes hidden)
    }

  # aws_cloudwatch_dashboard.notify_system[0] will be updated in-place
  ~ resource "aws_cloudwatch_dashboard" "notify_system" {
      ~ dashboard_body = (sensitive value)
        id             = "Notify-System-Overview"
        # (2 unchanged attributes hidden)
    }

  # aws_cloudwatch_log_metric_filter.admin-evicted-pods[0] will be updated in-place
  ~ resource "aws_cloudwatch_log_metric_filter" "admin-evicted-pods" {
        id             = "admin-evicted-pods"
        name           = "admin-evicted-pods"
      ~ pattern        = "{ ($.reason = \"Evicted\") && ($.kube_pod_status_reason = 1) && ($.pod = \"admin-*\") }" -> "{ ($.reason = \"Evicted\") && ($.kube_pod_status_reason = 1) && ($.pod = \"notify-admin-*\") }"
        # (1 unchanged attribute hidden)

        # (1 unchanged block hidden)
    }

  # aws_cloudwatch_log_metric_filter.api-evicted-pods[0] will be updated in-place
  ~ resource "aws_cloudwatch_log_metric_filter" "api-evicted-pods" {
        id             = "api-evicted-pods"
        name           = "api-evicted-pods"
      ~ pattern        = "{ ($.reason = \"Evicted\") && ($.kube_pod_status_reason = 1) && ($.pod = \"api-*\") }" -> "{ ($.reason = \"Evicted\") && ($.kube_pod_status_reason = 1) && ($.pod = \"notify-api-*\") }"
        # (1 unchanged attribute hidden)

        # (1 unchanged block hidden)
    }

  # aws_cloudwatch_log_metric_filter.celery-evicted-pods[0] will be updated in-place
  ~ resource "aws_cloudwatch_log_metric_filter" "celery-evicted-pods" {
        id             = "celery-evicted-pods"
        name           = "celery-evicted-pods"
      ~ pattern        = "{ ($.reason = \"Evicted\") && ($.kube_pod_status_reason = 1) && ($.pod = \"celery-*\") }" -> "{ ($.reason = \"Evicted\") && ($.kube_pod_status_reason = 1) && ($.pod = \"notify-celery-*\") }"
        # (1 unchanged attribute hidden)

        # (1 unchanged block hidden)
    }

  # aws_cloudwatch_log_metric_filter.document-download-evicted-pods[0] will be updated in-place
  ~ resource "aws_cloudwatch_log_metric_filter" "document-download-evicted-pods" {
        id             = "document-download-evicted-pods"
        name           = "document-download-evicted-pods"
      ~ pattern        = "{ ($.reason = \"Evicted\") && ($.kube_pod_status_reason = 1) && ($.pod = \"document-download-*\") }" -> "{ ($.reason = \"Evicted\") && ($.kube_pod_status_reason = 1) && ($.pod = \"notify-document-download-*\") }"
        # (1 unchanged attribute hidden)

        # (1 unchanged block hidden)
    }

  # aws_cloudwatch_log_metric_filter.documentation-evicted-pods[0] will be updated in-place
  ~ resource "aws_cloudwatch_log_metric_filter" "documentation-evicted-pods" {
        id             = "documentation-evicted-pods"
        name           = "documentation-evicted-pods"
      ~ pattern        = "{ ($.reason = \"Evicted\") && ($.kube_pod_status_reason = 1) && ($.pod = \"documentation-*\") }" -> "{ ($.reason = \"Evicted\") && ($.kube_pod_status_reason = 1) && ($.pod = \"notify-documentation-*\") }"
        # (1 unchanged attribute hidden)

        # (1 unchanged block hidden)
    }

  # aws_cloudwatch_log_metric_filter.github-arc-runner-alarm[0] will be destroyed
  # (because aws_cloudwatch_log_metric_filter.github-arc-runner-alarm is not in configuration)
  - resource "aws_cloudwatch_log_metric_filter" "github-arc-runner-alarm" {
      - id             = "GitHub ARC Runners Write Alarm" -> null
      - log_group_name = "/aws/containerinsights/notification-canada-ca-staging-eks-cluster/application" -> null
      - name           = "GitHub ARC Runners Write Alarm" -> null
      - pattern        = "{ $.kubernetes.pod_name = \"github-arc-ss-staging-*-runner-*\"  && $.log = \"*ERROR*\" }" -> null

      - metric_transformation {
          - dimensions    = {} -> null
          - name          = "aggregating-github-arc-runner-alarm" -> null
          - namespace     = "LogMetrics" -> null
          - unit          = "None" -> null
          - value         = "1" -> null
            # (1 unchanged attribute hidden)
        }
    }

  # aws_cloudwatch_metric_alarm.admin-pods-high-cpu-warning[0] will be updated in-place
  ~ resource "aws_cloudwatch_metric_alarm" "admin-pods-high-cpu-warning" {
      ~ dimensions                            = {
          ~ "Service"     = "admin" -> "notify-admin"
            # (2 unchanged elements hidden)
        }
        id                                    = "admin-pods-high-cpu-warning"
        tags                                  = {}
        # (21 unchanged attributes hidden)
    }

  # aws_cloudwatch_metric_alarm.admin-pods-high-memory-warning[0] will be updated in-place
  ~ resource "aws_cloudwatch_metric_alarm" "admin-pods-high-memory-warning" {
      ~ dimensions                            = {
          ~ "Service"     = "admin" -> "notify-admin"
            # (2 unchanged elements hidden)
        }
        id                                    = "admin-pods-high-memory-warning"
        tags                                  = {}
        # (21 unchanged attributes hidden)
    }

  # aws_cloudwatch_metric_alarm.admin-replicas-unavailable[0] will be updated in-place
  ~ resource "aws_cloudwatch_metric_alarm" "admin-replicas-unavailable" {
        id                                    = "admin-replicas-unavailable"
        tags                                  = {}
      ~ treat_missing_data                    = "notBreaching" -> "breaching"
        # (21 unchanged attributes hidden)

      - metric_query {
          - id          = "m1" -> null
          - period      = 0 -> null
          - return_data = true -> null
            # (3 unchanged attributes hidden)

          - metric {
              - dimensions  = {
                  - "ClusterName" = "notification-canada-ca-staging-eks-cluster"
                  - "deployment"  = "admin"
                  - "namespace"   = "notification-canada-ca"
                } -> null
              - metric_name = "kube_deployment_status_replicas_unavailable" -> null
              - namespace   = "ContainerInsights/Prometheus" -> null
              - period      = 300 -> null
              - stat        = "Average" -> null
                # (1 unchanged attribute hidden)
            }
        }
      + metric_query {
          + id          = "m1"
          + return_data = true
            # (3 unchanged attributes hidden)

          + metric {
              + dimensions  = {
                  + "ClusterName" = "notification-canada-ca-staging-eks-cluster"
                  + "deployment"  = "notify-admin"
                  + "namespace"   = "notification-canada-ca"
                }
              + metric_name = "kube_deployment_status_replicas_unavailable"
              + namespace   = "ContainerInsights/Prometheus"
              + period      = 300
              + stat        = "Average"
                # (1 unchanged attribute hidden)
            }
        }
    }

  # aws_cloudwatch_metric_alarm.api-pods-high-cpu-warning[0] will be updated in-place
  ~ resource "aws_cloudwatch_metric_alarm" "api-pods-high-cpu-warning" {
      ~ dimensions                            = {
          ~ "Service"     = "api" -> "notify-api"
            # (2 unchanged elements hidden)
        }
        id                                    = "api-pods-high-cpu-warning"
        tags                                  = {}
        # (21 unchanged attributes hidden)
    }

  # aws_cloudwatch_metric_alarm.api-pods-high-memory-warning[0] will be updated in-place
  ~ resource "aws_cloudwatch_metric_alarm" "api-pods-high-memory-warning" {
      ~ dimensions                            = {
          ~ "Service"     = "api" -> "notify-api"
            # (2 unchanged elements hidden)
        }
        id                                    = "api-pods-high-memory-warning"
        tags                                  = {}
        # (21 unchanged attributes hidden)
    }

  # aws_cloudwatch_metric_alarm.api-replicas-unavailable[0] will be updated in-place
  ~ resource "aws_cloudwatch_metric_alarm" "api-replicas-unavailable" {
        id                                    = "api-replicas-unavailable"
        tags                                  = {}
      ~ treat_missing_data                    = "notBreaching" -> "breaching"
        # (21 unchanged attributes hidden)

      - metric_query {
          - id          = "m1" -> null
          - period      = 0 -> null
          - return_data = true -> null
            # (3 unchanged attributes hidden)

          - metric {
              - dimensions  = {
                  - "ClusterName" = "notification-canada-ca-staging-eks-cluster"
                  - "deployment"  = "api"
                  - "namespace"   = "notification-canada-ca"
                } -> null
              - metric_name = "kube_deployment_status_replicas_unavailable" -> null
              - namespace   = "ContainerInsights/Prometheus" -> null
              - period      = 300 -> null
              - stat        = "Average" -> null
                # (1 unchanged attribute hidden)
            }
        }
      + metric_query {
          + id          = "m1"
          + return_data = true
            # (3 unchanged attributes hidden)

          + metric {
              + dimensions  = {
                  + "ClusterName" = "notification-canada-ca-staging-eks-cluster"
                  + "deployment"  = "notify-api"
                  + "namespace"   = "notification-canada-ca"
                }
              + metric_name = "kube_deployment_status_replicas_unavailable"
              + namespace   = "ContainerInsights/Prometheus"
              + period      = 300
              + stat        = "Average"
                # (1 unchanged attribute hidden)
            }
        }
    }

  # aws_cloudwatch_metric_alarm.celery-beat-replicas-unavailable[0] will be updated in-place
  ~ resource "aws_cloudwatch_metric_alarm" "celery-beat-replicas-unavailable" {
        id                                    = "celery-beat-replicas-unavailable"
        tags                                  = {}
      ~ treat_missing_data                    = "notBreaching" -> "breaching"
        # (21 unchanged attributes hidden)

      - metric_query {
          - id          = "m1" -> null
          - period      = 0 -> null
          - return_data = true -> null
            # (3 unchanged attributes hidden)

          - metric {
              - dimensions  = {
                  - "ClusterName" = "notification-canada-ca-staging-eks-cluster"
                  - "deployment"  = "celery-beat"
                  - "namespace"   = "notification-canada-ca"
                } -> null
              - metric_name = "kube_deployment_status_replicas_unavailable" -> null
              - namespace   = "ContainerInsights/Prometheus" -> null
              - period      = 300 -> null
              - stat        = "Average" -> null
                # (1 unchanged attribute hidden)
            }
        }
      + metric_query {
          + id          = "m1"
          + return_data = true
            # (3 unchanged attributes hidden)

          + metric {
              + dimensions  = {
                  + "ClusterName" = "notification-canada-ca-staging-eks-cluster"
                  + "deployment"  = "notify-celery-beat"
                  + "namespace"   = "notification-canada-ca"
                }
              + metric_name = "kube_deployment_status_replicas_unavailable"
              + namespace   = "ContainerInsights/Prometheus"
              + period      = 300
              + stat        = "Average"
                # (1 unchanged attribute hidden)
            }
        }
    }

  # aws_cloudwatch_metric_alarm.celery-email-send-primary-replicas-unavailable[0] will be updated in-place
  ~ resource "aws_cloudwatch_metric_alarm" "celery-email-send-primary-replicas-unavailable" {
        id                                    = "celery-email-send-primary-replicas-unavailable"
        tags                                  = {}
      ~ treat_missing_data                    = "notBreaching" -> "breaching"
        # (21 unchanged attributes hidden)

      - metric_query {
          - id          = "m1" -> null
          - period      = 0 -> null
          - return_data = true -> null
            # (3 unchanged attributes hidden)

          - metric {
              - dimensions  = {
                  - "ClusterName" = "notification-canada-ca-staging-eks-cluster"
                  - "deployment"  = "celery-email-send-primary"
                  - "namespace"   = "notification-canada-ca"
                } -> null
              - metric_name = "kube_deployment_status_replicas_unavailable" -> null
              - namespace   = "ContainerInsights/Prometheus" -> null
              - period      = 300 -> null
              - stat        = "Minimum" -> null
                # (1 unchanged attribute hidden)
            }
        }
      + metric_query {
          + id          = "m1"
          + return_data = true
            # (3 unchanged attributes hidden)

          + metric {
              + dimensions  = {
                  + "ClusterName" = "notification-canada-ca-staging-eks-cluster"
                  + "deployment"  = "notify-celery-email-send-primary"
                  + "namespace"   = "notification-canada-ca"
                }
              + metric_name = "kube_deployment_status_replicas_unavailable"
              + namespace   = "ContainerInsights/Prometheus"
              + period      = 300
              + stat        = "Minimum"
                # (1 unchanged attribute hidden)
            }
        }
    }

  # aws_cloudwatch_metric_alarm.celery-email-send-scalable-replicas-unavailable[0] will be updated in-place
  ~ resource "aws_cloudwatch_metric_alarm" "celery-email-send-scalable-replicas-unavailable" {
        id                                    = "celery-email-send-scalable-replicas-unavailable"
        tags                                  = {}
      ~ treat_missing_data                    = "notBreaching" -> "breaching"
        # (21 unchanged attributes hidden)

      - metric_query {
          - id          = "m1" -> null
          - period      = 0 -> null
          - return_data = true -> null
            # (3 unchanged attributes hidden)

          - metric {
              - dimensions  = {
                  - "ClusterName" = "notification-canada-ca-staging-eks-cluster"
                  - "deployment"  = "celery-email-send-scalable"
                  - "namespace"   = "notification-canada-ca"
                } -> null
              - metric_name = "kube_deployment_status_replicas_unavailable" -> null
              - namespace   = "ContainerInsights/Prometheus" -> null
              - period      = 300 -> null
              - stat        = "Minimum" -> null
                # (1 unchanged attribute hidden)
            }
        }
      + metric_query {
          + id          = "m1"
          + return_data = true
            # (3 unchanged attributes hidden)

          + metric {
              + dimensions  = {
                  + "ClusterName" = "notification-canada-ca-staging-eks-cluster"
                  + "deployment"  = "notify-celery-email-send-scalable"
                  + "namespace"   = "notification-canada-ca"
                }
              + metric_name = "kube_deployment_status_replicas_unavailable"
              + namespace   = "ContainerInsights/Prometheus"
              + period      = 300
              + stat        = "Minimum"
                # (1 unchanged attribute hidden)
            }
        }
    }

  # aws_cloudwatch_metric_alarm.celery-primary-pods-high-cpu-warning[0] will be updated in-place
  ~ resource "aws_cloudwatch_metric_alarm" "celery-primary-pods-high-cpu-warning" {
      ~ dimensions                            = {
          ~ "Service"     = "celery-primary" -> "notify-celery-primary"
            # (2 unchanged elements hidden)
        }
        id                                    = "celery-primary-pods-high-cpu-warning"
        tags                                  = {}
        # (21 unchanged attributes hidden)
    }

  # aws_cloudwatch_metric_alarm.celery-primary-pods-high-memory-warning[0] will be updated in-place
  ~ resource "aws_cloudwatch_metric_alarm" "celery-primary-pods-high-memory-warning" {
      ~ dimensions                            = {
          ~ "Service"     = "celery-primary" -> "notify-celery-primary"
            # (2 unchanged elements hidden)
        }
        id                                    = "celery-primary-pods-high-memory-warning"
        tags                                  = {}
        # (21 unchanged attributes hidden)
    }

  # aws_cloudwatch_metric_alarm.celery-primary-replicas-unavailable[0] will be updated in-place
  ~ resource "aws_cloudwatch_metric_alarm" "celery-primary-replicas-unavailable" {
        id                                    = "celery-primary-replicas-unavailable"
        tags                                  = {}
      ~ treat_missing_data                    = "notBreaching" -> "breaching"
        # (21 unchanged attributes hidden)

      - metric_query {
          - id          = "m1" -> null
          - period      = 0 -> null
          - return_data = true -> null
            # (3 unchanged attributes hidden)

          - metric {
              - dimensions  = {
                  - "ClusterName" = "notification-canada-ca-staging-eks-cluster"
                  - "deployment"  = "celery-primary"
                  - "namespace"   = "notification-canada-ca"
                } -> null
              - metric_name = "kube_deployment_status_replicas_unavailable" -> null
              - namespace   = "ContainerInsights/Prometheus" -> null
              - period      = 300 -> null
              - stat        = "Minimum" -> null
                # (1 unchanged attribute hidden)
            }
        }
      + metric_query {
          + id          = "m1"
          + return_data = true
            # (3 unchanged attributes hidden)

          + metric {
              + dimensions  = {
                  + "ClusterName" = "notification-canada-ca-staging-eks-cluster"
                  + "deployment"  = "notify-celery-primary"
                  + "namespace"   = "notification-canada-ca"
                }
              + metric_name = "kube_deployment_status_replicas_unavailable"
              + namespace   = "ContainerInsights/Prometheus"
              + period      = 300
              + stat        = "Minimum"
                # (1 unchanged attribute hidden)
            }
        }
    }

  # aws_cloudwatch_metric_alarm.celery-scalable-pods-high-cpu-warning[0] will be updated in-place
  ~ resource "aws_cloudwatch_metric_alarm" "celery-scalable-pods-high-cpu-warning" {
      ~ dimensions                            = {
          ~ "Service"     = "celery-scalable" -> "notify-celery-scalable"
            # (2 unchanged elements hidden)
        }
        id                                    = "celery-scalable-pods-high-cpu-warning"
        tags                                  = {}
        # (21 unchanged attributes hidden)
    }

  # aws_cloudwatch_metric_alarm.celery-scalable-replicas-unavailable[0] will be updated in-place
  ~ resource "aws_cloudwatch_metric_alarm" "celery-scalable-replicas-unavailable" {
        id                                    = "celery-scalable-replicas-unavailable"
        tags                                  = {}
      ~ treat_missing_data                    = "notBreaching" -> "breaching"
        # (21 unchanged attributes hidden)

      - metric_query {
          - id          = "m1" -> null
          - period      = 0 -> null
          - return_data = true -> null
            # (3 unchanged attributes hidden)

          - metric {
              - dimensions  = {
                  - "ClusterName" = "notification-canada-ca-staging-eks-cluster"
                  - "deployment"  = "celery-scalable"
                  - "namespace"   = "notification-canada-ca"
                } -> null
              - metric_name = "kube_deployment_status_replicas_unavailable" -> null
              - namespace   = "ContainerInsights/Prometheus" -> null
              - period      = 300 -> null
              - stat        = "Minimum" -> null
                # (1 unchanged attribute hidden)
            }
        }
      + metric_query {
          + id          = "m1"
          + return_data = true
            # (3 unchanged attributes hidden)

          + metric {
              + dimensions  = {
                  + "ClusterName" = "notification-canada-ca-staging-eks-cluster"
                  + "deployment"  = "notify-celery-scalable"
                  + "namespace"   = "notification-canada-ca"
                }
              + metric_name = "kube_deployment_status_replicas_unavailable"
              + namespace   = "ContainerInsights/Prometheus"
              + period      = 300
              + stat        = "Minimum"
                # (1 unchanged attribute hidden)
            }
        }
    }

  # aws_cloudwatch_metric_alarm.celery-sms-pods-high-cpu-warning[0] will be updated in-place
  ~ resource "aws_cloudwatch_metric_alarm" "celery-sms-pods-high-cpu-warning" {
      ~ dimensions                            = {
          ~ "Service"     = "celery-sms" -> "notify-celery-sms"
            # (2 unchanged elements hidden)
        }
        id                                    = "celery-sms-pods-high-cpu-warning"
        tags                                  = {}
        # (21 unchanged attributes hidden)
    }

  # aws_cloudwatch_metric_alarm.celery-sms-pods-high-memory-warning[0] will be updated in-place
  ~ resource "aws_cloudwatch_metric_alarm" "celery-sms-pods-high-memory-warning" {
      ~ dimensions                            = {
          ~ "Service"     = "celery-sms" -> "notify-celery-sms"
            # (2 unchanged elements hidden)
        }
        id                                    = "celery-sms-pods-high-memory-warning"
        tags                                  = {}
        # (21 unchanged attributes hidden)
    }

  # aws_cloudwatch_metric_alarm.celery-sms-replicas-unavailable[0] will be updated in-place
  ~ resource "aws_cloudwatch_metric_alarm" "celery-sms-replicas-unavailable" {
        id                                    = "celery-sms-replicas-unavailable"
        tags                                  = {}
      ~ treat_missing_data                    = "notBreaching" -> "breaching"
        #...
Show Conftest results
WARN - plan.json - main - Cloudwatch log metric pattern is invalid: ["aws_cloudwatch_log_metric_filter.celery-error[0]"]
WARN - plan.json - main - Cloudwatch log metric pattern is invalid: ["aws_cloudwatch_log_metric_filter.scanfiles-timeout[0]"]
WARN - plan.json - main - Missing Common Tags: ["aws_acm_certificate.client_vpn"]
WARN - plan.json - main - Missing Common Tags: ["aws_acm_certificate.notification-canada-ca"]
WARN - plan.json - main - Missing Common Tags: ["aws_acm_certificate.notification-canada-ca-alt[0]"]
WARN - plan.json - main - Missing Common Tags: ["aws_alb.notification-canada-ca"]
WARN - plan.json - main - Missing Common Tags: ["aws_alb_listener.internal_alb_tls"]
WARN - plan.json - main - Missing Common Tags: ["aws_alb_listener.notification-canada-ca"]
WARN - plan.json - main - Missing Common Tags: ["aws_alb_target_group.internal_nginx_http"]
WARN - plan.json - main - Missing Common Tags: ["aws_alb_target_group.notification-canada-ca-admin"]
WARN - plan.json - main - Missing Common Tags: ["aws_alb_target_group.notification-canada-ca-api"]
WARN - plan.json - main - Missing Common Tags: ["aws_alb_target_group.notification-canada-ca-document"]
WARN - plan.json - main - Missing Common Tags: ["aws_alb_target_group.notification-canada-ca-document-api"]
WARN - plan.json - main - Missing Common Tags: ["aws_alb_target_group.notification-canada-ca-documentation"]
WARN - plan.json - main - Missing Common Tags: ["aws_cloudwatch_log_group.blazer[0]"]
WARN - plan.json - main - Missing Common Tags: ["aws_cloudwatch_log_group.notification-canada-ca-eks-application-logs[0]"]
WARN - plan.json - main - Missing Common Tags: ["aws_cloudwatch_log_group.notification-canada-ca-eks-cluster-logs[0]"]
WARN - plan.json - main - Missing Common Tags: ["aws_cloudwatch_log_group.notification-canada-ca-eks-prometheus-logs[0]"]
WARN - plan.json - main - Missing Common Tags: ["aws_cloudwatch_metric_alarm.admin-evicted-pods[0]"]
WARN - plan.json - main - Missing Common Tags:...

@ben851 ben851 marked this pull request as ready for review December 12, 2024 16:20
@ben851 ben851 requested a review from jimleroyer as a code owner December 12, 2024 16:20
Copy link

Updating alarms ⏰? Great! Please update the Google Sheet and add a 👍 to this message after 🙏

Copy link
Contributor

@P0NDER0SA P0NDER0SA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Let's do this

@ben851 ben851 merged commit b9c72ef into main Dec 12, 2024
30 checks passed
@ben851 ben851 deleted the helm-alarms branch December 12, 2024 16:21
@ben851 ben851 mentioned this pull request Dec 17, 2024
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants