Skip to content

Commit

Permalink
[DPE-5659] Update COS alert rules (#542)
Browse files Browse the repository at this point in the history
* Update metrics_alert_rules.yaml

* Update tox.ini to skip linter checks on alert rules

* Revert tox.ini change
  • Loading branch information
a-velasco authored Oct 18, 2024
1 parent c4d6f56 commit 30d63e8
Showing 1 changed file with 37 additions and 22 deletions.
59 changes: 37 additions & 22 deletions src/alert_rules/prometheus/metrics_alert_rules.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,9 @@ groups:
labels:
severity: critical
annotations:
summary: MySQL Down (instance {{ $labels.instance }})
description: "MySQL instance is down\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"
summary: MySQL instance {{ $labels.instance }} is down.
description: |
LABELS = {{ $labels }}.
# 2.1.2
# customized: 80% -> 90%
Expand All @@ -20,18 +21,10 @@ groups:
labels:
severity: warning
annotations:
summary: MySQL too many connections (> 90%) (instance {{ $labels.instance }})
description: "More than 90% of MySQL connections are in use on {{ $labels.instance }}\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"

# 2.1.3
- alert: MySQLHighPreparedStatementsUtilization(>80%)
expr: max_over_time(mysql_global_status_prepared_stmt_count[1m]) / mysql_global_variables_max_prepared_stmt_count * 100 > 80
for: 2m
labels:
severity: warning
annotations:
summary: MySQL high prepared statements utilization (> 80%) (instance {{ $labels.instance }})
description: "High utilization of prepared statements (>80%) on {{ $labels.instance }}\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"
summary: MySQL instance {{ $labels.instance }} is using > 90% of `max_connections`.
description: |
Consider checking the client application responsible for generating those additional connections.
LABELS = {{ $labels }}.
# 2.1.4
# customized: 60% -> 80%
Expand All @@ -41,8 +34,22 @@ groups:
labels:
severity: warning
annotations:
summary: MySQL high threads running (instance {{ $labels.instance }})
description: "More than 80% of MySQL connections are in running state on {{ $labels.instance }}\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"
summary: MySQL instance {{ $labels.instance }} is actively using > 80% of `max_connections`.
description: |
Consider reviewing the value of the `max-connections` config parameter or allocate more resources to your database server.
LABELS = {{ $labels }}.
# 2.1.3
- alert: MySQLHighPreparedStatementsUtilization(>80%)
expr: max_over_time(mysql_global_status_prepared_stmt_count[1m]) / mysql_global_variables_max_prepared_stmt_count * 100 > 80
for: 2m
labels:
severity: warning
annotations:
summary: MySQL instance {{ $labels.instance }} is using > 80% of `max_prepared_stmt_count`.
description: |
Too many prepared statements might consume a lot of memory.
LABELS = {{ $labels }}.
# 2.1.8
# customized: warning -> info
Expand All @@ -52,8 +59,10 @@ groups:
labels:
severity: info
annotations:
summary: MySQL slow queries (instance {{ $labels.instance }})
description: "MySQL server mysql has some new slow query.\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"
summary: MySQL instance {{ $labels.instance }} has a slow query.
description: |
Consider optimizing the query by reviewing its execution plan, then rewrite the query and add any relevant indexes.
LABELS = {{ $labels }}.
# 2.1.9
- alert: MySQLInnoDBLogWaits
Expand All @@ -62,8 +71,11 @@ groups:
labels:
severity: warning
annotations:
summary: MySQL InnoDB log waits (instance {{ $labels.instance }})
description: "MySQL innodb log writes stalling\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"
summary: MySQL instance {{ $labels.instance }} has long InnoDB log waits.
description: |
MySQL InnoDB log writes might be stalling.
Check I/O activity on your nodes to find the responsible process or query. Consider using iotop and the performance_schema.
LABELS = {{ $labels }}.
# 2.1.10
- alert: MySQLRestarted
Expand All @@ -72,5 +84,8 @@ groups:
labels:
severity: info
annotations:
summary: MySQL restarted (instance {{ $labels.instance }})
description: "MySQL has just been restarted, less than one minute ago on {{ $labels.instance }}.\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"
summary: MySQL instance {{ $labels.instance }} restarted.
description: |
MySQL restarted less than one minute ago.
If the restart was unplanned or frequent, check Loki logs (e.g. `error.log`).
LABELS = {{ $labels }}.

0 comments on commit 30d63e8

Please sign in to comment.