Skip to content

Commit

Permalink
Add alertmanager to report alerts based on metricly metrics
Browse files Browse the repository at this point in the history
  • Loading branch information
yadneshk committed Dec 4, 2024
1 parent e307bcf commit cd9777b
Show file tree
Hide file tree
Showing 9 changed files with 112 additions and 0 deletions.
13 changes: 13 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@
- Logs incoming and outgoing API requests with support for multiple log levels (INFO, DEBUG, ERROR).
- **Metrics Visualization**
- Provides an inbuilt `Grafana` dashboard to visualize all metrics.
- **Alerting Mechanism**
- Provides an inbuilt `Alertmanager` rules that send Gmail alerts.
---

## **Getting Started**
Expand Down Expand Up @@ -371,6 +373,17 @@ The Metricly exporter provides the following API endpoints:
}
```

### **Alertmanager Configuration** ###
Metricly provides a few inbuilt alerts to monitor high utilization of CPU, Memory and Disk usage.

![Sample Alerts](doc/alerts.png)

Upon meeting condition for any alert, an email notification is sent to the receiver configured in `config/alertmanager/alertmanager.yml`

![High CPU Alert](doc/high_cpu_alert.png)

To include more alerts, take a look at `config/prometheus/alerts/`. Similar alerts can be built and added to the same directory.

### **Development**

#### **Testing**
Expand Down
15 changes: 15 additions & 0 deletions alertmanager.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
global:
resolve_timeout: 5m

route:
receiver: 'default'

receivers:
- name: 'default'
email_configs:
- to: '[email protected]'
from: '[email protected]'
smarthost: 'smtp.gmail.com:587'
auth_username: '[email protected]'
auth_identity: '[email protected]'
auth_password: 'jqhotwxphmieuaqw'
15 changes: 15 additions & 0 deletions config/alertmanager/alertmanager.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
global:
resolve_timeout: 5m

route:
receiver: 'default'

receivers:
- name: 'default'
email_configs:
- to: '[email protected]' # receiver's address
from: '[email protected]'
smarthost: 'smtp.gmail.com:587'
auth_username: '[email protected]' # sender's address
auth_identity: '[email protected]' # sender's address
auth_password: 'xxxxxxxxxxxxx' # gmail app password generated by sender
20 changes: 20 additions & 0 deletions config/prometheus/alerts/cpu_alerts.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
groups:
- name: cpu_alerts
rules:
- alert: CPUUsage > 60%
expr: avg_over_time(metricly_cpu_total[5m]) > 60
for: 1m
labels:
severity: warning
annotations:
summary: "High CPU usage detected"
description: "CPU usage is above 60% for the last 5 minutes on host {{ $labels.host }}"

- alert: CPUUsage > 80%
expr: avg_over_time(metricly_cpu_total[15m]) > 80
for: 1m
labels:
severity: critical
annotations:
summary: "High CPU usage detected"
description: "CPU usage is above 80% for the last 15 minutes on host {{ $labels.host }}"
20 changes: 20 additions & 0 deletions config/prometheus/alerts/disk_alerts.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
groups:
- name: disk_alerts
rules:
- alert: Disk Usage > 60%
expr: 100*metricly_disk_used_bytes/metricly_disk_total_bytes > 60
for: 1m
labels:
severity: warning
annotations:
summary: "High Disk usage detected"
description: "Disk usage is above 60%"

- alert: Disk Usage > 80%
expr: 100*metricly_disk_used_bytes/metricly_disk_total_bytes > 80
for: 1m
labels:
severity: critical
annotations:
summary: "High Disk usage detected"
description: "Disk usage is above 80%"
20 changes: 20 additions & 0 deletions config/prometheus/alerts/memory_alerts.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
groups:
- name: memory_alerts
rules:
- alert: Memory Usage > 60%
expr: 100*(metricly_memory_total_bytes-metricly_memory_available_bytes)/metricly_memory_total_bytes > 60
for: 1m
labels:
severity: warning
annotations:
summary: "High Memory usage detected"
description: "Memory usage is above 60%"

- alert: Memory Usage > 80%
expr: 100*(metricly_memory_total_bytes-metricly_memory_available_bytes)/metricly_memory_total_bytes > 80
for: 1m
labels:
severity: critical
annotations:
summary: "High Memory usage detected"
description: "Memory usage is above 80%"
Binary file added doc/alerts.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/high_cpu_alert.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
9 changes: 9 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -54,3 +54,12 @@ services:
depends_on:
- prometheus

alertmanager:
container_name: metricly_alertmanager
image: docker.io/prom/alertmanager:v0.27.0
network_mode: host
volumes:
- ./config/alertmanager/alertmanager.yml:/etc/alertmanager/alertmanager.yml:ro,z
restart: always
depends_on:
- prometheus

0 comments on commit cd9777b

Please sign in to comment.