Skip to content

Commit

Permalink
Add backup vault and recovery services vault monitors (#58)
Browse files Browse the repository at this point in the history
* Add backup vault and recovery services vault monitors

* adjust no data timeframe to max allowed value

* update READMEs
  • Loading branch information
Aohzan authored Sep 19, 2024
1 parent 396098d commit 94d1ea0
Show file tree
Hide file tree
Showing 18 changed files with 598 additions and 1 deletion.
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -181,6 +181,7 @@ For example, this will regenerate every READMEs thanks to [terraform-docs](https
- [app-gateway](https://github.com/claranet/terraform-datadog-monitors/tree/master/cloud/azure/app-gateway/)
- [app-services](https://github.com/claranet/terraform-datadog-monitors/tree/master/cloud/azure/app-services/)
- [azure-search](https://github.com/claranet/terraform-datadog-monitors/tree/master/cloud/azure/azure-search/)
- [backup-vault](https://github.com/claranet/terraform-datadog-monitors/tree/master/cloud/azure/backup-vault/)
- [cosmosdb](https://github.com/claranet/terraform-datadog-monitors/tree/master/cloud/azure/cosmosdb/)
- [datalakestore](https://github.com/claranet/terraform-datadog-monitors/tree/master/cloud/azure/datalakestore/)
- [eventgrid](https://github.com/claranet/terraform-datadog-monitors/tree/master/cloud/azure/eventgrid/)
Expand All @@ -191,6 +192,7 @@ For example, this will regenerate every READMEs thanks to [terraform-docs](https
- [load-balancer](https://github.com/claranet/terraform-datadog-monitors/tree/master/cloud/azure/load-balancer/)
- [mysql](https://github.com/claranet/terraform-datadog-monitors/tree/master/cloud/azure/mysql/)
- [postgresql](https://github.com/claranet/terraform-datadog-monitors/tree/master/cloud/azure/postgresql/)
- [recovery-services-vault](https://github.com/claranet/terraform-datadog-monitors/tree/master/cloud/azure/recovery-services-vault/)
- [redis](https://github.com/claranet/terraform-datadog-monitors/tree/master/cloud/azure/redis/)
- [serverfarms](https://github.com/claranet/terraform-datadog-monitors/tree/master/cloud/azure/serverfarms/)
- [servicebus](https://github.com/claranet/terraform-datadog-monitors/tree/master/cloud/azure/servicebus/)
Expand Down
91 changes: 91 additions & 0 deletions cloud/azure/backup-vault/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
# CLOUD AZURE BACKUP-VAULT DataDog monitors

## How to use this module

```hcl
module "datadog-monitors-cloud-azure-backup-vault" {
source = "claranet/monitors/datadog//cloud/azure/backup-vault"
version = "{revision}"
environment = var.environment
message = module.datadog-message-alerting.alerting-message
}
```

## Purpose

Creates DataDog monitors with the following checks:

- Backup Vault {{name}} has a backup unhealthy event
- Backup Vault {{name}} has a backup unhealthy event

<!-- BEGIN_TF_DOCS -->
## Requirements

| Name | Version |
|------|---------|
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | >= 0.12.31 |
| <a name="requirement_datadog"></a> [datadog](#requirement\_datadog) | >= 3.1.2 |

## Providers

| Name | Version |
|------|---------|
| <a name="provider_datadog"></a> [datadog](#provider\_datadog) | >= 3.1.2 |

## Modules

| Name | Source | Version |
|------|--------|---------|
| <a name="module_filter-tags"></a> [filter-tags](#module\_filter-tags) | ../../../common/filter-tags | n/a |
| <a name="module_filter-tags-unhealthy-event"></a> [filter-tags-unhealthy-event](#module\_filter-tags-unhealthy-event) | ../../../common/filter-tags | n/a |

## Resources

| Name | Type |
|------|------|
| [datadog_monitor.backup_vault_backup_unhealthy_event](https://registry.terraform.io/providers/DataDog/datadog/latest/docs/resources/monitor) | resource |
| [datadog_monitor.backup_vault_restore_unhealthy_event](https://registry.terraform.io/providers/DataDog/datadog/latest/docs/resources/monitor) | resource |

## Inputs

| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_backup_unhealthy_event_enabled"></a> [backup\_unhealthy\_event\_enabled](#input\_backup\_unhealthy\_event\_enabled) | Flag to enable Backup Vault Unhealthy Backup Event monitor | `string` | `"true"` | no |
| <a name="input_backup_unhealthy_event_extra_tags"></a> [backup\_unhealthy\_event\_extra\_tags](#input\_backup\_unhealthy\_event\_extra\_tags) | Extra tags for Backup Vault Unhealthy Backup Event monitor | `list(string)` | `[]` | no |
| <a name="input_backup_unhealthy_event_message"></a> [backup\_unhealthy\_event\_message](#input\_backup\_unhealthy\_event\_message) | Custom message for Backup Vault Unhealthy Backup Event monitor | `string` | `""` | no |
| <a name="input_backup_unhealthy_event_time_aggregator"></a> [backup\_unhealthy\_event\_time\_aggregator](#input\_backup\_unhealthy\_event\_time\_aggregator) | Monitor aggregator for Backup Vault Unhealthy Backup Event [available values: min, max or avg] | `string` | `"min"` | no |
| <a name="input_backup_unhealthy_event_timeframe"></a> [backup\_unhealthy\_event\_timeframe](#input\_backup\_unhealthy\_event\_timeframe) | Monitor timeframe for Backup Vault Unhealthy Backup Event [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | `string` | `"last_1d"` | no |
| <a name="input_environment"></a> [environment](#input\_environment) | Architecture Environment | `string` | n/a | yes |
| <a name="input_evaluation_delay"></a> [evaluation\_delay](#input\_evaluation\_delay) | Delay in seconds for the metric evaluation | `number` | `900` | no |
| <a name="input_filter_tags_custom"></a> [filter\_tags\_custom](#input\_filter\_tags\_custom) | Tags used for custom filtering when filter\_tags\_use\_defaults is false | `string` | `"*"` | no |
| <a name="input_filter_tags_custom_excluded"></a> [filter\_tags\_custom\_excluded](#input\_filter\_tags\_custom\_excluded) | Tags excluded for custom filtering when filter\_tags\_use\_defaults is false | `string` | `""` | no |
| <a name="input_filter_tags_use_defaults"></a> [filter\_tags\_use\_defaults](#input\_filter\_tags\_use\_defaults) | Use default filter tags convention | `string` | `"true"` | no |
| <a name="input_message"></a> [message](#input\_message) | Message sent when a monitor is triggered | `any` | n/a | yes |
| <a name="input_new_group_delay"></a> [new\_group\_delay](#input\_new\_group\_delay) | Delay in seconds before monitor new resource | `number` | `300` | no |
| <a name="input_no_data_timeframe"></a> [no\_data\_timeframe](#input\_no\_data\_timeframe) | Number of minutes before reporting no data | `string` | `1440` | no |
| <a name="input_notify_no_data"></a> [notify\_no\_data](#input\_notify\_no\_data) | Will raise no data alert if set to true | `bool` | `true` | no |
| <a name="input_prefix_slug"></a> [prefix\_slug](#input\_prefix\_slug) | Prefix string to prepend between brackets on every monitors names | `string` | `""` | no |
| <a name="input_restore_unhealthy_event_enabled"></a> [restore\_unhealthy\_event\_enabled](#input\_restore\_unhealthy\_event\_enabled) | Flag to enable Backup Vault Unhealthy Restore Event monitor | `string` | `"true"` | no |
| <a name="input_restore_unhealthy_event_extra_tags"></a> [restore\_unhealthy\_event\_extra\_tags](#input\_restore\_unhealthy\_event\_extra\_tags) | Extra tags for Backup Vault Unhealthy Restore Event monitor | `list(string)` | `[]` | no |
| <a name="input_restore_unhealthy_event_message"></a> [restore\_unhealthy\_event\_message](#input\_restore\_unhealthy\_event\_message) | Custom message for Backup Vault Unhealthy Restore Event monitor | `string` | `""` | no |
| <a name="input_restore_unhealthy_event_time_aggregator"></a> [restore\_unhealthy\_event\_time\_aggregator](#input\_restore\_unhealthy\_event\_time\_aggregator) | Monitor aggregator for Backup Vault Unhealthy Restore Event [available values: min, max or avg] | `string` | `"min"` | no |
| <a name="input_restore_unhealthy_event_timeframe"></a> [restore\_unhealthy\_event\_timeframe](#input\_restore\_unhealthy\_event\_timeframe) | Monitor timeframe for Backup Vault Unhealthy Restore Event [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | `string` | `"last_1d"` | no |
| <a name="input_tags"></a> [tags](#input\_tags) | Global variables | `list(string)` | <pre>[<br> "type:cloud",<br> "provider:azure",<br> "resource:backup_vault"<br>]</pre> | no |
| <a name="input_team"></a> [team](#input\_team) | n/a | `string` | `"claranet"` | no |
| <a name="input_timeout_h"></a> [timeout\_h](#input\_timeout\_h) | Default auto-resolving state (in hours) | `number` | `0` | no |

## Outputs

| Name | Description |
|------|-------------|
| <a name="output_backup_vault_backup_unhealthy_event_id"></a> [backup\_vault\_backup\_unhealthy\_event\_id](#output\_backup\_vault\_backup\_unhealthy\_event\_id) | id for monitor backup\_vault\_backup\_unhealthy\_event |
| <a name="output_backup_vault_restore_unhealthy_event_id"></a> [backup\_vault\_restore\_unhealthy\_event\_id](#output\_backup\_vault\_restore\_unhealthy\_event\_id) | id for monitor backup\_vault\_restore\_unhealthy\_event |
<!-- END_TF_DOCS -->
## Related documentation

DataDog documentation : [https://docs.datadoghq.com/integrations/azure/](https://docs.datadoghq.com/integrations/azure/)
You must search `keyvault`, there is no integration for now.

Azure metrics documentation : [https://docs.microsoft.com/fr-fr/azure/monitoring-and-diagnostics/monitoring-supported-metrics#microsoftkeyvaultvaults](https://docs.microsoft.com/fr-fr/azure/monitoring-and-diagnostics/monitoring-supported-metrics#microsoftkeyvaultvaults)
1 change: 1 addition & 0 deletions cloud/azure/backup-vault/common-inputs.tf
1 change: 1 addition & 0 deletions cloud/azure/backup-vault/common-locals.tf
117 changes: 117 additions & 0 deletions cloud/azure/backup-vault/inputs.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
# Global variables
variable "tags" {
type = list(string)
default = ["type:cloud", "provider:azure", "resource:backup_vault"]
}

# Datadog variables
variable "filter_tags_use_defaults" {
description = "Use default filter tags convention"
default = "true"
}

variable "filter_tags_custom" {
description = "Tags used for custom filtering when filter_tags_use_defaults is false"
default = "*"
}

variable "filter_tags_custom_excluded" {
description = "Tags excluded for custom filtering when filter_tags_use_defaults is false"
default = ""
}

variable "message" {
description = "Message sent when a monitor is triggered"
}

variable "evaluation_delay" {
description = "Delay in seconds for the metric evaluation"
default = 900
}

variable "new_group_delay" {
description = "Delay in seconds before monitor new resource"
default = 300
}

variable "timeout_h" {
description = "Default auto-resolving state (in hours)"
default = 0
}

variable "prefix_slug" {
description = "Prefix string to prepend between brackets on every monitors names"
default = ""
}

variable "notify_no_data" {
description = "Will raise no data alert if set to true"
default = true
}

variable "no_data_timeframe" {
description = "Number of minutes before reporting no data"
type = string
default = 1440
}

# Azure Backup Vault Unhealthy Backup Event monitor
variable "backup_unhealthy_event_enabled" {
description = "Flag to enable Backup Vault Unhealthy Backup Event monitor"
type = string
default = "true"
}

variable "backup_unhealthy_event_message" {
description = "Custom message for Backup Vault Unhealthy Backup Event monitor"
type = string
default = ""
}

variable "backup_unhealthy_event_time_aggregator" {
description = "Monitor aggregator for Backup Vault Unhealthy Backup Event [available values: min, max or avg]"
type = string
default = "min"
}

variable "backup_unhealthy_event_timeframe" {
description = "Monitor timeframe for Backup Vault Unhealthy Backup Event [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`]"
default = "last_1d"
}

variable "backup_unhealthy_event_extra_tags" {
description = "Extra tags for Backup Vault Unhealthy Backup Event monitor"
type = list(string)
default = []
}


# Azure Backup Vault Unhealthy Restore Event monitor
variable "restore_unhealthy_event_enabled" {
description = "Flag to enable Backup Vault Unhealthy Restore Event monitor"
type = string
default = "true"
}

variable "restore_unhealthy_event_message" {
description = "Custom message for Backup Vault Unhealthy Restore Event monitor"
type = string
default = ""
}

variable "restore_unhealthy_event_time_aggregator" {
description = "Monitor aggregator for Backup Vault Unhealthy Restore Event [available values: min, max or avg]"
type = string
default = "min"
}

variable "restore_unhealthy_event_timeframe" {
description = "Monitor timeframe for Backup Vault Unhealthy Restore Event [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`]"
default = "last_1d"
}

variable "restore_unhealthy_event_extra_tags" {
description = "Extra tags for Backup Vault Unhealthy Restore Event monitor"
type = list(string)
default = []
}
20 changes: 20 additions & 0 deletions cloud/azure/backup-vault/modules.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
module "filter-tags" {
source = "../../../common/filter-tags"

environment = var.environment
resource = "azure_backup_vault"
filter_tags_use_defaults = var.filter_tags_use_defaults
filter_tags_custom = var.filter_tags_custom
filter_tags_custom_excluded = var.filter_tags_custom_excluded
}

module "filter-tags-unhealthy-event" {
source = "../../../common/filter-tags"

environment = var.environment
resource = "azure_backup_vault"
filter_tags_use_defaults = var.filter_tags_use_defaults
filter_tags_custom = var.filter_tags_custom
filter_tags_custom_excluded = var.filter_tags_custom_excluded
extra_tags = ["!health_status:healthy "]
}
49 changes: 49 additions & 0 deletions cloud/azure/backup-vault/monitors-backup-vault.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
resource "datadog_monitor" "backup_vault_backup_unhealthy_event" {
count = var.backup_unhealthy_event_enabled == "true" ? 1 : 0
name = "${var.prefix_slug == "" ? "" : "[${var.prefix_slug}]"}[${var.environment}] Backup Vault {{name}} has a backup unhealthy event"
message = coalesce(var.backup_unhealthy_event_message, var.message)
type = "query alert"

query = <<EOQ
${var.backup_unhealthy_event_time_aggregator}(${var.backup_unhealthy_event_timeframe}): (
avg:azure.dataprotection_backup_vaults.backup_health_event${module.filter-tags-unhealthy-event.query_alert} by {name}
) > 0
EOQ

evaluation_delay = var.evaluation_delay
new_group_delay = var.new_group_delay
notify_no_data = var.notify_no_data
no_data_timeframe = var.no_data_timeframe
renotify_interval = 0
notify_audit = false
timeout_h = var.timeout_h
include_tags = true
require_full_window = false

tags = concat(local.common_tags, var.tags, var.backup_unhealthy_event_extra_tags)
}

resource "datadog_monitor" "backup_vault_restore_unhealthy_event" {
count = var.restore_unhealthy_event_enabled == "true" ? 1 : 0
name = "${var.prefix_slug == "" ? "" : "[${var.prefix_slug}]"}[${var.environment}] Backup Vault {{name}} has a backup unhealthy event"
message = coalesce(var.restore_unhealthy_event_message, var.message)
type = "query alert"

query = <<EOQ
${var.restore_unhealthy_event_time_aggregator}(${var.restore_unhealthy_event_timeframe}): (
avg:azure.dataprotection_backup_vaults.restore_health_event${module.filter-tags-unhealthy-event.query_alert} by {name}
) > 0
EOQ

evaluation_delay = var.evaluation_delay
new_group_delay = var.new_group_delay
notify_no_data = false
renotify_interval = 0
notify_audit = false
timeout_h = var.timeout_h
include_tags = true
require_full_window = false

tags = concat(local.common_tags, var.tags, var.restore_unhealthy_event_extra_tags)
}

10 changes: 10 additions & 0 deletions cloud/azure/backup-vault/outputs.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
output "backup_vault_backup_unhealthy_event_id" {
description = "id for monitor backup_vault_backup_unhealthy_event"
value = datadog_monitor.backup_vault_backup_unhealthy_event.*.id
}

output "backup_vault_restore_unhealthy_event_id" {
description = "id for monitor backup_vault_restore_unhealthy_event"
value = datadog_monitor.backup_vault_restore_unhealthy_event.*.id
}

9 changes: 9 additions & 0 deletions cloud/azure/backup-vault/versions.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
terraform {
required_providers {
datadog = {
source = "DataDog/datadog"
version = ">= 3.1.2"
}
}
required_version = ">= 0.12.31"
}
Loading

0 comments on commit 94d1ea0

Please sign in to comment.