Skip to content

Commit

Permalink
Add a "custom_namespace" block to prometheus.exporter.cloudwatch (#1658)
Browse files Browse the repository at this point in the history
* Add a "custom_namespace" block.

* Remove alias to namespace conversion
  • Loading branch information
ptodev authored Sep 11, 2024
1 parent a063540 commit 9cb05b9
Show file tree
Hide file tree
Showing 5 changed files with 246 additions and 12 deletions.
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,8 @@ Main (unreleased)

- A new parameter `aws_sdk_version_v2` is added for the cloudwatch exporters configuration. It enables the use of aws sdk v2 which has shown to have significant performance benefits. (@kgeckhart, @andriikushch)

- `prometheus.exporter.cloudwatch` can now collect metrics from custom namespaces via the `custom_namespace` block. (@ptodev)

### Bugfixes

- Fix a bug where the scrape timeout for a Probe resource was not applied, overwriting the scrape interval instead. (@morremeyer, @stefanandres)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -135,22 +135,27 @@ Omitted fields take their default values.

You can use the following blocks in`prometheus.exporter.cloudwatch` to configure collector-specific options:

| Hierarchy | Name | Description | Required |
| Hierarchy | Name | Description | Required |
|--------------------|------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|
| discovery | [discovery][] | Configures a discovery job. Multiple jobs can be configured. | no\* |
| discovery > role | [role][] | Configures the IAM roles the job should assume to scrape metrics. Defaults to the role configured in the environment {{< param "PRODUCT_NAME" >}} runs on. | no |
| discovery > metric | [metric][] | Configures the list of metrics the job should scrape. Multiple metrics can be defined inside one job. | yes |
| static | [static][] | Configures a static job. Multiple jobs can be configured. | no\* |
| static > role | [role][] | Configures the IAM roles the job should assume to scrape metrics. Defaults to the role configured in the environment {{< param "PRODUCT_NAME" >}} runs on. | no |
| static > metric | [metric][] | Configures the list of metrics the job should scrape. Multiple metrics can be defined inside one job. | yes |
| decoupled_scraping | [decoupled_scraping][] | Configures the decoupled scraping feature to retrieve metrics on a schedule and return the cached metrics. | no |
| discovery | [discovery][] | Configures a discovery job. Multiple jobs can be configured. | no\* |
| discovery > role | [role][] | Configures the IAM roles the job should assume to scrape metrics. Defaults to the role configured in the environment {{< param "PRODUCT_NAME" >}} runs on. | no |
| discovery > metric | [metric][] | Configures the list of metrics the job should scrape. Multiple metrics can be defined inside one job. | yes |
| static | [static][] | Configures a static job. Multiple jobs can be configured. | no\* |
| static > role | [role][] | Configures the IAM roles the job should assume to scrape metrics. Defaults to the role configured in the environment {{< param "PRODUCT_NAME" >}} runs on. | no |
| static > metric | [metric][] | Configures the list of metrics the job should scrape. Multiple metrics can be defined inside one job. | yes |
| custom_namespace | [custom_namespace][] | Configures a custom namespace job. Multiple jobs can be configured. | no\* |
| custom_namespace > role | [role][] | Configures the IAM roles the job should assume to scrape metrics. Defaults to the role configured in the environment {{< param "PRODUCT_NAME" >}} runs on. | no |
| custom_namespace > metric | [metric][] | Configures the list of metrics the job should scrape. Multiple metrics can be defined inside one job. | yes |
| decoupled_scraping | [decoupled_scraping][] | Configures the decoupled scraping feature to retrieve metrics on a schedule and return the cached metrics. | no |

{{< admonition type="note" >}}
The `static` and `discovery` blocks are marked as not required, but you must configure at least one static or discovery job.
The `static`, `discovery`, and `custom_namespace` blocks are marked as not required,
but you must configure at least one static, discovery, or custom namespace job.
{{< /admonition >}}

[discovery]: #discovery-block
[static]: #static-block
[custom_namespace]: #custom_namespace-block
[metric]: #metric-block
[role]: #role-block
[decoupled_scraping]: #decoupled_scraping-block
Expand Down Expand Up @@ -257,6 +262,46 @@ require `Resource`, `Service`, `Class`, and `Type` dimensions to be specified. T
metrics,
all dimensions attached to a metric when saved in CloudWatch are required.

### custom_namespace block

The `custom_namespace` block allows the component to scrape CloudWatch metrics from custom namespaces using only the namespace name and a list of metrics under that namespace.
For example:

```alloy
prometheus.exporter.cloudwatch "discover_instances" {
sts_region = "eu-west-1"
custom_namespace "customEC2Metrics" {
namespace = "CustomEC2Metrics"
regions = ["us-east-1"]
metric {
name = "cpu_usage_idle"
statistics = ["Average"]
period = "5m"
}
metric {
name = "disk_free"
statistics = ["Average"]
period = "5m"
}
}
}
```

You can configure the `custom_namespace` block multiple times if you need to scrape metrics from different namespaces.

| Name | Type | Description | Default | Required |
| ----------------------------- | -------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- | -------- |
| `regions` | `list(string)` | List of AWS regions. | | yes |
| `namespace` | `string` | CloudWatch metric namespace. | | yes |
| `recently_active_only` | `bool` | Only return metrics that have been active in the last 3 hours. | `false` | no |
| `custom_tags` | `map(string)` | Custom tags to be added as a list of key / value pairs. When exported to Prometheus format, the label name follows the following format: `custom_tag_{key}`. | `{}` | no |
| `dimension_name_requirements` | `list(string)` | List of metric dimensions to query. Before querying metric values, the total list of metrics will be filtered to only those that contain exactly this list of dimensions. An empty or undefined list results in all dimension combinations being included. | `{}` | no |
| `nil_to_zero` | `bool` | When `true`, `NaN` metric values are converted to 0. Individual metrics can override this value in the [metric][] block. | `true` | no |


### metric block

Represents an AWS Metrics to scrape. To see available metrics, AWS does not keep a documentation page with all available
Expand Down
56 changes: 53 additions & 3 deletions internal/component/prometheus/exporter/cloudwatch/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ type Arguments struct {
DiscoveryExportedTags TagsPerNamespace `alloy:"discovery_exported_tags,attr,optional"`
Discovery []DiscoveryJob `alloy:"discovery,block,optional"`
Static []StaticJob `alloy:"static,block,optional"`
CustomNamespace []CustomNamespaceJob `alloy:"custom_namespace,block,optional"`
DecoupledScrape DecoupledScrapeConfig `alloy:"decoupled_scraping,block,optional"`
UseAWSSDKVersion2 bool `alloy:"aws_sdk_version_v2,attr,optional"`
}
Expand All @@ -55,14 +56,16 @@ type DecoupledScrapeConfig struct {
type TagsPerNamespace = cloudwatch_exporter.TagsPerNamespace

// DiscoveryJob configures a discovery job for a given service.
// TODO: Add a recently_active_only attribute.
type DiscoveryJob struct {
Auth RegionAndRoles `alloy:",squash"`
CustomTags Tags `alloy:"custom_tags,attr,optional"`
SearchTags Tags `alloy:"search_tags,attr,optional"`
Type string `alloy:"type,attr"`
DimensionNameRequirements []string `alloy:"dimension_name_requirements,attr,optional"`
Metrics []Metric `alloy:"metric,block"`
NilToZero *bool `alloy:"nil_to_zero,attr,optional"`
//TODO: Remove NilToZero, because it is deprecated upstream.
NilToZero *bool `alloy:"nil_to_zero,attr,optional"`
}

// Tags represents a series of tags configured on an AWS resource. Each tag is a
Expand All @@ -77,7 +80,20 @@ type StaticJob struct {
Namespace string `alloy:"namespace,attr"`
Dimensions Dimensions `alloy:"dimensions,attr"`
Metrics []Metric `alloy:"metric,block"`
NilToZero *bool `alloy:"nil_to_zero,attr,optional"`
//TODO: Remove NilToZero, because it is deprecated upstream.
NilToZero *bool `alloy:"nil_to_zero,attr,optional"`
}

type CustomNamespaceJob struct {
Auth RegionAndRoles `alloy:",squash"`
Name string `alloy:",label"`
CustomTags Tags `alloy:"custom_tags,attr,optional"`
DimensionNameRequirements []string `alloy:"dimension_name_requirements,attr,optional"`
Namespace string `alloy:"namespace,attr"`
RecentlyActiveOnly bool `alloy:"recently_active_only,attr,optional"`
Metrics []Metric `alloy:"metric,block"`
//TODO: Remove NilToZero, because it is deprecated upstream.
NilToZero *bool `alloy:"nil_to_zero,attr,optional"`
}

// RegionAndRoles exposes for each supported job, the AWS regions and IAM roles
Expand Down Expand Up @@ -181,14 +197,19 @@ func convertToYACE(a Arguments) (yaceModel.JobsConfig, error) {
for _, stat := range a.Static {
staticJobs = append(staticJobs, toYACEStaticJob(stat))
}
var customNamespaceJobs []*yaceConf.CustomNamespace
for _, cn := range a.CustomNamespace {
customNamespaceJobs = append(customNamespaceJobs, toYACECustomNamespaceJob(cn))
}
conf := yaceConf.ScrapeConf{
APIVersion: "v1alpha1",
StsRegion: a.STSRegion,
Discovery: yaceConf.Discovery{
ExportedTagsOnMetrics: yaceConf.ExportedTagsOnMetrics(a.DiscoveryExportedTags),
Jobs: discoveryJobs,
},
Static: staticJobs,
Static: staticJobs,
CustomNamespace: customNamespaceJobs,
}

// Run the exporter's config validation. Between other things, it will check that the service for which a discovery
Expand Down Expand Up @@ -311,6 +332,35 @@ func toYACEDiscoveryJob(rj DiscoveryJob) *yaceConf.Job {
return job
}

func toYACECustomNamespaceJob(cn CustomNamespaceJob) *yaceConf.CustomNamespace {
nilToZero := cn.NilToZero
if nilToZero == nil {
nilToZero = &defaultNilToZero
}
return &yaceConf.CustomNamespace{
Name: cn.Name,
Namespace: cn.Namespace,
Regions: cn.Auth.Regions,
Roles: toYACERoles(cn.Auth.Roles),
CustomTags: cn.CustomTags.toYACE(),
DimensionNameRequirements: cn.DimensionNameRequirements,
// By setting RoundingPeriod to nil, the exporter will align the start and end times for retrieving CloudWatch
// metrics, with the smallest period in the retrieved batch.
RoundingPeriod: nil,
RecentlyActiveOnly: cn.RecentlyActiveOnly,
JobLevelMetricFields: yaceConf.JobLevelMetricFields{
// Set to zero job-wide scraping time settings. This should be configured at the metric level to make the data
// being fetched more explicit.
Period: 0,
Length: 0,
Delay: 0,
NilToZero: nilToZero,
AddCloudwatchTimestamp: &addCloudwatchTimestamp,
},
Metrics: toYACEMetrics(cn.Metrics, nilToZero),
}
}

// getHash calculates the MD5 hash of the Alloy representation of the config.
func getHash(a Arguments) string {
bytes, err := syntax.Marshal(a)
Expand Down
136 changes: 136 additions & 0 deletions internal/component/prometheus/exporter/cloudwatch/config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,27 @@ discovery {
}
`

const customNamespaceJobConfig = `
sts_region = "eu-west-1"
custom_namespace "customEC2Metrics" {
namespace = "CustomEC2Metrics"
regions = ["us-east-1"]
metric {
name = "cpu_usage_idle"
statistics = ["Average"]
period = "5m"
}
metric {
name = "disk_free"
statistics = ["Average"]
period = "5m"
}
}
`

const staticJobNilToZeroConfig = `
sts_region = "us-east-2"
debug = true
Expand Down Expand Up @@ -173,6 +194,31 @@ discovery {
}
`

const customNamespacebNilToZeroJobConfig = `
sts_region = "eu-west-1"
custom_namespace "customEC2Metrics" {
namespace = "CustomEC2Metrics"
regions = ["us-east-1"]
// setting nil_to_zero on the job level
nil_to_zero = false
metric {
name = "cpu_usage_idle"
statistics = ["Average"]
period = "5m"
}
metric {
name = "disk_free"
statistics = ["Average"]
period = "5m"
// setting nil_to_zero on the metric level
nil_to_zero = true
}
}
`

func TestCloudwatchComponentConfig(t *testing.T) {
type testcase struct {
raw string
Expand Down Expand Up @@ -355,6 +401,51 @@ func TestCloudwatchComponentConfig(t *testing.T) {
},
},
},
"single custom namespace job config": {
raw: customNamespaceJobConfig,
expected: yaceModel.JobsConfig{
StsRegion: "eu-west-1",
CustomNamespaceJobs: []yaceModel.CustomNamespaceJob{
{
Name: "customEC2Metrics",
Regions: []string{"us-east-1"},
// assert an empty role is used as default. IMPORTANT since this
// is what YACE looks for delegating to the environment role
Roles: []yaceModel.Role{{}},
CustomTags: []yaceModel.Tag{},
Namespace: "CustomEC2Metrics",
Metrics: []*yaceModel.MetricConfig{
{
Name: "cpu_usage_idle",
Statistics: []string{"Average"},
Period: 300,
Length: 300,
Delay: 0,
NilToZero: defaultNilToZero,
AddCloudwatchTimestamp: addCloudwatchTimestamp,
},
{
Name: "disk_free",
Statistics: []string{"Average"},
Period: 300,
Length: 300,
Delay: 0,
NilToZero: defaultNilToZero,
AddCloudwatchTimestamp: addCloudwatchTimestamp,
},
},
RoundingPeriod: nil,
JobLevelMetricFields: yaceModel.JobLevelMetricFields{
Period: 0,
Length: 0,
Delay: 0,
AddCloudwatchTimestamp: &falsePtr,
NilToZero: &defaultNilToZero,
},
},
},
},
},
"static job nil to zero": {
raw: staticJobNilToZeroConfig,
expected: yaceModel.JobsConfig{
Expand Down Expand Up @@ -473,6 +564,51 @@ func TestCloudwatchComponentConfig(t *testing.T) {
},
},
},
"custom namespace job nil to zero config": {
raw: customNamespacebNilToZeroJobConfig,
expected: yaceModel.JobsConfig{
StsRegion: "eu-west-1",
CustomNamespaceJobs: []yaceModel.CustomNamespaceJob{
{
Name: "customEC2Metrics",
Regions: []string{"us-east-1"},
// assert an empty role is used as default. IMPORTANT since this
// is what YACE looks for delegating to the environment role
Roles: []yaceModel.Role{{}},
CustomTags: []yaceModel.Tag{},
Namespace: "CustomEC2Metrics",
Metrics: []*yaceModel.MetricConfig{
{
Name: "cpu_usage_idle",
Statistics: []string{"Average"},
Period: 300,
Length: 300,
Delay: 0,
NilToZero: falsePtr,
AddCloudwatchTimestamp: addCloudwatchTimestamp,
},
{
Name: "disk_free",
Statistics: []string{"Average"},
Period: 300,
Length: 300,
Delay: 0,
NilToZero: truePtr,
AddCloudwatchTimestamp: addCloudwatchTimestamp,
},
},
RoundingPeriod: nil,
JobLevelMetricFields: yaceModel.JobLevelMetricFields{
Period: 0,
Length: 0,
Delay: 0,
AddCloudwatchTimestamp: &falsePtr,
NilToZero: &falsePtr,
},
},
},
},
},
} {
t.Run(name, func(t *testing.T) {
args := Arguments{}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ func (b *ConfigBuilder) appendCloudwatchExporter(config *cloudwatch_exporter.Con
}

func toCloudwatchExporter(config *cloudwatch_exporter.Config) *cloudwatch.Arguments {
// There's no need to fill out CustomNamespace, because static mode doesn't support it.
return &cloudwatch.Arguments{
STSRegion: config.STSRegion,
FIPSDisabled: config.FIPSDisabled,
Expand Down

0 comments on commit 9cb05b9

Please sign in to comment.