Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Elastic Fabric Adapter (EFA) metric collection to awscontainerinsightreceiver. #186

Merged
merged 5 commits into from
Mar 28, 2024

Conversation

straussb
Copy link

@straussb straussb commented Mar 20, 2024

Add Elastic Fabric Adapter (EFA) metric collection to awscontainerinsightreceiver.

The new component scrapes hardware counters from /sys/class/infiniband on disk. The layout of that directory is:

/sys/class/infiniband/<device name>
└── ports
    └── 1
        └── hw_counters
            ├── rdma_read_bytes
            ├── rdma_write_bytes
            ├── rdma_write_recv_bytes
            ├── rx_bytes
            ├── rx_drops
            └── tx_bytes

These are cumulative counters and so they are converted to deltas before sending down the pipeline.
We sum up data from all ports.

The device data is joined with data from the kubelet podresources API which tells us which container a given device is
assigned to.

The metrics are reported at container, pod, and node levels.

This commit also refactors some metric decoration code from cadvisor to a common localnode decorator, intended to be
used by any awscontainerinsightreceiver component that gathers metrics from the local node (as oppoosed to e.g. the k8s
control plane scraper). This is because we want to share the logic of populating PodName, Kubernetes labels, etc.

I also renamed RawContainerInsightsMetric to CIMetricImpl for brevity.

See also accompanying cloudwatch-agent PR: aws/amazon-cloudwatch-agent#1098


func (s *Scraper) fillMetric(metric stores.CIMetric, metricType string, namespace string, podName string,
containerName string, deviceName string, timestamp time.Time, counters *efaCounters) {
s.assignRateValueToField(ci.MetricName(metricType, ci.EfaRdmaReadBytes), counters.rdmaReadBytes,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably makes sense to have a map, and you can loop over to assignRate value to each field

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I did what you're asking in the next commit, let me know if you were imagining something different.

d.logger.Warn("Can't get containerInstanceId")
} else {
tags[ci.ContainerInstanceIDKey] = d.ecsInfo.GetContainerInstanceID()
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why cant we move lines 110-120 to addECSResources func as well

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, done. I had moved this from cadvisor_linux and didn't make any modifications.

}
}

func addECSResources(tags map[string]string) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, moved the Instance* metric type logic to there.

movence
movence previously approved these changes Mar 22, 2024
@straussb
Copy link
Author

For latest revision, I squashed all my commits together and rebased on aws-cwa-dev.

This got rid of the history... if you want to see the previous state of my PR, you can view the tree at: https://github.com/straussb/opentelemetry-collector-contrib/tree/db5503cb45f8e78b169da51c95bc4489f55db3ff

And compare to current state at: https://github.com/straussb/opentelemetry-collector-contrib/compare/db5503cb45f8e78b169da51c95bc4489f55db3ff..efa

The files that had rebase conflicts were:

  • internal/aws/containerinsight/const.go
  • receiver/awscontainerinsightreceiver/internal/prometheusscraper/decoratorconsumer/decorator.go
  • receiver/awscontainerinsightreceiver/internal/stores/utils.go
  • receiver/awscontainerinsightreceiver/internal/stores/utils_test.go
  • receiver/awscontainerinsightreceiver/receiver.go

So review those a bit more carefully.

Since last review, I also added a bit more logic in efaSysfs.go and podresourcesclient.go to check permissions/ownership of /sys/class/infiniband and the kubelet podresources socket, for security.

TypeContainerGPU = "ContainerGPU"
TypePodGPU = "PodGPU"
TypeNodeGPU = "NodeGPU"
TypeClusterGPU = "ClusterGPU"
TypeNeuronContainer = "ContainerNeuron"
Copy link

@movence movence Mar 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't need to be updated in this commit, but it would be nice to update this to TypeContainerNeuron as well for consistency

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, done.


stat, ok := info.Sys().(*syscall.Stat_t)
if !ok {
return true, nil
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it ok to return this condition as true?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would essentially mean that we are not running on Linux. But you're right, we should skip in that case so we remember to implement this logic for Windows if we ever add support for that. Changed in next commit.

// This was already patched and submitted, see
// https://www.spinics.net/lists/linux-rdma/msg68596.html
// Remove this as soon as the fix lands in the enterprise distros.
if strings.Contains(stringValue, "N/A (no PMA)") {
Copy link

@movence movence Mar 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will this break if the fix gets promoted? how do we track the release of this change?

Copy link
Author

@straussb straussb Mar 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I'm going to follow up with a commit to improve partial error handling - still post node-level metrics if we couldn't get pod resources data, as we discussed in the meeting; and also include data for some devices/counters even if reading some other data fails. As part of that commit, I will remove this logic and just skip if parsing to int failed.

(The comment is wrong anyway, I copied from somewhere else, so thanks for pointing this out. There's no upstream fix that will change anything, since we've implemented our own counter-reading code here.)

if err != nil {
acir.settings.Logger.Debug("Unable to start neuron scraper", zap.Error(err))
}

acir.initEfaSysfsScraper(localnodeDecorator)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should catch a error here as well, and log if the EFA scraper is not able to start

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously there was no reason that the scraper couldn't start. Added one case now.

ReadCounter(deviceName efaDeviceName, port string, counter string) (uint64, error)
}

type podResourcesStore interface {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why create a new one, we can reuse the one defined in the neuronScraper

Copy link
Author

@straussb straussb Mar 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's OK to duplicate in this case - I think the typical Go style would say that each class can independently mock out its dependencies as it chooses to. I thought about moving the interface to podresourcesstore.go but it doesn't make sense to take that flexibility away from the dependent classes.

…ightreceiver.

The new component scrapes hardware counters from /sys/class/infiniband on disk.  The layout of that directory is:

/sys/class/infiniband/<device name>
└── ports
    └── 1
        └── hw_counters
            ├── rdma_read_bytes
            ├── rdma_write_bytes
            ├── rdma_write_recv_bytes
            ├── rx_bytes
            ├── rx_drops
            └── tx_bytes

These are cumulative counters and so they are converted to deltas before sending down the pipeline.
We sum up data from all ports.

The device data is joined with data from the kubelet podresources API which tells us which container a given device is
assigned to.

The metrics are reported at container, pod, and node levels.

This commit also refactors some metric decoration code from cadvisor to a common localnode decorator, intended to be
used by any awscontainerinsightreceiver component that gathers metrics from the local node (as oppoosed to e.g. the k8s
control plane scraper).  This is because we want to share the logic of populating PodName, Kubernetes labels, etc.

I also renamed RawContainerInsightsMetric to CIMetricImpl for brevity.
movence
movence previously approved these changes Mar 27, 2024
@sam6134 sam6134 self-requested a review March 28, 2024 16:32
sam6134
sam6134 previously approved these changes Mar 28, 2024
@straussb straussb dismissed stale reviews from sam6134 and movence via f1384a8 March 28, 2024 18:16
@straussb straussb merged commit eed8454 into amazon-contributing:aws-cwa-dev Mar 28, 2024
51 of 67 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants