-
Notifications
You must be signed in to change notification settings - Fork 487
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: DaemonSet-like mode for Grafana Agent Operator #1495
Comments
adding 👀 for @grafana/solutions-engineering |
I am currently implementing deployment of a fleet of Grafana Agents on company Kubernetes clusters using standalone/manual deployments (e.g. the Grafana Cloud provided K8s integration) and resources deployed by the agent operator. Deploying and configuring all the agents for all three observability pillars with reliability under load and at scale is anything but straightforward. There are many things one has to know. I know of some, e.g. sharding of metrics agents, load-balancing of traces agents. Even with the operator. There are many more instances where I don't know what I don't know yet. I am wholly behind the idea presented here. It does not matter whether the implementation is DeamonSet-like or anything else. I don't think you have to restrict to yourself the notion of 'agent per node'. You can't know how big each node is and whether you can vertically scale one agent instance to handle all load. That said, the operator could integrate with cadvisor and kube-state-metrics or similar and use data from them to scale agents accordingly.
That should be true but I don't feel it is. I have to understand what is happening under the hood to be able to scale. Ideally I wouldn't want to deal with |
This proposal would be superseded by #1565. |
An application I'd like this for is being able to scrape the local Potentially solving this issue would let you deprecate/remove the kubelet service thing. Note that you should be able to specify for a given |
Grafana Agent Operator currently requires deploying multiple sets of agents:
The specific resources deployed by the operator is ideally an implementation detail to the users, but it's still not ideal that we need to do this. One side effect of the current implementation is that the requests/limits you assign to the GrafanaAgent resource are shared with all deployments of the agent. This is redundant (not all pods need the same requests/limits) and duplicative (the total resource requests are your requests * the number of pods the operator determines it needs to deploy).
I propose that Grafana Agent Operator supports a "DaemonSet-like" mode, where it manages one pod per node handling all telemetry, including integrations. We should use a "DaemonSet-like" controller to allow PVCs to be created per pod, as real DaemonSets don't support this.
As-is, this proposal isn't ready for work, and has at least a few dependencies:
Despite it not being ready, I'm opening this as a proposal now because:
The text was updated successfully, but these errors were encountered: