Beyla cache: prevent OOMing during start in big clusters #1354

mariomac · 2024-11-13T13:18:34Z

In big clusters (800 nodes), when the cache pod starts, it receives a huge amount of K8s metadata, and it is enqueued in the memory faster than the Beyla cache service it is really able to forward to the Beyla client instances.

This causes that, when a beyla-k8s-cache pod starts, it accumulates GBs of memory until it is OOMed, then connected Beyla clients might move to another instances, and when the pod is restarted, it is idle enough to process all the information faster than it accumulates in main memory.

We need to find a way to desynchronize the informers receive-transform-store thread from the client message submission.

mariomac · 2024-11-26T11:21:22Z

This has been fixed by properly setting GOMEMLIMIT. We just need to document this.

mariomac added the k8s-cache label Nov 13, 2024

mariomac mentioned this issue Nov 14, 2024

Asynchronous synchronization of Beyla cache #1358

Merged

marctc added the roadmap label Nov 25, 2024

marctc added the documentation Improvements or additions to documentation label Nov 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Beyla cache: prevent OOMing during start in big clusters #1354

Beyla cache: prevent OOMing during start in big clusters #1354

mariomac commented Nov 13, 2024 •

edited

Loading

mariomac commented Nov 26, 2024

Beyla cache: prevent OOMing during start in big clusters #1354

Beyla cache: prevent OOMing during start in big clusters #1354

Comments

mariomac commented Nov 13, 2024 • edited Loading

mariomac commented Nov 26, 2024

mariomac commented Nov 13, 2024 •

edited

Loading