-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Per actor metrics: should be cleaned when the actor is dropped or moved. #9492
Comments
Quite annoying when investigating problems... 🥲 Hope to be fixed |
And currently |
Sounds good to me. Similar to the solution that I imagined before (described in the PR's desciption) i.e. using something to hold the lifetime of these actor-level metrics |
One thing that makes streaming metrics harder to clean is that the labels of the streaming metrics are not the same. 🤣 Let me try to register/collect them with as less modifications as possible. |
@MrCroxx Any further updates? |
Worked on the new file cache engine before. Let me get back to this PR these days. |
@MrCroxx any updates? 👀 |
Porblem
The leaked actor memory not only consumes extra memory but also affect the metrics. As the screenshot shows, the Actor 24 has already been dropped, but the metrics still exist.
Cause
The is caused by the design of
MetricVec
(actually a hashmap oflabels -> single metrics
) in the Prometheus client library.For example,
which is actually backed by
MetricVec
When call it with
with_label_values
, a new key (label
) will be created in that hashmap e.g.But it's never been removed.
Solution
Similar to
LockGuard
, one solution I can tell is to wrap theMetricVec
varaibles e.g.agg_cached_keys
within a handler object, and remove itselves' key (label) fromMetricVec
's hashmap when being dropped.The text was updated successfully, but these errors were encountered: