-
Notifications
You must be signed in to change notification settings - Fork 594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
too aggressive and early cache eviction #15305
Comments
The metrics mentioned in #14797 (comment) is somehow expected under our current implementation of LRU memory manager. The root cause is because the feedback loop between LRU watermark and actual usage is much slower than 1 second (the default running interval of the memory policy). The memory manager assumes its feedback at previous run has been reflected in the current memory usage, but it's not, actually. To get rid of this assumption, we have discussed on several ideas. For example, we may let each streaming executor report its LRU cache memory usage by epoch, so that the memory manager can find out the best LRU watermark accordingly. |
TPC-H q4: #14811 (comment) Edit: But q20 does get affected by this as shown at #14797 (comment) |
TPC-H q17: #14799 (comment) also has a very similar observation |
Repost observation and experiments done by @MrCroxx for better visibility: There are three observations:
|
The evicted bytes by epoch (summed by all LRUs, 1s barrier): https://1drv.ms/x/s!AiJJmrmsw_N2mxGoFZbGAYSRGstJ?e=B76XvA |
first posted here TPC-H q20: #14797 (comment)
will collect a few more examples when encountering
The text was updated successfully, but these errors were encountered: