Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Global memory statistician for small memory fragments #13780

Open
wcy-fdu opened this issue Dec 4, 2023 · 3 comments
Open

Global memory statistician for small memory fragments #13780

wcy-fdu opened this issue Dec 4, 2023 · 3 comments

Comments

@wcy-fdu
Copy link
Contributor

wcy-fdu commented Dec 4, 2023

RisingWave will occupy some small memory during its working process, for example:

  • actor channel will buffer 2048 rows(the upper limit may be size based 1MB late)
  • stateful executors' mem tables will occupy memory until it's size exceed 4MB

Although these memory footprints are small, they vary dynamically with workload. If there are many materialize views/Actors in the current system, the total amount of these small memories will also increase accordingly, thereby increasing the risk of OOM(We have witnessed such issues during longevity test).

The current strategy involves global memory management, where we use jemalloc to monitor the memory usage of Compute Node. Once a certain threshold is reached, we start evicting the LRU cache. However, we currently do not track these small memories because we consider them to be small and quickly released. Since customers create a lot of MVs in the cluster, we should count these small memories and make some mitigation strategies after they add up.

possible methods:

  • introduce some kind of memory tracking to monitor these fragmented memories.
  • more reasonable memory allocation according to the memory usage(eg. more reserved memory or allocate a fixed size to it in streaming memory)
  • introduce some kind of control strategy when these memories are too large.
@github-actions github-actions bot added this to the release-1.6 milestone Dec 4, 2023
@hzxa21
Copy link
Collaborator

hzxa21 commented Dec 4, 2023

RisingWave will occupy some small memory during its working process, for example:

  • actor channel will buffer 2048 rows(the upper limit may be size based 1MB late)
  • stateful executors' mem tables will occupy memory until it's size exceed 4MB

Two more examples worth mentioning:

  • source reader may buffer source messages. For example, by per librdkafka consumer instance can buffer 64MB message by default
  • sink may buffer stream chunks (without spill) per epoch in order to do compaction if sink has different PK than upstream

@fuyufjh
Copy link
Member

fuyufjh commented Jan 9, 2024

Any specific tasks?

@fuyufjh fuyufjh added this to the release-1.7 milestone Jan 9, 2024
@wcy-fdu wcy-fdu modified the milestones: release-1.7, release-1.8 Mar 6, 2024
@wcy-fdu wcy-fdu removed this from the release-1.8 milestone Apr 8, 2024
Copy link
Contributor

This issue has been open for 60 days with no activity. Could you please update the status? Feel free to continue discussion or close as not planned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants