Global memory statistician for small memory fragments #13780

wcy-fdu · 2023-12-04T06:43:44Z

RisingWave will occupy some small memory during its working process, for example:

actor channel will buffer 2048 rows(the upper limit may be size based 1MB late)
stateful executors' mem tables will occupy memory until it's size exceed 4MB

Although these memory footprints are small, they vary dynamically with workload. If there are many materialize views/Actors in the current system, the total amount of these small memories will also increase accordingly, thereby increasing the risk of OOM(We have witnessed such issues during longevity test).

The current strategy involves global memory management, where we use jemalloc to monitor the memory usage of Compute Node. Once a certain threshold is reached, we start evicting the LRU cache. However, we currently do not track these small memories because we consider them to be small and quickly released. Since customers create a lot of MVs in the cluster, we should count these small memories and make some mitigation strategies after they add up.

possible methods:

introduce some kind of memory tracking to monitor these fragmented memories.
more reasonable memory allocation according to the memory usage(eg. more reserved memory or allocate a fixed size to it in streaming memory)
introduce some kind of control strategy when these memories are too large.

hzxa21 · 2023-12-04T08:00:13Z

RisingWave will occupy some small memory during its working process, for example:

actor channel will buffer 2048 rows(the upper limit may be size based 1MB late)

stateful executors' mem tables will occupy memory until it's size exceed 4MB

Two more examples worth mentioning:

source reader may buffer source messages. For example, by per librdkafka consumer instance can buffer 64MB message by default
sink may buffer stream chunks (without spill) per epoch in order to do compaction if sink has different PK than upstream

fuyufjh · 2024-01-09T09:45:25Z

Any specific tasks?

github-actions · 2024-06-12T08:59:14Z

This issue has been open for 60 days with no activity. Could you please update the status? Feel free to continue discussion or close as not planned.

wcy-fdu added the type/feature label Dec 4, 2023

github-actions bot added this to the release-1.6 milestone Dec 4, 2023

wcy-fdu mentioned this issue Dec 4, 2023

feat(streaming): memory-size-based back-pressure in exchange #13775

Merged

9 tasks

fuyufjh removed this from the release-1.6 milestone Jan 9, 2024

fuyufjh added this to the release-1.7 milestone Jan 9, 2024

fuyufjh assigned wcy-fdu Jan 9, 2024

wcy-fdu modified the milestones: release-1.7, release-1.8 Mar 6, 2024

wcy-fdu removed this from the release-1.8 milestone Apr 8, 2024

github-actions bot added the no-issue-activity label Jun 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Global memory statistician for small memory fragments #13780

Global memory statistician for small memory fragments #13780

wcy-fdu commented Dec 4, 2023 •

edited

Loading

hzxa21 commented Dec 4, 2023

fuyufjh commented Jan 9, 2024

github-actions bot commented Jun 12, 2024

Global memory statistician for small memory fragments #13780

Global memory statistician for small memory fragments #13780

Comments

wcy-fdu commented Dec 4, 2023 • edited Loading

hzxa21 commented Dec 4, 2023

fuyufjh commented Jan 9, 2024

github-actions bot commented Jun 12, 2024

wcy-fdu commented Dec 4, 2023 •

edited

Loading