enhancement: the regex consumes an excessive amount of memory. #16664

zwang28 · 2024-05-09T11:00:09Z

Describe the bug

The cluster has 1000+ MVs, where each MV contains a regex. The regex lib holds 70+GB memory consistently.

manual.heap.collapsed.zip

Error message/log

No response

To Reproduce

No response

Expected behavior

No response

How did you deploy RisingWave?

No response

The version of RisingWave

v1.8

backtrack_limit increased from 1_000_000 to 1_000_000_000

Additional context

No response

BugenZhao · 2024-05-10T04:47:07Z

Each executor within the same operator will create its own expressions, which are usually identical but consume multiple resources. Moreover, if the same expression node is present in different operators or streaming jobs, it can also be reused.

We may consider perform some caching on different granularities:

cache the results of build_expr based on their protobuf input
cache the "context"s that are believed to be memory-consuming based on its own input, like RegexpContext in this case

zwang28 · 2024-05-10T07:19:25Z

Pls note that in this cluster I've raised backtrack_limit from 1_000_000 to 1_000_000_000 to work around Error::BacktrackLimitExceeded. So there's indeed very large regex state which requires sizable runtime memory.

Because the memory pool never contracts in size, it may explain the large memory footprint.

update:
I've reverted the backtrack_limit to 1_000_000, but the memory usage remains high.

zwang28 · 2024-07-01T09:13:19Z

After limiting the regex's input to 100_000 and reverting backtrack_limit to default 1_000_000, regex still take 50GB memory:

2024-07-01-09-00-17.manual.heap.collapsed.zip

fuyufjh · 2024-07-01T09:28:42Z

Because the memory pool never contracts in size, it may explain the large memory footprint.

+1 for this conclusion. regex_automata::util::pool::inner::Pool<T,F>::get_slow takes ~47 GB memory in the memory flamegraph.

The comments here sound helpful. FYI. https://github.com/rust-lang/regex/blob/8856fe36ac7dc37989e6ffb26b5fc57189bae626/regex-automata/src/meta/regex.rs#L92

It seems the quickest solution to follow this usage, that is, sharing the same Regex object across multiple threads/tasks

/// * The same `Regex` is shared across multiple threads simultaneously,
/// usually via a [`util::lazy::Lazy`](crate::util::lazy::Lazy) or something
/// similar from the `once_cell` or `lazy_static` crates.

But since we are using fancy-regex, I'm not sure whether it's doable

BugenZhao · 2024-07-01T10:02:55Z

Currently, a pool never contracts in size. Its size is proportional to the maximum number of simultaneous uses.

So it appears that it's the dynamic part for using them consumes too much memory, instead of the static compiled part.

It seems the quickest solution to follow this usage, that is, sharing the same Regex object across multiple threads/tasks

Do you mean that we intentionally let all regexs (with the same pattern) reuse the dynamic part to reduce pool size, at the cost of potential synchronization? Sounds reasonable as long as we find it does not affect performance too much.

zwang28 · 2024-07-02T02:30:08Z

intentionally let all regexs (with the same pattern) reuse the dynamic part to reduce pool size

For this cluster, we may need all regexs of all patterns reuse the dynamic part
Because in this cluster a regex of the same pattern already occurs exactly once in each compute node, i.e. the parallelism of the regex is number of compute node.

fuyufjh · 2024-07-02T03:07:58Z

intentionally let all regexs (with the same pattern) reuse the dynamic part to reduce pool size

For this cluster, we may need all regexs of all patterns reuse the dynamic part Because in this cluster a regex of the same pattern already occurs exactly once in each compute node, i.e. the parallelism of the regex is number of compute node.

Hmmm, in that way there will be much more lock contention. I tend to try "reusing all regexs (with the same pattern)" first. As mentioned before, we are using fancy-regex on top of regex_automata, and doing this requires refactoring the code to use regex_automata directly. Not sure how difficult it is.

BugenZhao · 2024-07-02T05:28:29Z

I've reverted the backtrack_limit to 1_000_000, but the memory usage remains high.

BTW, if there's backtrack in the regex, fancy-regex won't dispatch the implementation to regex_automata but use its own implementation instead, which does not seem to involve a Cache. So I tend to believe it's different issue.

zwang28 · 2024-07-02T07:37:58Z

The pool is sharded to alleviate contention; however, as a side effect, it leads to an increase in memory usage since it is accessed by thread_id % MAX_POOL_STACKS. The memory usage can be amplified at most MAX_POOL_STACKS=8 times.
See comments:
https://github.com/rust-lang/regex/blob/8856fe36ac7dc37989e6ffb26b5fc57189bae626/regex-automata/src/util/pool.rs#L287C1-L288C1

Specific to this cluster, given that concurrent access to a regex's pool won't occur (one parallelism for each regex in each compute node), setting the shard num MAX_POOL_STACKS to 1 should be adequate. I will give it a try later when the env is available.

zwang28 added the type/bug Something isn't working label May 9, 2024

github-actions bot added this to the release-1.10 milestone May 9, 2024

BugenZhao added type/enhancement Improvements to existing implementation. component/func-expr Support a SQL function or operator and removed type/bug Something isn't working labels May 10, 2024

zwang28 changed the title ~~bug: the regex consumes an excessive amount of memory.~~ enhancement: the regex consumes an excessive amount of memory. May 10, 2024

zwang28 modified the milestones: release-1.10, future-release-1.11 Jul 10, 2024

zwang28 removed this from the release-2.0 milestone Aug 8, 2024

zwang28 closed this as not planned Won't fix, can't repro, duplicate, stale Oct 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enhancement: the regex consumes an excessive amount of memory. #16664

enhancement: the regex consumes an excessive amount of memory. #16664

zwang28 commented May 9, 2024 •

edited

Loading

BugenZhao commented May 10, 2024

zwang28 commented May 10, 2024 •

edited

Loading

zwang28 commented Jul 1, 2024

fuyufjh commented Jul 1, 2024 •

edited

Loading

BugenZhao commented Jul 1, 2024

zwang28 commented Jul 2, 2024 •

edited

Loading

fuyufjh commented Jul 2, 2024

BugenZhao commented Jul 2, 2024

zwang28 commented Jul 2, 2024 •

edited

Loading

enhancement: the regex consumes an excessive amount of memory. #16664

enhancement: the regex consumes an excessive amount of memory. #16664

Comments

zwang28 commented May 9, 2024 • edited Loading

Describe the bug

Error message/log

To Reproduce

Expected behavior

How did you deploy RisingWave?

The version of RisingWave

Additional context

BugenZhao commented May 10, 2024

zwang28 commented May 10, 2024 • edited Loading

zwang28 commented Jul 1, 2024

fuyufjh commented Jul 1, 2024 • edited Loading

BugenZhao commented Jul 1, 2024

zwang28 commented Jul 2, 2024 • edited Loading

fuyufjh commented Jul 2, 2024

BugenZhao commented Jul 2, 2024

zwang28 commented Jul 2, 2024 • edited Loading

zwang28 commented May 9, 2024 •

edited

Loading

zwang28 commented May 10, 2024 •

edited

Loading

fuyufjh commented Jul 1, 2024 •

edited

Loading

zwang28 commented Jul 2, 2024 •

edited

Loading

zwang28 commented Jul 2, 2024 •

edited

Loading