feat: support sequential exchange #14795

chenzl25 · 2024-01-25T07:36:44Z

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

Try to resolve nightly-20240117 compute node OOM during sysbench select-random-limits #14634
For the exchange of 2 phase (small) limit, we apply a sequential exchange to avoid OOM.

Checklist

I have written necessary rustdoc comments
I have added necessary unit tests and integration tests
I have added test labels as necessary. See details.
I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
All checks passed in ./risedev check (or alias, ./risedev c)
My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)

My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

liurenjie1024 · 2024-01-25T07:42:42Z

It changes the memory usage from task_count * limit to limit, but it happens on frontend node. Is the frontend node oom?

chenzl25 · 2024-01-25T07:46:29Z

src/batch/src/task/task_execution.rs

+impl Drop for TaskOutput {
+    fn drop(&mut self) {
+        println!("drop TaskOutput {:?}", self.output_id);
+    }
+}


Previously, for small limit queries, this struct could not be dropped timely.

Under a constant workload of limit queries. It could cause a OOM.

Interesting, maybe we can use this test to check why task leak happens. Let me take a look later.

chenzl25 · 2024-01-25T07:47:29Z

It changes the memory usage from task_count * limit to limit, but it happens on frontend node. Is the frontend node oom?

CN oom, I think it is related to the early stop of the limit execution.

liurenjie1024 · 2024-01-25T07:48:38Z

It changes the memory usage from task_count * limit to limit, but it happens on frontend node. Is the frontend node oom?

CN oom, I think it is related to the early stop of the limit execution.

Let's take a look at the sysbench.

fuyufjh · 2024-01-25T09:50:10Z

Are we sure this is the cause? Please at least run a sysbench test to verify it before merging.

liurenjie1024 · 2024-01-25T10:24:38Z

Are we sure this is the cause? Please at least run a sysbench test to verify it before merging.

+1

BugenZhao · 2024-01-26T05:16:00Z

We'd better identify the root cause of the regression before applying any other optimizations to avoid getting things more complicated. 🥺

chenzl25 · 2024-01-26T07:14:00Z

Previous CN memory:
QPS of limit select: 500

Heap dump of the OOM:
#14634 (comment)

chenzl25 · 2024-01-26T07:15:08Z

After this PR:
QPS of limit select: 1500
CN Memory:

chenzl25 · 2024-01-26T07:17:23Z

It has achieved 300% performance improvement and much more stable memory consumption under this workload.

chenzl25 · 2024-01-26T07:22:29Z

The root cause is small limit queries would fire more tasks as needed in the exchange operators, but when the limit query is ended, those fired tasks cannot be dropped timely, because it relied on the channel closed and the drop is asynchronous. Under a constantly limit queries workload, it would end up with too many tasks being dropped, which finally causes an OOM.

chenzl25 · 2024-01-26T07:28:55Z

This issue reminds me of the backfill prefetch streaming read OOM issue. Both of them share the same behavior.

liurenjie1024 · 2024-01-26T07:43:20Z

src/batch/src/executor/generic_exchange.rs

-        while let Some(data_chunk) = stream.next().await {
-            let data_chunk = data_chunk?;
-            yield data_chunk
+        let streams = self


How about move this part to the self.sequential block? It's not easy to understand the difference of collection.

liurenjie1024

LGTM, thanks!

chenzl25 · 2024-01-26T08:29:55Z

Sysbench test passed.
https://buildkite.com/risingwave-test/sysbench/builds/667#018d44ba-5ba0-49b7-a472-a952fe9134eb

BugenZhao

Can we document what sequential is somewhere in the code? It seems to be not very intuitive.

support sequential exchange

951d5b9

chenzl25 requested review from BugenZhao and stdrc January 25, 2024 07:36

github-actions bot added the type/feature label Jan 25, 2024

chenzl25 requested review from fuyufjh and liurenjie1024 January 25, 2024 07:37

chenzl25 commented Jan 25, 2024

View reviewed changes

remove print

354d961

chenzl25 requested a review from wenym1 January 26, 2024 07:25

liurenjie1024 reviewed Jan 26, 2024

View reviewed changes

liurenjie1024 approved these changes Jan 26, 2024

View reviewed changes

chenzl25 added this pull request to the merge queue Jan 26, 2024

Merged via the queue into main with commit 5576c2c Jan 26, 2024
34 of 35 checks passed

chenzl25 deleted the dylan/use_sequential_exchange_for_limit branch January 26, 2024 08:58

BugenZhao reviewed Jan 26, 2024

View reviewed changes

cyliu0 mentioned this pull request Jan 31, 2024

nightly-20240126 deleting data from backfill table stuck #14886

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support sequential exchange #14795

feat: support sequential exchange #14795

chenzl25 commented Jan 25, 2024

liurenjie1024 commented Jan 25, 2024

chenzl25 Jan 25, 2024

chenzl25 Jan 25, 2024

liurenjie1024 Jan 25, 2024

chenzl25 commented Jan 25, 2024

liurenjie1024 commented Jan 25, 2024

fuyufjh commented Jan 25, 2024

liurenjie1024 commented Jan 25, 2024

BugenZhao commented Jan 26, 2024 •

edited

Loading

chenzl25 commented Jan 26, 2024 •

edited

Loading

chenzl25 commented Jan 26, 2024 •

edited

Loading

chenzl25 commented Jan 26, 2024

chenzl25 commented Jan 26, 2024

chenzl25 commented Jan 26, 2024

liurenjie1024 Jan 26, 2024

liurenjie1024 left a comment

chenzl25 commented Jan 26, 2024

BugenZhao left a comment

feat: support sequential exchange #14795

feat: support sequential exchange #14795

Conversation

chenzl25 commented Jan 25, 2024

What's changed and what's your intention?

Checklist

Documentation

Release note

liurenjie1024 commented Jan 25, 2024

chenzl25 Jan 25, 2024

Choose a reason for hiding this comment

chenzl25 Jan 25, 2024

Choose a reason for hiding this comment

liurenjie1024 Jan 25, 2024

Choose a reason for hiding this comment

chenzl25 commented Jan 25, 2024

liurenjie1024 commented Jan 25, 2024

fuyufjh commented Jan 25, 2024

liurenjie1024 commented Jan 25, 2024

BugenZhao commented Jan 26, 2024 • edited Loading

chenzl25 commented Jan 26, 2024 • edited Loading

chenzl25 commented Jan 26, 2024 • edited Loading

chenzl25 commented Jan 26, 2024

chenzl25 commented Jan 26, 2024

chenzl25 commented Jan 26, 2024

liurenjie1024 Jan 26, 2024

Choose a reason for hiding this comment

liurenjie1024 left a comment

Choose a reason for hiding this comment

chenzl25 commented Jan 26, 2024

BugenZhao left a comment

Choose a reason for hiding this comment

BugenZhao commented Jan 26, 2024 •

edited

Loading

chenzl25 commented Jan 26, 2024 •

edited

Loading

chenzl25 commented Jan 26, 2024 •

edited

Loading