Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support sequential exchange #14795

Merged
merged 2 commits into from
Jan 26, 2024

Conversation

chenzl25
Copy link
Contributor

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added test labels as necessary. See details.
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

@liurenjie1024
Copy link
Contributor

It changes the memory usage from task_count * limit to limit, but it happens on frontend node. Is the frontend node oom?

Comment on lines 199 to 203
impl Drop for TaskOutput {
fn drop(&mut self) {
println!("drop TaskOutput {:?}", self.output_id);
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, for small limit queries, this struct could not be dropped timely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under a constant workload of limit queries. It could cause a OOM.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, maybe we can use this test to check why task leak happens. Let me take a look later.

@chenzl25
Copy link
Contributor Author

It changes the memory usage from task_count * limit to limit, but it happens on frontend node. Is the frontend node oom?

CN oom, I think it is related to the early stop of the limit execution.

@liurenjie1024
Copy link
Contributor

It changes the memory usage from task_count * limit to limit, but it happens on frontend node. Is the frontend node oom?

CN oom, I think it is related to the early stop of the limit execution.

Let's take a look at the sysbench.

@fuyufjh
Copy link
Member

fuyufjh commented Jan 25, 2024

Are we sure this is the cause? Please at least run a sysbench test to verify it before merging.

@liurenjie1024
Copy link
Contributor

Are we sure this is the cause? Please at least run a sysbench test to verify it before merging.

+1

@BugenZhao
Copy link
Member

BugenZhao commented Jan 26, 2024

We'd better identify the root cause of the regression before applying any other optimizations to avoid getting things more complicated. 🥺

@chenzl25
Copy link
Contributor Author

chenzl25 commented Jan 26, 2024

Previous CN memory:
QPS of limit select: 500

image image

Heap dump of the OOM:
#14634 (comment)

@chenzl25
Copy link
Contributor Author

chenzl25 commented Jan 26, 2024

After this PR:
QPS of limit select: 1500
CN Memory:
image

image

@chenzl25
Copy link
Contributor Author

It has achieved 300% performance improvement and much more stable memory consumption under this workload.

@chenzl25
Copy link
Contributor Author

The root cause is small limit queries would fire more tasks as needed in the exchange operators, but when the limit query is ended, those fired tasks cannot be dropped timely, because it relied on the channel closed and the drop is asynchronous. Under a constantly limit queries workload, it would end up with too many tasks being dropped, which finally causes an OOM.

@chenzl25 chenzl25 requested a review from wenym1 January 26, 2024 07:25
@chenzl25
Copy link
Contributor Author

This issue reminds me of the backfill prefetch streaming read OOM issue. Both of them share the same behavior.

while let Some(data_chunk) = stream.next().await {
let data_chunk = data_chunk?;
yield data_chunk
let streams = self
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about move this part to the self.sequential block? It's not easy to understand the difference of collection.

Copy link
Contributor

@liurenjie1024 liurenjie1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@chenzl25
Copy link
Contributor Author

@chenzl25 chenzl25 added this pull request to the merge queue Jan 26, 2024
Merged via the queue into main with commit 5576c2c Jan 26, 2024
34 of 35 checks passed
@chenzl25 chenzl25 deleted the dylan/use_sequential_exchange_for_limit branch January 26, 2024 08:58
Copy link
Member

@BugenZhao BugenZhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we document what sequential is somewhere in the code? It seems to be not very intuitive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

nightly-20240117 compute node OOM during sysbench select-random-limits
4 participants