feat: merge small chunks for sink executor #17824

chenzl25 · 2024-07-26T09:07:24Z

Is your feature request related to a problem? Please describe.

It has been found that the chunks of a Kinesis sink executor are too small, with an average cardinality of 1.5. Since the Kinesis sink sends one chunk at a time to the external Kinesis system, if the distance between RisingWave and Kinesis is too great, it will result in low sink throughput.

I think we can support a chunk merge executor for sinks to avoid small chunks.

Describe the solution you'd like

No response

Describe alternatives you've considered

No response

Additional context

No response

fuyufjh · 2024-07-29T09:40:40Z

IIUC, this problem eixsts for all internal operators: any chunk that passes an Exchange will be splitted into smaller chunks. Saying, the parallelism = 128, then every chunk will be divided by 128 = 2 rows

I am thinking that, since the problem was caused by HashDispatcher, shall we solve it in the Merge? For example, waiting for all Ready chunks and merge them into a single bigger chunk.

chenzl25 · 2024-07-29T10:10:41Z

#15713

st1page · 2024-07-29T10:11:20Z

IIUC, this problem eixsts for all internal operators: any chunk that passes an Exchange will be splitted into smaller chunks. Saying, the parallelism = 128, then every chunk will be divided by 128 = 2 rows

I am thinking that, since the problem was caused by HashDispatcher, shall we solve it in the Merge? For example, waiting for all Ready chunks and merge them into a single bigger chunk.

It will hurt the latency so we need to make it configurable.

chenzl25 · 2024-07-30T08:28:28Z

I tested the PR and it works great. There is no back pressure in the sink anymore. Let's discuss whether to apply the merging broadly (like for all types of sink or even the merge executor).

fuyufjh · 2024-07-30T08:45:58Z

I feel that discussion is less helpful here. I'd like to write code to implement the idea and run some benchmarks.

For example, waiting for all Ready chunks and merge them into a single bigger chunk.

It will hurt the latency so we need to make it configurable.

I think my proposal won't hurt the latency because it doesn't introduce additional wait.

chenzl25 · 2024-07-30T09:06:02Z

Let me add a histogram metric to monitor the input chunk rows for actor first to see in which workload, small chunks will appear. Because I think nexmark is not a good benchmark for this issue.

github-actions · 2024-10-22T02:03:49Z

This issue has been open for 60 days with no activity.

If you think it is still relevant today, and needs to be done in the near future, you can comment to update the status, or just manually remove the no-issue-activity label.

You can also confidently close this issue as not planned to keep our backlog clean.
Don't worry if you think the issue is still valuable to continue in the future.
It's searchable and can be reopened when it's time. 😄

chenzl25 added the type/feature label Jul 26, 2024

github-actions bot added this to the release-1.11 milestone Jul 26, 2024

chenzl25 mentioned this issue Jul 26, 2024

feat(sink): merge small chunks for sink executor #17825

Merged

9 tasks

chenzl25 closed this as completed in #17825 Jul 29, 2024

chenzl25 reopened this Jul 29, 2024

fuyufjh assigned chenzl25 Jul 30, 2024

chenzl25 mentioned this issue Aug 8, 2024

feat(stream): merge chunks for merge executor #17967

Closed

9 tasks

fuyufjh mentioned this issue Aug 8, 2024

feat(stream): merge stream chunks at MergeExecutor #17968

Merged

9 tasks

chenzl25 modified the milestones: release-2.0, release-2.1 Aug 19, 2024

github-actions bot added the no-issue-activity label Oct 22, 2024

fuyufjh closed this as completed in #17968 Oct 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: merge small chunks for sink executor #17824

feat: merge small chunks for sink executor #17824

chenzl25 commented Jul 26, 2024

fuyufjh commented Jul 29, 2024 •

edited

Loading

chenzl25 commented Jul 29, 2024

st1page commented Jul 29, 2024

chenzl25 commented Jul 30, 2024

fuyufjh commented Jul 30, 2024 •

edited

Loading

chenzl25 commented Jul 30, 2024

github-actions bot commented Oct 22, 2024

feat: merge small chunks for sink executor #17824

feat: merge small chunks for sink executor #17824

Comments

chenzl25 commented Jul 26, 2024

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

fuyufjh commented Jul 29, 2024 • edited Loading

chenzl25 commented Jul 29, 2024

st1page commented Jul 29, 2024

chenzl25 commented Jul 30, 2024

fuyufjh commented Jul 30, 2024 • edited Loading

chenzl25 commented Jul 30, 2024

github-actions bot commented Oct 22, 2024

fuyufjh commented Jul 29, 2024 •

edited

Loading

fuyufjh commented Jul 30, 2024 •

edited

Loading