-
Notifications
You must be signed in to change notification settings - Fork 599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(stream): compact stream chunk by dropping useless rows in sink #11070
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
license-eye has totally checked 3827 files.
Valid | Invalid | Ignored | Fixed |
---|---|---|---|
1714 | 1 | 2112 | 0 |
Click to see the invalid file list
- src/stream/src/common/compact_chunk.rs
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Codecov Report
@@ Coverage Diff @@
## main #11070 +/- ##
==========================================
+ Coverage 69.76% 69.80% +0.03%
==========================================
Files 1404 1405 +1
Lines 235006 235243 +237
==========================================
+ Hits 163958 164204 +246
+ Misses 71048 71039 -9
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 10 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
A quick question. Should we compact per chunk? Why not per barrier in SinkExecutor? |
implement some new logic to maintain the UpdateDelete and UpdateInsert in the input chunk. PTAL again @fuyufjh , thanks. Maybe we can open it in project executor? |
-1. It seems to be too heavy (and also obscure) for being used in |
From my point of view, this PR has been diverged a lot from its original motivation (#10853), and has become somehow over-designed. Also slightly -1 for #10949, because it's a problem imagined by ourselves. Even if we need to solve it, we need to carefully choose the implementation (must be very fast) and where to put it on (such as before UDF and after column-pruning Project) I tend to close this PR now and restart from the very beginning: solve #10853 at SinkExecutor. |
Fine, Let's merge this PR and I will solve it in sink executor later |
Co-authored-by: TennyZhuang <[email protected]>
…risingwave into cyx/compact-stream-chunk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Loooks good enough to merge
Misunderstood. 😇 I mean close, not merge. |
Anyway, you can merge it if you insist, because it only changes the Sink. Although I think my proposal is a little bit better in terms of efficiency. It's still unacceptable to me to do the same thing in
|
But both issues are solved now 😇 The only remind issue here is the performance overhead now, |
Sorry, I missed some changes in the file. Just had a huddle with @st1page and here is the notes:
This doesn't apply on current implementation.
There is no reordering now. This PR finds out all pair of deletes and insert and set their invisibility to false. I have concern with this behavior, because I'm not 100% sure it's safe. For example, assume there is a
If we eliminate the rows, then after In order words, this is somehow another kind of reordering: logically, it delays the I know we don't have |
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: st1page <[email protected]> Co-authored-by: stonepage <[email protected]> Co-authored-by: Bugen Zhao <[email protected]> Co-authored-by: TennyZhuang <[email protected]>
Signed-off-by: Bugen Zhao <[email protected]>
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
This PR compacts a stream chunk by converting UPDATE into INSERT & DELETE, as well as dropping useless rows. Specifically, UPDATE K->K will be dropped. The chunk compaction is used in sink executor and project executor.
Related issues: #10949
Fix #10853
Checklist
./risedev check
(or alias,./risedev c
)Documentation
Click here for Documentation
Types of user-facing changes
Please keep the types that apply to your changes, and remove the others.
Release note