-
Notifications
You must be signed in to change notification settings - Fork 594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(frontend): separate cdc table scan from stream table scan #13332
Conversation
The main conflict with #13276 is cdc backfill has its own function to construct its state catalog. So it should be easy to resolve, I can help with it. |
Any specific tests I should run for cdc backfill? |
Codecov Report
@@ Coverage Diff @@
## main #13332 +/- ##
==========================================
- Coverage 67.83% 67.80% -0.03%
==========================================
Files 1525 1526 +1
Lines 259575 259782 +207
==========================================
+ Hits 176072 176143 +71
- Misses 83503 83639 +136
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 20 files with indirect coverage changes 📣 Codecov offers a browser extension for seamless coverage viewing on GitHub. Try it in Chrome or Firefox today! |
Hi all PTAL |
@fuyufjh @StrikeW @BugenZhao @yezizp2012 those who were involved in the related discussion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add a plan test to show what it looks like in explain statement?
Can we extract some common methods? Or the maintainability is still not that good IMO. 😕 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rest lgtm
Any specific tests I should run for cdc backfill?
There is a cdc.share_stream.slt
e2e test, you can run it locally with ./risedev slt
@kwannoel
Won't do too much refactor first. Because #13276 is still in progress and I don't want to introduce too much conflicts. I have made a note of it in the parent issue for this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rest LGTM
.into(), | ||
"Remove the FORMAT and ENCODE specification".into(), | ||
) | ||
.into()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes to this file are bug fix for explain of cdc. Previously this logic is only located in handle_create_table
.
This logic can be further unified with that in handle_create_table
.
Further refactoring should:
- Provide an
options
struct to simplify argument passing. - unify the match functionality between both, to avoid this bug from happening again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactoring will not be done here. It is out of scope for purposes of this PR. Only just fix so we can add a planner test.
done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
||
let node_body = | ||
// don't need batch plan for cdc source | ||
PbNodeBody::StreamScan(StreamScanNode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tend to use a new fragment / proto message for StreamCdcScanNode
. Of course this can be done in the future. cc. @StrikeW
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, we don't need to follow the bad design of StreamScanNode
, which has 2 inputs - an empty MergeNode
as placeholder and a strange BatchScanNode
but not really a BatchScan
As I mentioned in #13242 (comment): This part is shitty but changing it (StreamScanNode
) will break compatibility. But here we have the chance to change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tend to use a new fragment / proto message for
StreamCdcScanNode
. Of course this can be done in the future. cc. @StrikeW
Agree, this is my plan, stated in the PR description:
Just separate them on the stream plan layer first. Don't touch the LogicalScan, core, proto definitions first, so we can keep each PR small and separate them step by step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a checklist to track the changes.
cef3a8a
to
6164b72
Compare
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
Part of #13146
Just separate them on the stream plan layer first. Don't touch the
LogicalScan
,core
,proto
definitions first, so we can keep each PR small and separate them step by step.This PR just copy the whole stream table scan, then:
Checklist
./risedev check
(or alias,./risedev c
)Documentation
Release note
If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.