-
Notifications
You must be signed in to change notification settings - Fork 594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(optimizer): divide logical optimizer into one for batch and one for streaming. #8192
Conversation
Codecov Report
@@ Coverage Diff @@
## main #8192 +/- ##
==========================================
- Coverage 71.69% 71.68% -0.02%
==========================================
Files 1131 1132 +1
Lines 182332 182320 -12
==========================================
- Hits 130723 130694 -29
- Misses 51609 51626 +17
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PATL @st1page
Ok(plan) | ||
} | ||
|
||
pub fn gen_optimized_logical_plan_for_batch(mut plan: PlanRef) -> Result<PlanRef> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm. So it comes to another problem - lots of duplication code between the for_batch
and for_stream
. I am not sure whether it's better than before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somewhat agree. I think it's easier to understand in a single block and branching on batch or stream. It makes it clearer which optimizations are shared, and which apply respectively only to batch or stream.
But this assumes that the general ordering of optimization passes are shared (so far, it's been true, but I can't tell if this PR also changes that)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somewhat agree. I think it's easier to understand in a single block and branching on batch or stream. It makes it clearer which optimizations are shared, and which apply respectively only to batch or stream.
But this assumes that the general ordering of optimization passes are shared (so far, it's been true, but I can't tell if this PR also changes that)
In the beginning, it sounds good to only have a single block and branching on batch or stream, however, It contains more and more branches gradually. At least for now, I can give about 5 optimizations that needed to be applied to the batch and stream separately. In addition, I hope this separation can help developers to think of whether their optimization can only be applied to the batch or the streaming, or both.
src/frontend/planner_test/tests/testdata/predicate_pushdown.yaml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, We used to think the optimization for streaming and batch have a common phase and it save our test and coding for a while. But currently, we find that the difference between the two optimizations could be flexible. And after this pr change, we can plug in the different optimization for the two execution more flexibly.
Hmmm. So it comes to another problem - lots of duplication code between the for_batch and for_stream. I am not sure whether it's better than before.
I think to reuse the duplication code, we should extract the common and meaningful functions such as fn predicate_push_down(mut plan) ->Result<PlanRef>
(in fact this piece is duplicated even in one specific execution's optimizer) or fn simple_push_down
. The emphasis of the "meaningful" here is because we do not think the batch and streaming shared a common optimization logical in nature.
In summary, I think the changes are just like that we use the Combination to reuse the code instead of Inheritance here.
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
PlanRoot
toLogicalOptimizer
.Checklist For Contributors
./risedev check
(or alias,./risedev c
)Checklist For Reviewers
Documentation
Click here for Documentation
Types of user-facing changes
Please keep the types that apply to your changes, and remove the others.
Release note