Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(optimizer): add columns_monotonicity field for PlanNode #17600

Merged
merged 6 commits into from
Jul 17, 2024

Conversation

stdrc
Copy link
Member

@stdrc stdrc commented Jul 8, 2024

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

Following #17554, this PR adds a new field columns_monotonicity to plan_base::StreamExtra, and derives the property in each stream node. As a result, we can use the property as a hint to optimize algorithm, or require input nodes to have a specific variant of property.

Fixes #13983.

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added test labels as necessary. See details.
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

Copy link
Member Author

stdrc commented Jul 8, 2024

@stdrc stdrc changed the title add columns_monotonicity field for PlanNode feat(optimizer): [WIP] add columns_monotonicity field for PlanNode Jul 8, 2024
@stdrc stdrc marked this pull request as ready for review July 8, 2024 06:18
@stdrc stdrc marked this pull request as draft July 8, 2024 06:19
@stdrc stdrc changed the title feat(optimizer): [WIP] add columns_monotonicity field for PlanNode feat(optimizer): add columns_monotonicity field for PlanNode Jul 9, 2024
@stdrc stdrc force-pushed the rc/monotonicity-as-a-property branch from fa356fc to 2d4e7b2 Compare July 9, 2024 08:29
@stdrc stdrc force-pushed the rc/column-monotonicity branch from 9f264c4 to 6b9feea Compare July 9, 2024 08:32
@stdrc stdrc force-pushed the rc/monotonicity-as-a-property branch from 2d4e7b2 to d40c9c9 Compare July 9, 2024 08:32
@stdrc stdrc marked this pull request as ready for review July 9, 2024 08:32
Base automatically changed from rc/column-monotonicity to main July 10, 2024 05:25
@stdrc stdrc force-pushed the rc/monotonicity-as-a-property branch from d40c9c9 to 5f46d7a Compare July 10, 2024 06:41
Copy link
Contributor

@st1page st1page left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally LGTM

src/frontend/src/optimizer/plan_node/stream_group_topn.rs Outdated Show resolved Hide resolved
src/frontend/src/optimizer/plan_node/stream_project_set.rs Outdated Show resolved Hide resolved
src/frontend/src/optimizer/plan_node/stream_union.rs Outdated Show resolved Hide resolved
src/frontend/src/optimizer/plan_node/stream_fs_fetch.rs Outdated Show resolved Hide resolved
@stdrc stdrc force-pushed the rc/monotonicity-as-a-property branch 2 times, most recently from 986df96 to 48d85fd Compare July 16, 2024 08:58
@stdrc stdrc changed the base branch from main to rc/stream-node-new-clean-up July 16, 2024 08:58
@stdrc stdrc force-pushed the rc/stream-node-new-clean-up branch from 043bb8a to 3320e87 Compare July 16, 2024 09:47
@stdrc stdrc force-pushed the rc/monotonicity-as-a-property branch from 48d85fd to bbbdd0e Compare July 16, 2024 09:47
Base automatically changed from rc/stream-node-new-clean-up to main July 16, 2024 10:42
@stdrc stdrc force-pushed the rc/monotonicity-as-a-property branch from bbbdd0e to ac30c1f Compare July 16, 2024 13:07
Copy link
Member

@xxchan xxchan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any background about what makes you want to resurrect this work?

@@ -271,3 +292,53 @@ impl MonotonicityAnalyzer {
Inherent(Unknown)
}
}

#[derive(Debug, Default, Clone, PartialEq, Eq, Hash)]
pub struct MonotonicityMap(BTreeMap<usize, Monotonicity>);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add comments? The meaning of the BTreeMap is not clear to me. Is it sth like [Monotonicity; num_cols]?

Comment on lines 325 to 329
impl IndexMut<usize> for MonotonicityMap {
fn index_mut(&mut self, idx: usize) -> &mut Self::Output {
self.0.entry(idx).or_insert(Monotonicity::Unknown)
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to maintain the invariant of not containing Unknown, I think we should not implement IndexMut, and stick to insert, because you will be able to m[i]=Unknown.

false,
false,
watermark_columns,
MonotonicityMap::new(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are TopN's order columns monotonic?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid not. Currently I tend to define monotonicity as a property ignoring Ops. While you did remind me that TopN can produce watermarks.

stdrc added 4 commits July 17, 2024 11:33
Signed-off-by: Richard Chien <[email protected]>
Signed-off-by: Richard Chien <[email protected]>
@stdrc stdrc force-pushed the rc/monotonicity-as-a-property branch from ac30c1f to 15a38e7 Compare July 17, 2024 03:38
@stdrc
Copy link
Member Author

stdrc commented Jul 17, 2024

Any background about what makes you want to resurrect this work?

Three reasons:

  1. The effort towards a unified state cleaning approach for </<= and >/>= is blocked (refactor(dyn-filter): properly clean state and propagate watermark in DynamicFilter #17694), so, to get rid of the hack to determine condition_always_relax, we need feat(dyn-filter): derive condition_always_relax from column monotonicity #17704, which needs monotonicity property.
  2. Even we can avoid condition_always_relax after unifying dynamic filter state cleaning by refactor(dyn-filter): properly clean state and propagate watermark in DynamicFilter #17694, we still need monotonicity property to infer the out_append_only property.
  3. We may be able to utilize this property to avoid extra computation for reordering, e.g. to avoid StreamEowcSort before StreamEowcOverWindow if the input is already NonDecreasing.

@stdrc stdrc requested a review from xxchan July 17, 2024 07:14
@stdrc stdrc added this pull request to the merge queue Jul 17, 2024
Merged via the queue into main with commit 6e2c82f Jul 17, 2024
30 of 31 checks passed
@stdrc stdrc deleted the rc/monotonicity-as-a-property branch July 17, 2024 10:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

optimizer: implement monotonic_columns for stream plan nodes
4 participants