Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(over window): skip remaining affected rows when rank is not changed #18950

Merged
merged 2 commits into from
Oct 28, 2024

Conversation

stdrc
Copy link
Member

@stdrc stdrc commented Oct 16, 2024

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

For example:

rank() over (
  partition by ...
  order by val
)

Rows:

 row_id | val | rank
--------|-----|------
 100    | 2   | 1
 101    | 6   | 2
 102    | 7   | 3
 103    | 9   | 4

When changing val of row 101, it will potenrially affect row 102 and 103. Previously, no matter what the val is changed to, we will re-compute all outputs from row 101 to the end of the partition (row 103). After this PR, if the output of row 102 is not changed, we will stop the re-computation.

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added test labels as necessary. See details.
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

@stdrc stdrc changed the title skip remaining affected curr keys when rank has no change perf(over window): skip remaining affected rows when rank is not changed Oct 16, 2024
@stdrc stdrc marked this pull request as ready for review October 16, 2024 13:38
@st1page st1page self-requested a review October 16, 2024 15:07
@stdrc stdrc force-pushed the rc/over-window-state-compute-metric branch from 9fe13b2 to 746309b Compare October 17, 2024 08:53
@stdrc stdrc force-pushed the rc/over-window-perf-optimization-1 branch from 50e2520 to 4fa1dce Compare October 17, 2024 08:53
@stdrc stdrc force-pushed the rc/over-window-state-compute-metric branch from 746309b to d5e5adc Compare October 21, 2024 04:58
@stdrc stdrc force-pushed the rc/over-window-perf-optimization-1 branch from 4fa1dce to 9cbf647 Compare October 21, 2024 04:58
@stdrc stdrc requested review from Li0k and kwannoel October 21, 2024 05:23
@stdrc stdrc force-pushed the rc/over-window-state-compute-metric branch from d5e5adc to 102e482 Compare October 21, 2024 05:43
@stdrc stdrc force-pushed the rc/over-window-perf-optimization-1 branch from 9cbf647 to b2059c5 Compare October 21, 2024 05:43
Base automatically changed from rc/over-window-state-compute-metric to main October 21, 2024 06:47
@stdrc stdrc force-pushed the rc/over-window-perf-optimization-1 branch from c0fd7e7 to f6d2d5a Compare October 23, 2024 08:04
@stdrc stdrc changed the base branch from main to rc/over-window-opt-pass-through-unchanged October 23, 2024 08:04
@stdrc stdrc force-pushed the rc/over-window-opt-pass-through-unchanged branch from 0d87ab2 to 0fc881d Compare October 23, 2024 08:08
@stdrc stdrc force-pushed the rc/over-window-perf-optimization-1 branch 2 times, most recently from 01a5a65 to e2055c8 Compare October 23, 2024 08:13
@stdrc stdrc force-pushed the rc/over-window-perf-optimization-1 branch from e2055c8 to aa685e8 Compare October 23, 2024 08:26
@stdrc stdrc marked this pull request as draft October 23, 2024 08:28
@stdrc stdrc force-pushed the rc/over-window-perf-optimization-1 branch from aa685e8 to c993b5f Compare October 23, 2024 08:58
@stdrc stdrc force-pushed the rc/over-window-opt-pass-through-unchanged branch from 099e640 to 1eb8388 Compare October 23, 2024 12:41
@stdrc stdrc force-pushed the rc/over-window-perf-optimization-1 branch 2 times, most recently from 42184a0 to 49c3479 Compare October 23, 2024 13:11
@stdrc stdrc force-pushed the rc/over-window-opt-pass-through-unchanged branch from 48e3886 to ca48be1 Compare October 25, 2024 06:15
@stdrc stdrc force-pushed the rc/over-window-perf-optimization-1 branch from 49c3479 to cdc8c0c Compare October 25, 2024 06:15
@stdrc stdrc marked this pull request as ready for review October 28, 2024 08:56
@stdrc
Copy link
Member Author

stdrc commented Oct 28, 2024

After some loose benchmark based on a custom query on nexmark datagen, I found that this PR doesn't help much, but still better than nothing.

Query (no much rationale, just a quick attempt to imitate a user's query):

CREATE VIEW v1 AS
SELECT
    foo,
    bar,
    SUM(point) AS points
FROM (
    SELECT
        ((bidder * 2654435761) # (bidder >> 16)) % 10000 as foo,
        auction % 3 as bar,
        price % 10 as point
    FROM bid
)
GROUP BY foo, bar;

CREATE SINK mv AS
SELECT
    foo,
    bar,
    points,
    ROW_NUMBER() OVER (
        PARTITION BY bar
        ORDER BY points DESC, foo
    ) AS r
FROM v1
WITH ( connector = 'blackhole', type = 'append-only', force_append_only = 'true');

Before:

row number old 1

After:

row number skip same output

I'm going to merge this first, since no more operations can be done on this PR and the performance improvement looks small.

@stdrc stdrc force-pushed the rc/over-window-opt-pass-through-unchanged branch from ca48be1 to 13778ee Compare October 28, 2024 09:45
@stdrc stdrc force-pushed the rc/over-window-perf-optimization-1 branch from 6b7bf3f to c16155a Compare October 28, 2024 09:45
Base automatically changed from rc/over-window-opt-pass-through-unchanged to main October 28, 2024 10:57
@graphite-app graphite-app bot requested review from a team October 28, 2024 11:02
@stdrc stdrc force-pushed the rc/over-window-perf-optimization-1 branch from c16155a to 44fd039 Compare October 28, 2024 17:06
@stdrc stdrc enabled auto-merge October 28, 2024 17:07
@stdrc stdrc added this pull request to the merge queue Oct 28, 2024
Merged via the queue into main with commit b0dad75 Oct 28, 2024
29 of 30 checks passed
@stdrc stdrc deleted the rc/over-window-perf-optimization-1 branch October 28, 2024 17:46
stdrc added a commit that referenced this pull request Nov 4, 2024
stdrc added a commit that referenced this pull request Nov 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants