WindowOperatorQueryFrameProcessor: Fix frame writer capacity issues + adhere to FrameProcessor's contract #17209

Akshat-Jain · 2024-10-01T05:30:37Z

Description

Currently, we had a bug in the WindowOperatorQueryFrameProcessor where the frame writer's capacity could get reached for larger queries, causing it to not output all of the result rows.

For example, the following query gives incomplete output rows when we use maxNumTasks=2, compared to when we use more workers.

select trip_id, row_number() over(partition by trip_id) as c1 from "trips_xaa" where __time < TIMESTAMP '2013-09-20 11:15:25' group by trip_id
-- This gives 20479 rows with maxNumTasks=2
-- This gives 30341 rows with maxNumTasks=5
-- This gives 30341 rows with maxNumTasks=11

This PR fixes the above issue by maintaining the state of last rowId flushed to output channel, and triggering another iteration of runIncrementally() method if frame writer has rows pending flush to the output channel.

The above is done keeping in mind FrameProcessor's contract which enforces that we should write only a single frame to each output channel in any given iteration of runIncrementally().

For manual testing, I've been verifying the behavior with queries like the following:

select trip_id, row_number() over(partition by trip_id) as c1 from "trips_xaa" where __time < TIMESTAMP '2016-10-20 11:15:25' group by trip_id
-- 49795247 rows inputted, 49795247 rows outputted by window stage (with maxNumTasks=2 and maxNumTasks=11)

select c1, count(c1) from (select trip_id, row_number() over(partition by trip_id) as c1 from "trips_xaa" where __time < TIMESTAMP '2016-10-20 11:15:25' group by trip_id) group by c1
-- [1, 49795247] with maxNumTasks=2 and maxNumTasks=11

Additionally, I have added a test WindowOperatorQueryFrameProcessorTest#testFrameWriterReachingCapacity() which was previously writing less number of rows to the output channel, but is writing the complete set of rows.

This PR has:

been self-reviewed.
added documentation for new or modified features or behaviors.
a release note entry in the PR description.
added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
added or updated version, license, or notice information in licenses.yaml
added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
added integration tests.
been tested in a test Druid cluster.

… adhere to FrameProcessor's contract

kgyrtkirk · 2024-10-01T09:36:40Z

this seems like a bugfix - please remove the optimization; that should be reviewed separetly

Akshat-Jain · 2024-10-01T09:42:52Z

this seems like a bugfix - please remove the optimization; that should be reviewed separetly

@kgyrtkirk I think we anyway should get the optimization also merged for Druid 31 (since it's significant), hence bundled it together with this patch because it was already updating a lot of the frame writer logic.

If it helps, this part of the diff corresponds to the optimization:

...age-query/src/main/java/org/apache/druid/msq/querykit/WindowOperatorQueryFrameProcessor.java

kgyrtkirk · 2024-10-01T10:34:33Z

remove the performance enhancement ; and place it in a separate PR - also add a benchmark to help keep track of performance improvements

Akshat-Jain · 2024-10-01T12:25:37Z

remove the performance enhancement ; and place it in a separate PR - also add a benchmark to help keep track of performance improvements

@kgyrtkirk I was hoping to keep all changes related to frame writer together, to ensure that everything works fine with both of them together. Instead of a bug fix PR, I'm looking at this PR as more of a "revamp frame writer logic in WindowOperatorQueryFrameProcessor" PR.

Splitting it into multiple PRs is certainly doable, but I'd still have to continue testing them coupled together locally, which is unnecessary and would cause unnecessary delays (we are hoping to get this in for Druid 31 release).

Regarding the benchmark - I agree that it would be great to have it. It's on my list of todos, but it's not a trivial task and would require some time. So until then, benchmark shouldn't block any performance improvements / logic changes.

Thoughts? cc: @cryptoe

kgyrtkirk · 2024-10-01T12:40:10Z

what's happening in this PR right now is pretty convoluted - that's why I'm asking to remove any performance enhancements ; because I don't think the fix for the bug is right

Akshat-Jain · 2024-10-01T13:01:04Z

@kgyrtkirk Fair enough, working on splitting up the PR.

Akshat-Jain · 2024-10-01T13:17:16Z

@kgyrtkirk Have removed the optimization diff from this PR. Have opened a separate PR for it: #17211, appreciate your reviews on that PR as well. Thanks!

…de itself

kgyrtkirk

looks good; just a minor comment :)

...age-query/src/main/java/org/apache/druid/msq/querykit/WindowOperatorQueryFrameProcessor.java

…method

kgyrtkirk

thank you for the updates!
+1

kgyrtkirk · 2024-10-02T13:04:14Z

...age-query/src/main/java/org/apache/druid/msq/querykit/WindowOperatorQueryFrameProcessor.java

   * @throws IOException
   */
-  private void flushAllRowsAndCols(ArrayList<RowsAndColumns> resultRowAndCols) throws IOException
+  private void flushAllRowsAndCols() throws IOException
  {
    RowsAndColumns rac = new ConcatRowsAndColumns(resultRowAndCols);


its just performance: but the creation of this rac is an O(n) operation; regardless where the rowId stands.
that's why it would have been better to just pack all these things into an inner-workhorse class....when that will be done this should be taken into account.

… adhere to FrameProcessor's contract (apache#17209) This PR fixes the above issue by maintaining the state of last rowId flushed to output channel, and triggering another iteration of runIncrementally() method if frame writer has rows pending flush to the output channel. The above is done keeping in mind FrameProcessor's contract which enforces that we should write only a single frame to each output channel in any given iteration of runIncrementally().

…17231) * WindowOperatorQueryFrameProcessor: Fix frame writer capacity issues + adhere to FrameProcessor's contract (#17209) * WindowOperatorQueryFrameProcessor: Avoid unnecessary re-runs of runIncrementally() (#17211)

WindowOperatorQueryFrameProcessor: Fix frame writer capacity issues +…

40e1bfa

… adhere to FrameProcessor's contract

github-actions bot added Area - Batch Ingestion Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 labels Oct 1, 2024

Fix checkstyle

26ee63f

cryptoe added this to the 31.0.0 milestone Oct 1, 2024

Add quidem test that hits the capacity of frame writer

70e7135

kgyrtkirk reviewed Oct 1, 2024

View reviewed changes

Akshat-Jain added 2 commits October 1, 2024 18:38

Remove the optimization diff

073b779

Remove newline

9dbf3ae

Akshat-Jain added 2 commits October 1, 2024 20:26

Address review comment - clean up the usages of flushRACsAndRunAgain

1c0aefa

Replace the single usage of flushRACsAndRunAgain() method with the co…

547e51d

…de itself

Akshat-Jain force-pushed the msq-wf-frame-capacity branch from 220450c to 547e51d Compare October 1, 2024 15:07

Akshat-Jain requested a review from kgyrtkirk October 1, 2024 15:09

kgyrtkirk reviewed Oct 2, 2024

View reviewed changes

...age-query/src/main/java/org/apache/druid/msq/querykit/WindowOperatorQueryFrameProcessor.java Outdated Show resolved Hide resolved

...age-query/src/main/java/org/apache/druid/msq/querykit/WindowOperatorQueryFrameProcessor.java Outdated Show resolved Hide resolved

Akshat-Jain added 3 commits October 2, 2024 13:32

Address review comment: Remove parameters from flushAllRowsAndCols() …

a884177

…method

Remove rowId from signature of other methods also

3357cec

Address review comment

8c5be6a

kgyrtkirk approved these changes Oct 2, 2024

View reviewed changes

cryptoe approved these changes Oct 3, 2024

View reviewed changes

cryptoe merged commit 135ca8f into apache:master Oct 3, 2024
56 checks passed

Akshat-Jain mentioned this pull request Oct 3, 2024

[Backport] WindowOperatorQueryFrameProcessor fixes (#17209) (#17211) #17231

Merged

kfaraz mentioned this pull request Oct 11, 2024

[DRAFT] 31.0.0 Release Notes #17332

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WindowOperatorQueryFrameProcessor: Fix frame writer capacity issues + adhere to FrameProcessor's contract #17209

WindowOperatorQueryFrameProcessor: Fix frame writer capacity issues + adhere to FrameProcessor's contract #17209

Akshat-Jain commented Oct 1, 2024 •

edited

Loading

kgyrtkirk commented Oct 1, 2024

Akshat-Jain commented Oct 1, 2024

kgyrtkirk commented Oct 1, 2024

Akshat-Jain commented Oct 1, 2024

kgyrtkirk commented Oct 1, 2024

Akshat-Jain commented Oct 1, 2024

Akshat-Jain commented Oct 1, 2024

kgyrtkirk left a comment

kgyrtkirk left a comment

kgyrtkirk Oct 2, 2024

WindowOperatorQueryFrameProcessor: Fix frame writer capacity issues + adhere to FrameProcessor's contract #17209

WindowOperatorQueryFrameProcessor: Fix frame writer capacity issues + adhere to FrameProcessor's contract #17209

Conversation

Akshat-Jain commented Oct 1, 2024 • edited Loading

Description

kgyrtkirk commented Oct 1, 2024

Akshat-Jain commented Oct 1, 2024

kgyrtkirk commented Oct 1, 2024

Akshat-Jain commented Oct 1, 2024

kgyrtkirk commented Oct 1, 2024

Akshat-Jain commented Oct 1, 2024

Akshat-Jain commented Oct 1, 2024

kgyrtkirk left a comment

Choose a reason for hiding this comment

kgyrtkirk left a comment

Choose a reason for hiding this comment

kgyrtkirk Oct 2, 2024

Choose a reason for hiding this comment

Akshat-Jain commented Oct 1, 2024 •

edited

Loading