rework cursor creation #16533

clintropolis · 2024-06-01T08:43:57Z

Description

This PR reworks Cursor and VectorCursor creation from CursorFactory (StorageAdapter) to be more efficient and push additional details about the query into the process with a newly introduced CursorBuildSpec. A new CursorHolder interface has been added that is constructed by another new interface, CursorHolderFactory with a method makeCursorHolder, which accepts the CursorBuildSpec and replaces the arguments of CursorFactory.makeCursors and CursorFactory.makeVectorCursors.

The primary goal here is to make it easy to push these details down to cursor creation for an upcoming feature under development, projections, which are effectively a type of materialized views that live within segments and will be created at ingestion time. More details on this coming later once I finish writing the proposal, but basically we need to know stuff like what grouping columns, filters, aggregations, etc, are involved in a query in order to be able to automatically select the appropriate projection available within the segment, or using the regular default segment rows if no matching projections are available.

In addition to the primary goal, this refactor also allows cursor creation to be more efficient. Since CursorHolder is created from the CursorBuildSpec, the canVectorize, asCursor, and asVectorCursor methods can all share and re-use the same resources.

Finally, since this PR is introducing new cursor building interfaces, it is also taking the opportunity to downplay the prominence of time ordering, particularly query granularity and its role in cursors. CursorFactory.makeCursors previously returned a Sequence<Cursor>, corresponding to query granularity buckets. It has been transitioned to no longer do this in CursorHolder.asCursor, instead returning a single Cursor, equivalent to using ALL granularity with the previous method but without being wrapped in a Sequence. Query granularity in engines is now handled using a new CursorGranularizer which is equivalent to how the vectorized engine works, simplifying query engines which do not care about query granularity, and making things easier for engines in the future to have segments which are not sorted by time.

StorageAdapter is a @PublicApi, and extends CursorFactory, so a default implementation of CursorHolderFactory that is backed by the old canVectorize, makeCursors, and makeVectorCursor methods is provided. That said, all of the built-in query engines and tests have been migrated to use makeCursorHolder, all CursorFactory implementations have been migrated to CursorHolderFactory, and the old methods of CursorFactory have been marked as @Deprecated, and implementations of CursorFactory have been completely removed.

summary of changes

Added CursorBuildSpec which captures all of the 'interesting' stuff that goes into producing a cursor as a replacement for the method arguments of CursorFactory.canVectorize, CursorFactory.makeCursor, and CursorFactory.makeVectorCursor
added new interface CursorHolder and new interface CursorHolderFactory as a replacement for CursorFactory, with method makeCursorHolder, which takes a CursorBuildSpec as an argument and replaces CursorFactory.canVectorize, CursorFactory.makeCursor, and CursorFactory.makeVectorCursor
CursorFactory.makeCursors previously returned a Sequence<Cursor> corresponding to the query granularity buckets, with a separate Cursor per bucket. CursorHolder.asCursor instead returns a single Cursor (equivalent to 'ALL' granularity), and a new CursorGranularizer has been added for query engines to iterate over the cursor and divide into granularity buckets. This makes the non-vectorized engine behave the same way as the vectorized query engine (with its VectorCursorGranularizer), and simplifies a lot of stuff that has to read segments particularly if it does not care about bucketing the results into granularities.
Deprecated CursorFactory, CursorFactory.canVectorize, CursorFactory.makeCursors, and CursorFactory.makeVectorCursor
updated all StorageAdapter implementations to implement makeCursorHolder, transitioned direct CursorFactory implementations to instead implement CursorMakerFactory. StorageAdapter being a CursorMakerFactory is intended to be a transitional thing, ideally will not be released in favor of moving CursorMakerFactory to be fetched directly from Segment, however this PR was already large enough so this will be done in a follow-up.
updated all query engines to use makeCursorHolder, granularity based engines to use CursorGranularizer.

Release note

(for developers)

Use notes from #16985 instead

There is a change to the StorageAdapter interface, which is a 'public' api for extension writers to add additional types of segments to participate in Druids query engines. StorageAdapter extends the CursorFactory interface, whose methods canVectorize, makeCursors, and makeVectorCursor have been deprecated. This interface is replaced with a new CursorHolderFactory, which StorageAdapter implements, with method makeCursorHolder, which accepts a CursorBuildSpec and returns a new interface CursorHolder which defines no argument versions of canVectorize, asCursor, and asVectorCursor.

A default implementation of makeCursorHolder is provided that uses the existing deprecated methods, so no immediate action is needed for StorageAdapter implementors on upgrade, but implementors should plan in the future to migrate to implementing CursorHolderFactory directly. Custom query engines can no longer use these deprecated methods, as they have been removed from all StorageAdapter implementations. Use CursorHolder instead. CursorHolder.asCursor is equivalent to calling CursorFactory.getCursors with ALL granularity. Any custom engines which need to bucket the cursor by query granularity should instead use CursorGranularizer, which can be used to produce a set of intervals and advance the cursor within those intervals.

(for users)
Due to some internal refactoring, TopN queries no longer can use the "pooled" algorithms internally for query granularities other than 'ALL', unless the columns value cardinality is smaller than the internally computed "numValuesPerPass", which is based on processing buffer size and projected aggregator size. This is not completely obvious from a users perspective, so the only indicator might be a performance difference for TopN queries with other query granularities. SQL will only ever plan to a TopN query with 'ALL' granularity so it is not affected.

Key changed/added classes in this PR

CursorHolderFactory
CursorHolder
CursorBuildSpec
CursorGranularizer
Cursor
Offset
CursorFactory (deprecated)
GroupingEngine (to use CursorHolder and CursorGranularizer)
TimeseriesQueryEngine (to use CursorHolder and CursorGranularizer)
TopNQueryEngine (to use CursorHolder and CursorGranularizer)
All other built-in query engines to use CursorHolder
All tests that use CursorFactory methods now use CursorHolder

This PR has:

been self-reviewed.
a release note entry in the PR description.
added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
been tested in a test Druid cluster.

changes: * Added `CursorBuildSpec` which captures all of the 'interesting' stuff that goes into producing a cursor as a replacement for the method arguments of `CursorFactory.canVectorize`, `CursorFactory.makeCursors`, and `CursorFactory.makeVectorCursors` * added new interfaces `CursorMaker` and new method `asCursorMaker` to `CursorFactory`, which takes a `CursorBuildSpec` as an argument and replaces `CursorFactory.canVectorize`, `CursorFactory.makeCursors`, and `CursorFactory.makeVectorCursors` * Deprecated `CursorFactory.canVectorize`, `CursorFactory.makeCursors`, and `CursorFactory.makeVectorCursors` * updated all `CursorFactory` implementations to implement `asCursorMaker` * updated all query engines to use `asCursorMaker`

processing/src/main/java/org/apache/druid/segment/CursorFactory.java

processing/src/main/java/org/apache/druid/segment/QueryableIndexCursorMaker.java

processing/src/main/java/org/apache/druid/segment/FilteredStorageAdapter.java

asdf2014

This method needs to be removed as it is never used to pass the intellij-inspections check in Github Action

processing/src/main/java/org/apache/druid/segment/CursorBuildSpec.java

* CursorMaker no longer handles query granularity directly * Sequence<Cursor> of makeCursors is now just a Cursor from makeCursor * added CursorGranularizer to bucket queries by granularity instead, updated all engines that support query granularity to use this instead (topN still needs some more work) * remove Cursor.getTime * remove implementations of CursorFactory methods that are not asCursorMaker, replacing with defaults that throw exceptions

processing/src/main/java/org/apache/druid/query/rowsandcols/LazilyDecoratedRowsAndColumns.java

-        frameWriter.addSelection();
-        in.advance();
+      for (; !cursor.isDoneOrInterrupted() && remainingRowsToFetch > 0; remainingRowsToFetch--) {
+        writer.addSelection();


processing/src/main/java/org/apache/druid/query/rowsandcols/StorageAdapterRowsAndColumns.java

processing/src/test/java/org/apache/druid/segment/virtual/ExpressionVectorSelectorsTest.java

...g/src/test/java/org/apache/druid/segment/incremental/IncrementalIndexStorageAdapterTest.java

...ti-stage-query/src/main/java/org/apache/druid/msq/querykit/scan/ScanQueryFrameProcessor.java

processing/src/test/java/org/apache/druid/segment/virtual/ExpressionVectorSelectorsTest.java

processing/src/main/java/org/apache/druid/query/rowsandcols/StorageAdapterRowsAndColumns.java

-        in.advance();
+      try(final FrameWriter writer = frameWriterFactory.newFrameWriter(columnSelectorFactory)) {
+        while (!cursor.isDoneOrInterrupted()) {
+          writer.addSelection();


The change in apache#16533 added CursorHolders to the processor-level Closer. This is problematic when running on an input channel: it means we started keeping around all CursorHolders for all frames we process and closing them when the channel is complete, rather than closing them as we go along. This patch restores the prior behavior, where resources are closed as we go.

* ScanQueryFrameProcessor: Close CursorHolders as we go along. The change in #16533 added CursorHolders to the processor-level Closer. This is problematic when running on an input channel: it means we started keeping around all CursorHolders for all frames we process and closing them when the channel is complete, rather than closing them as we go along. This patch restores the prior behavior, where resources are closed as we go. * Fix other call sites. * Fix reference. * Improvements.

…17175) Fixes a mistake introduced in #16533 which can result in CursorGranularizer incorrectly trying to get values from a selector after calling cursor.advance because of a missing check for cursor.isDone

…pache#17175) Fixes a mistake introduced in apache#16533 which can result in CursorGranularizer incorrectly trying to get values from a selector after calling cursor.advance because of a missing check for cursor.isDone

…17175) (#17176) Fixes a mistake introduced in #16533 which can result in CursorGranularizer incorrectly trying to get values from a selector after calling cursor.advance because of a missing check for cursor.isDone

…17152) * ScanQueryFrameProcessor: Close CursorHolders as we go along. The change in apache#16533 added CursorHolders to the processor-level Closer. This is problematic when running on an input channel: it means we started keeping around all CursorHolders for all frames we process and closing them when the channel is complete, rather than closing them as we go along. This patch restores the prior behavior, where resources are closed as we go. * Fix other call sites. * Fix reference. * Improvements.

This patch re-uses timeBoundaryInspector for each cursor holder, which enables caching of minDataTimestamp and maxDataTimestamp. Fixes a performance regression introduced in apache#16533, where these fields stopped being cached across cursors. Prior to that patch, they were cached in the QueryableIndexStorageAdapter.

This patch re-uses timeBoundaryInspector for each cursor holder, which enables caching of minDataTimestamp and maxDataTimestamp. Fixes a performance regression introduced in #16533, where these fields stopped being cached across cursors. Prior to that patch, they were cached in the QueryableIndexStorageAdapter.

This patch re-uses timeBoundaryInspector for each cursor holder, which enables caching of minDataTimestamp and maxDataTimestamp. Fixes a performance regression introduced in apache#16533, where these fields stopped being cached across cursors. Prior to that patch, they were cached in the QueryableIndexStorageAdapter.

This patch re-uses timeBoundaryInspector for each cursor holder, which enables caching of minDataTimestamp and maxDataTimestamp. Fixes a performance regression introduced in #16533, where these fields stopped being cached across cursors. Prior to that patch, they were cached in the QueryableIndexStorageAdapter. Co-authored-by: Gian Merlino <[email protected]>

changes: * fix issue where topN with query granularity other than ALL would use the heap algorithm when it was actual able to use the pooled algorithm, and incorrectly used the pool algorithm in cases where it must use the heap algorithm, a regression from apache#16533 * fix issue where topN with query granularity other than ALL could incorrectly process values in the wrong time bucket, another regression from apache#16533

* topn with granularity regression fixes changes: * fix issue where topN with query granularity other than ALL would use the heap algorithm when it was actual able to use the pooled algorithm, and incorrectly used the pool algorithm in cases where it must use the heap algorithm, a regression from #16533 * fix issue where topN with query granularity other than ALL could incorrectly process values in the wrong time bucket, another regression from #16533 * move defensive check outside of loop * more test * extra layer of safety * move check outside of loop * fix spelling * add query context parameter to allow using pooled algorithm for topN when multi-passes is required even wihen query granularity is not all * add comment, revert IT context changes and add new context flag

* topn with granularity regression fixes changes: * fix issue where topN with query granularity other than ALL would use the heap algorithm when it was actual able to use the pooled algorithm, and incorrectly used the pool algorithm in cases where it must use the heap algorithm, a regression from apache#16533 * fix issue where topN with query granularity other than ALL could incorrectly process values in the wrong time bucket, another regression from apache#16533 * move defensive check outside of loop * more test * extra layer of safety * move check outside of loop * fix spelling * add query context parameter to allow using pooled algorithm for topN when multi-passes is required even wihen query granularity is not all * add comment, revert IT context changes and add new context flag

* topn with granularity regression fixes (#17565) * topn with granularity regression fixes changes: * fix issue where topN with query granularity other than ALL would use the heap algorithm when it was actual able to use the pooled algorithm, and incorrectly used the pool algorithm in cases where it must use the heap algorithm, a regression from #16533 * fix issue where topN with query granularity other than ALL could incorrectly process values in the wrong time bucket, another regression from #16533 * move defensive check outside of loop * more test * extra layer of safety * move check outside of loop * fix spelling * add query context parameter to allow using pooled algorithm for topN when multi-passes is required even wihen query granularity is not all * add comment, revert IT context changes and add new context flag * remove unused

clintropolis added Area - Querying Area - Segment Format and Ser/De Refactoring labels Jun 1, 2024

github-actions bot added Area - Batch Ingestion Area - Ingestion Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 labels Jun 1, 2024

github-advanced-security bot found potential problems Jun 1, 2024

View reviewed changes

clintropolis added 2 commits June 3, 2024 15:10

add dump segment test, fixes

ff3999c

final

dc4a6b4

asdf2014 reviewed Jun 4, 2024

View reviewed changes

processing/src/main/java/org/apache/druid/segment/CursorBuildSpec.java Outdated Show resolved Hide resolved

clintropolis added the WIP label Jun 4, 2024

clintropolis added 3 commits June 4, 2024 20:50

adjustments

dca2650

Merge remote-tracking branch 'upstream/master' into cursor-rework

7263807

oops, missing unstaged from merge

5f60f65

clintropolis added Release Notes Design Review and removed WIP labels Jun 5, 2024

clintropolis added 3 commits June 11, 2024 17:52

adjust

3858f36

adjust

6718f2b

missed a spot

0b5d399

clintropolis added the WIP label Jun 13, 2024

clintropolis added 5 commits July 23, 2024 17:12

Merge remote-tracking branch 'upstream/master' into cursor-rework

f693e3e

leaky leaky

3824754

Merge remote-tracking branch 'upstream/master' into cursor-rework

d22e1cf

split out CursorMakerFactory interface, adjustments

6c32775

github-advanced-security bot found potential problems Jul 26, 2024

View reviewed changes

fixup inspections

04e5e47

github-advanced-security bot found potential problems Jul 27, 2024

View reviewed changes

clintropolis deleted the cursor-rework branch August 16, 2024 19:46

clintropolis mentioned this pull request Aug 17, 2024

remove isDescending from Query interface, move to TimeseriesQuery #16917

Merged

clintropolis mentioned this pull request Sep 2, 2024

transition away from StorageAdapter #16985

Merged

6 tasks

gianm mentioned this pull request Sep 25, 2024

ScanQueryFrameProcessor: Close CursorHolders as we go along. #17152

Merged

This was referenced Sep 26, 2024

fix issue with ScanQueryFrameProcessor cursor build not adjusting intervals #17168

Merged

fix a mistake in CursorGranularizer to check doneness after advance #17175

Merged

clintropolis mentioned this pull request Sep 28, 2024

read metadata in SimpleQueryableIndex if available to compute segment ordering #17181

Merged

kfaraz added this to the 31.0.0 milestone Oct 4, 2024

kfaraz mentioned this pull request Oct 4, 2024

[Backport] ScanQueryFrameProcessor: Close CursorHolders as we go along. (#17152) (#17168) #17245

Merged

kfaraz mentioned this pull request Oct 11, 2024

[DRAFT] 31.0.0 Release Notes #17332

Closed

clintropolis mentioned this pull request Oct 11, 2024

remove stale comment in QueryableIndexCursorHolder #17333

Merged

gianm mentioned this pull request Oct 22, 2024

QueryableIndexSegment: Re-use time boundary inspector. #17397

Merged

clintropolis mentioned this pull request Nov 14, 2024

Projections #17481

Open

clintropolis mentioned this pull request Dec 13, 2024

topn with granularity regression fixes #17565

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rework cursor creation #16533

rework cursor creation #16533

clintropolis commented Jun 1, 2024 •

edited

Loading

asdf2014 left a comment

rework cursor creation #16533

rework cursor creation #16533

Conversation

clintropolis commented Jun 1, 2024 • edited Loading

Description

summary of changes

Release note

Key changed/added classes in this PR

asdf2014 left a comment

Choose a reason for hiding this comment

clintropolis commented Jun 1, 2024 •

edited

Loading