diff --git a/docs/multi-stage-query/reference.md b/docs/multi-stage-query/reference.md index 6236b6545258..010bbff2a270 100644 --- a/docs/multi-stage-query/reference.md +++ b/docs/multi-stage-query/reference.md @@ -246,6 +246,7 @@ The following table lists the context parameters for the MSQ task engine: | `durableShuffleStorage` | SELECT, INSERT, REPLACE

Whether to use durable storage for shuffle mesh. To use this feature, configure the durable storage at the server level using `druid.msq.intermediate.storage.enable=true`). If these properties are not configured, any query with the context variable `durableShuffleStorage=true` fails with a configuration error.

| `false` | | `faultTolerance` | SELECT, INSERT, REPLACE

Whether to turn on fault tolerance mode or not. Failed workers are retried based on [Limits](#limits). Cannot be used when `durableShuffleStorage` is explicitly set to false. | `false` | | `selectDestination` | SELECT

Controls where the final result of the select query is written.
Use `taskReport`(the default) to write select results to the task report. This is not scalable since task reports size explodes for large results
Use `durableStorage` to write results to durable storage location. For large results sets, its recommended to use `durableStorage` . To configure durable storage see [`this`](#durable-storage) section. | `taskReport` | +| `waitTillSegmentsLoad` | INSERT, REPLACE

If set, the ingest query waits for the generated segment to be loaded before exiting, else the ingest query exits without waiting. The task and live reports contain the information about the status of loading segments if this flag is set. This will ensure that any future queries made after the ingestion exits will include results from the ingestion. The drawback is that the controller task will stall till the segments are loaded. | `false` | ## Joins diff --git a/docs/querying/sql.md b/docs/querying/sql.md index 378bf302872b..13259bdf4044 100644 --- a/docs/querying/sql.md +++ b/docs/querying/sql.md @@ -57,7 +57,7 @@ Druid SQL supports SELECT queries with the following structure: [ WITH tableName [ ( column1, column2, ... ) ] AS ( query ) ] SELECT [ ALL | DISTINCT ] { * | exprs } FROM { | () | [ INNER | LEFT ] JOIN ON condition } -[, UNNEST(source_expression) as table_alias_name(column_alias_name) ] +[ CROSS JOIN UNNEST(source_expression) as table_alias_name(column_alias_name) ] [ WHERE expr ] [ GROUP BY [ exprs | GROUPING SETS ( (exprs), ... ) | ROLLUP (exprs) | CUBE (exprs) ] ] [ HAVING expr ] @@ -97,7 +97,7 @@ The UNNEST clause unnests array values. It's the SQL equivalent to the [unnest d The following is the general syntax for UNNEST, specifically a query that returns the column that gets unnested: ```sql -SELECT column_alias_name FROM datasource, UNNEST(source_expression1) AS table_alias_name1(column_alias_name1), UNNEST(source_expression2) AS table_alias_name2(column_alias_name2), ... +SELECT column_alias_name FROM datasource CROSS JOIN UNNEST(source_expression1) AS table_alias_name1(column_alias_name1) CROSS JOIN UNNEST(source_expression2) AS table_alias_name2(column_alias_name2) ... ``` * The `datasource` for UNNEST can be any Druid datasource, such as the following: @@ -112,7 +112,7 @@ Keep the following things in mind when writing your query: - You must include the context parameter `"enableUnnest": true`. - You can unnest multiple source expressions in a single query. -- Notice the comma between the datasource and the UNNEST function. This is needed in most cases of the UNNEST function. Specifically, it is not needed when you're unnesting an inline array since the array itself is the datasource. +- Notice the CROSS JOIN between the datasource and the UNNEST function. This is needed in most cases of the UNNEST function. Specifically, it is not needed when you're unnesting an inline array since the array itself is the datasource. - If you view the native explanation of a SQL UNNEST, you'll notice that Druid uses `j0.unnest` as a virtual column to perform the unnest. An underscore is added for each unnest, so you may notice virtual columns named `_j0.unnest` or `__j0.unnest`. - UNNEST preserves the ordering of the source array that is being unnested. diff --git a/docs/tutorials/tutorial-unnest-arrays.md b/docs/tutorials/tutorial-unnest-arrays.md index 1f8c530f8d01..49fdfe98af25 100644 --- a/docs/tutorials/tutorial-unnest-arrays.md +++ b/docs/tutorials/tutorial-unnest-arrays.md @@ -163,7 +163,7 @@ In the results, notice that the column named `dim3` has nested values like `["a" The following is the general syntax for UNNEST: ```sql -SELECT column_alias_name FROM datasource, UNNEST(source_expression) AS table_alias_name(column_alias_name) +SELECT column_alias_name FROM datasource CROSS JOIN UNNEST(source_expression) AS table_alias_name(column_alias_name) ``` In addition, you must supply the following context parameter: @@ -179,7 +179,7 @@ For more information about the syntax, see [UNNEST](../querying/sql.md#unnest). The following query returns a column called `d3` from the table `nested_data`. `d3` contains the unnested values from the source column `dim3`: ```sql -SELECT d3 FROM "nested_data", UNNEST(MV_TO_ARRAY(dim3)) AS example_table(d3) +SELECT d3 FROM "nested_data" CROSS JOIN UNNEST(MV_TO_ARRAY(dim3)) AS example_table(d3) ``` Notice the MV_TO_ARRAY helper function, which converts the multi-value records in `dim3` to arrays. It is required since `dim3` is a multi-value string dimension. @@ -191,7 +191,7 @@ If the column you are unnesting is not a string dimension, then you do not need You can unnest into a virtual column (multiple columns treated as one). The following query returns the two source columns and a third virtual column containing the unnested data: ```sql -SELECT dim4,dim5,d45 FROM nested_data, UNNEST(ARRAY[dim4,dim5]) AS example_table(d45) +SELECT dim4,dim5,d45 FROM nested_data CROSS JOIN UNNEST(ARRAY[dim4,dim5]) AS example_table(d45) ``` The virtual column `d45` is the product of the two source columns. Notice how the total number of rows has grown. The table `nested_data` had only seven rows originally. @@ -199,7 +199,7 @@ The virtual column `d45` is the product of the two source columns. Notice how th Another way to unnest a virtual column is to concatenate them with ARRAY_CONCAT: ```sql -SELECT dim4,dim5,d45 FROM nested_data, UNNEST(ARRAY_CONCAT(dim4,dim5)) AS example_table(d45) +SELECT dim4,dim5,d45 FROM nested_data CROSS JOIN UNNEST(ARRAY_CONCAT(dim4,dim5)) AS example_table(d45) ``` Decide which method to use based on what your goals are. @@ -221,7 +221,7 @@ The example query returns the following from the `nested_data` datasource: - an unnested virtual column composed of `dim4` and `dim5` aliased to `d45` ```sql -SELECT dim3,dim4,dim5,d3,d45 FROM "nested_data", UNNEST(MV_TO_ARRAY("dim3")) AS foo1(d3), UNNEST(ARRAY[dim4,dim5]) AS foo2(d45) +SELECT dim3,dim4,dim5,d3,d45 FROM "nested_data" CROSS JOIN UNNEST(MV_TO_ARRAY("dim3")) AS foo1(d3) CROSS JOIN UNNEST(ARRAY[dim4,dim5]) AS foo2(d45) ``` @@ -230,7 +230,7 @@ SELECT dim3,dim4,dim5,d3,d45 FROM "nested_data", UNNEST(MV_TO_ARRAY("dim3")) AS The following query uses only three columns from the `nested_data` table as the datasource. From that subset, it unnests the column `dim3` into `d3` and returns `d3`. ```sql -SELECT d3 FROM (SELECT dim1, dim2, dim3 FROM "nested_data"), UNNEST(MV_TO_ARRAY(dim3)) AS example_table(d3) +SELECT d3 FROM (SELECT dim1, dim2, dim3 FROM "nested_data") CROSS JOIN UNNEST(MV_TO_ARRAY(dim3)) AS example_table(d3) ``` ### Unnest with a filter @@ -242,7 +242,7 @@ You can specify which rows to unnest by including a filter in your query. The fo * Returns the records for the unnested `d3` that have a `dim2` record that matches the filter ```sql -SELECT d3 FROM (SELECT * FROM nested_data WHERE dim2 IN ('abc')), UNNEST(MV_TO_ARRAY(dim3)) AS example_table(d3) +SELECT d3 FROM (SELECT * FROM nested_data WHERE dim2 IN ('abc')) CROSS JOIN UNNEST(MV_TO_ARRAY(dim3)) AS example_table(d3) ``` You can also filter the results of an UNNEST clause. The following example unnests the inline array `[1,2,3]` but only returns the rows that match the filter: @@ -257,7 +257,7 @@ This means that you can run a query like the following where Druid only return r - The value of `m1` is less than 2. ```sql -SELECT * FROM nested_data, UNNEST(MV_TO_ARRAY("dim3")) AS foo(d3) WHERE d3 IN ('b', 'd') and m1 < 2 +SELECT * FROM nested_data CROSS JOIN UNNEST(MV_TO_ARRAY("dim3")) AS foo(d3) WHERE d3 IN ('b', 'd') and m1 < 2 ``` The query only returns a single row since only one row meets the conditions. You can see the results change if you modify the filter. @@ -267,7 +267,7 @@ The query only returns a single row since only one row meets the conditions. You The following query unnests `dim3` and then performs a GROUP BY on the output `d3`. ```sql -SELECT d3 FROM nested_data, UNNEST(MV_TO_ARRAY(dim3)) AS example_table(d3) GROUP BY d3 +SELECT d3 FROM nested_data CROSS JOIN UNNEST(MV_TO_ARRAY(dim3)) AS example_table(d3) GROUP BY d3 ``` You can further transform your results by including clauses like `ORDER BY d3 DESC` or LIMIT. diff --git a/extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java b/extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java index c423b959eccf..c7b10f245c1d 100644 --- a/extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java +++ b/extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java @@ -463,14 +463,18 @@ public TaskStatus runTask(final Closer closer) } } + boolean shouldWaitForSegmentLoad = MultiStageQueryContext.shouldWaitForSegmentLoad(task.getQuerySpec().getQuery().context()); try { releaseTaskLocks(); cleanUpDurableStorageIfNeeded(); if (queryKernel != null && queryKernel.isSuccess()) { - if (segmentLoadWaiter != null) { - // If successful and there are segments created, segmentLoadWaiter should wait for them to become available. + if (shouldWaitForSegmentLoad && segmentLoadWaiter != null) { + // If successful, there are segments created and segment load is enabled, segmentLoadWaiter should wait + // for them to become available. + log.info("Controller will now wait for segments to be loaded. The query has already finished executing," + + " and results will be included once the segments are loaded, even if this query is cancelled now."); segmentLoadWaiter.waitForSegmentsToLoad(); } } @@ -1363,31 +1367,35 @@ private void publishAllSegments(final Set segments) throws IOExcept } } else { Set versionsToAwait = segmentsWithTombstones.stream().map(DataSegment::getVersion).collect(Collectors.toSet()); + if (MultiStageQueryContext.shouldWaitForSegmentLoad(task.getQuerySpec().getQuery().context())) { + segmentLoadWaiter = new SegmentLoadStatusFetcher( + context.injector().getInstance(BrokerClient.class), + context.jsonMapper(), + task.getId(), + task.getDataSource(), + versionsToAwait, + segmentsWithTombstones.size(), + true + ); + } + performSegmentPublish( + context.taskActionClient(), + SegmentTransactionalInsertAction.overwriteAction(null, segmentsWithTombstones) + ); + } + } else if (!segments.isEmpty()) { + Set versionsToAwait = segments.stream().map(DataSegment::getVersion).collect(Collectors.toSet()); + if (MultiStageQueryContext.shouldWaitForSegmentLoad(task.getQuerySpec().getQuery().context())) { segmentLoadWaiter = new SegmentLoadStatusFetcher( context.injector().getInstance(BrokerClient.class), context.jsonMapper(), task.getId(), task.getDataSource(), versionsToAwait, - segmentsWithTombstones.size(), + segments.size(), true ); - performSegmentPublish( - context.taskActionClient(), - SegmentTransactionalInsertAction.overwriteAction(null, segmentsWithTombstones) - ); } - } else if (!segments.isEmpty()) { - Set versionsToAwait = segments.stream().map(DataSegment::getVersion).collect(Collectors.toSet()); - segmentLoadWaiter = new SegmentLoadStatusFetcher( - context.injector().getInstance(BrokerClient.class), - context.jsonMapper(), - task.getId(), - task.getDataSource(), - versionsToAwait, - segments.size(), - true - ); // Append mode. performSegmentPublish( context.taskActionClient(), diff --git a/extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/SegmentLoadStatusFetcher.java b/extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/SegmentLoadStatusFetcher.java index 478c632a7491..17f46bad23a2 100644 --- a/extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/SegmentLoadStatusFetcher.java +++ b/extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/SegmentLoadStatusFetcher.java @@ -41,13 +41,10 @@ import javax.annotation.Nullable; import javax.ws.rs.core.MediaType; -import java.util.HashMap; -import java.util.Iterator; -import java.util.Map; import java.util.Set; -import java.util.TreeSet; import java.util.concurrent.TimeUnit; import java.util.concurrent.atomic.AtomicReference; +import java.util.stream.Collectors; /** * Class that periodically checks with the broker if all the segments generated are loaded by querying the sys table @@ -84,14 +81,14 @@ public class SegmentLoadStatusFetcher implements AutoCloseable + "COUNT(*) FILTER (WHERE is_available = 0 AND is_published = 1 AND replication_factor != 0) AS pendingSegments,\n" + "COUNT(*) FILTER (WHERE replication_factor = -1) AS unknownSegments\n" + "FROM sys.segments\n" - + "WHERE datasource = '%s' AND is_overshadowed = 0 AND version = '%s'"; + + "WHERE datasource = '%s' AND is_overshadowed = 0 AND version in (%s)"; private final BrokerClient brokerClient; private final ObjectMapper objectMapper; // Map of version vs latest load status. - private final Map versionToLoadStatusMap; + private final AtomicReference versionLoadStatusReference; private final String datasource; - private final Set versionsToAwait; + private final String versionsInClauseString; private final int totalSegmentsGenerated; private final boolean doWait; // since live reports fetch the value in another thread, we need to use AtomicReference @@ -112,8 +109,11 @@ public SegmentLoadStatusFetcher( this.brokerClient = brokerClient; this.objectMapper = objectMapper; this.datasource = datasource; - this.versionsToAwait = new TreeSet<>(versionsToAwait); - this.versionToLoadStatusMap = new HashMap<>(); + this.versionsInClauseString = String.join( + ",", + versionsToAwait.stream().map(s -> StringUtils.format("'%s'", s)).collect(Collectors.toSet()) + ); + this.versionLoadStatusReference = new AtomicReference<>(new VersionLoadStatus(0, 0, 0, 0, totalSegmentsGenerated)); this.totalSegmentsGenerated = totalSegmentsGenerated; this.status = new AtomicReference<>(new SegmentLoadWaiterStatus( State.INIT, @@ -145,8 +145,9 @@ public void waitForSegmentsToLoad() final AtomicReference hasAnySegmentBeenLoaded = new AtomicReference<>(false); try { FutureUtils.getUnchecked(executorService.submit(() -> { + long lastLogMillis = -TimeUnit.MINUTES.toMillis(1); try { - while (!versionsToAwait.isEmpty()) { + while (!(hasAnySegmentBeenLoaded.get() && versionLoadStatusReference.get().isLoadingComplete())) { // Check the timeout and exit if exceeded. long runningMillis = new Interval(startTime, DateTimes.nowUtc()).toDurationMillis(); if (runningMillis > TIMEOUT_DURATION_MILLIS) { @@ -159,29 +160,21 @@ public void waitForSegmentsToLoad() return; } - Iterator iterator = versionsToAwait.iterator(); - log.info( - "Fetching segment load status for datasource[%s] from broker for segment versions[%s]", - datasource, - versionsToAwait - ); - - // Query the broker for all pending versions - while (iterator.hasNext()) { - String version = iterator.next(); - - // Fetch the load status for this version from the broker - VersionLoadStatus loadStatus = fetchLoadStatusForVersion(version); - versionToLoadStatusMap.put(version, loadStatus); - hasAnySegmentBeenLoaded.set(hasAnySegmentBeenLoaded.get() || loadStatus.getUsedSegments() > 0); - - // If loading is done for this stage, remove it from future loops. - if (hasAnySegmentBeenLoaded.get() && loadStatus.isLoadingComplete()) { - iterator.remove(); - } + if (runningMillis - lastLogMillis >= TimeUnit.MINUTES.toMillis(1)) { + lastLogMillis = runningMillis; + log.info( + "Fetching segment load status for datasource[%s] from broker for segment versions[%s]", + datasource, + versionsInClauseString + ); } - if (!versionsToAwait.isEmpty()) { + // Fetch the load status from the broker + VersionLoadStatus loadStatus = fetchLoadStatusFromBroker(); + versionLoadStatusReference.set(loadStatus); + hasAnySegmentBeenLoaded.set(hasAnySegmentBeenLoaded.get() || loadStatus.getUsedSegments() > 0); + + if (!(hasAnySegmentBeenLoaded.get() && versionLoadStatusReference.get().isLoadingComplete())) { // Update the status. updateStatus(State.WAITING, startTime); // Sleep for a bit before checking again. @@ -216,50 +209,45 @@ private void waitIfNeeded(long waitTimeMillis) throws Exception } /** - * Updates the {@link #status} with the latest details based on {@link #versionToLoadStatusMap} + * Updates the {@link #status} with the latest details based on {@link #versionLoadStatusReference} */ private void updateStatus(State state, DateTime startTime) { - int pendingSegmentCount = 0, usedSegmentsCount = 0, precachedSegmentCount = 0, onDemandSegmentCount = 0, unknownSegmentCount = 0; - for (Map.Entry entry : versionToLoadStatusMap.entrySet()) { - usedSegmentsCount += entry.getValue().getUsedSegments(); - precachedSegmentCount += entry.getValue().getPrecachedSegments(); - onDemandSegmentCount += entry.getValue().getOnDemandSegments(); - unknownSegmentCount += entry.getValue().getUnknownSegments(); - pendingSegmentCount += entry.getValue().getPendingSegments(); - } - long runningMillis = new Interval(startTime, DateTimes.nowUtc()).toDurationMillis(); + VersionLoadStatus versionLoadStatus = versionLoadStatusReference.get(); status.set( new SegmentLoadWaiterStatus( state, startTime, runningMillis, totalSegmentsGenerated, - usedSegmentsCount, - precachedSegmentCount, - onDemandSegmentCount, - pendingSegmentCount, - unknownSegmentCount + versionLoadStatus.getUsedSegments(), + versionLoadStatus.getPrecachedSegments(), + versionLoadStatus.getOnDemandSegments(), + versionLoadStatus.getPendingSegments(), + versionLoadStatus.getUnknownSegments() ) ); } /** - * Uses {@link #brokerClient} to fetch latest load status for a given version. Converts the response into a + * Uses {@link #brokerClient} to fetch latest load status for a given set of versions. Converts the response into a * {@link VersionLoadStatus} and returns it. */ - private VersionLoadStatus fetchLoadStatusForVersion(String version) throws Exception + private VersionLoadStatus fetchLoadStatusFromBroker() throws Exception { Request request = brokerClient.makeRequest(HttpMethod.POST, "/druid/v2/sql/"); - SqlQuery sqlQuery = new SqlQuery(StringUtils.format(LOAD_QUERY, datasource, version), + SqlQuery sqlQuery = new SqlQuery(StringUtils.format(LOAD_QUERY, datasource, versionsInClauseString), ResultFormat.OBJECTLINES, false, false, false, null, null ); request.setContent(MediaType.APPLICATION_JSON, objectMapper.writeValueAsBytes(sqlQuery)); String response = brokerClient.sendQuery(request); - if (response.trim().isEmpty()) { + if (response == null) { + // Unable to query broker + return new VersionLoadStatus(0, 0, 0, 0, totalSegmentsGenerated); + } else if (response.trim().isEmpty()) { // If no segments are returned for a version, all segments have been dropped by a drop rule. return new VersionLoadStatus(0, 0, 0, 0, 0); } else { diff --git a/extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/util/MultiStageQueryContext.java b/extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/util/MultiStageQueryContext.java index 265f5eae0fe1..98dcd471d0fe 100644 --- a/extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/util/MultiStageQueryContext.java +++ b/extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/util/MultiStageQueryContext.java @@ -97,6 +97,8 @@ public class MultiStageQueryContext public static final String CTX_FAULT_TOLERANCE = "faultTolerance"; public static final boolean DEFAULT_FAULT_TOLERANCE = false; + public static final String CTX_SEGMENT_LOAD_WAIT = "waitTillSegmentsLoad"; + public static final boolean DEFAULT_SEGMENT_LOAD_WAIT = false; public static final String CTX_MAX_INPUT_BYTES_PER_WORKER = "maxInputBytesPerWorker"; public static final String CTX_CLUSTER_STATISTICS_MERGE_MODE = "clusterStatisticsMergeMode"; @@ -148,6 +150,14 @@ public static boolean isFaultToleranceEnabled(final QueryContext queryContext) ); } + public static boolean shouldWaitForSegmentLoad(final QueryContext queryContext) + { + return queryContext.getBoolean( + CTX_SEGMENT_LOAD_WAIT, + DEFAULT_SEGMENT_LOAD_WAIT + ); + } + public static boolean isReindex(final QueryContext queryContext) { return queryContext.getBoolean( diff --git a/processing/src/main/java/org/apache/druid/java/util/common/DefineClassUtils.java b/processing/src/main/java/org/apache/druid/java/util/common/DefineClassUtils.java index 604ef987394e..d79c517ae3d1 100644 --- a/processing/src/main/java/org/apache/druid/java/util/common/DefineClassUtils.java +++ b/processing/src/main/java/org/apache/druid/java/util/common/DefineClassUtils.java @@ -95,7 +95,7 @@ private static MethodHandle defineClassJava9(MethodHandles.Lookup lookup) throws } /** - * "Compile" a MethodHandle that is equilavent to: + * "Compile" a MethodHandle that is equivalent to: * * Class defineClass(Class targetClass, byte[] byteCode, String className) { * return Unsafe.defineClass( @@ -147,7 +147,7 @@ private static MethodHandle defineClassJava8(MethodHandles.Lookup lookup) throws // defineClass(className, byteCode, 0, length, targetClass) defineClass = MethodHandles.insertArguments(defineClass, 2, (int) 0); - // JDK8 does not implement MethodHandles.arrayLength so we have to roll our own + // JDK8 does not implement MethodHandles.arrayLength, so we have to roll our own MethodHandle arrayLength = lookup.findStatic( lookup.lookupClass(), "getArrayLength", @@ -171,6 +171,16 @@ private static MethodHandle defineClassJava8(MethodHandles.Lookup lookup) throws return defineClass; } + /** + * This method is referenced in Java 8 using method handle, therefore it is not actually unused, and shouldn't be + * removed (till Java 8 is supported) + */ + @SuppressWarnings("unused") // method is referenced and used in defineClassJava8 + static int getArrayLength(byte[] bytes) + { + return bytes.length; + } + public static Class defineClass( Class targetClass, byte[] byteCode, diff --git a/processing/src/main/java/org/apache/druid/query/IterableRowsCursorHelper.java b/processing/src/main/java/org/apache/druid/query/IterableRowsCursorHelper.java index c5bb271213e8..b4d06edc77cf 100644 --- a/processing/src/main/java/org/apache/druid/query/IterableRowsCursorHelper.java +++ b/processing/src/main/java/org/apache/druid/query/IterableRowsCursorHelper.java @@ -20,18 +20,18 @@ package org.apache.druid.query; import org.apache.druid.java.util.common.Intervals; +import org.apache.druid.java.util.common.Pair; import org.apache.druid.java.util.common.granularity.Granularities; import org.apache.druid.java.util.common.guava.Sequence; import org.apache.druid.java.util.common.guava.Sequences; -import org.apache.druid.java.util.common.guava.Yielder; -import org.apache.druid.java.util.common.guava.Yielders; +import org.apache.druid.segment.Cursor; import org.apache.druid.segment.RowAdapter; import org.apache.druid.segment.RowBasedCursor; import org.apache.druid.segment.RowWalker; import org.apache.druid.segment.VirtualColumns; import org.apache.druid.segment.column.RowSignature; -import java.util.Iterator; +import java.io.Closeable; /** * Helper methods to create cursor from iterable of rows @@ -43,7 +43,18 @@ public class IterableRowsCursorHelper * Creates a cursor that iterates over all the rows generated by the iterable. Presence of __time column is not a * necessity */ - public static RowBasedCursor getCursorFromIterable(Iterable rows, RowSignature rowSignature) + public static Pair getCursorFromIterable(Iterable rows, RowSignature rowSignature) + { + return getCursorFromSequence(Sequences.simple(rows), rowSignature); + } + + /** + * Creates a cursor that iterates over all the rows generated by the sequence. Presence of __time column is not a + * necessity. + *

+ * Returns a pair of cursor that iterates over the rows and closeable that cleans up the created rowWalker + */ + public static Pair getCursorFromSequence(Sequence rows, RowSignature rowSignature) { RowAdapter rowAdapter = columnName -> { if (rowSignature == null) { @@ -55,8 +66,10 @@ public static RowBasedCursor getCursorFromIterable(Iterable } return row -> row[columnIndex]; }; - RowWalker rowWalker = new RowWalker<>(Sequences.simple(rows), rowAdapter); - return new RowBasedCursor<>( + + RowWalker rowWalker = new RowWalker<>(rows, rowAdapter); + + Cursor baseCursor = new RowBasedCursor<>( rowWalker, rowAdapter, null, @@ -66,41 +79,7 @@ public static RowBasedCursor getCursorFromIterable(Iterable false, rowSignature != null ? rowSignature : RowSignature.empty() ); - } - /** - * Creates a cursor that iterates over all the rows generated by the sequence. Presence of __time column is not a - * necessity - */ - public static RowBasedCursor getCursorFromSequence(Sequence rows, RowSignature rowSignature) - { - return getCursorFromIterable( - new Iterable() - { - Yielder yielder = Yielders.each(rows); - - @Override - public Iterator iterator() - { - return new Iterator() - { - @Override - public boolean hasNext() - { - return !yielder.isDone(); - } - - @Override - public Object[] next() - { - Object[] retVal = yielder.get(); - yielder = yielder.next(null); - return retVal; - } - }; - } - }, - rowSignature - ); + return Pair.of(baseCursor, rowWalker); } } diff --git a/processing/src/main/java/org/apache/druid/query/groupby/GroupByQueryQueryToolChest.java b/processing/src/main/java/org/apache/druid/query/groupby/GroupByQueryQueryToolChest.java index 73205d2b75fa..9c746dd41429 100644 --- a/processing/src/main/java/org/apache/druid/query/groupby/GroupByQueryQueryToolChest.java +++ b/processing/src/main/java/org/apache/druid/query/groupby/GroupByQueryQueryToolChest.java @@ -43,6 +43,7 @@ import org.apache.druid.frame.write.FrameWriterUtils; import org.apache.druid.frame.write.FrameWriters; import org.apache.druid.java.util.common.ISE; +import org.apache.druid.java.util.common.Pair; import org.apache.druid.java.util.common.granularity.Granularity; import org.apache.druid.java.util.common.guava.MappedSequence; import org.apache.druid.java.util.common.guava.Sequence; @@ -72,6 +73,7 @@ import org.apache.druid.segment.column.RowSignature; import org.joda.time.DateTime; +import java.io.Closeable; import java.io.IOException; import java.util.ArrayList; import java.util.BitSet; @@ -726,12 +728,14 @@ public Optional> resultsAsFrames( ); - Cursor cursor = IterableRowsCursorHelper.getCursorFromSequence( + Pair cursorAndCloseable = IterableRowsCursorHelper.getCursorFromSequence( resultsAsArrays(query, resultSequence), rowSignature ); + Cursor cursor = cursorAndCloseable.lhs; + Closeable closeble = cursorAndCloseable.rhs; - Sequence frames = FrameCursorUtils.cursorToFrames(cursor, frameWriterFactory); + Sequence frames = FrameCursorUtils.cursorToFrames(cursor, frameWriterFactory).withBaggage(closeble); return Optional.of(frames.map(frame -> new FrameSignaturePair(frame, modifiedRowSignature))); } diff --git a/processing/src/main/java/org/apache/druid/query/scan/ScanQueryQueryToolChest.java b/processing/src/main/java/org/apache/druid/query/scan/ScanQueryQueryToolChest.java index b7253d70fe93..4d0885da00d8 100644 --- a/processing/src/main/java/org/apache/druid/query/scan/ScanQueryQueryToolChest.java +++ b/processing/src/main/java/org/apache/druid/query/scan/ScanQueryQueryToolChest.java @@ -36,12 +36,12 @@ import org.apache.druid.frame.write.FrameWriterUtils; import org.apache.druid.frame.write.FrameWriters; import org.apache.druid.java.util.common.ISE; +import org.apache.druid.java.util.common.Pair; import org.apache.druid.java.util.common.UOE; import org.apache.druid.java.util.common.guava.BaseSequence; import org.apache.druid.java.util.common.guava.Sequence; import org.apache.druid.java.util.common.guava.Sequences; -import org.apache.druid.java.util.common.guava.Yielder; -import org.apache.druid.java.util.common.guava.Yielders; +import org.apache.druid.java.util.common.io.Closer; import org.apache.druid.query.FrameSignaturePair; import org.apache.druid.query.GenericQueryMetricsFactory; import org.apache.druid.query.IterableRowsCursorHelper; @@ -57,6 +57,7 @@ import org.apache.druid.segment.column.RowSignature; import org.apache.druid.utils.CloseableUtils; +import java.io.Closeable; import java.util.ArrayList; import java.util.Iterator; import java.util.List; @@ -220,24 +221,7 @@ public Optional> resultsAsFrames( ) { final RowSignature defaultRowSignature = resultArraySignature(query); - Iterator resultSequenceIterator = new Iterator() - { - Yielder yielder = Yielders.each(resultSequence); - - @Override - public boolean hasNext() - { - return !yielder.isDone(); - } - - @Override - public ScanResultValue next() - { - ScanResultValue scanResultValue = yielder.get(); - yielder = yielder.next(null); - return scanResultValue; - } - }; + ScanResultValueIterator resultSequenceIterator = new ScanResultValueIterator(resultSequence); Iterable> retVal = () -> new Iterator>() { @@ -280,7 +264,7 @@ public Sequence next() ); } }; - return Optional.of(Sequences.concat(retVal)); + return Optional.of(Sequences.concat(retVal).withBaggage(resultSequenceIterator)); } private Sequence convertScanResultValuesToFrame( @@ -294,16 +278,22 @@ private Sequence convertScanResultValuesToFrame( Preconditions.checkNotNull(rowSignature, "'rowSignature' must be provided"); List cursors = new ArrayList<>(); + Closer closer = Closer.create(); for (ScanResultValue scanResultValue : batch) { final List rows = (List) scanResultValue.getEvents(); final Function mapper = getResultFormatMapper(query.getResultFormat(), rowSignature.getColumnNames()); final Iterable formattedRows = Lists.newArrayList(Iterables.transform(rows, (Function) mapper)); - cursors.add(IterableRowsCursorHelper.getCursorFromIterable( + Pair cursorAndCloseable = IterableRowsCursorHelper.getCursorFromIterable( formattedRows, rowSignature - )); + ); + Cursor cursor = cursorAndCloseable.lhs; + Closeable closeable = cursorAndCloseable.rhs; + cursors.add(cursor); + // Cursors created from iterators don't have any resources, therefore this is mostly a defensive check + closer.register(closeable); } RowSignature modifiedRowSignature = useNestedForUnknownTypes @@ -323,7 +313,7 @@ private Sequence convertScanResultValuesToFrame( frameWriterFactory ); - return frames.map(frame -> new FrameSignaturePair(frame, modifiedRowSignature)); + return frames.map(frame -> new FrameSignaturePair(frame, modifiedRowSignature)).withBaggage(closer); } @Override diff --git a/processing/src/main/java/org/apache/druid/query/scan/ScanResultValueIterator.java b/processing/src/main/java/org/apache/druid/query/scan/ScanResultValueIterator.java new file mode 100644 index 000000000000..646c69eaf185 --- /dev/null +++ b/processing/src/main/java/org/apache/druid/query/scan/ScanResultValueIterator.java @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.query.scan; + +import org.apache.druid.java.util.common.guava.Sequence; +import org.apache.druid.java.util.common.guava.Yielder; +import org.apache.druid.java.util.common.guava.Yielders; +import org.apache.druid.java.util.common.parsers.CloseableIterator; + +import java.io.IOException; + +/** + * Iterates over the scan result sequence and provides an interface to clean up the resources (if any) to close the + * underlying sequence. Similar to {@link Yielder}, once close is called on the iterator, the calls to the rest of the + * iterator's methods are undefined. + */ +public class ScanResultValueIterator implements CloseableIterator +{ + Yielder yielder; + + public ScanResultValueIterator(final Sequence resultSequence) + { + yielder = Yielders.each(resultSequence); + } + + @Override + public void close() throws IOException + { + yielder.close(); + } + + @Override + public boolean hasNext() + { + return !yielder.isDone(); + } + + @Override + public Object next() + { + ScanResultValue scanResultValue = yielder.get(); + yielder = yielder.next(null); + return scanResultValue; + } +} diff --git a/processing/src/main/java/org/apache/druid/query/timeseries/TimeseriesQueryQueryToolChest.java b/processing/src/main/java/org/apache/druid/query/timeseries/TimeseriesQueryQueryToolChest.java index c16fe29c14de..cd8e553bf512 100644 --- a/processing/src/main/java/org/apache/druid/query/timeseries/TimeseriesQueryQueryToolChest.java +++ b/processing/src/main/java/org/apache/druid/query/timeseries/TimeseriesQueryQueryToolChest.java @@ -38,6 +38,7 @@ import org.apache.druid.frame.write.FrameWriterUtils; import org.apache.druid.frame.write.FrameWriters; import org.apache.druid.java.util.common.DateTimes; +import org.apache.druid.java.util.common.Pair; import org.apache.druid.java.util.common.granularity.Granularities; import org.apache.druid.java.util.common.granularity.Granularity; import org.apache.druid.java.util.common.guava.Sequence; @@ -65,6 +66,7 @@ import org.apache.druid.segment.column.RowSignature; import org.joda.time.DateTime; +import java.io.Closeable; import java.util.ArrayList; import java.util.Collections; import java.util.Comparator; @@ -474,10 +476,12 @@ public Optional> resultsAsFrames( ) { final RowSignature rowSignature = resultArraySignature(query); - final Cursor cursor = IterableRowsCursorHelper.getCursorFromSequence( + final Pair cursorAndCloseable = IterableRowsCursorHelper.getCursorFromSequence( resultsAsArrays(query, resultSequence), rowSignature ); + final Cursor cursor = cursorAndCloseable.lhs; + final Closeable closeable = cursorAndCloseable.rhs; RowSignature modifiedRowSignature = useNestedForUnknownTypes ? FrameWriterUtils.replaceUnknownTypesWithNestedColumns(rowSignature) @@ -489,7 +493,7 @@ public Optional> resultsAsFrames( new ArrayList<>() ); - Sequence frames = FrameCursorUtils.cursorToFrames(cursor, frameWriterFactory); + Sequence frames = FrameCursorUtils.cursorToFrames(cursor, frameWriterFactory).withBaggage(closeable); // All frames are generated with the same signature therefore we can attach the row signature return Optional.of(frames.map(frame -> new FrameSignaturePair(frame, modifiedRowSignature))); diff --git a/processing/src/main/java/org/apache/druid/query/topn/TopNQueryQueryToolChest.java b/processing/src/main/java/org/apache/druid/query/topn/TopNQueryQueryToolChest.java index 80ffb3e62974..87b50e0e4677 100644 --- a/processing/src/main/java/org/apache/druid/query/topn/TopNQueryQueryToolChest.java +++ b/processing/src/main/java/org/apache/druid/query/topn/TopNQueryQueryToolChest.java @@ -35,6 +35,7 @@ import org.apache.druid.frame.write.FrameWriterUtils; import org.apache.druid.frame.write.FrameWriters; import org.apache.druid.java.util.common.ISE; +import org.apache.druid.java.util.common.Pair; import org.apache.druid.java.util.common.granularity.Granularity; import org.apache.druid.java.util.common.guava.Sequence; import org.apache.druid.java.util.common.guava.Sequences; @@ -64,6 +65,7 @@ import org.apache.druid.segment.column.RowSignature; import org.joda.time.DateTime; +import java.io.Closeable; import java.util.ArrayList; import java.util.Collections; import java.util.Iterator; @@ -558,10 +560,12 @@ public Optional> resultsAsFrames( ) { final RowSignature rowSignature = resultArraySignature(query); - final Cursor cursor = IterableRowsCursorHelper.getCursorFromSequence( + final Pair cursorAndCloseable = IterableRowsCursorHelper.getCursorFromSequence( resultsAsArrays(query, resultSequence), rowSignature ); + Cursor cursor = cursorAndCloseable.lhs; + Closeable closeable = cursorAndCloseable.rhs; RowSignature modifiedRowSignature = useNestedForUnknownTypes ? FrameWriterUtils.replaceUnknownTypesWithNestedColumns(rowSignature) @@ -573,7 +577,7 @@ public Optional> resultsAsFrames( new ArrayList<>() ); - Sequence frames = FrameCursorUtils.cursorToFrames(cursor, frameWriterFactory); + Sequence frames = FrameCursorUtils.cursorToFrames(cursor, frameWriterFactory).withBaggage(closeable); return Optional.of(frames.map(frame -> new FrameSignaturePair(frame, modifiedRowSignature))); } diff --git a/processing/src/main/java/org/apache/druid/segment/RowWalker.java b/processing/src/main/java/org/apache/druid/segment/RowWalker.java index d6241f197e85..f55245b3bca3 100644 --- a/processing/src/main/java/org/apache/druid/segment/RowWalker.java +++ b/processing/src/main/java/org/apache/druid/segment/RowWalker.java @@ -26,14 +26,19 @@ import org.joda.time.DateTime; import javax.annotation.Nullable; +import java.io.Closeable; import java.io.IOException; import java.util.function.ToLongFunction; /** * Used by {@link RowBasedStorageAdapter} and {@link RowBasedCursor} to walk through rows. It allows multiple * {@link RowBasedCursor} to share the same underlying Iterable. + * + * The class creates a yielder from the sequence to iterate over the rows. However, it doesn't call the sequence's close + * after iterating over it. {@link #close()} should be called by the instantiators of the class to clear the resources + * held by the {@link #rowSequence} and the corresponding yielder created to iterate over it. */ -public class RowWalker +public class RowWalker implements Closeable { private final Sequence rowSequence; private final ToLongFunction timestampFunction; @@ -86,6 +91,7 @@ public void skipToDateTime(final DateTime timestamp, final boolean descending) } } + @Override public void close() { if (rowYielder != null) { diff --git a/processing/src/main/java/org/apache/druid/segment/nested/DictionaryIdLookup.java b/processing/src/main/java/org/apache/druid/segment/nested/DictionaryIdLookup.java index a4fd0907066b..4e46a1a529a0 100644 --- a/processing/src/main/java/org/apache/druid/segment/nested/DictionaryIdLookup.java +++ b/processing/src/main/java/org/apache/druid/segment/nested/DictionaryIdLookup.java @@ -24,7 +24,6 @@ import org.apache.druid.error.DruidException; import org.apache.druid.java.util.common.ByteBufferUtils; import org.apache.druid.java.util.common.FileUtils; -import org.apache.druid.java.util.common.ISE; import org.apache.druid.java.util.common.StringUtils; import org.apache.druid.java.util.common.io.smoosh.FileSmoosher; import org.apache.druid.java.util.common.io.smoosh.SmooshedFileMapper; @@ -99,7 +98,7 @@ public int lookupString(@Nullable String value) // for strings because of this. if other type dictionary writers could potentially use multiple internal files // in the future, we should transition them to using this approach as well (or build a combination smoosher and // mapper so that we can have a mutable smoosh) - File stringSmoosh = FileUtils.createTempDir(name + "__stringTempSmoosh"); + File stringSmoosh = FileUtils.createTempDir(StringUtils.urlEncode(name) + "__stringTempSmoosh"); final String fileName = NestedCommonFormatColumnSerializer.getInternalFileName( name, NestedCommonFormatColumnSerializer.STRING_DICTIONARY_FILE_NAME @@ -127,7 +126,7 @@ public int lookupString(@Nullable String value) final byte[] bytes = StringUtils.toUtf8Nullable(value); final int index = stringDictionary.indexOf(bytes == null ? null : ByteBuffer.wrap(bytes)); if (index < 0) { - throw DruidException.defensive("Value not found in string dictionary"); + throw DruidException.defensive("Value not found in column[%s] string dictionary", name); } return index; } @@ -135,7 +134,7 @@ public int lookupString(@Nullable String value) public int lookupLong(@Nullable Long value) { if (longDictionary == null) { - Path longFile = makeTempFile(name + NestedCommonFormatColumnSerializer.LONG_DICTIONARY_FILE_NAME); + final Path longFile = makeTempFile(name + NestedCommonFormatColumnSerializer.LONG_DICTIONARY_FILE_NAME); longBuffer = mapWriter(longFile, longDictionaryWriter); longDictionary = FixedIndexed.read(longBuffer, TypeStrategies.LONG, ByteOrder.nativeOrder(), Long.BYTES).get(); // reset position @@ -143,7 +142,7 @@ public int lookupLong(@Nullable Long value) } final int index = longDictionary.indexOf(value); if (index < 0) { - throw DruidException.defensive("Value not found in long dictionary"); + throw DruidException.defensive("Value not found in column[%s] long dictionary", name); } return index + longOffset(); } @@ -151,15 +150,20 @@ public int lookupLong(@Nullable Long value) public int lookupDouble(@Nullable Double value) { if (doubleDictionary == null) { - Path doubleFile = makeTempFile(name + NestedCommonFormatColumnSerializer.DOUBLE_DICTIONARY_FILE_NAME); + final Path doubleFile = makeTempFile(name + NestedCommonFormatColumnSerializer.DOUBLE_DICTIONARY_FILE_NAME); doubleBuffer = mapWriter(doubleFile, doubleDictionaryWriter); - doubleDictionary = FixedIndexed.read(doubleBuffer, TypeStrategies.DOUBLE, ByteOrder.nativeOrder(), Double.BYTES).get(); + doubleDictionary = FixedIndexed.read( + doubleBuffer, + TypeStrategies.DOUBLE, + ByteOrder.nativeOrder(), + Double.BYTES + ).get(); // reset position doubleBuffer.position(0); } final int index = doubleDictionary.indexOf(value); if (index < 0) { - throw DruidException.defensive("Value not found in double dictionary"); + throw DruidException.defensive("Value not found in column[%s] double dictionary", name); } return index + doubleOffset(); } @@ -167,7 +171,7 @@ public int lookupDouble(@Nullable Double value) public int lookupArray(@Nullable int[] value) { if (arrayDictionary == null) { - Path arrayFile = makeTempFile(name + NestedCommonFormatColumnSerializer.ARRAY_DICTIONARY_FILE_NAME); + final Path arrayFile = makeTempFile(name + NestedCommonFormatColumnSerializer.ARRAY_DICTIONARY_FILE_NAME); arrayBuffer = mapWriter(arrayFile, arrayDictionaryWriter); arrayDictionary = FrontCodedIntArrayIndexed.read(arrayBuffer, ByteOrder.nativeOrder()).get(); // reset position @@ -175,7 +179,7 @@ public int lookupArray(@Nullable int[] value) } final int index = arrayDictionary.indexOf(value); if (index < 0) { - throw DruidException.defensive("Value not found in array dictionary"); + throw DruidException.defensive("Value not found in column[%s] array dictionary", name); } return index + arrayOffset(); } @@ -239,7 +243,7 @@ private int arrayOffset() private Path makeTempFile(String name) { try { - return Files.createTempFile(name, ".tmp"); + return Files.createTempFile(StringUtils.urlEncode(name), null); } catch (IOException e) { throw new RuntimeException(e); @@ -315,7 +319,11 @@ public long write(ByteBuffer[] srcs) throws IOException public int addToOffset(long numBytesWritten) { if (numBytesWritten > bytesLeft()) { - throw new ISE("Wrote more bytes[%,d] than available[%,d]. Don't do that.", numBytesWritten, bytesLeft()); + throw DruidException.defensive( + "Wrote more bytes[%,d] than available[%,d]. Don't do that.", + numBytesWritten, + bytesLeft() + ); } currOffset += numBytesWritten; diff --git a/processing/src/main/java/org/apache/druid/segment/nested/FieldTypeInfo.java b/processing/src/main/java/org/apache/druid/segment/nested/FieldTypeInfo.java index c8b3ab31302e..15691cfc9c4c 100644 --- a/processing/src/main/java/org/apache/druid/segment/nested/FieldTypeInfo.java +++ b/processing/src/main/java/org/apache/druid/segment/nested/FieldTypeInfo.java @@ -151,35 +151,7 @@ public MutableTypeSet(byte types, boolean hasEmptyArray) public MutableTypeSet add(ColumnType type) { - switch (type.getType()) { - case STRING: - types |= STRING_MASK; - break; - case LONG: - types |= LONG_MASK; - break; - case DOUBLE: - types |= DOUBLE_MASK; - break; - case ARRAY: - Preconditions.checkNotNull(type.getElementType(), "ElementType must not be null"); - switch (type.getElementType().getType()) { - case STRING: - types |= STRING_ARRAY_MASK; - break; - case LONG: - types |= LONG_ARRAY_MASK; - break; - case DOUBLE: - types |= DOUBLE_ARRAY_MASK; - break; - default: - throw new ISE("Unsupported nested array type: [%s]", type.asTypeString()); - } - break; - default: - throw new ISE("Unsupported nested type: [%s]", type.asTypeString()); - } + types = FieldTypeInfo.add(types, type); return this; } @@ -207,7 +179,11 @@ public MutableTypeSet merge(byte other, boolean hasEmptyArray) @Nullable public ColumnType getSingleType() { - return FieldTypeInfo.getSingleType(types); + final ColumnType columnType = FieldTypeInfo.getSingleType(types); + if (hasEmptyArray && columnType != null && !columnType.isArray()) { + return null; + } + return columnType; } public boolean isEmpty() @@ -218,6 +194,10 @@ public boolean isEmpty() public byte getByteValue() { + final ColumnType singleType = FieldTypeInfo.getSingleType(types); + if (hasEmptyArray && singleType != null && !singleType.isArray()) { + return FieldTypeInfo.add(types, ColumnType.ofArray(singleType)); + } return types; } @@ -293,6 +273,40 @@ private static ColumnType getSingleType(byte types) } } + public static byte add(byte types, ColumnType type) + { + switch (type.getType()) { + case STRING: + types |= STRING_MASK; + break; + case LONG: + types |= LONG_MASK; + break; + case DOUBLE: + types |= DOUBLE_MASK; + break; + case ARRAY: + Preconditions.checkNotNull(type.getElementType(), "ElementType must not be null"); + switch (type.getElementType().getType()) { + case STRING: + types |= STRING_ARRAY_MASK; + break; + case LONG: + types |= LONG_ARRAY_MASK; + break; + case DOUBLE: + types |= DOUBLE_ARRAY_MASK; + break; + default: + throw new ISE("Unsupported nested array type: [%s]", type.asTypeString()); + } + break; + default: + throw new ISE("Unsupported nested type: [%s]", type.asTypeString()); + } + return types; + } + public static Set convertToSet(byte types) { final Set theTypes = Sets.newHashSetWithExpectedSize(4); diff --git a/processing/src/test/java/org/apache/druid/query/FrameBasedInlineDataSourceSerializerTest.java b/processing/src/test/java/org/apache/druid/query/FrameBasedInlineDataSourceSerializerTest.java index fbbc089255b5..e01c9459fa12 100644 --- a/processing/src/test/java/org/apache/druid/query/FrameBasedInlineDataSourceSerializerTest.java +++ b/processing/src/test/java/org/apache/druid/query/FrameBasedInlineDataSourceSerializerTest.java @@ -32,6 +32,7 @@ import org.apache.druid.frame.write.FrameWriters; import org.apache.druid.jackson.DefaultObjectMapper; import org.apache.druid.java.util.common.Intervals; +import org.apache.druid.java.util.common.Pair; import org.apache.druid.java.util.common.guava.Sequence; import org.apache.druid.segment.Cursor; import org.apache.druid.segment.column.ColumnType; @@ -40,6 +41,7 @@ import org.junit.Assert; import org.junit.Test; +import java.io.Closeable; import java.util.ArrayList; public class FrameBasedInlineDataSourceSerializerTest @@ -124,10 +126,11 @@ private FrameBasedInlineDataSource convertToFrameBasedInlineDataSource( RowSignature rowSignature ) { - Cursor cursor = IterableRowsCursorHelper.getCursorFromIterable( + Pair cursorAndCloseable = IterableRowsCursorHelper.getCursorFromIterable( inlineDataSource.getRows(), rowSignature ); + Cursor cursor = cursorAndCloseable.lhs; RowSignature modifiedRowSignature = FrameWriterUtils.replaceUnknownTypesWithNestedColumns(rowSignature); Sequence frames = FrameCursorUtils.cursorToFrames( cursor, @@ -139,7 +142,7 @@ private FrameBasedInlineDataSource convertToFrameBasedInlineDataSource( ) ); return new FrameBasedInlineDataSource( - frames.map(frame -> new FrameSignaturePair(frame, rowSignature)).toList(), + frames.map(frame -> new FrameSignaturePair(frame, rowSignature)).withBaggage(cursorAndCloseable.rhs).toList(), modifiedRowSignature ); } diff --git a/processing/src/test/java/org/apache/druid/query/IterableRowsCursorHelperTest.java b/processing/src/test/java/org/apache/druid/query/IterableRowsCursorHelperTest.java index 1acaceabbd60..45f14b80976c 100644 --- a/processing/src/test/java/org/apache/druid/query/IterableRowsCursorHelperTest.java +++ b/processing/src/test/java/org/apache/druid/query/IterableRowsCursorHelperTest.java @@ -48,7 +48,7 @@ public class IterableRowsCursorHelperTest @Test public void getCursorFromIterable() { - Cursor cursor = IterableRowsCursorHelper.getCursorFromIterable(rows, rowSignature); + Cursor cursor = IterableRowsCursorHelper.getCursorFromIterable(rows, rowSignature).lhs; testCursorMatchesRowSequence(cursor, rowSignature, rows); } @@ -56,7 +56,7 @@ public void getCursorFromIterable() public void getCursorFromSequence() { - Cursor cursor = IterableRowsCursorHelper.getCursorFromSequence(Sequences.simple(rows), rowSignature); + Cursor cursor = IterableRowsCursorHelper.getCursorFromSequence(Sequences.simple(rows), rowSignature).lhs; testCursorMatchesRowSequence(cursor, rowSignature, rows); } diff --git a/processing/src/test/java/org/apache/druid/query/scan/NestedDataScanQueryTest.java b/processing/src/test/java/org/apache/druid/query/scan/NestedDataScanQueryTest.java index 0d91e7d5e001..fcf6720311f0 100644 --- a/processing/src/test/java/org/apache/druid/query/scan/NestedDataScanQueryTest.java +++ b/processing/src/test/java/org/apache/druid/query/scan/NestedDataScanQueryTest.java @@ -787,12 +787,12 @@ public void testIngestAndScanSegmentsRealtimeSchemaDiscoveryTypeGauntlet() throw Assert.assertEquals(resultsRealtime.size(), resultsSegments.size()); if (NullHandling.replaceWithDefault()) { Assert.assertEquals( - "[[1672531200000, null, 0, 0.0, 1, 51, -0.13, 1, [], [51, -35], {a=700, b={x=g, y=1.1, z=[9, null, 9, 9]}}, {x=400, y=[{l=[null], m=100, n=5}, {l=[a, b, c], m=a, n=1}], z={}}, null, [a, b], null, [2, 3], null, [null], null, [1, 0, 1], null, [{x=1}, {x=2}], null, hello, 1234, 1.234, {x=1, y=hello, z={a=1.1, b=1234, c=[a, b, c]}}, [a, b, c], [1, 2, 3], [1.1, 2.2, 3.3], [], {}, [null, null], [{}, {}, {}], [{a=b, x=1, y=1.3}], 1], [1672531200000, null, 2, 0.0, 0, b, 1.1, b, 2, b, {a=200, b={x=b, y=1.1, z=[2, 4, 6]}}, {x=10, y=[{l=[b, b, c], m=b, n=2}, [1, 2, 3]], z={a=[5.5], b=false}}, [a, b, c], [null, b], [2, 3], null, [3.3, 4.4, 5.5], [999.0, null, 5.5], [null, null, 2.2], [1, 1], [null, [null], []], [{x=3}, {x=4}], null, hello, 1234, 1.234, {x=1, y=hello, z={a=1.1, b=1234, c=[a, b, c]}}, [a, b, c], [1, 2, 3], [1.1, 2.2, 3.3], [], {}, [null, null], [{}, {}, {}], [{a=b, x=1, y=1.3}], 1], [1672531200000, a, 1, 1.0, 1, 1, 1, 1, 1, 1, {a=100, b={x=a, y=1.1, z=[1, 2, 3, 4]}}, {x=1234, y=[{l=[a, b, c], m=a, n=1}, {l=[a, b, c], m=a, n=1}], z={a=[1.1, 2.2, 3.3], b=true}}, [a, b], [a, b], [1, 2, 3], [1, null, 3], [1.1, 2.2, 3.3], [1.1, 2.2, null], [a, 1, 2.2], [1, 0, 1], [[1, 2, null], [3, 4]], [{x=1}, {x=2}], null, hello, 1234, 1.234, {x=1, y=hello, z={a=1.1, b=1234, c=[a, b, c]}}, [a, b, c], [1, 2, 3], [1.1, 2.2, 3.3], [], {}, [null, null], [{}, {}, {}], [{a=b, x=1, y=1.3}], 1], [1672531200000, b, 4, 3.3, 1, 1, 0.0, {}, 4, 1, {a=400, b={x=d, y=1.1, z=[3, 4]}}, {x=1234, z={a=[1.1, 2.2, 3.3], b=true}}, [d, e], [b, b], [1, 4], [1], [2.2, 3.3, 4.0], null, [a, b, c], [null, 0, 1], [[1, 2], [3, 4], [5, 6, 7]], [{x=null}, {x=2}], null, hello, 1234, 1.234, {x=1, y=hello, z={a=1.1, b=1234, c=[a, b, c]}}, [a, b, c], [1, 2, 3], [1.1, 2.2, 3.3], [], {}, [null, null], [{}, {}, {}], [{a=b, x=1, y=1.3}], 1], [1672531200000, c, 0, 4.4, 1, hello, -1000, {}, [], hello, {a=500, b={x=e, z=[1, 2, 3, 4]}}, {x=11, y=[], z={a=[null], b=false}}, null, null, [1, 2, 3], [], [1.1, 2.2, 3.3], null, null, [0], null, [{x=1000}, {y=2000}], null, hello, 1234, 1.234, {x=1, y=hello, z={a=1.1, b=1234, c=[a, b, c]}}, [a, b, c], [1, 2, 3], [1.1, 2.2, 3.3], [], {}, [null, null], [{}, {}, {}], [{a=b, x=1, y=1.3}], 1], [1672531200000, d, 5, 5.9, 0, null, 3.33, a, 6, null, {a=600, b={x=f, y=1.1, z=[6, 7, 8, 9]}}, null, [a, b], null, null, [null, 2, 9], null, [999.0, 5.5, null], [a, 1, 2.2], [], [[1], [1, 2, null]], [{a=1}, {b=2}], null, hello, 1234, 1.234, {x=1, y=hello, z={a=1.1, b=1234, c=[a, b, c]}}, [a, b, c], [1, 2, 3], [1.1, 2.2, 3.3], [], {}, [null, null], [{}, {}, {}], [{a=b, x=1, y=1.3}], 1], [1672531200000, null, 3, 2.0, 0, 3.0, 1.0, 3.3, 3, 3.0, {a=300}, {x=4.4, y=[{l=[], m=100, n=3}, {l=[a]}, {l=[b], n=[]}], z={a=[], b=true}}, [b, c], [d, null, b], [1, 2, 3, 4], [1, 2, 3], [1.1, 3.3], [null, 2.2, null], [1, null, 1], [1, null, 1], [[1], null, [1, 2, 3]], [null, {x=2}], null, hello, 1234, 1.234, {x=1, y=hello, z={a=1.1, b=1234, c=[a, b, c]}}, [a, b, c], [1, 2, 3], [1.1, 2.2, 3.3], [], {}, [null, null], [{}, {}, {}], [{a=b, x=1, y=1.3}], 1]]", + "[[1672531200000, null, 0, 0.0, 1, 51, -0.13, 1, [], [51, -35], {a=700, b={x=g, y=1.1, z=[9, null, 9, 9]}, v=[]}, {x=400, y=[{l=[null], m=100, n=5}, {l=[a, b, c], m=a, n=1}], z={}}, null, [a, b], null, [2, 3], null, [null], null, [1, 0, 1], null, [{x=1}, {x=2}], null, hello, 1234, 1.234, {x=1, y=hello, z={a=1.1, b=1234, c=[a, b, c]}}, [a, b, c], [1, 2, 3], [1.1, 2.2, 3.3], [], {}, [null, null], [{}, {}, {}], [{a=b, x=1, y=1.3}], 1], [1672531200000, null, 2, 0.0, 0, b, 1.1, b, 2, b, {a=200, b={x=b, y=1.1, z=[2, 4, 6]}, v=[]}, {x=10, y=[{l=[b, b, c], m=b, n=2}, [1, 2, 3]], z={a=[5.5], b=false}}, [a, b, c], [null, b], [2, 3], null, [3.3, 4.4, 5.5], [999.0, null, 5.5], [null, null, 2.2], [1, 1], [null, [null], []], [{x=3}, {x=4}], null, hello, 1234, 1.234, {x=1, y=hello, z={a=1.1, b=1234, c=[a, b, c]}}, [a, b, c], [1, 2, 3], [1.1, 2.2, 3.3], [], {}, [null, null], [{}, {}, {}], [{a=b, x=1, y=1.3}], 1], [1672531200000, a, 1, 1.0, 1, 1, 1, 1, 1, 1, {a=100, b={x=a, y=1.1, z=[1, 2, 3, 4]}, v=[]}, {x=1234, y=[{l=[a, b, c], m=a, n=1}, {l=[a, b, c], m=a, n=1}], z={a=[1.1, 2.2, 3.3], b=true}}, [a, b], [a, b], [1, 2, 3], [1, null, 3], [1.1, 2.2, 3.3], [1.1, 2.2, null], [a, 1, 2.2], [1, 0, 1], [[1, 2, null], [3, 4]], [{x=1}, {x=2}], null, hello, 1234, 1.234, {x=1, y=hello, z={a=1.1, b=1234, c=[a, b, c]}}, [a, b, c], [1, 2, 3], [1.1, 2.2, 3.3], [], {}, [null, null], [{}, {}, {}], [{a=b, x=1, y=1.3}], 1], [1672531200000, b, 4, 3.3, 1, 1, 0.0, {}, 4, 1, {a=400, b={x=d, y=1.1, z=[3, 4]}, v=[]}, {x=1234, z={a=[1.1, 2.2, 3.3], b=true}}, [d, e], [b, b], [1, 4], [1], [2.2, 3.3, 4.0], null, [a, b, c], [null, 0, 1], [[1, 2], [3, 4], [5, 6, 7]], [{x=null}, {x=2}], null, hello, 1234, 1.234, {x=1, y=hello, z={a=1.1, b=1234, c=[a, b, c]}}, [a, b, c], [1, 2, 3], [1.1, 2.2, 3.3], [], {}, [null, null], [{}, {}, {}], [{a=b, x=1, y=1.3}], 1], [1672531200000, c, 0, 4.4, 1, hello, -1000, {}, [], hello, {a=500, b={x=e, z=[1, 2, 3, 4]}, v=a}, {x=11, y=[], z={a=[null], b=false}}, null, null, [1, 2, 3], [], [1.1, 2.2, 3.3], null, null, [0], null, [{x=1000}, {y=2000}], null, hello, 1234, 1.234, {x=1, y=hello, z={a=1.1, b=1234, c=[a, b, c]}}, [a, b, c], [1, 2, 3], [1.1, 2.2, 3.3], [], {}, [null, null], [{}, {}, {}], [{a=b, x=1, y=1.3}], 1], [1672531200000, d, 5, 5.9, 0, null, 3.33, a, 6, null, {a=600, b={x=f, y=1.1, z=[6, 7, 8, 9]}, v=b}, null, [a, b], null, null, [null, 2, 9], null, [999.0, 5.5, null], [a, 1, 2.2], [], [[1], [1, 2, null]], [{a=1}, {b=2}], null, hello, 1234, 1.234, {x=1, y=hello, z={a=1.1, b=1234, c=[a, b, c]}}, [a, b, c], [1, 2, 3], [1.1, 2.2, 3.3], [], {}, [null, null], [{}, {}, {}], [{a=b, x=1, y=1.3}], 1], [1672531200000, null, 3, 2.0, 0, 3.0, 1.0, 3.3, 3, 3.0, {a=300}, {x=4.4, y=[{l=[], m=100, n=3}, {l=[a]}, {l=[b], n=[]}], z={a=[], b=true}}, [b, c], [d, null, b], [1, 2, 3, 4], [1, 2, 3], [1.1, 3.3], [null, 2.2, null], [1, null, 1], [1, null, 1], [[1], null, [1, 2, 3]], [null, {x=2}], null, hello, 1234, 1.234, {x=1, y=hello, z={a=1.1, b=1234, c=[a, b, c]}}, [a, b, c], [1, 2, 3], [1.1, 2.2, 3.3], [], {}, [null, null], [{}, {}, {}], [{a=b, x=1, y=1.3}], 1]]", resultsSegments.get(0).getEvents().toString() ); } else { Assert.assertEquals( - "[[1672531200000, null, null, null, 1, 51, -0.13, 1, [], [51, -35], {a=700, b={x=g, y=1.1, z=[9, null, 9, 9]}}, {x=400, y=[{l=[null], m=100, n=5}, {l=[a, b, c], m=a, n=1}], z={}}, null, [a, b], null, [2, 3], null, [null], null, [1, 0, 1], null, [{x=1}, {x=2}], null, hello, 1234, 1.234, {x=1, y=hello, z={a=1.1, b=1234, c=[a, b, c]}}, [a, b, c], [1, 2, 3], [1.1, 2.2, 3.3], [], {}, [null, null], [{}, {}, {}], [{a=b, x=1, y=1.3}], 1], [1672531200000, , 2, null, 0, b, 1.1, b, 2, b, {a=200, b={x=b, y=1.1, z=[2, 4, 6]}}, {x=10, y=[{l=[b, b, c], m=b, n=2}, [1, 2, 3]], z={a=[5.5], b=false}}, [a, b, c], [null, b], [2, 3], null, [3.3, 4.4, 5.5], [999.0, null, 5.5], [null, null, 2.2], [1, 1], [null, [null], []], [{x=3}, {x=4}], null, hello, 1234, 1.234, {x=1, y=hello, z={a=1.1, b=1234, c=[a, b, c]}}, [a, b, c], [1, 2, 3], [1.1, 2.2, 3.3], [], {}, [null, null], [{}, {}, {}], [{a=b, x=1, y=1.3}], 1], [1672531200000, a, 1, 1.0, 1, 1, 1, 1, 1, 1, {a=100, b={x=a, y=1.1, z=[1, 2, 3, 4]}}, {x=1234, y=[{l=[a, b, c], m=a, n=1}, {l=[a, b, c], m=a, n=1}], z={a=[1.1, 2.2, 3.3], b=true}}, [a, b], [a, b], [1, 2, 3], [1, null, 3], [1.1, 2.2, 3.3], [1.1, 2.2, null], [a, 1, 2.2], [1, 0, 1], [[1, 2, null], [3, 4]], [{x=1}, {x=2}], null, hello, 1234, 1.234, {x=1, y=hello, z={a=1.1, b=1234, c=[a, b, c]}}, [a, b, c], [1, 2, 3], [1.1, 2.2, 3.3], [], {}, [null, null], [{}, {}, {}], [{a=b, x=1, y=1.3}], 1], [1672531200000, b, 4, 3.3, 1, 1, null, {}, 4, 1, {a=400, b={x=d, y=1.1, z=[3, 4]}}, {x=1234, z={a=[1.1, 2.2, 3.3], b=true}}, [d, e], [b, b], [1, 4], [1], [2.2, 3.3, 4.0], null, [a, b, c], [null, 0, 1], [[1, 2], [3, 4], [5, 6, 7]], [{x=null}, {x=2}], null, hello, 1234, 1.234, {x=1, y=hello, z={a=1.1, b=1234, c=[a, b, c]}}, [a, b, c], [1, 2, 3], [1.1, 2.2, 3.3], [], {}, [null, null], [{}, {}, {}], [{a=b, x=1, y=1.3}], 1], [1672531200000, c, null, 4.4, 1, hello, -1000, {}, [], hello, {a=500, b={x=e, z=[1, 2, 3, 4]}}, {x=11, y=[], z={a=[null], b=false}}, null, null, [1, 2, 3], [], [1.1, 2.2, 3.3], null, null, [0], null, [{x=1000}, {y=2000}], null, hello, 1234, 1.234, {x=1, y=hello, z={a=1.1, b=1234, c=[a, b, c]}}, [a, b, c], [1, 2, 3], [1.1, 2.2, 3.3], [], {}, [null, null], [{}, {}, {}], [{a=b, x=1, y=1.3}], 1], [1672531200000, d, 5, 5.9, 0, null, 3.33, a, 6, null, {a=600, b={x=f, y=1.1, z=[6, 7, 8, 9]}}, null, [a, b], null, null, [null, 2, 9], null, [999.0, 5.5, null], [a, 1, 2.2], [], [[1], [1, 2, null]], [{a=1}, {b=2}], null, hello, 1234, 1.234, {x=1, y=hello, z={a=1.1, b=1234, c=[a, b, c]}}, [a, b, c], [1, 2, 3], [1.1, 2.2, 3.3], [], {}, [null, null], [{}, {}, {}], [{a=b, x=1, y=1.3}], 1], [1672531200000, null, 3, 2.0, null, 3.0, 1.0, 3.3, 3, 3.0, {a=300}, {x=4.4, y=[{l=[], m=100, n=3}, {l=[a]}, {l=[b], n=[]}], z={a=[], b=true}}, [b, c], [d, null, b], [1, 2, 3, 4], [1, 2, 3], [1.1, 3.3], [null, 2.2, null], [1, null, 1], [1, null, 1], [[1], null, [1, 2, 3]], [null, {x=2}], null, hello, 1234, 1.234, {x=1, y=hello, z={a=1.1, b=1234, c=[a, b, c]}}, [a, b, c], [1, 2, 3], [1.1, 2.2, 3.3], [], {}, [null, null], [{}, {}, {}], [{a=b, x=1, y=1.3}], 1]]", + "[[1672531200000, null, null, null, 1, 51, -0.13, 1, [], [51, -35], {a=700, b={x=g, y=1.1, z=[9, null, 9, 9]}, v=[]}, {x=400, y=[{l=[null], m=100, n=5}, {l=[a, b, c], m=a, n=1}], z={}}, null, [a, b], null, [2, 3], null, [null], null, [1, 0, 1], null, [{x=1}, {x=2}], null, hello, 1234, 1.234, {x=1, y=hello, z={a=1.1, b=1234, c=[a, b, c]}}, [a, b, c], [1, 2, 3], [1.1, 2.2, 3.3], [], {}, [null, null], [{}, {}, {}], [{a=b, x=1, y=1.3}], 1], [1672531200000, , 2, null, 0, b, 1.1, b, 2, b, {a=200, b={x=b, y=1.1, z=[2, 4, 6]}, v=[]}, {x=10, y=[{l=[b, b, c], m=b, n=2}, [1, 2, 3]], z={a=[5.5], b=false}}, [a, b, c], [null, b], [2, 3], null, [3.3, 4.4, 5.5], [999.0, null, 5.5], [null, null, 2.2], [1, 1], [null, [null], []], [{x=3}, {x=4}], null, hello, 1234, 1.234, {x=1, y=hello, z={a=1.1, b=1234, c=[a, b, c]}}, [a, b, c], [1, 2, 3], [1.1, 2.2, 3.3], [], {}, [null, null], [{}, {}, {}], [{a=b, x=1, y=1.3}], 1], [1672531200000, a, 1, 1.0, 1, 1, 1, 1, 1, 1, {a=100, b={x=a, y=1.1, z=[1, 2, 3, 4]}, v=[]}, {x=1234, y=[{l=[a, b, c], m=a, n=1}, {l=[a, b, c], m=a, n=1}], z={a=[1.1, 2.2, 3.3], b=true}}, [a, b], [a, b], [1, 2, 3], [1, null, 3], [1.1, 2.2, 3.3], [1.1, 2.2, null], [a, 1, 2.2], [1, 0, 1], [[1, 2, null], [3, 4]], [{x=1}, {x=2}], null, hello, 1234, 1.234, {x=1, y=hello, z={a=1.1, b=1234, c=[a, b, c]}}, [a, b, c], [1, 2, 3], [1.1, 2.2, 3.3], [], {}, [null, null], [{}, {}, {}], [{a=b, x=1, y=1.3}], 1], [1672531200000, b, 4, 3.3, 1, 1, null, {}, 4, 1, {a=400, b={x=d, y=1.1, z=[3, 4]}, v=[]}, {x=1234, z={a=[1.1, 2.2, 3.3], b=true}}, [d, e], [b, b], [1, 4], [1], [2.2, 3.3, 4.0], null, [a, b, c], [null, 0, 1], [[1, 2], [3, 4], [5, 6, 7]], [{x=null}, {x=2}], null, hello, 1234, 1.234, {x=1, y=hello, z={a=1.1, b=1234, c=[a, b, c]}}, [a, b, c], [1, 2, 3], [1.1, 2.2, 3.3], [], {}, [null, null], [{}, {}, {}], [{a=b, x=1, y=1.3}], 1], [1672531200000, c, null, 4.4, 1, hello, -1000, {}, [], hello, {a=500, b={x=e, z=[1, 2, 3, 4]}, v=a}, {x=11, y=[], z={a=[null], b=false}}, null, null, [1, 2, 3], [], [1.1, 2.2, 3.3], null, null, [0], null, [{x=1000}, {y=2000}], null, hello, 1234, 1.234, {x=1, y=hello, z={a=1.1, b=1234, c=[a, b, c]}}, [a, b, c], [1, 2, 3], [1.1, 2.2, 3.3], [], {}, [null, null], [{}, {}, {}], [{a=b, x=1, y=1.3}], 1], [1672531200000, d, 5, 5.9, 0, null, 3.33, a, 6, null, {a=600, b={x=f, y=1.1, z=[6, 7, 8, 9]}, v=b}, null, [a, b], null, null, [null, 2, 9], null, [999.0, 5.5, null], [a, 1, 2.2], [], [[1], [1, 2, null]], [{a=1}, {b=2}], null, hello, 1234, 1.234, {x=1, y=hello, z={a=1.1, b=1234, c=[a, b, c]}}, [a, b, c], [1, 2, 3], [1.1, 2.2, 3.3], [], {}, [null, null], [{}, {}, {}], [{a=b, x=1, y=1.3}], 1], [1672531200000, null, 3, 2.0, null, 3.0, 1.0, 3.3, 3, 3.0, {a=300}, {x=4.4, y=[{l=[], m=100, n=3}, {l=[a]}, {l=[b], n=[]}], z={a=[], b=true}}, [b, c], [d, null, b], [1, 2, 3, 4], [1, 2, 3], [1.1, 3.3], [null, 2.2, null], [1, null, 1], [1, null, 1], [[1], null, [1, 2, 3]], [null, {x=2}], null, hello, 1234, 1.234, {x=1, y=hello, z={a=1.1, b=1234, c=[a, b, c]}}, [a, b, c], [1, 2, 3], [1.1, 2.2, 3.3], [], {}, [null, null], [{}, {}, {}], [{a=b, x=1, y=1.3}], 1]]", resultsSegments.get(0).getEvents().toString() ); } diff --git a/processing/src/test/java/org/apache/druid/segment/join/table/FrameBasedIndexedTableTest.java b/processing/src/test/java/org/apache/druid/segment/join/table/FrameBasedIndexedTableTest.java index ed59f5f80652..6a244e9bae97 100644 --- a/processing/src/test/java/org/apache/druid/segment/join/table/FrameBasedIndexedTableTest.java +++ b/processing/src/test/java/org/apache/druid/segment/join/table/FrameBasedIndexedTableTest.java @@ -33,6 +33,7 @@ import org.apache.druid.frame.write.FrameWriterFactory; import org.apache.druid.frame.write.FrameWriters; import org.apache.druid.java.util.common.IAE; +import org.apache.druid.java.util.common.Pair; import org.apache.druid.java.util.common.StringUtils; import org.apache.druid.java.util.common.io.Closer; import org.apache.druid.query.FrameBasedInlineDataSource; @@ -42,10 +43,12 @@ import org.apache.druid.segment.column.ColumnType; import org.apache.druid.segment.column.RowSignature; import org.apache.druid.testing.InitializedNullHandlingTest; +import org.junit.After; import org.junit.Assert; import org.junit.Before; import org.junit.Test; +import java.io.Closeable; import java.io.IOException; import java.util.ArrayList; import java.util.Arrays; @@ -207,11 +210,13 @@ public class FrameBasedIndexedTableTest extends InitializedNullHandlingTest private FrameBasedInlineDataSource dataSource; private FrameBasedIndexedTable frameBasedIndexedTable; + private Pair cursorCloseablePair; @Before public void setup() { - Cursor cursor = IterableRowsCursorHelper.getCursorFromIterable(DATASOURCE_ROWS, ROW_SIGNATURE); + cursorCloseablePair = IterableRowsCursorHelper.getCursorFromIterable(DATASOURCE_ROWS, ROW_SIGNATURE); + Cursor cursor = cursorCloseablePair.lhs; FrameWriterFactory frameWriterFactory = FrameWriters.makeFrameWriterFactory( FrameType.COLUMNAR, new SingleMemoryAllocatorFactory(HeapMemoryAllocator.unlimited()), @@ -226,7 +231,12 @@ public void setup() ); frameBasedIndexedTable = new FrameBasedIndexedTable(dataSource, KEY_COLUMNS, "test"); + } + @After + public void tearDown() throws IOException + { + cursorCloseablePair.rhs.close(); } @Test diff --git a/processing/src/test/java/org/apache/druid/segment/nested/NestedDataColumnSupplierTest.java b/processing/src/test/java/org/apache/druid/segment/nested/NestedDataColumnSupplierTest.java index 80daa3549dcf..653b39ff9bf7 100644 --- a/processing/src/test/java/org/apache/druid/segment/nested/NestedDataColumnSupplierTest.java +++ b/processing/src/test/java/org/apache/druid/segment/nested/NestedDataColumnSupplierTest.java @@ -171,7 +171,7 @@ public static void staticSetup() @Before public void setup() throws IOException { - final String fileNameBase = "test"; + final String fileNameBase = "test/column"; final String arrayFileNameBase = "array"; fileMapper = smooshify(fileNameBase, tempFolder.newFolder(), data); baseBuffer = fileMapper.mapFile(fileNameBase); diff --git a/processing/src/test/java/org/apache/druid/segment/nested/NestedFieldTypeInfoTest.java b/processing/src/test/java/org/apache/druid/segment/nested/NestedFieldTypeInfoTest.java index 0e8d95cc57c9..33df1887ea54 100644 --- a/processing/src/test/java/org/apache/druid/segment/nested/NestedFieldTypeInfoTest.java +++ b/processing/src/test/java/org/apache/druid/segment/nested/NestedFieldTypeInfoTest.java @@ -56,6 +56,23 @@ public void testSingleType() throws IOException } } + @Test + public void testSingleTypeWithEmptyArray() throws IOException + { + List supportedTypes = ImmutableList.of( + ColumnType.STRING, + ColumnType.LONG, + ColumnType.DOUBLE, + ColumnType.STRING_ARRAY, + ColumnType.LONG_ARRAY, + ColumnType.DOUBLE_ARRAY + ); + + for (ColumnType type : supportedTypes) { + testSingleTypeWithEmptyArray(type); + } + } + @Test public void testMultiType() throws IOException { @@ -137,6 +154,51 @@ private void testMultiType(Set columnTypes) throws IOException Assert.assertEquals(1, BUFFER.position()); } + private void testSingleTypeWithEmptyArray(ColumnType columnType) throws IOException + { + FieldTypeInfo.MutableTypeSet typeSet = new FieldTypeInfo.MutableTypeSet(); + typeSet.add(columnType); + typeSet.addUntypedArray(); + + if (columnType.isArray()) { + // arrays with empty arrays are still single type + Assert.assertEquals(columnType, typeSet.getSingleType()); + Assert.assertEquals(ImmutableSet.of(columnType), FieldTypeInfo.convertToSet(typeSet.getByteValue())); + + writeTypeSet(typeSet); + FieldTypeInfo info = new FieldTypeInfo(BUFFER); + Assert.assertEquals(0, BUFFER.position()); + + FieldTypeInfo.TypeSet roundTrip = info.getTypes(0); + Assert.assertEquals(columnType, roundTrip.getSingleType()); + + FieldTypeInfo info2 = FieldTypeInfo.read(BUFFER, 1); + Assert.assertEquals(info.getTypes(0), info2.getTypes(0)); + Assert.assertEquals(1, BUFFER.position()); + } else { + // scalar types become multi-type + Set columnTypes = ImmutableSet.of(columnType, ColumnType.ofArray(columnType)); + FieldTypeInfo.MutableTypeSet merge = new FieldTypeInfo.MutableTypeSet(); + merge.merge(new FieldTypeInfo.MutableTypeSet().add(columnType).getByteValue(), true); + + Assert.assertEquals(merge.getByteValue(), typeSet.getByteValue()); + Assert.assertNull(typeSet.getSingleType()); + Assert.assertEquals(columnTypes, FieldTypeInfo.convertToSet(typeSet.getByteValue())); + + writeTypeSet(typeSet); + FieldTypeInfo info = new FieldTypeInfo(BUFFER); + Assert.assertEquals(0, BUFFER.position()); + + FieldTypeInfo.TypeSet roundTrip = info.getTypes(0); + Assert.assertNull(roundTrip.getSingleType()); + Assert.assertEquals(columnTypes, FieldTypeInfo.convertToSet(roundTrip.getByteValue())); + + FieldTypeInfo info2 = FieldTypeInfo.read(BUFFER, 1); + Assert.assertEquals(info.getTypes(0), info2.getTypes(0)); + Assert.assertEquals(1, BUFFER.position()); + } + } + private static void writeTypeSet(FieldTypeInfo.MutableTypeSet typeSet) throws IOException { BUFFER.position(0); diff --git a/processing/src/test/resources/nested-all-types-test-data.json b/processing/src/test/resources/nested-all-types-test-data.json index 34d92b52ae82..b70c87646019 100644 --- a/processing/src/test/resources/nested-all-types-test-data.json +++ b/processing/src/test/resources/nested-all-types-test-data.json @@ -1,7 +1,7 @@ -{"timestamp": "2023-01-01T00:00:00", "str":"a", "long":1, "double":1.0, "bool": true, "variant": 1, "variantNumeric": 1, "variantEmptyObj":1, "variantEmtpyArray":1, "variantWithArrays": 1, "obj":{"a": 100, "b": {"x": "a", "y": 1.1, "z": [1, 2, 3, 4]}}, "complexObj":{"x": 1234, "y": [{"l": ["a", "b", "c"], "m": "a", "n": 1},{"l": ["a", "b", "c"], "m": "a", "n": 1}], "z": {"a": [1.1, 2.2, 3.3], "b": true}}, "arrayString": ["a", "b"], "arrayStringNulls": ["a", "b"], "arrayLong":[1, 2, 3], "arrayLongNulls":[1, null,3], "arrayDouble":[1.1, 2.2, 3.3], "arrayDoubleNulls":[1.1, 2.2, null], "arrayVariant":["a", 1, 2.2], "arrayBool":[true, false, true], "arrayNestedLong":[[1, 2, null], [3, 4]], "arrayObject":[{"x": 1},{"x":2}], "null": null, "cstr": "hello", "clong": 1234, "cdouble": 1.234, "cObj":{"x": 1, "y": "hello", "z": {"a": 1.1, "b": 1234, "c": ["a", "b", "c"]}}, "cstringArray": ["a", "b", "c"], "cLongArray": [1, 2, 3], "cDoubleArray": [1.1, 2.2, 3.3], "cEmptyArray":[], "cEmptyObj":{}, "cNullArray": [null, null], "cEmptyObjectArray": [{},{},{}], "cObjectArray": [{"a":"b", "x":1, "y":1.3}]} -{"timestamp": "2023-01-01T00:00:00", "str":"", "long":2, "bool": false, "variant": "b", "variantNumeric": 1.1, "variantEmptyObj":"b", "variantEmtpyArray":2, "variantWithArrays": "b", "obj":{"a": 200, "b": {"x": "b", "y": 1.1, "z": [2, 4, 6]}}, "complexObj":{"x": 10, "y": [{"l": ["b", "b", "c"], "m": "b", "n": 2}, [1, 2, 3]], "z": {"a": [5.5], "b": false}}, "arrayString": ["a", "b", "c"], "arrayStringNulls": [null, "b"], "arrayLong":[2, 3], "arrayDouble":[3.3, 4.4, 5.5], "arrayDoubleNulls":[999, null, 5.5], "arrayVariant":[null, null, 2.2], "arrayBool":[true, true], "arrayNestedLong":[null, [null], []], "arrayObject":[{"x": 3},{"x":4}], "null": null, "cstr": "hello", "clong": 1234, "cdouble": 1.234, "cObj":{"x": 1, "y": "hello", "z": {"a": 1.1, "b": 1234, "c": ["a", "b", "c"]}}, "cstringArray": ["a", "b", "c"], "cLongArray": [1, 2, 3], "cDoubleArray": [1.1, 2.2, 3.3], "cEmptyArray":[], "cEmptyObj":{}, "cNullArray": [null, null], "cEmptyObjectArray": [{},{},{}], "cObjectArray": [{"a":"b", "x":1, "y":1.3}]} -{"timestamp": "2023-01-01T00:00:00", "str":"null", "long":3, "double":2.0, "variant": 3.0, "variantNumeric": 1.0, "variantEmptyObj":3.3, "variantEmtpyArray":3, "variantWithArrays": 3.0, "obj":{"a": 300}, "complexObj":{"x": 4.4, "y": [{"l": [], "m": 100, "n": 3},{"l": ["a"]}, {"l": ["b"], "n": []}], "z": {"a": [], "b": true}}, "arrayString": ["b", "c"], "arrayStringNulls": ["d", null, "b"], "arrayLong":[1, 2, 3, 4], "arrayLongNulls":[1, 2, 3], "arrayDouble":[1.1, 3.3], "arrayDoubleNulls":[null, 2.2, null], "arrayVariant":[1, null, 1], "arrayBool":[true, null, true], "arrayNestedLong":[[1], null, [1, 2, 3]], "arrayObject":[null,{"x":2}], "cstr": "hello", "clong": 1234, "cdouble": 1.234, "cObj":{"x": 1, "y": "hello", "z": {"a": 1.1, "b": 1234, "c": ["a", "b", "c"]}}, "cstringArray": ["a", "b", "c"], "cLongArray": [1, 2, 3], "cDoubleArray": [1.1, 2.2, 3.3], "cEmptyArray":[], "cEmptyObj":{}, "cNullArray": [null, null], "cEmptyObjectArray": [{},{},{}], "cObjectArray": [{"a":"b", "x":1, "y":1.3}]} -{"timestamp": "2023-01-01T00:00:00", "str":"b", "long":4, "double":3.3, "bool": true, "variant": "1", "variantEmptyObj":{}, "variantEmtpyArray":4, "variantWithArrays": "1", "obj":{"a": 400, "b": {"x": "d", "y": 1.1, "z": [3, 4]}}, "complexObj":{"x": 1234, "z": {"a": [1.1, 2.2, 3.3], "b": true}}, "arrayString": ["d", "e"], "arrayStringNulls": ["b", "b"], "arrayLong":[1, 4], "arrayLongNulls":[1], "arrayDouble":[2.2, 3.3, 4.0], "arrayVariant":["a", "b", "c"], "arrayBool":[null, false, true], "arrayNestedLong":[[1, 2], [3, 4], [5, 6, 7]], "arrayObject":[{"x": null},{"x":2}], "null": null, "cstr": "hello", "clong": 1234, "cdouble": 1.234, "cObj":{"x": 1, "y": "hello", "z": {"a": 1.1, "b": 1234, "c": ["a", "b", "c"]}}, "cstringArray": ["a", "b", "c"], "cLongArray": [1, 2, 3], "cDoubleArray": [1.1, 2.2, 3.3], "cEmptyArray":[], "cEmptyObj":{}, "cNullArray": [null, null], "cEmptyObjectArray": [{},{},{}], "cObjectArray": [{"a":"b", "x":1, "y":1.3}]} -{"timestamp": "2023-01-01T00:00:00", "str":"c", "long": null, "double":4.4, "bool": true, "variant": "hello", "variantNumeric": -1000, "variantEmptyObj":{}, "variantEmtpyArray":[], "variantWithArrays": "hello", "obj":{"a": 500, "b": {"x": "e", "z": [1, 2, 3, 4]}}, "complexObj":{"x": 11, "y": [], "z": {"a": [null], "b": false}}, "arrayString": null, "arrayLong":[1, 2, 3], "arrayLongNulls":[], "arrayDouble":[1.1, 2.2, 3.3], "arrayDoubleNulls":null, "arrayBool":[false], "arrayObject":[{"x": 1000},{"y":2000}], "null": null, "cstr": "hello", "clong": 1234, "cdouble": 1.234, "cObj":{"x": 1, "y": "hello", "z": {"a": 1.1, "b": 1234, "c": ["a", "b", "c"]}}, "cstringArray": ["a", "b", "c"], "cLongArray": [1, 2, 3], "cDoubleArray": [1.1, 2.2, 3.3], "cEmptyArray":[], "cEmptyObj":{}, "cNullArray": [null, null], "cEmptyObjectArray": [{},{},{}], "cObjectArray": [{"a":"b", "x":1, "y":1.3}]} -{"timestamp": "2023-01-01T00:00:00", "str":"d", "long":5, "double":5.9, "bool": false, "variantNumeric": 3.33, "variantEmptyObj":"a", "variantEmtpyArray":6, "obj":{"a": 600, "b": {"x": "f", "y": 1.1, "z": [6, 7, 8, 9]}}, "arrayString": ["a", "b"], "arrayStringNulls": null, "arrayLongNulls":[null, 2, 9], "arrayDouble":null, "arrayDoubleNulls":[999, 5.5, null], "arrayVariant":["a", 1, 2.2], "arrayBool":[], "arrayNestedLong":[[1], [1, 2, null]], "arrayObject":[{"a": 1},{"b":2}], "cstr": "hello", "clong": 1234, "cdouble": 1.234, "cObj":{"x": 1, "y": "hello", "z": {"a": 1.1, "b": 1234, "c": ["a", "b", "c"]}}, "cstringArray": ["a", "b", "c"], "cLongArray": [1, 2, 3], "cDoubleArray": [1.1, 2.2, 3.3], "cEmptyArray":[], "cEmptyObj":{}, "cNullArray": [null, null], "cEmptyObjectArray": [{},{},{}], "cObjectArray": [{"a":"b", "x":1, "y":1.3}]} -{"timestamp": "2023-01-01T00:00:00", "str":null, "double":null, "bool": true, "variant": 51, "variantNumeric": -0.13, "variantEmptyObj":1, "variantEmtpyArray":[], "variantWithArrays": [51, -35], "obj":{"a": 700, "b": {"x": "g", "y": 1.1, "z": [9, null, 9, 9]}}, "complexObj":{"x": 400, "y": [{"l": [null], "m": 100, "n": 5},{"l": ["a", "b", "c"], "m": "a", "n": 1}], "z": {}}, "arrayStringNulls": ["a", "b"], "arrayLong":null, "arrayLongNulls":[2, 3], "arrayDoubleNulls":[null], "arrayVariant":null, "arrayBool":[true, false, true], "arrayNestedLong":null, "arrayObject":[{"x": 1},{"x":2}], "null": null, "cstr": "hello", "clong": 1234, "cdouble": 1.234, "cObj":{"x": 1, "y": "hello", "z": {"a": 1.1, "b": 1234, "c": ["a", "b", "c"]}}, "cstringArray": ["a", "b", "c"], "cLongArray": [1, 2, 3], "cDoubleArray": [1.1, 2.2, 3.3], "cEmptyArray":[], "cEmptyObj":{}, "cNullArray": [null, null], "cEmptyObjectArray": [{},{},{}], "cObjectArray": [{"a":"b", "x":1, "y":1.3}]} +{"timestamp": "2023-01-01T00:00:00", "str":"a", "long":1, "double":1.0, "bool": true, "variant": 1, "variantNumeric": 1, "variantEmptyObj":1, "variantEmtpyArray":1, "variantWithArrays": 1, "obj":{"a": 100, "b": {"x": "a", "y": 1.1, "z": [1, 2, 3, 4]}, "v": []}, "complexObj":{"x": 1234, "y": [{"l": ["a", "b", "c"], "m": "a", "n": 1},{"l": ["a", "b", "c"], "m": "a", "n": 1}], "z": {"a": [1.1, 2.2, 3.3], "b": true}}, "arrayString": ["a", "b"], "arrayStringNulls": ["a", "b"], "arrayLong":[1, 2, 3], "arrayLongNulls":[1, null,3], "arrayDouble":[1.1, 2.2, 3.3], "arrayDoubleNulls":[1.1, 2.2, null], "arrayVariant":["a", 1, 2.2], "arrayBool":[true, false, true], "arrayNestedLong":[[1, 2, null], [3, 4]], "arrayObject":[{"x": 1},{"x":2}], "null": null, "cstr": "hello", "clong": 1234, "cdouble": 1.234, "cObj":{"x": 1, "y": "hello", "z": {"a": 1.1, "b": 1234, "c": ["a", "b", "c"]}}, "cstringArray": ["a", "b", "c"], "cLongArray": [1, 2, 3], "cDoubleArray": [1.1, 2.2, 3.3], "cEmptyArray":[], "cEmptyObj":{}, "cNullArray": [null, null], "cEmptyObjectArray": [{},{},{}], "cObjectArray": [{"a":"b", "x":1, "y":1.3}]} +{"timestamp": "2023-01-01T00:00:00", "str":"", "long":2, "bool": false, "variant": "b", "variantNumeric": 1.1, "variantEmptyObj":"b", "variantEmtpyArray":2, "variantWithArrays": "b", "obj":{"a": 200, "b": {"x": "b", "y": 1.1, "z": [2, 4, 6]}, "v": []}, "complexObj":{"x": 10, "y": [{"l": ["b", "b", "c"], "m": "b", "n": 2}, [1, 2, 3]], "z": {"a": [5.5], "b": false}}, "arrayString": ["a", "b", "c"], "arrayStringNulls": [null, "b"], "arrayLong":[2, 3], "arrayDouble":[3.3, 4.4, 5.5], "arrayDoubleNulls":[999, null, 5.5], "arrayVariant":[null, null, 2.2], "arrayBool":[true, true], "arrayNestedLong":[null, [null], []], "arrayObject":[{"x": 3},{"x":4}], "null": null, "cstr": "hello", "clong": 1234, "cdouble": 1.234, "cObj":{"x": 1, "y": "hello", "z": {"a": 1.1, "b": 1234, "c": ["a", "b", "c"]}}, "cstringArray": ["a", "b", "c"], "cLongArray": [1, 2, 3], "cDoubleArray": [1.1, 2.2, 3.3], "cEmptyArray":[], "cEmptyObj":{}, "cNullArray": [null, null], "cEmptyObjectArray": [{},{},{}], "cObjectArray": [{"a":"b", "x":1, "y":1.3}]} +{"timestamp": "2023-01-01T00:00:00", "str":"null", "long":3, "double":2.0, "variant": 3.0, "variantNumeric": 1.0, "variantEmptyObj":3.3, "variantEmtpyArray":3, "variantWithArrays": 3.0, "obj":{"a": 300}, "complexObj":{"x": 4.4, "y": [{"l": [], "m": 100, "n": 3},{"l": ["a"]}, {"l": ["b"], "n": []}], "z": {"a": [], "b": true}}, "arrayString": ["b", "c"], "arrayStringNulls": ["d", null, "b"], "arrayLong":[1, 2, 3, 4], "arrayLongNulls":[1, 2, 3], "arrayDouble":[1.1, 3.3], "arrayDoubleNulls":[null, 2.2, null], "arrayVariant":[1, null, 1], "arrayBool":[true, null, true], "arrayNestedLong":[[1], null, [1, 2, 3]], "arrayObject":[null,{"x":2}], "cstr": "hello", "clong": 1234, "cdouble": 1.234, "cObj":{"x": 1, "y": "hello", "z": {"a": 1.1, "b": 1234, "c": ["a", "b", "c"]}}, "cstringArray": ["a", "b", "c"], "cLongArray": [1, 2, 3], "cDoubleArray": [1.1, 2.2, 3.3], "cEmptyArray":[], "cEmptyObj":{}, "cNullArray": [null, null], "cEmptyObjectArray": [{},{},{}], "cObjectArray": [{"a":"b", "x":1, "y":1.3}]} +{"timestamp": "2023-01-01T00:00:00", "str":"b", "long":4, "double":3.3, "bool": true, "variant": "1", "variantEmptyObj":{}, "variantEmtpyArray":4, "variantWithArrays": "1", "obj":{"a": 400, "b": {"x": "d", "y": 1.1, "z": [3, 4]}, "v": []}, "complexObj":{"x": 1234, "z": {"a": [1.1, 2.2, 3.3], "b": true}}, "arrayString": ["d", "e"], "arrayStringNulls": ["b", "b"], "arrayLong":[1, 4], "arrayLongNulls":[1], "arrayDouble":[2.2, 3.3, 4.0], "arrayVariant":["a", "b", "c"], "arrayBool":[null, false, true], "arrayNestedLong":[[1, 2], [3, 4], [5, 6, 7]], "arrayObject":[{"x": null},{"x":2}], "null": null, "cstr": "hello", "clong": 1234, "cdouble": 1.234, "cObj":{"x": 1, "y": "hello", "z": {"a": 1.1, "b": 1234, "c": ["a", "b", "c"]}}, "cstringArray": ["a", "b", "c"], "cLongArray": [1, 2, 3], "cDoubleArray": [1.1, 2.2, 3.3], "cEmptyArray":[], "cEmptyObj":{}, "cNullArray": [null, null], "cEmptyObjectArray": [{},{},{}], "cObjectArray": [{"a":"b", "x":1, "y":1.3}]} +{"timestamp": "2023-01-01T00:00:00", "str":"c", "long": null, "double":4.4, "bool": true, "variant": "hello", "variantNumeric": -1000, "variantEmptyObj":{}, "variantEmtpyArray":[], "variantWithArrays": "hello", "obj":{"a": 500, "b": {"x": "e", "z": [1, 2, 3, 4]}, "v": "a"}, "complexObj":{"x": 11, "y": [], "z": {"a": [null], "b": false}}, "arrayString": null, "arrayLong":[1, 2, 3], "arrayLongNulls":[], "arrayDouble":[1.1, 2.2, 3.3], "arrayDoubleNulls":null, "arrayBool":[false], "arrayObject":[{"x": 1000},{"y":2000}], "null": null, "cstr": "hello", "clong": 1234, "cdouble": 1.234, "cObj":{"x": 1, "y": "hello", "z": {"a": 1.1, "b": 1234, "c": ["a", "b", "c"]}}, "cstringArray": ["a", "b", "c"], "cLongArray": [1, 2, 3], "cDoubleArray": [1.1, 2.2, 3.3], "cEmptyArray":[], "cEmptyObj":{}, "cNullArray": [null, null], "cEmptyObjectArray": [{},{},{}], "cObjectArray": [{"a":"b", "x":1, "y":1.3}]} +{"timestamp": "2023-01-01T00:00:00", "str":"d", "long":5, "double":5.9, "bool": false, "variantNumeric": 3.33, "variantEmptyObj":"a", "variantEmtpyArray":6, "obj":{"a": 600, "b": {"x": "f", "y": 1.1, "z": [6, 7, 8, 9]}, "v": "b"}, "arrayString": ["a", "b"], "arrayStringNulls": null, "arrayLongNulls":[null, 2, 9], "arrayDouble":null, "arrayDoubleNulls":[999, 5.5, null], "arrayVariant":["a", 1, 2.2], "arrayBool":[], "arrayNestedLong":[[1], [1, 2, null]], "arrayObject":[{"a": 1},{"b":2}], "cstr": "hello", "clong": 1234, "cdouble": 1.234, "cObj":{"x": 1, "y": "hello", "z": {"a": 1.1, "b": 1234, "c": ["a", "b", "c"]}}, "cstringArray": ["a", "b", "c"], "cLongArray": [1, 2, 3], "cDoubleArray": [1.1, 2.2, 3.3], "cEmptyArray":[], "cEmptyObj":{}, "cNullArray": [null, null], "cEmptyObjectArray": [{},{},{}], "cObjectArray": [{"a":"b", "x":1, "y":1.3}]} +{"timestamp": "2023-01-01T00:00:00", "str":null, "double":null, "bool": true, "variant": 51, "variantNumeric": -0.13, "variantEmptyObj":1, "variantEmtpyArray":[], "variantWithArrays": [51, -35], "obj":{"a": 700, "b": {"x": "g", "y": 1.1, "z": [9, null, 9, 9]}, "v": []}, "complexObj":{"x": 400, "y": [{"l": [null], "m": 100, "n": 5},{"l": ["a", "b", "c"], "m": "a", "n": 1}], "z": {}}, "arrayStringNulls": ["a", "b"], "arrayLong":null, "arrayLongNulls":[2, 3], "arrayDoubleNulls":[null], "arrayVariant":null, "arrayBool":[true, false, true], "arrayNestedLong":null, "arrayObject":[{"x": 1},{"x":2}], "null": null, "cstr": "hello", "clong": 1234, "cdouble": 1.234, "cObj":{"x": 1, "y": "hello", "z": {"a": 1.1, "b": 1234, "c": ["a", "b", "c"]}}, "cstringArray": ["a", "b", "c"], "cLongArray": [1, 2, 3], "cDoubleArray": [1.1, 2.2, 3.3], "cEmptyArray":[], "cEmptyObj":{}, "cNullArray": [null, null], "cEmptyObjectArray": [{},{},{}], "cObjectArray": [{"a":"b", "x":1, "y":1.3}]} diff --git a/sql/src/test/java/org/apache/druid/sql/calcite/CalciteNestedDataQueryTest.java b/sql/src/test/java/org/apache/druid/sql/calcite/CalciteNestedDataQueryTest.java index d440a6bb7182..5098343b5389 100644 --- a/sql/src/test/java/org/apache/druid/sql/calcite/CalciteNestedDataQueryTest.java +++ b/sql/src/test/java/org/apache/druid/sql/calcite/CalciteNestedDataQueryTest.java @@ -5667,7 +5667,7 @@ public void testScanAllTypesAuto() "1", "[]", "[51,-35]", - "{\"a\":700,\"b\":{\"x\":\"g\",\"y\":1.1,\"z\":[9,null,9,9]}}", + "{\"a\":700,\"b\":{\"x\":\"g\",\"y\":1.1,\"z\":[9,null,9,9]},\"v\":[]}", "{\"x\":400,\"y\":[{\"l\":[null],\"m\":100,\"n\":5},{\"l\":[\"a\",\"b\",\"c\"],\"m\":\"a\",\"n\":1}],\"z\":{}}", null, "[\"a\",\"b\"]", @@ -5705,7 +5705,7 @@ public void testScanAllTypesAuto() "\"b\"", "2", "b", - "{\"a\":200,\"b\":{\"x\":\"b\",\"y\":1.1,\"z\":[2,4,6]}}", + "{\"a\":200,\"b\":{\"x\":\"b\",\"y\":1.1,\"z\":[2,4,6]},\"v\":[]}", "{\"x\":10,\"y\":[{\"l\":[\"b\",\"b\",\"c\"],\"m\":\"b\",\"n\":2},[1,2,3]],\"z\":{\"a\":[5.5],\"b\":false}}", "[\"a\",\"b\",\"c\"]", "[null,\"b\"]", @@ -5743,7 +5743,7 @@ public void testScanAllTypesAuto() "1", "1", "1", - "{\"a\":100,\"b\":{\"x\":\"a\",\"y\":1.1,\"z\":[1,2,3,4]}}", + "{\"a\":100,\"b\":{\"x\":\"a\",\"y\":1.1,\"z\":[1,2,3,4]},\"v\":[]}", "{\"x\":1234,\"y\":[{\"l\":[\"a\",\"b\",\"c\"],\"m\":\"a\",\"n\":1},{\"l\":[\"a\",\"b\",\"c\"],\"m\":\"a\",\"n\":1}],\"z\":{\"a\":[1.1,2.2,3.3],\"b\":true}}", "[\"a\",\"b\"]", "[\"a\",\"b\"]", @@ -5781,7 +5781,7 @@ public void testScanAllTypesAuto() "{}", "4", "1", - "{\"a\":400,\"b\":{\"x\":\"d\",\"y\":1.1,\"z\":[3,4]}}", + "{\"a\":400,\"b\":{\"x\":\"d\",\"y\":1.1,\"z\":[3,4]},\"v\":[]}", "{\"x\":1234,\"z\":{\"a\":[1.1,2.2,3.3],\"b\":true}}", "[\"d\",\"e\"]", "[\"b\",\"b\"]", @@ -5819,7 +5819,7 @@ public void testScanAllTypesAuto() "{}", "[]", "hello", - "{\"a\":500,\"b\":{\"x\":\"e\",\"z\":[1,2,3,4]}}", + "{\"a\":500,\"b\":{\"x\":\"e\",\"z\":[1,2,3,4]},\"v\":\"a\"}", "{\"x\":11,\"y\":[],\"z\":{\"a\":[null],\"b\":false}}", null, null, @@ -5857,7 +5857,7 @@ public void testScanAllTypesAuto() "\"a\"", "6", null, - "{\"a\":600,\"b\":{\"x\":\"f\",\"y\":1.1,\"z\":[6,7,8,9]}}", + "{\"a\":600,\"b\":{\"x\":\"f\",\"y\":1.1,\"z\":[6,7,8,9]},\"v\":\"b\"}", null, "[\"a\",\"b\"]", null, @@ -5935,7 +5935,7 @@ public void testScanAllTypesAuto() "1", "[]", "[51,-35]", - "{\"a\":700,\"b\":{\"x\":\"g\",\"y\":1.1,\"z\":[9,null,9,9]}}", + "{\"a\":700,\"b\":{\"x\":\"g\",\"y\":1.1,\"z\":[9,null,9,9]},\"v\":[]}", "{\"x\":400,\"y\":[{\"l\":[null],\"m\":100,\"n\":5},{\"l\":[\"a\",\"b\",\"c\"],\"m\":\"a\",\"n\":1}],\"z\":{}}", null, "[\"a\",\"b\"]", @@ -5973,7 +5973,7 @@ public void testScanAllTypesAuto() "\"b\"", "2", "b", - "{\"a\":200,\"b\":{\"x\":\"b\",\"y\":1.1,\"z\":[2,4,6]}}", + "{\"a\":200,\"b\":{\"x\":\"b\",\"y\":1.1,\"z\":[2,4,6]},\"v\":[]}", "{\"x\":10,\"y\":[{\"l\":[\"b\",\"b\",\"c\"],\"m\":\"b\",\"n\":2},[1,2,3]],\"z\":{\"a\":[5.5],\"b\":false}}", "[\"a\",\"b\",\"c\"]", "[null,\"b\"]", @@ -6011,7 +6011,7 @@ public void testScanAllTypesAuto() "1", "1", "1", - "{\"a\":100,\"b\":{\"x\":\"a\",\"y\":1.1,\"z\":[1,2,3,4]}}", + "{\"a\":100,\"b\":{\"x\":\"a\",\"y\":1.1,\"z\":[1,2,3,4]},\"v\":[]}", "{\"x\":1234,\"y\":[{\"l\":[\"a\",\"b\",\"c\"],\"m\":\"a\",\"n\":1},{\"l\":[\"a\",\"b\",\"c\"],\"m\":\"a\",\"n\":1}],\"z\":{\"a\":[1.1,2.2,3.3],\"b\":true}}", "[\"a\",\"b\"]", "[\"a\",\"b\"]", @@ -6049,7 +6049,7 @@ public void testScanAllTypesAuto() "{}", "4", "1", - "{\"a\":400,\"b\":{\"x\":\"d\",\"y\":1.1,\"z\":[3,4]}}", + "{\"a\":400,\"b\":{\"x\":\"d\",\"y\":1.1,\"z\":[3,4]},\"v\":[]}", "{\"x\":1234,\"z\":{\"a\":[1.1,2.2,3.3],\"b\":true}}", "[\"d\",\"e\"]", "[\"b\",\"b\"]", @@ -6087,7 +6087,7 @@ public void testScanAllTypesAuto() "{}", "[]", "hello", - "{\"a\":500,\"b\":{\"x\":\"e\",\"z\":[1,2,3,4]}}", + "{\"a\":500,\"b\":{\"x\":\"e\",\"z\":[1,2,3,4]},\"v\":\"a\"}", "{\"x\":11,\"y\":[],\"z\":{\"a\":[null],\"b\":false}}", null, null, @@ -6125,7 +6125,7 @@ public void testScanAllTypesAuto() "\"a\"", "6", null, - "{\"a\":600,\"b\":{\"x\":\"f\",\"y\":1.1,\"z\":[6,7,8,9]}}", + "{\"a\":600,\"b\":{\"x\":\"f\",\"y\":1.1,\"z\":[6,7,8,9]},\"v\":\"b\"}", null, "[\"a\",\"b\"]", null, diff --git a/sql/src/test/java/org/apache/druid/sql/calcite/CalciteSubqueryTest.java b/sql/src/test/java/org/apache/druid/sql/calcite/CalciteSubqueryTest.java index d39c9bf1388e..2ddc674eadda 100644 --- a/sql/src/test/java/org/apache/druid/sql/calcite/CalciteSubqueryTest.java +++ b/sql/src/test/java/org/apache/druid/sql/calcite/CalciteSubqueryTest.java @@ -57,7 +57,6 @@ import org.apache.druid.sql.calcite.util.CalciteTests; import org.joda.time.DateTimeZone; import org.joda.time.Period; -import org.junit.Ignore; import org.junit.Test; import org.junit.runner.RunWith; import org.junit.runners.Parameterized; @@ -227,7 +226,6 @@ public void testExactCountDistinctOfSemiJoinResult() ); } - @Ignore("Merge buffers exceed the prescribed limit when the results are materialized as frames") @Test public void testTwoExactCountDistincts() { diff --git a/sql/src/test/java/org/apache/druid/sql/calcite/CalciteSysQueryTest.java b/sql/src/test/java/org/apache/druid/sql/calcite/CalciteSysQueryTest.java new file mode 100644 index 000000000000..5a66383194cf --- /dev/null +++ b/sql/src/test/java/org/apache/druid/sql/calcite/CalciteSysQueryTest.java @@ -0,0 +1,63 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.sql.calcite; + +import com.google.common.collect.ImmutableList; +import org.apache.druid.sql.calcite.DecoupledIgnore.DecoupledIgnoreProcessor; +import org.apache.druid.sql.calcite.DecoupledIgnore.Modes; +import org.junit.Rule; +import org.junit.Test; + +public class CalciteSysQueryTest extends BaseCalciteQueryTest +{ + @Rule(order = 0) + public DecoupledIgnoreProcessor decoupledIgnoreProcessor = new DecoupledIgnoreProcessor(); + + @Test + public void testTasksSum() + { + notMsqCompatible(); + + testBuilder() + .sql("select datasource, sum(duration) from sys.tasks group by datasource") + .expectedResults(ImmutableList.of( + new Object[]{"foo", 11L}, + new Object[]{"foo2", 22L})) + .expectedLogicalPlan("LogicalAggregate(group=[{0}], EXPR$1=[SUM($1)])\n" + + " LogicalProject(exprs=[[$3, $8]])\n" + + " LogicalTableScan(table=[[sys, tasks]])\n") + .run(); + } + + @DecoupledIgnore(mode = Modes.EXPRESSION_NOT_GROUPED) + @Test + public void testTasksSumOver() + { + notMsqCompatible(); + + testBuilder() + .sql("select datasource, sum(duration) over () from sys.tasks group by datasource") + .expectedResults(ImmutableList.of( + new Object[]{"foo", 11L}, + new Object[]{"foo2", 22L})) + // please add expectedLogicalPlan if this test starts passing! + .run(); + } +} diff --git a/sql/src/test/java/org/apache/druid/sql/calcite/DecoupledIgnore.java b/sql/src/test/java/org/apache/druid/sql/calcite/DecoupledIgnore.java index 0c30432d3d67..029b41f54995 100644 --- a/sql/src/test/java/org/apache/druid/sql/calcite/DecoupledIgnore.java +++ b/sql/src/test/java/org/apache/druid/sql/calcite/DecoupledIgnore.java @@ -52,7 +52,8 @@ enum Modes PLAN_MISMATCH(AssertionError.class, "AssertionError: query #"), NOT_ENOUGH_RULES(DruidException.class, "not enough rules"), CANNOT_CONVERT(DruidException.class, "Cannot convert query parts"), - ERROR_HANDLING(AssertionError.class, "(is was |is was |with message a string containing)"); + ERROR_HANDLING(AssertionError.class, "(is was |is was |with message a string containing)"), + EXPRESSION_NOT_GROUPED(DruidException.class, "Expression '[a-z]+' is not being grouped"); public Class throwableClass; public String regex; diff --git a/sql/src/test/java/org/apache/druid/sql/calcite/util/CalciteTests.java b/sql/src/test/java/org/apache/druid/sql/calcite/util/CalciteTests.java index 17f2ed106377..15568092ad80 100644 --- a/sql/src/test/java/org/apache/druid/sql/calcite/util/CalciteTests.java +++ b/sql/src/test/java/org/apache/druid/sql/calcite/util/CalciteTests.java @@ -39,7 +39,14 @@ import org.apache.druid.discovery.DruidNodeDiscoveryProvider; import org.apache.druid.discovery.NodeRole; import org.apache.druid.guice.annotations.Json; +import org.apache.druid.indexer.RunnerTaskState; +import org.apache.druid.indexer.TaskLocation; +import org.apache.druid.indexer.TaskState; +import org.apache.druid.indexer.TaskStatusPlus; +import org.apache.druid.java.util.common.CloseableIterators; +import org.apache.druid.java.util.common.DateTimes; import org.apache.druid.java.util.common.Pair; +import org.apache.druid.java.util.common.parsers.CloseableIterator; import org.apache.druid.java.util.http.client.HttpClient; import org.apache.druid.java.util.http.client.Request; import org.apache.druid.java.util.http.client.response.HttpResponseHandler; @@ -82,9 +89,11 @@ import java.io.File; import java.net.URI; import java.net.URISyntaxException; +import java.util.ArrayList; import java.util.Collection; import java.util.HashMap; import java.util.HashSet; +import java.util.List; import java.util.Map; import java.util.Set; import java.util.concurrent.Executor; @@ -368,6 +377,38 @@ public ListenableFuture findCurrentLeader() throw new RuntimeException(e); } } + + @Override + public ListenableFuture> taskStatuses( + @Nullable String state, + @Nullable String dataSource, + @Nullable Integer maxCompletedTasks + ) + { + List tasks = new ArrayList(); + tasks.add(createTaskStatus("id1", DATASOURCE1, 10L)); + tasks.add(createTaskStatus("id1", DATASOURCE1, 1L)); + tasks.add(createTaskStatus("id2", DATASOURCE2, 20L)); + tasks.add(createTaskStatus("id2", DATASOURCE2, 2L)); + return Futures.immediateFuture(CloseableIterators.withEmptyBaggage(tasks.iterator())); + } + + private TaskStatusPlus createTaskStatus(String id, String datasource, Long duration) + { + return new TaskStatusPlus( + id, + "testGroupId", + "testType", + DateTimes.nowUtc(), + DateTimes.nowUtc(), + TaskState.RUNNING, + RunnerTaskState.RUNNING, + duration, + TaskLocation.create("testHost", 1010, -1), + datasource, + null + ); + } }; return new SystemSchema(