forked from NVIDIA/spark-rapids
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge branch-24.06 into main #110
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Keep deps (JNI + private) dependencies as 24.04-SNAPSHOT util they're available next week. Added TODO (NVIDIA#10256) to remind us to bump up deps version to 24.06.0-SNAPSHOT. Signed-off-by: Tim Liu <[email protected]>
[auto-merge] branch-24.04 to branch-24.06 [skip ci] [bot]
[auto-merge] branch-24.04 to branch-24.06 [skip ci] [bot]
[auto-merge] branch-24.04 to branch-24.06 [skip ci] [bot]
[auto-merge] branch-24.04 to branch-24.06 [skip ci] [bot]
[auto-merge] branch-24.04 to branch-24.06 [skip ci] [bot]
[auto-merge] branch-24.04 to branch-24.06 [skip ci] [bot]
[auto-merge] branch-24.04 to branch-24.06 [skip ci] [bot]
[auto-merge] branch-24.04 to branch-24.06 [skip ci] [bot]
Fixes NVIDIA#10487 Signed-off-by: Gera Shegalov <[email protected]>
Signed-off-by: Tim Liu <[email protected]>
[auto-merge] branch-24.04 to branch-24.06 [skip ci] [bot]
Signed-off-by: Niranjan Artal <[email protected]>
Signed-off-by: Jason Lowe <[email protected]>
Signed-off-by: liurenjie1024 <[email protected]>
Fix merge conflict with branch-24.04 [skip ci]
[auto-merge] branch-24.04 to branch-24.06 [skip ci] [bot]
[auto-merge] branch-24.04 to branch-24.06 [skip ci] [bot]
[auto-merge] branch-24.04 to branch-24.06 [skip ci] [bot]
…-10704 Fix auto merge conflict 10704 [skip ci]
Signed-off-by: Chong Gao <[email protected]> Signed-off-by: Robert (Bobby) Evans <[email protected]> Co-authored-by: Chong Gao <[email protected]>
Signed-off-by: Hongbin Ma (Mahone) <[email protected]>
* Removing some authorizations for departed users Signed-off-by: Mike Wilson <[email protected]> Co-authored-by: Sameer Raheja <[email protected]>
* Upgrade to jucx 1.16.0 Signed-off-by: Alessandro Bellina <[email protected]> --------- Signed-off-by: Alessandro Bellina <[email protected]>
…10718) * Refactor Parquet reader Signed-off-by: Nghia Truong <[email protected]> * Update config * Add back the deprecated config Signed-off-by: Nghia Truong <[email protected]> * Fix config Signed-off-by: Nghia Truong <[email protected]> * Change message for the deprecated config Signed-off-by: Nghia Truong <[email protected]> * Rename variable Signed-off-by: Nghia Truong <[email protected]> * Change the logic of reading conf Signed-off-by: Nghia Truong <[email protected]> * Add example and mark conf as `internal()` Signed-off-by: Nghia Truong <[email protected]> * Reformat code Signed-off-by: Nghia Truong <[email protected]> * Update docs Signed-off-by: Nghia Truong <[email protected]> * Change configs Signed-off-by: Nghia Truong <[email protected]> * Update docs Signed-off-by: Nghia Truong <[email protected]> * Change variables into functions Signed-off-by: Nghia Truong <[email protected]> * Change functions back into `lazy val` Signed-off-by: Nghia Truong <[email protected]> --------- Signed-off-by: Nghia Truong <[email protected]>
…debugging UT in IDEA (NVIDIA#10733) * wip for test Signed-off-by: Haoyang Li <[email protected]> * update comment Signed-off-by: Haoyang Li <[email protected]> --------- Signed-off-by: Haoyang Li <[email protected]>
NVIDIA#10716) Signed-off-by: Robert (Bobby) Evans <[email protected]>
* generate shims * Generate Scala 2.13 poms Signed-off-by: Raza Jafri <[email protected]> * undo bad change to the supportedExprs.csv * Fixed copyrights and removed snapshot * Update copyrights on SparkShimsSuite --------- Signed-off-by: Raza Jafri <[email protected]>
* set fixed seed for some random failed tests Signed-off-by: Haoyang Li <[email protected]> * add import Signed-off-by: Haoyang Li <[email protected]> --------- Signed-off-by: Haoyang Li <[email protected]>
* Refactor Parquet reader Signed-off-by: Nghia Truong <[email protected]> * WIP for ORC chunked reader * Update config * Add back the deprecated config Signed-off-by: Nghia Truong <[email protected]> * Fix config Signed-off-by: Nghia Truong <[email protected]> * Change message for the deprecated config Signed-off-by: Nghia Truong <[email protected]> * Add OrcChunkedReader Signed-off-by: Nghia Truong <[email protected]> * Cleanup Signed-off-by: Nghia Truong <[email protected]> * Fix `MultiFileCloudOrcPartitionReader` Signed-off-by: Nghia Truong <[email protected]> * Fix `readBufferToTablesAndClose` Signed-off-by: Nghia Truong <[email protected]> * Fix comment Signed-off-by: Nghia Truong <[email protected]> * Add chunked reader to tests Signed-off-by: Nghia Truong <[email protected]> * Fix table schema Signed-off-by: Nghia Truong <[email protected]> --------- Signed-off-by: Nghia Truong <[email protected]>
* Fix NPE in GpuParseUrl for null keys. Fixes NVIDIA#10810. This commit fixes an NPE that occurred when `ParseUrl` is called to extract a specific key from the `QUERY` portion of the URL, and the specified key is `null`. The NPE would manifest as follows: ``` 24/05/13 14:28:35.379 Executor task launch worker for task 1.0 in stage 746.0 (TID 1493) ERROR Executor: Exception in task 1.0 in stage 746.0 (TID 1493) java.lang.NullPointerException: null at org.apache.spark.sql.rapids.GpuParseUrl.doColumnar(GpuParseUrl.scala:86) ~[rapids-4-spark-aggregator_2.12-24.06.0-SNAPSHOT-spark330.jar:?] at org.apache.spark.sql.rapids.GpuParseUrl.$anonfun$columnarEval$5(GpuParseUrl.scala:123) ~[rapids-4-spark-aggregator_2.12-24.06.0-SNAPSHOT-spark330.jar:?] at com.nvidia.spark.rapids.Arm$.withResourceIfAllowed(Arm.scala:84) ~[rapids-4-spark-aggregator_2.12-24.06.0-SNAPSHOT-spark330.jar:?] at org.apache.spark.sql.rapids.GpuParseUrl.$anonfun$columnarEval$4(GpuParseUrl.scala:120) ~[rapids-4-spark-aggregator_2.12-24.06.0-SNAPSHOT-spark330.jar:?] at com.nvidia.spark.rapids.Arm$.withResourceIfAllowed(Arm.scala:84) ~[rapids-4-spark-aggregator_2.12-24.06.0-SNAPSHOT-spark330.jar:?] at org.apache.spark.sql.rapids.GpuParseUrl.$anonfun$columnarEval$3(GpuParseUrl.scala:119) ~[rapids-4-spark-aggregator_2.12-24.06.0-SNAPSHOT-spark330.jar:?] at com.nvidia.spark.rapids.Arm$.withResourceIfAllowed(Arm.scala:84) ~[rapids-4-spark-aggregator_2.12-24.06.0-SNAPSHOT-spark330.jar:?] ... ``` Signed-off-by: MithunR <[email protected]> * Reword validity check. Signed-off-by: MithunR <[email protected]> --------- Signed-off-by: MithunR <[email protected]>
…s] (NVIDIA#10829) * Scala 2.13: Inheritance shadowing Signed-off-by: Raza Jafri <[email protected]> * Signing off Signed-off-by: Raza Jafri <[email protected]> --------- Signed-off-by: Raza Jafri <[email protected]>
* Added DateTimeUtilsShims * Signing off Signed-off-by: Raza Jafri <[email protected]> * added the missing DateTimeUtilsShims for 343 --------- Signed-off-by: Raza Jafri <[email protected]>
This PR is to add the ZSTD codec for GPU shuffle compression. --------- Signed-off-by: Firestarman <[email protected]>
…ries (NVIDIA#10836) Signed-off-by: Jason Lowe <[email protected]>
* Add NVTX ranges to identify Spark stages and tasks Signed-off-by: Jason Lowe <[email protected]> * scalastyle --------- Signed-off-by: Jason Lowe <[email protected]>
…-10845 Fix auto merge conflict 10845 [[skip ci]]
Signed-off-by: Robert (Bobby) Evans <[email protected]>
…IDIA#10822) Signed-off-by: Haoyang Li <[email protected]> Co-authored-by: Gera Shegalov <[email protected]>
…VIDIA#10860) Fixes NVIDIA#10606. This commit accounts for the change in the signature of `PartitionedFileUtil.getPartitionedFile()`, in Apache Spark 4.0. (See [SPARK-46473](apache/spark#44437).) Signed-off-by: MithunR <[email protected]>
) Signed-off-by: Tim Liu <[email protected]>
…DIA#10839) Demo PR contributing to NVIDIA#10838 It showcases a coding convention to follow using SortOrder and FilterExec replacements as an example ```scala scala> spark.range(100).where($"id" <= 10).collect() java.lang.RuntimeException: convertToGpu failed at scala.sys.package$.error(package.scala:30) at com.nvidia.spark.rapids.GpuFilterExecMeta.convertToGpu(basicPhysicalOperators.scala:790) at com.nvidia.spark.rapids.GpuFilterExecMeta.convertToGpu(basicPhysicalOperators.scala:783) at com.nvidia.spark.rapids.SparkPlanMeta.convertIfNeeded(RapidsMeta.scala:838) at com.nvidia.spark.rapids.GpuOverrides$.com$nvidia$spark$rapids$GpuOverrides$$doConvertPlan(GpuOverrides.scala:4383) at com.nvidia.spark.rapids.GpuOverrides.applyOverrides(GpuOverrides.scala:4728) at com.nvidia.spark.rapids.GpuOverrides.$anonfun$applyWithContext$3(GpuOverrides.scala:4588) at com.nvidia.spark.rapids.GpuOverrides$.logDuration(GpuOverrides.scala:455) at com.nvidia.spark.rapids.GpuOverrides.$anonfun$applyWithContext$1(GpuOverrides.scala:4585) at com.nvidia.spark.rapids.GpuOverrideUtil$.$anonfun$tryOverride$1(GpuOverrides.scala:4551) at com.nvidia.spark.rapids.GpuOverrides.applyWithContext(GpuOverrides.scala:4605) at com.nvidia.spark.rapids.GpuOverrides.apply(GpuOverrides.scala:4578) at com.nvidia.spark.rapids.GpuOverrides.apply(GpuOverrides.scala:4574) at org.apache.spark.sql.execution.ApplyColumnarRulesAndInsertTransitions.$anonfun$apply$1(Columnar.scala:532) ``` Signed-off-by: Gera Shegalov <[email protected]>
NVIDIA#10872) * add plugin link to ignore pattern Signed-off-by: liyuan <[email protected]> * add plugin link to ignore pattern Signed-off-by: liyuan <[email protected]> --------- Signed-off-by: liyuan <[email protected]>
* refine UT framework to promote GPU evaluation Signed-off-by: Hongbin Ma (Mahone) <[email protected]> * enable some exprs for json Signed-off-by: Hongbin Ma (Mahone) <[email protected]> * exclude flaky tests Signed-off-by: Hongbin Ma (Mahone) <[email protected]> * fix review comments Signed-off-by: Hongbin Ma (Mahone) <[email protected]> * use vectorized parameter where possible Signed-off-by: Hongbin Ma (Mahone) <[email protected]> * add todo for utc issue Signed-off-by: Hongbin Ma (Mahone) <[email protected]> --------- Signed-off-by: Hongbin Ma (Mahone) <[email protected]>
…A#10858) * Add support for multiple filtering keys for subquery broadcast * Signing off Signed-off-by: Raza Jafri <[email protected]> * Fixed test compilation --------- Signed-off-by: Raza Jafri <[email protected]>
Signed-off-by: Haoyang Li <[email protected]>
* Disabling the cuDF default pinned pool for 24.06 Signed-off-by: Alessandro Bellina <[email protected]> * Add a warning in case we can't configure the cuDF default pool --------- Signed-off-by: Alessandro Bellina <[email protected]>
Signed-off-by: Haoyang Li <[email protected]>
…A#10903) Signed-off-by: Haoyang Li <[email protected]>
* Add support for self-contained profiling Signed-off-by: Jason Lowe <[email protected]> * Use Scala regex, add executor-side logging on profile startup/shutdown * Use reflection to handle potentially missing Hadoop CallerContext * scala 2.13 fix --------- Signed-off-by: Jason Lowe <[email protected]>
…t " (NVIDIA#10934) * Revert "Add Support for Multiple Filtering Keys for Subquery Broadcast (NVIDIA#10858)" This reverts commit 3001852. * Signing off Signed-off-by: Raza Jafri <[email protected]> --------- Signed-off-by: Raza Jafri <[email protected]>
NVIDIA#10947) Prevent '^[0-9]{n}' from being processed as `spark_rapids_jni::literal_range_pattern` that currently only supports "contains", not "starts with" Fixes NVIDIA#10928 Also adding missing tailrec annotations to recursive parser methods. Signed-off-by: Gera Shegalov <[email protected]>
Signed-off-by: jenkins <jenkins@localhost>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Change version to 24.06.0
Note: merge this PR with Create a merge commit to merge