Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change dependency: jni 24.04.0 private 24.04.0 #34

Closed
wants to merge 98 commits into from

Conversation

NvTimLiu
Copy link
Owner

Wait for the pre-merge CI job to succeed

nvauto and others added 30 commits January 24, 2024 17:20
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
To fix: NVIDIA#10256

Bump up dependency version to 24.04.0-SNAPSHOT

Signed-off-by: Tim Liu <[email protected]>
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
razajafri and others added 28 commits February 26, 2024 09:00
* Remove support for Databricks 10.4

* Drop Databricks 10.4 runtime from CI

Signed-off-by: Tim Liu <[email protected]>

* Update copyright 2024

Signed-off-by: Tim Liu <[email protected]>

* updated 2.13 pom.xml

* Signing off

Signed-off-by: Raza Jafri <[email protected]>

* Removed all non-scala references to 321db

* updated copyrights

---------

Signed-off-by: Tim Liu <[email protected]>
Signed-off-by: Raza Jafri <[email protected]>
Co-authored-by: Tim Liu <[email protected]>
…0497)

Check both column extrema to detect multiplication overflow

Fixes NVIDIA#10431

Signed-off-by: Gera Shegalov <[email protected]>
…#10466)

* remove leading space for json path in GetJsonObject

Signed-off-by: Haoyang Li <[email protected]>

* Update comments

Signed-off-by: Haoyang Li <[email protected]>

* Use JsonPathParser to normalize path

Signed-off-by: Haoyang Li <[email protected]>

* Update compatibility doc

Signed-off-by: Haoyang Li <[email protected]>

* clean up

Signed-off-by: Haoyang Li <[email protected]>

* Fallback json paths containing  in GetJsonObject

Signed-off-by: Haoyang Li <[email protected]>

* cache normalizeJsonPath and prevent memory leak

Signed-off-by: Haoyang Li <[email protected]>

* clean up

Signed-off-by: Haoyang Li <[email protected]>

* ready to merge

Signed-off-by: Haoyang Li <[email protected]>

* Use parser to check whether to fallback

Signed-off-by: Haoyang Li <[email protected]>

* Add a special case

Signed-off-by: Haoyang Li <[email protected]>

---------

Signed-off-by: Haoyang Li <[email protected]>
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
* Move 351 shims into noSnapshot buildvers

Move 351 shims into noSnapshot buildvers as spark has release it.

Follow up of NVIDIA#10465 (comment)

Signed-off-by: Tim Liu <[email protected]>

* 351 shim for scala 2.13

Signed-off-by: Tim Liu <[email protected]>

---------

Signed-off-by: Tim Liu <[email protected]>
…0500)

Fixes NVIDIA#8208.

This commit adds support for `WindowGroupLimitExec` to run on GPU.  This optimization was added in Apache Spark 3.5, to reduce the number of rows that participate in shuffles, for queries that contain filters on the result of ranking functions. For example:

```sql
SELECT foo, bar FROM (
  SELECT foo, bar, 
         RANK() OVER (PARTITION BY foo ORDER BY bar) AS rnk
  FROM mytable )
WHERE rnk < 10
```

Such a query would require a shuffle to bring all rows in a window-group to be made available in the same task.
In Spark 3.5, an optimization was added in [SPARK-37099](https://issues.apache.org/jira/browse/SPARK-37099) to take advantage of the `rnk < 10` predicate to reduce shuffle load.
Specifically, since only 9 (i.e. 10-1) ranks participate in the window function, only those many rows need be shuffled into the task, per input batch.  By pre-filtering rows that can't possibly satisfy the condition, the number of shuffled records can be reduced.

The GPU implementation (i.e. `GpuWindowGroupLimitExec`) differs slightly from the CPU implementation, because it needs to execute on the entire input column batch.  As a result, `GpuWindowGroupLimitExec` runs the rank scan on each input batch, and then filters out ranks that exceed the limit specified in the predicate (`rnk < 10`). After the shuffle, the `RANK()` is calculated again by `GpuRunningWindowExec`, to produce the final result.

The current implementation addresses `RANK()` and `DENSE_RANK` window functions.  Other ranking functions (like `ROW_NUMBER()`) can be added at a later date.

Signed-off-by: MithunR <[email protected]>
This PR adds a new metric for the preprojection in GpuExand.

Signed-off-by: Firestarman <[email protected]>
* Update rapids jni and private dependency version to 24.02.1 (NVIDIA#10511)

Signed-off-by: Tim Liu <[email protected]>

* Add missed shims for scala2.13 (NVIDIA#10465)

* Add missed shims for scala2.13

Signed-off-by: Tim Liu <[email protected]>

* Add 351 snapshot shim for the scala2.13 version of plugin jar

Signed-off-by: Tim Liu <[email protected]>

* Remove 351 snapshot shim as spark 3.5.1 has been released

Signed-off-by: Tim Liu <[email protected]>

* Remove scala2.13 351 snapshot shim

Signed-off-by: Tim Liu <[email protected]>

* Remove 351 shim's jason string

Ran `mvn generate-sources -Dshimplify=true -Dshimplify.move=true -Dshimplify.remove.shim=351`

to remove 351 shim's jason string, and fix some unnecessary empty lines that were introduced

Signed-off-by: Tim Liu <[email protected]>

* Update Copyright 2024

Auto copyright by below scripts
```
export SPARK_RAPIDS_AUTO_COPYRIGHTER=ON

./scripts/auto-copyrighter.sh $(git diff --name-only origin/branch-24.04..HEAD)
```

Signed-off-by: Tim Liu <[email protected]>

* Revert "Update Copyright 2024"

This reverts commit 8482847.

* Revert "Remove 351 shim's jason string"

This reverts commit 78d1f00.

* skip 351 from strict checking

* Alien scala2.13/pom.xml to scala2.12 one

Run the script `bash build/make-scala-version-build-files.sh 2.13`

Signed-off-by: Tim Liu <[email protected]>

* pretend 351 is a snapshot in 24.02

Signed-off-by: Gera Shegalov <[email protected]>

* pretend 351 is a SNAPSHOT version

* Revert change of build/shimplify.py

Signed-off-by: Tim Liu <[email protected]>

---------

Signed-off-by: Tim Liu <[email protected]>
Signed-off-by: Gera Shegalov <[email protected]>
Co-authored-by: Raza Jafri <[email protected]>
Co-authored-by: Gera Shegalov <[email protected]>

* Update changelog for v24.02.0 release (NVIDIA#10525)

Signed-off-by: Tim Liu <[email protected]>

---------

Signed-off-by: Tim Liu <[email protected]>
Signed-off-by: Gera Shegalov <[email protected]>
Co-authored-by: Raza Jafri <[email protected]>
Co-authored-by: Gera Shegalov <[email protected]>
Fix merge conflict from branch-24.02
Update to latest branch-24.02 [skip ci]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
* Distinct inner join

Signed-off-by: Jason Lowe <[email protected]>

* Distinct left join

Signed-off-by: Jason Lowe <[email protected]>

* Update to new API

* Fix test

---------

Signed-off-by: Jason Lowe <[email protected]>
…ment (NVIDIA#10564)

* WIP

Signed-off-by: Gera Shegalov <[email protected]>

* WIP

Signed-off-by: Gera Shegalov <[email protected]>

* Enable specifying the pytest using file_or_dir args

```bash
TEST_PARALLEL=0 \
SPARK_HOME=~/dist/spark-3.1.1-bin-hadoop3.2 \
TEST_FILE_OR_DIR=~/gits/NVIDIA/spark-rapids/integration_tests/src/main/python/arithmetic_ops_test.py::test_addition  \
./integration_tests/run_pyspark_from_build.sh --collect-only

<Module src/main/python/arithmetic_ops_test.py>
  <Function test_addition[Byte]>
  <Function test_addition[Short]>
  <Function test_addition[Integer]>
  <Function test_addition[Long]>
  <Function test_addition[Float]>
  <Function test_addition[Double]>
  <Function test_addition[Decimal(7,3)]>
  <Function test_addition[Decimal(12,2)]>
  <Function test_addition[Decimal(18,0)]>
  <Function test_addition[Decimal(20,2)]>
  <Function test_addition[Decimal(30,2)]>
  <Function test_addition[Decimal(36,5)]>
  <Function test_addition[Decimal(38,10)]>
  <Function test_addition[Decimal(38,0)]>
  <Function test_addition[Decimal(7,7)]>
  <Function test_addition[Decimal(7,-3)]>
  <Function test_addition[Decimal(36,-5)]>
  <Function test_addition[Decimal(38,-10)]>
```

Signed-off-by: Gera Shegalov <[email protected]>
Co-authored-by: Raza Jafri <[email protected]>

* Changing to TESTS=module::method

Signed-off-by: Gera Shegalov <[email protected]>

---------

Signed-off-by: Gera Shegalov <[email protected]>
Co-authored-by: Raza Jafri <[email protected]>
Wait for the pre-merge CI job to succeed

Signed-off-by: Tim Liu <[email protected]>
@NvTimLiu
Copy link
Owner Author

build

@NvTimLiu NvTimLiu closed this Mar 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.