Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable implicit usage of JDK profiles based on Maven's host JDK #4

Closed
wants to merge 18 commits into from

Conversation

gerashegalov
Copy link
Owner

The current architecture triggers a particular Spark shim profile by a buildver property, and such profiles are present in child profiles as well. This makes implicit usage of JDK profiles and the default release311 base Spark version profile mutually exclusive. Developers routinely forget to activate the right JDK profile explicitly per -PjdkXY and see counter-intuitive failures far removed from the root cause.

This PR works around the issue by creating an intermediate pom jdk-profiles.

This is PR is stacked on top of NVIDIA#9508

Signed-off-by: @gerashegalov

@gerashegalov gerashegalov self-assigned this Oct 28, 2023
@gerashegalov gerashegalov changed the base branch from shimDepsSwitcherParent to sparkShellSmokeTest October 28, 2023 06:04
@gerashegalov gerashegalov changed the base branch from sparkShellSmokeTest to shimDepsSwitcherParent October 28, 2023 06:04
@gerashegalov gerashegalov marked this pull request as draft October 29, 2023 04:12
gerashegalov and others added 12 commits October 29, 2023 13:10
Signed-off-by: Gera Shegalov <[email protected]>
fixes NVIDIA#9480

This PR adds support of launching Map Pandas UDF on empty partitions to align with Spark's behavior.

So far I don't see other types of Pandas UDF will be called for empty partitions.

The test is copied from the example in the linked issue.

---------

Signed-off-by: Firestarman <[email protected]>
* Enforce Apache 3.3.0+ for Scala 2.13

Fixes NVIDIA#9563

Signed-off-by: Gera Shegalov <[email protected]>

* Fix comment

Signed-off-by: Gera Shegalov <[email protected]>

---------

Signed-off-by: Gera Shegalov <[email protected]>
Correct two instances of incomplete `#endif` comments that are not properly [replaced](https://github.com/NVIDIA/spark-rapids/blob/2cc202ad8e226823e6e5f448878732601d7c6698/build/make-scala-version-build-files.sh#L76) when generating the scala2.13 pom. It does not break the build because the broken comment becomes an ignored text fragment

Signed-off-by: Gera Shegalov <[email protected]>
Fixes NVIDIA#9569.

NVIDIA#9489 added `NTH_VALUE()` tests with option to `IGNORE NULLS`, but mistakenly
enabled `IGNORE NULLS` for Spark versions prior to `3.2.1`.

This commit restricts tests for `IGNORE NULLS` to only Spark versions
exceeding `3.1.x`, where the feature is available.

Signed-off-by: MithunR <[email protected]>
…issue (NVIDIA#9576)

* Move Stack classes to wrapper classes around the original scala 2.12/2.13 Stack classes to handle build issues

Signed-off-by: Navin Kumar <[email protected]>

* Make these classes subclass scala.Proxy

Signed-off-by: Navin Kumar <[email protected]>

* Clean this up

Signed-off-by: Navin Kumar <[email protected]>

* Fix bug in premerge build

Signed-off-by: Navin Kumar <[email protected]>

---------

Signed-off-by: Navin Kumar <[email protected]>
* Delta Lake 2.3.0 support

Signed-off-by: Jason Lowe <[email protected]>

* Make merge UDFs consistent with Delta Lake 2.4

---------

Signed-off-by: Jason Lowe <[email protected]>
…A#9508)

Factor out dependency switching profiles for different Spark builds into a single intermediate parent pom

Fixes NVIDIA#9552 

Signed-off-by: Gera Shegalov <[email protected]>
Signed-off-by: Gera Shegalov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants