Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test ONLY: if modified files #15

Closed
wants to merge 61 commits into from
Closed

Test ONLY: if modified files #15

wants to merge 61 commits into from

Conversation

YanxuanLiu
Copy link
Owner

No description provided.

nvauto and others added 30 commits November 25, 2024 06:15
Keep the rapids JNI and private dependency version at 24.12.0-SNAPSHOT until the nightly CI for the branch-25.02 branch is complete. Track the dependency update process at: NVIDIA#11755

Signed-off-by: nvauto <[email protected]>
[auto-merge] branch-24.12 to branch-25.02 [skip ci] [bot]
[auto-merge] branch-24.12 to branch-25.02 [skip ci] [bot]
[auto-merge] branch-24.12 to branch-25.02 [skip ci] [bot]
[auto-merge] branch-24.12 to branch-25.02 [skip ci] [bot]
* remove excluded release shim and TODO

Signed-off-by: YanxuanLiu <[email protected]>

* remove shim from 2.13 properties

Signed-off-by: YanxuanLiu <[email protected]>

* Fix error: 'NoneType' object has no attribute 'split' for excluded_shims

Signed-off-by: timl <[email protected]>

---------

Signed-off-by: YanxuanLiu <[email protected]>
Signed-off-by: timl <[email protected]>
Co-authored-by: timl <[email protected]>
[auto-merge] branch-24.12 to branch-25.02 [skip ci] [bot]
[auto-merge] branch-24.12 to branch-25.02 [skip ci] [bot]
[auto-merge] branch-24.12 to branch-25.02 [skip ci] [bot]
[auto-merge] branch-24.12 to branch-25.02 [skip ci] [bot]
[auto-merge] branch-24.12 to branch-25.02 [skip ci] [bot]
[auto-merge] branch-24.12 to branch-25.02 [skip ci] [bot]
[auto-merge] branch-24.12 to branch-25.02 [skip ci] [bot]
NVIDIA#11788)

The CI_PART1 job uploads the built Spark Rapids tar file to Databricks DBFS storage.

The CI_PART2 job retrieves the built tar file from DBFS storage and runs integration tests against it.

Signed-off-by: timl <[email protected]>
[auto-merge] branch-24.12 to branch-25.02 [skip ci] [bot]
…ip ci] (NVIDIA#11791)

* replace date with jni&private timestamp for cache key

Signed-off-by: YanxuanLiu <[email protected]>

* use date if quering timestamp failed

Signed-off-by: YanxuanLiu <[email protected]>

* add bash script to get timestamp

Signed-off-by: YanxuanLiu <[email protected]>

* replace timestamp with sha1

Signed-off-by: YanxuanLiu <[email protected]>

---------

Signed-off-by: YanxuanLiu <[email protected]>
[auto-merge] branch-24.12 to branch-25.02 [skip ci] [bot]
[auto-merge] branch-24.12 to branch-25.02 [skip ci] [bot]
Signed-off-by: Nghia Truong <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
* Add the 'test_type' parameter for Databricks script

For fixing: NVIDIA#11818

'nightly' is for nightly CI, 'pre-commit' is for the pre-merge CI

the pre-merge CI does not need to copy the Rapids plugin built tar from the Databricks cluster back to the local host,

only the nightly build needs to copy the spark-rapids-built.tgz back

Signed-off-by: timl <[email protected]>

* Update copyright

Signed-off-by: timl <[email protected]>

---------

Signed-off-by: timl <[email protected]>
…ace (NVIDIA#11813)

* Support some escape characters in search list when rewriting regexp_replace to string replace

Signed-off-by: Haoyang Li <[email protected]>

* add a case

Signed-off-by: Haoyang Li <[email protected]>

* address comment

Signed-off-by: Haoyang Li <[email protected]>

* update datagen

Signed-off-by: Haoyang Li <[email protected]>

---------

Signed-off-by: Haoyang Li <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
* Fix TrafficController numTasks check

Signed-off-by: Jihoon Son <[email protected]>

* rename weights properly

* simplify the loop condition

* Rename the condition variable for readability

Co-authored-by: Gera Shegalov <[email protected]>

* missing renames

* add test for when all tasks are big

---------

Signed-off-by: Jihoon Son <[email protected]>
Co-authored-by: Gera Shegalov <[email protected]>
* Add support for kudo write metrics

* Refactor

Signed-off-by: liurenjie1024 <[email protected]>

* Address comments

* Resolve comments

* Fix compiler

* Fix build break

* Fix build break

* Fix build break

* Fix build break

---------

Signed-off-by: liurenjie1024 <[email protected]>
…IA#11826)

* Balance the pre-merge CI job's time for the ci_1 and ci_2 tests

To fix: NVIDIA#11825

The pre-merge CI job is divided into CI_1 (mvn_verify) and CI_2.

We run these two parts in parallel to speed up the pre-merge CI.

Currently, CI_1 takes about 2 hours, while CI_2 takes approximately 4 hours.

Mark some tests as CI_1  to balance the time between CI_1 and CI_2

After remarking tests, both CI_1 and CI_2 jobs should be finished in 3 hours or so.

Signed-off-by: timl <[email protected]>

* Separate pre-merge CI job to two parts

To balance the duration, separate pre-merge CI job to two parts:
    premergeUT1(2 shims' UT + 1/3 of the integration tests)
    premergeUT2(1 shim's UT + 2/3 of the integration tests), for balancing the duration

Signed-off-by: timl <[email protected]>

---------

Signed-off-by: timl <[email protected]>
revans2 and others added 29 commits December 6, 2024 14:48
Signed-off-by: Nghia Truong <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
[auto-merge] branch-24.12 to branch-25.02 [skip ci] [bot]
* Some minor improvements identified during benchmark

Signed-off-by: liurenjie1024 <[email protected]>

* Fix late initialization

---------

Signed-off-by: liurenjie1024 <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
[auto-merge] branch-24.12 to branch-25.02 [skip ci] [bot]
* Optimize Databricks Jenkins scripts

Remove duplicate try/catch/container script blocks

Move default Databricks parameters into the common Groovy library

Signed-off-by: timl <[email protected]>

* Fix merge conflict

Fix merge conflict with https://github.com/NVIDIA/spark-rapids/pull/11819/files#diff-6c8e5cceR72

Signed-off-by: Tim Liu <[email protected]>

---------

Signed-off-by: timl <[email protected]>
Signed-off-by: Tim Liu <[email protected]>
* correct arg of get_buildvers.py

Signed-off-by: YanxuanLiu <[email protected]>

* output fail info

Signed-off-by: YanxuanLiu <[email protected]>

* fail the script when error occur

Signed-off-by: YanxuanLiu <[email protected]>

* test error

Signed-off-by: YanxuanLiu <[email protected]>

* test error

Signed-off-by: YanxuanLiu <[email protected]>

* split command to avoid masking error

Signed-off-by: YanxuanLiu <[email protected]>

---------

Signed-off-by: YanxuanLiu <[email protected]>
[auto-merge] branch-24.12 to branch-25.02 [skip ci] [bot]
[auto-merge] branch-24.12 to branch-25.02 [skip ci] [bot]
Signed-off-by: Nghia Truong <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
This PR is addressing some issues found during my local split-retry triage, to try to improve the stability. 

It includes:

- replacing map with safeMap for the conversion between Table and ColumnarBatch.
- reducing the GPU peak memory by closing the unnecessary batches as soon as possbile in Generate exec.
- adding the retry support for Table splitting operation in Gpu write.
- eliminating a potential memory leak in BroadcastNestedLoop join.

The existing tests should already cover these changes.

Signed-off-by: Firestarman <[email protected]>
… by CI_PART1 [databricks] (NVIDIA#11840)

* Support running Databricks CI_PART2 integration tests with JARs built by CI_PART1

To fix: NVIDIA#11838

The CI_PART1 job uploads the built Spark Rapids tar file to Databricks DBFS storage.

The CI_PART2 job retrieves the built tar file from DBFS storage and runs integration tests against it.

Then the CI_PART2 job doesn't need to duplicate the building of Spark Rapids jars; it can save about 1 hour of Databricks time.

Signed-off-by: timl <[email protected]>

* Check rapids plugin built tar in Databricks Jenkinsfile

Signed-off-by: Tim Liu <[email protected]>

* Check if the comma-separated files exist in the Databricks DBFS path within timeout minutes

Signed-off-by: Tim Liu <[email protected]>

* CI_PART2 build plugin jars after the timeout

Signed-off-by: Tim Liu <[email protected]>

* Let CI2 to do the eventually cleanup

Signed-off-by: Tim Liu <[email protected]>

---------

Signed-off-by: timl <[email protected]>
Signed-off-by: Tim Liu <[email protected]>
[auto-merge] branch-24.12 to branch-25.02 [skip ci] [bot]
Signed-off-by: Nghia Truong <[email protected]>
This commit documents the serialization format checks for writing
Hive text, and why it differs from the read-side.

`spark-rapids` supports only '^A'-separated Hive text files for read
and write. This format tends to be denoted in a Hive table's Storage
Properties with `serialization.format=1`.

If a Hive table is written with a different/custom delimiter, it is
denoted with a different value of `serialization.format`.  For instance,
a CSV table might be denoted by `serialization.format='',
field.delim=','`.

It was noticed in NVIDIA#11803
that:
1. On the [read
   side](https://github.com/NVIDIA/spark-rapids/blob/aa2da410511d8a737e207257769ec662a79174fe/sql-plugin/src/main/scala/org/apache/spark/sql/hive/rapids/HiveProviderImpl.scala#L155-L161), `spark-rapids` treats an empty `serialization.format` as `''`.
2. On the [write
   side](https://github.com/NVIDIA/spark-rapids/blob/aa2da410511d8a737e207257769ec662a79174fe/sql-plugin/src/main/scala/org/apache/spark/sql/hive/rapids/GpuHiveFileFormat.scala#L130-L136),
an empty `serialization.format` is seen as `1`.

The reason for the read side value is to be conservative.  Since the
table is pre-existing, its value should have been set already.

The reason for the write side is that there are legitimate cases where a
table might not have its `serialization.format` set.  (CTAS, for one.)

This commit documents all the scenarios that need to be considered on
the write side.

Signed-off-by: MithunR <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
@YanxuanLiu YanxuanLiu closed this Jan 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.