Fixed some of the failing parquet_tests [databricks] #11429

razajafri · 2024-09-05T20:49:02Z

This PR contributes towards fixing #11024

Signed-off-by: Raza Jafri <[email protected]>

revans2 · 2024-09-06T14:05:13Z

integration_tests/src/main/python/parquet_test.py

@@ -35,15 +35,19 @@ def read_parquet_df(data_path):
 def read_parquet_sql(data_path):
    return lambda spark : spark.sql('select * from parquet.`{}`'.format(data_path))

+datetimeRebaseModeInWriteKey = 'spark.sql.legacy.parquet.datetimeRebaseModeInWrite' if is_before_spark_400() else 'spark.sql.parquet.datetimeRebaseModeInWrite'


All of the non-legacy versions of these configs appear to have been added in 3.0.0. Is there a reason we are not just switching over to using them instead?

revans2 · 2024-09-06T14:07:03Z

sql-plugin/src/main/spark400/scala/com/nvidia/spark/rapids/shims/GpuBatchScanExec.scala

@@ -47,7 +47,7 @@ case class GpuBatchScanExec(
  @transient override lazy val batch: Batch = if (scan == null) null else scan.toBatch
  // TODO: unify the equal/hashCode implementation for all data source v2 query plans.
  override def equals(other: Any): Boolean = other match {
-    case other: GpuBatchScanExec =>
+    case other: BatchScanExec =>


Why is this being changed? This is a GpuBatchScanExec. We don't want to be equal to non-GPU versions do we?

Right. While debugging I wasn't sure what was causing a failure and looking at 330 shim I changed this and didn't change it back before submitting this PR.

I am adding that change to this PR as well.

razajafri · 2024-09-06T17:25:08Z

build

razajafri · 2024-09-10T16:11:21Z

The failure in CI seems unrelated

 - extract mortgage data *** FAILED ***

[2024-09-06T17:53:06.482Z]   0 did not equal 10000 (MortgageSparkSuite.scala:65)

The test only reads a CSV, sorts it and does a row count. It passes locally

razajafri · 2024-09-10T16:11:24Z

build

jlowe · 2024-09-10T16:16:56Z

The failure in CI seems unrelated

This is tracked in #11436, test was temporarily disabled in #11451.

razajafri · 2024-09-10T17:19:18Z

We currently do not have sufficient g5.4xlarge capacity in the Availability Zone you requested (us-west-2c). Our system will be working on provisioning additional capacity. You can currently get g5.4xlarge capacity by not specifying an Availability Zone in your request or choosing us-west-2a, us-west-2b.

razajafri · 2024-09-10T17:19:30Z

build

razajafri added 2 commits September 5, 2024 13:37

Fixed some of the failing parquet_tests

d897bff

Signing off

a865536

Signed-off-by: Raza Jafri <[email protected]>

revans2 reviewed Sep 6, 2024

View reviewed changes

addressed review comments

a76e5a5

razajafri requested a review from revans2 September 6, 2024 16:06

removed unused import

936203e

revans2 approved these changes Sep 6, 2024

View reviewed changes

razajafri merged commit 502f5a3 into NVIDIA:branch-24.10 Sep 10, 2024
45 checks passed

razajafri deleted the SP-11024-fix-parquet-tests branch September 10, 2024 22:50

sameerz added Spark 4.0+ Spark 4.0+ issues bug Something isn't working labels Sep 10, 2024

rwlee mentioned this pull request Oct 17, 2024

Fix tests failures in orc_cast_test.py #11021

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed some of the failing parquet_tests [databricks] #11429

Fixed some of the failing parquet_tests [databricks] #11429

razajafri commented Sep 5, 2024

revans2 Sep 6, 2024

revans2 Sep 6, 2024

razajafri Sep 6, 2024

razajafri commented Sep 6, 2024

razajafri commented Sep 10, 2024

razajafri commented Sep 10, 2024

jlowe commented Sep 10, 2024

razajafri commented Sep 10, 2024

razajafri commented Sep 10, 2024

Fixed some of the failing parquet_tests [databricks] #11429

Fixed some of the failing parquet_tests [databricks] #11429

Conversation

razajafri commented Sep 5, 2024

revans2 Sep 6, 2024

Choose a reason for hiding this comment

revans2 Sep 6, 2024

Choose a reason for hiding this comment

razajafri Sep 6, 2024

Choose a reason for hiding this comment

razajafri commented Sep 6, 2024

razajafri commented Sep 10, 2024

razajafri commented Sep 10, 2024

jlowe commented Sep 10, 2024

razajafri commented Sep 10, 2024

razajafri commented Sep 10, 2024