Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Delta log cache size settings during integration tests [databricks] #10541

Merged
merged 1 commit into from
Mar 4, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion integration_tests/run_pyspark_from_build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -227,7 +227,9 @@ else
TZ=${TZ:-UTC}

# Set the Delta log cache size to prevent the driver from caching every Delta log indefinitely
export PYSP_TEST_spark_driver_extraJavaOptions="-ea -Duser.timezone=$TZ -Ddelta.log.cacheSize=10 $COVERAGE_SUBMIT_FLAGS"
export PYSP_TEST_spark_databricks_delta_delta_log_cacheSize=${PYSP_TEST_spark_databricks_delta_delta_log_cacheSize:-10}
deltaCacheSize=$PYSP_TEST_spark_databricks_delta_delta_log_cacheSize
export PYSP_TEST_spark_driver_extraJavaOptions="-ea -Duser.timezone=$TZ -Ddelta.log.cacheSize=$deltaCacheSize $COVERAGE_SUBMIT_FLAGS"
export PYSP_TEST_spark_executor_extraJavaOptions="-ea -Duser.timezone=$TZ"
export PYSP_TEST_spark_ui_showConsoleProgress='false'
export PYSP_TEST_spark_sql_session_timeZone=$TZ
Expand Down Expand Up @@ -380,6 +382,7 @@ EOF

# avoid double processing of variables passed to spark in
# spark_conf_init
unset PYSP_TEST_spark_databricks_delta_delta_log_cacheSize
unset PYSP_TEST_spark_driver_extraClassPath
unset PYSP_TEST_spark_driver_extraJavaOptions
unset PYSP_TEST_spark_jars
Expand All @@ -391,6 +394,7 @@ EOF
--driver-java-options "$driverJavaOpts" \
$SPARK_SUBMIT_FLAGS \
--conf 'spark.rapids.memory.gpu.allocSize='"$gpuAllocSize" \
--conf 'spark.databricks.delta.delta.log.cacheSize='"$deltaCacheSize" \
"${RUN_TESTS_COMMAND[@]}" "${TEST_COMMON_OPTS[@]}"
fi
fi
Loading