-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Enable AQE related recommendations in Profiler Auto-tuner #688
Conversation
6da5607
to
52e5039
Compare
Signed-off-by: Partho Sarthi <[email protected]>
3c945bb
to
6ec1aa6
Compare
core/src/main/scala/com/nvidia/spark/rapids/tool/profiling/AutoTuner.scala
Show resolved
Hide resolved
core/src/main/scala/com/nvidia/spark/rapids/tool/profiling/AutoTuner.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/com/nvidia/spark/rapids/tool/profiling/AutoTuner.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @parthosa and @cindyyuanjiang !
I made some few comments
Can you please check also how the AQE recommendations look like in the user-tools?
One of the things to keep an eye on is that the tables of the profiler stdout are readable and match the expected from the profiler core .
core/src/main/scala/com/nvidia/spark/rapids/tool/profiling/AutoTuner.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/com/nvidia/spark/rapids/tool/profiling/AutoTuner.scala
Show resolved
Hide resolved
core/src/main/scala/com/nvidia/spark/rapids/tool/profiling/AutoTuner.scala
Show resolved
Hide resolved
core/src/main/scala/org/apache/spark/sql/rapids/tool/ToolUtils.scala
Outdated
Show resolved
Hide resolved
core/src/test/scala/com/nvidia/spark/rapids/tool/profiling/AutoTunerSuite.scala
Outdated
Show resolved
Hide resolved
core/src/test/scala/com/nvidia/spark/rapids/tool/profiling/AutoTunerSuite.scala
Outdated
Show resolved
Hide resolved
Signed-off-by: Partho Sarthi <[email protected]>
Signed-off-by: cindyyuanjiang <[email protected]>
Signed-off-by: cindyyuanjiang <[email protected]>
core/src/main/scala/com/nvidia/spark/rapids/tool/profiling/AutoTuner.scala
Show resolved
Hide resolved
core/src/main/scala/com/nvidia/spark/rapids/tool/profiling/AutoTuner.scala
Outdated
Show resolved
Hide resolved
Signed-off-by: cindyyuanjiang <[email protected]>
Filed follow up issue, tracked here: #719 |
core/src/main/scala/com/nvidia/spark/rapids/tool/profiling/AutoTuner.scala
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Switching my review to approved. I saw the benchmark results and I am okay with the change now. I am a little concerned with what happens if the the number of shuffle partitions is very small, or the AQE target shuffle size is very large, or if the maxPartitionBytes is very large. But as long as we have benchmarks that we are running we can improve the benchmarks over time as we see more corner cases from customers show up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the PR has conflicts. Needs upmerge.
core/src/main/scala/com/nvidia/spark/rapids/tool/profiling/AutoTuner.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/com/nvidia/spark/rapids/tool/profiling/AutoTuner.scala
Outdated
Show resolved
Hide resolved
Thank you for the feedback @revans2! |
Signed-off-by: cindyyuanjiang <[email protected]>
Signed-off-by: cindyyuanjiang <[email protected]>
Fixes #576
This PR added the following list of settings AQE optimization for auto-tuner:
- For A100, set to 64MB
- For T4, set to 32MB
Otherwise, set to 128MB.
- For A100, set to 400
- For T4, set to 800
- Set to 'false' to prioritize 'advisoryPartitionSizeInBytes' over 'minPartitionSize' for better performance