-
Notifications
You must be signed in to change notification settings - Fork 237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fine-grained spill metrics #9509
Fine-grained spill metrics #9509
Conversation
Signed-off-by: Gera Shegalov <[email protected]>
Signed-off-by: Gera Shegalov <[email protected]>
Signed-off-by: Gera Shegalov <[email protected]>
Signed-off-by: Gera Shegalov <[email protected]>
Signed-off-by: Gera Shegalov <[email protected]>
Signed-off-by: Gera Shegalov <[email protected]>
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also look at updating the tuning docs for these changes. I think that doc changed to be in an internal NVIDIA repo so @mattahrens might need to help you get it done right.
Also @jlowe with me initially trying to call out project rapids4spark, so I want to check with him that he is okay with the names of the metrics here.
sql-plugin/src/main/scala/com/nvidia/spark/rapids/RapidsHostMemoryStore.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/RapidsHostMemoryStore.scala
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/HostAlloc.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/RapidsDiskStore.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/org/apache/spark/sql/rapids/GpuTaskMetrics.scala
Outdated
Show resolved
Hide resolved
Signed-off-by: Gera Shegalov <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updates look good, happy to see fewer metrics being added. Still wondering about host alloc time. Do we have a GPU alloc time metric or plans to add one?
I think there may have been some miscommunication here. Or at least I didn't do a good job of this. In #8880 I asked for "metric for the amount of time a task was blocked on host memory allocation" This was intended to be analogous to |
Signed-off-by: Gera Shegalov <[email protected]>
Signed-off-by: Gera Shegalov <[email protected]>
build |
sql-plugin/src/main/scala/org/apache/spark/sql/rapids/GpuTaskMetrics.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/org/apache/spark/sql/rapids/GpuTaskMetrics.scala
Outdated
Show resolved
Hide resolved
build |
This PR fixes #8880. it removes task metrics
and adds the following metrics instead:
Signed-off-by: Gera Shegalov [email protected]