-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate GPU vs CPU usage metrics per pytest file in pandas testsuite for cudf.pandas
#16739
Generate GPU vs CPU usage metrics per pytest file in pandas testsuite for cudf.pandas
#16739
Conversation
… into gpu_cpu_metrics
… into gpu_cpu_metrics
cudf.pandas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: I'm aware of cases of where both GPU and CPU usage shows 0%, which is due to various reasons that I'm working on addressing in a follow-up PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that some of the tests have 0% reported for both CPU and GPU usage. I guess we aren't catching some of the dummy function calls (eg. pr_df['_slow_function_call']=0
?
Edit: My apologies, you're aware of this already.
pr_df['CPU Usage'] = pr_df['CPU Usage'].astype(str) + '%' | ||
pr_df['GPU Usage'] = pr_df['GPU Usage'].astype(str) + '%' | ||
|
||
pr_df['CPU Usage'] = pr_df['CPU Usage'].replace('nan%', '0%') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: IMO would be better to fillna(0)
before the astype(str)
so we don't have to do this replace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -68,8 +68,20 @@ def emoji_failed(x): | |||
pr_df = pd.DataFrame.from_dict(pr_results, orient="index").sort_index() | |||
main_df = pd.DataFrame.from_dict(main_results, orient="index").sort_index() | |||
diff_df = pr_df - main_df | |||
pr_df['CPU Usage'] = ((pr_df['_slow_function_call']/(pr_df['_slow_function_call'] + pr_df['_fast_function_call']))*100.0).round(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Could you put the denominator calculation on it's own line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@Matt711 I pushed a fix to this PR that will improve the missing reports, there would a few instances of such cases now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Everything LGTM. Before I approve, I'd like to see the final table.
Final table is here: https://github.com/rapidsai/cudf/actions/runs/10941284785/attempts/1#summary-30389636693 |
/merge |
Description
This PR introduces GPU and CPU usage reporting to cudf.pandas pytest suite and the generated metrics will be available for viewing in the existing pandas pytest summary page:
https://github.com/rapidsai/cudf/actions/runs/10886370333/attempts/1#summary-30220192117
Note: I'm aware of cases of where both GPU and CPU usage show 0%, which is due to various reasons that I'm working on addressing in a follow-up PR.
Checklist