Collect CUDA/CPU profiling info into result sheets. #5921

golechwierowicz · 2023-11-23T16:30:39Z

This PR:

Adds CUDA/CPU collection capabilties to the script.
Modifies result_analyzer.py to analyze newly collected results.
Moves CUDA synchronize/XLA device synchronize into the profiler.
Fixes list typing for Python 3.8+.

Tested with command:

python3 xla/benchmarks/experiment_runner.py --dynamo=openxla --xla=PJRT --test=train --filter=basic_gnn_gcn$ --suite-name=torchbench --accelerator=cuda --progress-bar --output-dirname=/tmp/output --repeat=2 --print-subprocess --no-resume --profile-cuda-cpu-collect --profile-cuda python3 xla/benchmarks/result_analyzer.py --output-dir=/tmp/output

This PR: 0. Adds CUDA/CPU collection capabilties to the script. 1. Modifies result_analyzer.py to analyze newly collected results. 2. Moves CUDA synchronize/XLA device synchronize into the profiler. 3. Fixes list typing for Python 3.8+. Tested with command: python3 xla/benchmarks/experiment_runner.py --dynamo=openxla --xla=PJRT --test=train --filter=basic_gnn_gcn$ --suite-name=torchbench --accelerator=cuda --progress-bar --output-dirname=/tmp/output --repeat=2 --print-subprocess --no-resume --profile-cuda-cpu-collect --profile-cuda python3 xla/benchmarks/result_analyzer.py --output-dir=/tmp/output

frgossen

Nice, ty!

benchmarks/experiment_runner.py

frgossen

LGTM, one comment.

vanbasten23 · 2023-11-28T03:50:42Z

benchmarks/experiment_runner.py

+      )
+      return
+
+    kernel_dump = prof.profiler.total_average()


Curious where is this total_average() defined?

https://github.com/pytorch/pytorch/blob/00412e6dfacdec5a8508c841b3ea0846388dc872/torch/autograd/profiler_util.py#L339-L350

* Collect CUDA/CPU profiling info into result sheets. This PR: 0. Adds CUDA/CPU collection capabilties to the script. 1. Modifies result_analyzer.py to analyze newly collected results. 2. Moves CUDA synchronize/XLA device synchronize into the profiler. 3. Fixes list typing for Python 3.8+. Tested with command: python3 xla/benchmarks/experiment_runner.py --dynamo=openxla --xla=PJRT --test=train --filter=basic_gnn_gcn$ --suite-name=torchbench --accelerator=cuda --progress-bar --output-dirname=/tmp/output --repeat=2 --print-subprocess --no-resume --profile-cuda-cpu-collect --profile-cuda python3 xla/benchmarks/result_analyzer.py --output-dir=/tmp/output * Lint, and add _s suffix to metrics --------- Co-authored-by: root <[email protected]>

miladm · 2023-12-04T16:47:08Z

Thanks.

cc @zpcore to take advantage of this feature in future benchmarking automation work.

* Collect CUDA/CPU profiling info into result sheets. This PR: 0. Adds CUDA/CPU collection capabilties to the script. 1. Modifies result_analyzer.py to analyze newly collected results. 2. Moves CUDA synchronize/XLA device synchronize into the profiler. 3. Fixes list typing for Python 3.8+. Tested with command: python3 xla/benchmarks/experiment_runner.py --dynamo=openxla --xla=PJRT --test=train --filter=basic_gnn_gcn$ --suite-name=torchbench --accelerator=cuda --progress-bar --output-dirname=/tmp/output --repeat=2 --print-subprocess --no-resume --profile-cuda-cpu-collect --profile-cuda python3 xla/benchmarks/result_analyzer.py --output-dir=/tmp/output * Lint, and add _s suffix to metrics --------- Co-authored-by: root <[email protected]>

golechwierowicz requested a review from frgossen November 23, 2023 16:30

frgossen reviewed Nov 24, 2023

View reviewed changes

benchmarks/experiment_runner.py Outdated Show resolved Hide resolved

frgossen approved these changes Nov 24, 2023

View reviewed changes

golechwierowicz force-pushed the olechwierowicz/add_profiling_data branch from 53b80a7 to dde4ea4 Compare November 27, 2023 15:51

Lint, and add _s suffix to metrics

ac84ed1

golechwierowicz force-pushed the olechwierowicz/add_profiling_data branch from dde4ea4 to ac84ed1 Compare November 27, 2023 15:52

vanbasten23 reviewed Nov 28, 2023

View reviewed changes

golechwierowicz merged commit c03afb1 into master Nov 28, 2023
18 checks passed

golechwierowicz deleted the olechwierowicz/add_profiling_data branch November 28, 2023 08:05

zpcore mentioned this pull request Nov 28, 2023

fix TypeError issue #5927

Closed

miladm requested a review from zpcore December 4, 2023 16:47

miladm added the xla:gpu label Dec 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Collect CUDA/CPU profiling info into result sheets. #5921

Collect CUDA/CPU profiling info into result sheets. #5921

golechwierowicz commented Nov 23, 2023 •

edited

Loading

frgossen left a comment

frgossen left a comment

vanbasten23 Nov 28, 2023

golechwierowicz Nov 28, 2023

miladm commented Dec 4, 2023

Collect CUDA/CPU profiling info into result sheets. #5921

Collect CUDA/CPU profiling info into result sheets. #5921

Conversation

golechwierowicz commented Nov 23, 2023 • edited Loading

frgossen left a comment

Choose a reason for hiding this comment

frgossen left a comment

Choose a reason for hiding this comment

vanbasten23 Nov 28, 2023

Choose a reason for hiding this comment

golechwierowicz Nov 28, 2023

Choose a reason for hiding this comment

miladm commented Dec 4, 2023

golechwierowicz commented Nov 23, 2023 •

edited

Loading