Adding the benchmarking with TorchBench #5788

Liyang90 · 2023-11-10T00:16:43Z

Merging the benchmark fork

…y added

JackCaoG · 2023-11-10T00:19:03Z

Thanks @Liyang90 @zpcore can you verified if this script works on today's nightly before we merge this one?

zpcore · 2023-11-10T00:25:11Z

Thanks @Liyang90 @zpcore can you verified if this script works on today's nightly before we merge this one?

I don't think so, there are several things we want to update, e.g.,:

zpcore@835b5ea: Dynamo backend naming
zpcore@b146f0a Update with newest torchbench call api.
zpcore@b853e85 Enable Dynamo optimizer.

@RissyRan may have other updates she wants to add.

Should we check in this version first and make the update next? We can add the CI once we are done with the update with the latest build.

Liyang90 · 2023-11-10T00:30:56Z

Thanks @Liyang90 @zpcore can you verified if this script works on today's nightly before we merge this one?

I don't think so, there are several things we want to update, e.g.,:

zpcore@835b5ea: Dynamo backend naming

zpcore@b146f0a Update with newest torchbench call api.

zpcore@b853e85 Enable Dynamo optimizer.

Should we check in this version first and make the update next? We can add the CI once we are done with the update with the latest build.

Right, I think this PR requires a follow up PR before its functionality can be tested.

This is okay as it is not the main branch.

JackCaoG · 2023-11-10T00:41:59Z

Yea I am ok with merging it as it is while we are working on follow up prs.

frgossen · 2023-11-15T21:44:12Z

I ran into a few issues with the runner and think it needs a few minor changes.
https://github.com/pytorch/xla/pull/5806/files

* Initial commit with dummy model benchmark * add XRT support * Add torchbench benchmark models * add randomize_input * add model set up for torchbench model * update ExperimentLoader * Add saving results * minor args update * update style * add experiment name * add grad context for eval and train * minor user config update * fix train() return item * minor refactor * add dynamo options * add column in result for dynamo setting * using to capture output and error * Fix some failure cases for dynamo * reduce eval result size by returning eval loss * minor refactor * revert eval result change * minor fix * Change output format to jsonl * Add accelerator model nname * add skipping finished experiments * main process needs to remove PJRT_DEVICE env var that is automatically added * Add a simple result analyzer * Result analyzer save to database csv with historical data * Handle detectron2 models * minor update * add deny list

* Adding the benchmarking with TorchBench (#5788) * Initial commit with dummy model benchmark * add XRT support * Add torchbench benchmark models * add randomize_input * add model set up for torchbench model * update ExperimentLoader * Add saving results * minor args update * update style * add experiment name * add grad context for eval and train * minor user config update * fix train() return item * minor refactor * add dynamo options * add column in result for dynamo setting * using to capture output and error * Fix some failure cases for dynamo * reduce eval result size by returning eval loss * minor refactor * revert eval result change * minor fix * Change output format to jsonl * Add accelerator model nname * add skipping finished experiments * main process needs to remove PJRT_DEVICE env var that is automatically added * Add a simple result analyzer * Result analyzer save to database csv with historical data * Handle detectron2 models * minor update * add deny list * Create run_benchmark * Rename run_benchmark to run_benchmark.sh * Fix device names and dynamo backend names in benchmark runner (#5806) * update optimizer for openxla * Add benchmark selection by tier 1-3 (#5808) * Apply Pytorch/XLA formatting style (#5816) * Add top tier benchmark runner (#5809) * Add profiling capabilities to experiment_runner.py script (#5812) * update run model config call interface, optimizer and result analyze script * update dependency errir * Add profiling capabilties --------- Co-authored-by: zpcore <[email protected]> * benchmarks: add script to aggregate results from result_analyzer (#5829) * benchmarks: extract tiers into their own file So that they can be reused in other files. The second user is coming next. * benchmarks: add aggregate.py This script processes output CSV files from results_analyzer to generate CSV/plots. Example: $ for fmt in csv png; do \ for acc in v100 a6000; do \ for report in latest histogram speedup; do \ for test in training inference; do \ FILENAME=/tmp/png/$acc-$test-$report.$fmt; \ python3 aggregate.py \ --accelerator=$acc \ --test=$test \ -i /tmp/csv-depot \ --report=$report \ --title="All benchmarks" \ --format=$fmt > $FILENAME || break; \ chmod 644 $FILENAME; \ done; \ done; \ done; \ done This generates plots and CSV files to summarize the latest performance vs. Inductor, as well as a histogram and a geomean speedup over time for all the input CSV data in /tmp/csv-depot. Results are broken down per accelerator and either inference or training. To generate results per tier, we just have to pass --filter-by-tier to the above and update the title to --title="Tier 1". * Fix syntax in experiment_runner.py (#5827) * Add flag to forward XLA flags and allow for experiment expansion (#5828) * Add hide-errors flag to result analyzer (#5836) * Add readme and linting * Fix ClusterResolver --------- Co-authored-by: Liyang90 <[email protected]> Co-authored-by: Manfei <[email protected]> Co-authored-by: zpcore <[email protected]> Co-authored-by: Grzegorz Olechwierowicz <[email protected]> Co-authored-by: Emilio Cota <[email protected]>

* Adding the benchmarking with TorchBench (pytorch#5788) * Initial commit with dummy model benchmark * add XRT support * Add torchbench benchmark models * add randomize_input * add model set up for torchbench model * update ExperimentLoader * Add saving results * minor args update * update style * add experiment name * add grad context for eval and train * minor user config update * fix train() return item * minor refactor * add dynamo options * add column in result for dynamo setting * using to capture output and error * Fix some failure cases for dynamo * reduce eval result size by returning eval loss * minor refactor * revert eval result change * minor fix * Change output format to jsonl * Add accelerator model nname * add skipping finished experiments * main process needs to remove PJRT_DEVICE env var that is automatically added * Add a simple result analyzer * Result analyzer save to database csv with historical data * Handle detectron2 models * minor update * add deny list * Create run_benchmark * Rename run_benchmark to run_benchmark.sh * Fix device names and dynamo backend names in benchmark runner (pytorch#5806) * update optimizer for openxla * Add benchmark selection by tier 1-3 (pytorch#5808) * Apply Pytorch/XLA formatting style (pytorch#5816) * Add top tier benchmark runner (pytorch#5809) * Add profiling capabilities to experiment_runner.py script (pytorch#5812) * update run model config call interface, optimizer and result analyze script * update dependency errir * Add profiling capabilties --------- Co-authored-by: zpcore <[email protected]> * benchmarks: add script to aggregate results from result_analyzer (pytorch#5829) * benchmarks: extract tiers into their own file So that they can be reused in other files. The second user is coming next. * benchmarks: add aggregate.py This script processes output CSV files from results_analyzer to generate CSV/plots. Example: $ for fmt in csv png; do \ for acc in v100 a6000; do \ for report in latest histogram speedup; do \ for test in training inference; do \ FILENAME=/tmp/png/$acc-$test-$report.$fmt; \ python3 aggregate.py \ --accelerator=$acc \ --test=$test \ -i /tmp/csv-depot \ --report=$report \ --title="All benchmarks" \ --format=$fmt > $FILENAME || break; \ chmod 644 $FILENAME; \ done; \ done; \ done; \ done This generates plots and CSV files to summarize the latest performance vs. Inductor, as well as a histogram and a geomean speedup over time for all the input CSV data in /tmp/csv-depot. Results are broken down per accelerator and either inference or training. To generate results per tier, we just have to pass --filter-by-tier to the above and update the title to --title="Tier 1". * Fix syntax in experiment_runner.py (pytorch#5827) * Add flag to forward XLA flags and allow for experiment expansion (pytorch#5828) * Add hide-errors flag to result analyzer (pytorch#5836) * Add readme and linting * Fix ClusterResolver --------- Co-authored-by: Liyang90 <[email protected]> Co-authored-by: Manfei <[email protected]> Co-authored-by: zpcore <[email protected]> Co-authored-by: Grzegorz Olechwierowicz <[email protected]> Co-authored-by: Emilio Cota <[email protected]>

* Adding the benchmarking with TorchBench (#5788) * Initial commit with dummy model benchmark * add XRT support * Add torchbench benchmark models * add randomize_input * add model set up for torchbench model * update ExperimentLoader * Add saving results * minor args update * update style * add experiment name * add grad context for eval and train * minor user config update * fix train() return item * minor refactor * add dynamo options * add column in result for dynamo setting * using to capture output and error * Fix some failure cases for dynamo * reduce eval result size by returning eval loss * minor refactor * revert eval result change * minor fix * Change output format to jsonl * Add accelerator model nname * add skipping finished experiments * main process needs to remove PJRT_DEVICE env var that is automatically added * Add a simple result analyzer * Result analyzer save to database csv with historical data * Handle detectron2 models * minor update * add deny list * Create run_benchmark * Rename run_benchmark to run_benchmark.sh * Fix device names and dynamo backend names in benchmark runner (#5806) * update optimizer for openxla * Add benchmark selection by tier 1-3 (#5808) * Apply Pytorch/XLA formatting style (#5816) * Add top tier benchmark runner (#5809) * Add profiling capabilities to experiment_runner.py script (#5812) * update run model config call interface, optimizer and result analyze script * update dependency errir * Add profiling capabilties --------- Co-authored-by: zpcore <[email protected]> * benchmarks: add script to aggregate results from result_analyzer (#5829) * benchmarks: extract tiers into their own file So that they can be reused in other files. The second user is coming next. * benchmarks: add aggregate.py This script processes output CSV files from results_analyzer to generate CSV/plots. Example: $ for fmt in csv png; do \ for acc in v100 a6000; do \ for report in latest histogram speedup; do \ for test in training inference; do \ FILENAME=/tmp/png/$acc-$test-$report.$fmt; \ python3 aggregate.py \ --accelerator=$acc \ --test=$test \ -i /tmp/csv-depot \ --report=$report \ --title="All benchmarks" \ --format=$fmt > $FILENAME || break; \ chmod 644 $FILENAME; \ done; \ done; \ done; \ done This generates plots and CSV files to summarize the latest performance vs. Inductor, as well as a histogram and a geomean speedup over time for all the input CSV data in /tmp/csv-depot. Results are broken down per accelerator and either inference or training. To generate results per tier, we just have to pass --filter-by-tier to the above and update the title to --title="Tier 1". * Fix syntax in experiment_runner.py (#5827) * Add flag to forward XLA flags and allow for experiment expansion (#5828) * Add hide-errors flag to result analyzer (#5836) * Add readme and linting * Fix ClusterResolver --------- Co-authored-by: Liyang90 <[email protected]> Co-authored-by: Manfei <[email protected]> Co-authored-by: zpcore <[email protected]> Co-authored-by: Grzegorz Olechwierowicz <[email protected]> Co-authored-by: Emilio Cota <[email protected]>

Liyang90 added 30 commits January 10, 2023 19:29

Initial commit with dummy model benchmark

f2ed93b

add XRT support

5984de6

Add torchbench benchmark models

8d0f78e

add randomize_input

ca20c6f

add model set up for torchbench model

4ae79e2

update ExperimentLoader

35d446f

Add saving results

4bb173d

minor args update

980cd5b

sync with master

7cc45fb

update style

a410666

add experiment name

9e1506e

add grad context for eval and train

7591cbc

minor user config update

6e9a327

fix train() return item

1939fe0

minor refactor

8a50910

add dynamo options

bacda04

add column in result for dynamo setting

0bc642e

using to capture output and error

3851180

Fix some failure cases for dynamo

83c674c

reduce eval result size by returning eval loss

113830f

minor refactor

de609b8

revert eval result change

2d94836

minor fix

c2ad278

Change output format to jsonl

88fee74

Add accelerator model nname

deb1482

sync with master

4993534

add skipping finished experiments

10c52a7

main process needs to remove PJRT_DEVICE env var that is automaticall…

3b3724c

…y added

Add a simple result analyzer

668f289

Result analyzer save to database csv with historical data

1e787a7

Liyang90 added 6 commits March 15, 2023 16:32

Handle detectron2 models

80a3fd6

Merge branch 'master' into benchmark

ad12d41

minor update

a756875

Merge branch 'master' into benchmark

aec61fb

add deny list

307578c

Merge branch 'pytorch:benchmark' into benchmark

84dff00

JackCaoG approved these changes Nov 10, 2023

View reviewed changes

zpcore merged commit 1b905cc into pytorch:benchmark Nov 13, 2023
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding the benchmarking with TorchBench #5788

Adding the benchmarking with TorchBench #5788

Liyang90 commented Nov 10, 2023

JackCaoG commented Nov 10, 2023

zpcore commented Nov 10, 2023 •

edited

Loading

Liyang90 commented Nov 10, 2023 •

edited

Loading

JackCaoG commented Nov 10, 2023

frgossen commented Nov 15, 2023

Adding the benchmarking with TorchBench #5788

Adding the benchmarking with TorchBench #5788

Conversation

Liyang90 commented Nov 10, 2023

JackCaoG commented Nov 10, 2023

zpcore commented Nov 10, 2023 • edited Loading

Liyang90 commented Nov 10, 2023 • edited Loading

JackCaoG commented Nov 10, 2023

frgossen commented Nov 15, 2023

zpcore commented Nov 10, 2023 •

edited

Loading

Liyang90 commented Nov 10, 2023 •

edited

Loading