Add hide-errors flag to result analyzer #5836

frgossen · 2023-11-21T18:34:11Z

No description provided.

vanbasten23

LGTM

* Adding the benchmarking with TorchBench (#5788) * Initial commit with dummy model benchmark * add XRT support * Add torchbench benchmark models * add randomize_input * add model set up for torchbench model * update ExperimentLoader * Add saving results * minor args update * update style * add experiment name * add grad context for eval and train * minor user config update * fix train() return item * minor refactor * add dynamo options * add column in result for dynamo setting * using to capture output and error * Fix some failure cases for dynamo * reduce eval result size by returning eval loss * minor refactor * revert eval result change * minor fix * Change output format to jsonl * Add accelerator model nname * add skipping finished experiments * main process needs to remove PJRT_DEVICE env var that is automatically added * Add a simple result analyzer * Result analyzer save to database csv with historical data * Handle detectron2 models * minor update * add deny list * Create run_benchmark * Rename run_benchmark to run_benchmark.sh * Fix device names and dynamo backend names in benchmark runner (#5806) * update optimizer for openxla * Add benchmark selection by tier 1-3 (#5808) * Apply Pytorch/XLA formatting style (#5816) * Add top tier benchmark runner (#5809) * Add profiling capabilities to experiment_runner.py script (#5812) * update run model config call interface, optimizer and result analyze script * update dependency errir * Add profiling capabilties --------- Co-authored-by: zpcore <[email protected]> * benchmarks: add script to aggregate results from result_analyzer (#5829) * benchmarks: extract tiers into their own file So that they can be reused in other files. The second user is coming next. * benchmarks: add aggregate.py This script processes output CSV files from results_analyzer to generate CSV/plots. Example: $ for fmt in csv png; do \ for acc in v100 a6000; do \ for report in latest histogram speedup; do \ for test in training inference; do \ FILENAME=/tmp/png/$acc-$test-$report.$fmt; \ python3 aggregate.py \ --accelerator=$acc \ --test=$test \ -i /tmp/csv-depot \ --report=$report \ --title="All benchmarks" \ --format=$fmt > $FILENAME || break; \ chmod 644 $FILENAME; \ done; \ done; \ done; \ done This generates plots and CSV files to summarize the latest performance vs. Inductor, as well as a histogram and a geomean speedup over time for all the input CSV data in /tmp/csv-depot. Results are broken down per accelerator and either inference or training. To generate results per tier, we just have to pass --filter-by-tier to the above and update the title to --title="Tier 1". * Fix syntax in experiment_runner.py (#5827) * Add flag to forward XLA flags and allow for experiment expansion (#5828) * Add hide-errors flag to result analyzer (#5836) * Add readme and linting * Fix ClusterResolver --------- Co-authored-by: Liyang90 <[email protected]> Co-authored-by: Manfei <[email protected]> Co-authored-by: zpcore <[email protected]> Co-authored-by: Grzegorz Olechwierowicz <[email protected]> Co-authored-by: Emilio Cota <[email protected]>

* Adding the benchmarking with TorchBench (pytorch#5788) * Initial commit with dummy model benchmark * add XRT support * Add torchbench benchmark models * add randomize_input * add model set up for torchbench model * update ExperimentLoader * Add saving results * minor args update * update style * add experiment name * add grad context for eval and train * minor user config update * fix train() return item * minor refactor * add dynamo options * add column in result for dynamo setting * using to capture output and error * Fix some failure cases for dynamo * reduce eval result size by returning eval loss * minor refactor * revert eval result change * minor fix * Change output format to jsonl * Add accelerator model nname * add skipping finished experiments * main process needs to remove PJRT_DEVICE env var that is automatically added * Add a simple result analyzer * Result analyzer save to database csv with historical data * Handle detectron2 models * minor update * add deny list * Create run_benchmark * Rename run_benchmark to run_benchmark.sh * Fix device names and dynamo backend names in benchmark runner (pytorch#5806) * update optimizer for openxla * Add benchmark selection by tier 1-3 (pytorch#5808) * Apply Pytorch/XLA formatting style (pytorch#5816) * Add top tier benchmark runner (pytorch#5809) * Add profiling capabilities to experiment_runner.py script (pytorch#5812) * update run model config call interface, optimizer and result analyze script * update dependency errir * Add profiling capabilties --------- Co-authored-by: zpcore <[email protected]> * benchmarks: add script to aggregate results from result_analyzer (pytorch#5829) * benchmarks: extract tiers into their own file So that they can be reused in other files. The second user is coming next. * benchmarks: add aggregate.py This script processes output CSV files from results_analyzer to generate CSV/plots. Example: $ for fmt in csv png; do \ for acc in v100 a6000; do \ for report in latest histogram speedup; do \ for test in training inference; do \ FILENAME=/tmp/png/$acc-$test-$report.$fmt; \ python3 aggregate.py \ --accelerator=$acc \ --test=$test \ -i /tmp/csv-depot \ --report=$report \ --title="All benchmarks" \ --format=$fmt > $FILENAME || break; \ chmod 644 $FILENAME; \ done; \ done; \ done; \ done This generates plots and CSV files to summarize the latest performance vs. Inductor, as well as a histogram and a geomean speedup over time for all the input CSV data in /tmp/csv-depot. Results are broken down per accelerator and either inference or training. To generate results per tier, we just have to pass --filter-by-tier to the above and update the title to --title="Tier 1". * Fix syntax in experiment_runner.py (pytorch#5827) * Add flag to forward XLA flags and allow for experiment expansion (pytorch#5828) * Add hide-errors flag to result analyzer (pytorch#5836) * Add readme and linting * Fix ClusterResolver --------- Co-authored-by: Liyang90 <[email protected]> Co-authored-by: Manfei <[email protected]> Co-authored-by: zpcore <[email protected]> Co-authored-by: Grzegorz Olechwierowicz <[email protected]> Co-authored-by: Emilio Cota <[email protected]>

* Adding the benchmarking with TorchBench (#5788) * Initial commit with dummy model benchmark * add XRT support * Add torchbench benchmark models * add randomize_input * add model set up for torchbench model * update ExperimentLoader * Add saving results * minor args update * update style * add experiment name * add grad context for eval and train * minor user config update * fix train() return item * minor refactor * add dynamo options * add column in result for dynamo setting * using to capture output and error * Fix some failure cases for dynamo * reduce eval result size by returning eval loss * minor refactor * revert eval result change * minor fix * Change output format to jsonl * Add accelerator model nname * add skipping finished experiments * main process needs to remove PJRT_DEVICE env var that is automatically added * Add a simple result analyzer * Result analyzer save to database csv with historical data * Handle detectron2 models * minor update * add deny list * Create run_benchmark * Rename run_benchmark to run_benchmark.sh * Fix device names and dynamo backend names in benchmark runner (#5806) * update optimizer for openxla * Add benchmark selection by tier 1-3 (#5808) * Apply Pytorch/XLA formatting style (#5816) * Add top tier benchmark runner (#5809) * Add profiling capabilities to experiment_runner.py script (#5812) * update run model config call interface, optimizer and result analyze script * update dependency errir * Add profiling capabilties --------- Co-authored-by: zpcore <[email protected]> * benchmarks: add script to aggregate results from result_analyzer (#5829) * benchmarks: extract tiers into their own file So that they can be reused in other files. The second user is coming next. * benchmarks: add aggregate.py This script processes output CSV files from results_analyzer to generate CSV/plots. Example: $ for fmt in csv png; do \ for acc in v100 a6000; do \ for report in latest histogram speedup; do \ for test in training inference; do \ FILENAME=/tmp/png/$acc-$test-$report.$fmt; \ python3 aggregate.py \ --accelerator=$acc \ --test=$test \ -i /tmp/csv-depot \ --report=$report \ --title="All benchmarks" \ --format=$fmt > $FILENAME || break; \ chmod 644 $FILENAME; \ done; \ done; \ done; \ done This generates plots and CSV files to summarize the latest performance vs. Inductor, as well as a histogram and a geomean speedup over time for all the input CSV data in /tmp/csv-depot. Results are broken down per accelerator and either inference or training. To generate results per tier, we just have to pass --filter-by-tier to the above and update the title to --title="Tier 1". * Fix syntax in experiment_runner.py (#5827) * Add flag to forward XLA flags and allow for experiment expansion (#5828) * Add hide-errors flag to result analyzer (#5836) * Add readme and linting * Fix ClusterResolver --------- Co-authored-by: Liyang90 <[email protected]> Co-authored-by: Manfei <[email protected]> Co-authored-by: zpcore <[email protected]> Co-authored-by: Grzegorz Olechwierowicz <[email protected]> Co-authored-by: Emilio Cota <[email protected]>

Add hide-errors flag to result analyzer

f65b4ce

frgossen force-pushed the hide-errors branch from 8201324 to f65b4ce Compare November 21, 2023 18:36

frgossen requested a review from vanbasten23 November 21, 2023 18:37

vanbasten23 approved these changes Nov 21, 2023

View reviewed changes

vanbasten23 merged commit a64f4ca into pytorch:benchmark Nov 21, 2023
1 check passed

zpcore pushed a commit that referenced this pull request Nov 21, 2023

Add hide-errors flag to result analyzer (#5836)

a6bef63

frgossen deleted the hide-errors branch November 21, 2023 23:13

frgossen added a commit to frgossen/pytorch-xla that referenced this pull request Nov 22, 2023

Add hide-errors flag to result analyzer (pytorch#5836)

8e2168a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add hide-errors flag to result analyzer #5836

Add hide-errors flag to result analyzer #5836

frgossen commented Nov 21, 2023

vanbasten23 left a comment

Add hide-errors flag to result analyzer #5836

Add hide-errors flag to result analyzer #5836

Conversation

frgossen commented Nov 21, 2023

vanbasten23 left a comment

Choose a reason for hiding this comment