-
Notifications
You must be signed in to change notification settings - Fork 491
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add hide-errors flag to result analyzer #5836
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
frgossen
force-pushed
the
hide-errors
branch
from
November 21, 2023 18:36
8201324
to
f65b4ce
Compare
vanbasten23
approved these changes
Nov 21, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
zpcore
pushed a commit
that referenced
this pull request
Nov 21, 2023
frgossen
added a commit
to frgossen/pytorch-xla
that referenced
this pull request
Nov 22, 2023
frgossen
added a commit
that referenced
this pull request
Nov 22, 2023
* Adding the benchmarking with TorchBench (#5788) * Initial commit with dummy model benchmark * add XRT support * Add torchbench benchmark models * add randomize_input * add model set up for torchbench model * update ExperimentLoader * Add saving results * minor args update * update style * add experiment name * add grad context for eval and train * minor user config update * fix train() return item * minor refactor * add dynamo options * add column in result for dynamo setting * using to capture output and error * Fix some failure cases for dynamo * reduce eval result size by returning eval loss * minor refactor * revert eval result change * minor fix * Change output format to jsonl * Add accelerator model nname * add skipping finished experiments * main process needs to remove PJRT_DEVICE env var that is automatically added * Add a simple result analyzer * Result analyzer save to database csv with historical data * Handle detectron2 models * minor update * add deny list * Create run_benchmark * Rename run_benchmark to run_benchmark.sh * Fix device names and dynamo backend names in benchmark runner (#5806) * update optimizer for openxla * Add benchmark selection by tier 1-3 (#5808) * Apply Pytorch/XLA formatting style (#5816) * Add top tier benchmark runner (#5809) * Add profiling capabilities to experiment_runner.py script (#5812) * update run model config call interface, optimizer and result analyze script * update dependency errir * Add profiling capabilties --------- Co-authored-by: zpcore <[email protected]> * benchmarks: add script to aggregate results from result_analyzer (#5829) * benchmarks: extract tiers into their own file So that they can be reused in other files. The second user is coming next. * benchmarks: add aggregate.py This script processes output CSV files from results_analyzer to generate CSV/plots. Example: $ for fmt in csv png; do \ for acc in v100 a6000; do \ for report in latest histogram speedup; do \ for test in training inference; do \ FILENAME=/tmp/png/$acc-$test-$report.$fmt; \ python3 aggregate.py \ --accelerator=$acc \ --test=$test \ -i /tmp/csv-depot \ --report=$report \ --title="All benchmarks" \ --format=$fmt > $FILENAME || break; \ chmod 644 $FILENAME; \ done; \ done; \ done; \ done This generates plots and CSV files to summarize the latest performance vs. Inductor, as well as a histogram and a geomean speedup over time for all the input CSV data in /tmp/csv-depot. Results are broken down per accelerator and either inference or training. To generate results per tier, we just have to pass --filter-by-tier to the above and update the title to --title="Tier 1". * Fix syntax in experiment_runner.py (#5827) * Add flag to forward XLA flags and allow for experiment expansion (#5828) * Add hide-errors flag to result analyzer (#5836) * Add readme and linting * Fix ClusterResolver --------- Co-authored-by: Liyang90 <[email protected]> Co-authored-by: Manfei <[email protected]> Co-authored-by: zpcore <[email protected]> Co-authored-by: Grzegorz Olechwierowicz <[email protected]> Co-authored-by: Emilio Cota <[email protected]>
lsy323
pushed a commit
to lsy323/xla
that referenced
this pull request
Nov 28, 2023
* Adding the benchmarking with TorchBench (pytorch#5788) * Initial commit with dummy model benchmark * add XRT support * Add torchbench benchmark models * add randomize_input * add model set up for torchbench model * update ExperimentLoader * Add saving results * minor args update * update style * add experiment name * add grad context for eval and train * minor user config update * fix train() return item * minor refactor * add dynamo options * add column in result for dynamo setting * using to capture output and error * Fix some failure cases for dynamo * reduce eval result size by returning eval loss * minor refactor * revert eval result change * minor fix * Change output format to jsonl * Add accelerator model nname * add skipping finished experiments * main process needs to remove PJRT_DEVICE env var that is automatically added * Add a simple result analyzer * Result analyzer save to database csv with historical data * Handle detectron2 models * minor update * add deny list * Create run_benchmark * Rename run_benchmark to run_benchmark.sh * Fix device names and dynamo backend names in benchmark runner (pytorch#5806) * update optimizer for openxla * Add benchmark selection by tier 1-3 (pytorch#5808) * Apply Pytorch/XLA formatting style (pytorch#5816) * Add top tier benchmark runner (pytorch#5809) * Add profiling capabilities to experiment_runner.py script (pytorch#5812) * update run model config call interface, optimizer and result analyze script * update dependency errir * Add profiling capabilties --------- Co-authored-by: zpcore <[email protected]> * benchmarks: add script to aggregate results from result_analyzer (pytorch#5829) * benchmarks: extract tiers into their own file So that they can be reused in other files. The second user is coming next. * benchmarks: add aggregate.py This script processes output CSV files from results_analyzer to generate CSV/plots. Example: $ for fmt in csv png; do \ for acc in v100 a6000; do \ for report in latest histogram speedup; do \ for test in training inference; do \ FILENAME=/tmp/png/$acc-$test-$report.$fmt; \ python3 aggregate.py \ --accelerator=$acc \ --test=$test \ -i /tmp/csv-depot \ --report=$report \ --title="All benchmarks" \ --format=$fmt > $FILENAME || break; \ chmod 644 $FILENAME; \ done; \ done; \ done; \ done This generates plots and CSV files to summarize the latest performance vs. Inductor, as well as a histogram and a geomean speedup over time for all the input CSV data in /tmp/csv-depot. Results are broken down per accelerator and either inference or training. To generate results per tier, we just have to pass --filter-by-tier to the above and update the title to --title="Tier 1". * Fix syntax in experiment_runner.py (pytorch#5827) * Add flag to forward XLA flags and allow for experiment expansion (pytorch#5828) * Add hide-errors flag to result analyzer (pytorch#5836) * Add readme and linting * Fix ClusterResolver --------- Co-authored-by: Liyang90 <[email protected]> Co-authored-by: Manfei <[email protected]> Co-authored-by: zpcore <[email protected]> Co-authored-by: Grzegorz Olechwierowicz <[email protected]> Co-authored-by: Emilio Cota <[email protected]>
chunnienc
pushed a commit
to chunnienc/xla
that referenced
this pull request
Dec 14, 2023
* Adding the benchmarking with TorchBench (pytorch#5788) * Initial commit with dummy model benchmark * add XRT support * Add torchbench benchmark models * add randomize_input * add model set up for torchbench model * update ExperimentLoader * Add saving results * minor args update * update style * add experiment name * add grad context for eval and train * minor user config update * fix train() return item * minor refactor * add dynamo options * add column in result for dynamo setting * using to capture output and error * Fix some failure cases for dynamo * reduce eval result size by returning eval loss * minor refactor * revert eval result change * minor fix * Change output format to jsonl * Add accelerator model nname * add skipping finished experiments * main process needs to remove PJRT_DEVICE env var that is automatically added * Add a simple result analyzer * Result analyzer save to database csv with historical data * Handle detectron2 models * minor update * add deny list * Create run_benchmark * Rename run_benchmark to run_benchmark.sh * Fix device names and dynamo backend names in benchmark runner (pytorch#5806) * update optimizer for openxla * Add benchmark selection by tier 1-3 (pytorch#5808) * Apply Pytorch/XLA formatting style (pytorch#5816) * Add top tier benchmark runner (pytorch#5809) * Add profiling capabilities to experiment_runner.py script (pytorch#5812) * update run model config call interface, optimizer and result analyze script * update dependency errir * Add profiling capabilties --------- Co-authored-by: zpcore <[email protected]> * benchmarks: add script to aggregate results from result_analyzer (pytorch#5829) * benchmarks: extract tiers into their own file So that they can be reused in other files. The second user is coming next. * benchmarks: add aggregate.py This script processes output CSV files from results_analyzer to generate CSV/plots. Example: $ for fmt in csv png; do \ for acc in v100 a6000; do \ for report in latest histogram speedup; do \ for test in training inference; do \ FILENAME=/tmp/png/$acc-$test-$report.$fmt; \ python3 aggregate.py \ --accelerator=$acc \ --test=$test \ -i /tmp/csv-depot \ --report=$report \ --title="All benchmarks" \ --format=$fmt > $FILENAME || break; \ chmod 644 $FILENAME; \ done; \ done; \ done; \ done This generates plots and CSV files to summarize the latest performance vs. Inductor, as well as a histogram and a geomean speedup over time for all the input CSV data in /tmp/csv-depot. Results are broken down per accelerator and either inference or training. To generate results per tier, we just have to pass --filter-by-tier to the above and update the title to --title="Tier 1". * Fix syntax in experiment_runner.py (pytorch#5827) * Add flag to forward XLA flags and allow for experiment expansion (pytorch#5828) * Add hide-errors flag to result analyzer (pytorch#5836) * Add readme and linting * Fix ClusterResolver --------- Co-authored-by: Liyang90 <[email protected]> Co-authored-by: Manfei <[email protected]> Co-authored-by: zpcore <[email protected]> Co-authored-by: Grzegorz Olechwierowicz <[email protected]> Co-authored-by: Emilio Cota <[email protected]>
golechwierowicz
added a commit
that referenced
this pull request
Jan 12, 2024
* Adding the benchmarking with TorchBench (#5788) * Initial commit with dummy model benchmark * add XRT support * Add torchbench benchmark models * add randomize_input * add model set up for torchbench model * update ExperimentLoader * Add saving results * minor args update * update style * add experiment name * add grad context for eval and train * minor user config update * fix train() return item * minor refactor * add dynamo options * add column in result for dynamo setting * using to capture output and error * Fix some failure cases for dynamo * reduce eval result size by returning eval loss * minor refactor * revert eval result change * minor fix * Change output format to jsonl * Add accelerator model nname * add skipping finished experiments * main process needs to remove PJRT_DEVICE env var that is automatically added * Add a simple result analyzer * Result analyzer save to database csv with historical data * Handle detectron2 models * minor update * add deny list * Create run_benchmark * Rename run_benchmark to run_benchmark.sh * Fix device names and dynamo backend names in benchmark runner (#5806) * update optimizer for openxla * Add benchmark selection by tier 1-3 (#5808) * Apply Pytorch/XLA formatting style (#5816) * Add top tier benchmark runner (#5809) * Add profiling capabilities to experiment_runner.py script (#5812) * update run model config call interface, optimizer and result analyze script * update dependency errir * Add profiling capabilties --------- Co-authored-by: zpcore <[email protected]> * benchmarks: add script to aggregate results from result_analyzer (#5829) * benchmarks: extract tiers into their own file So that they can be reused in other files. The second user is coming next. * benchmarks: add aggregate.py This script processes output CSV files from results_analyzer to generate CSV/plots. Example: $ for fmt in csv png; do \ for acc in v100 a6000; do \ for report in latest histogram speedup; do \ for test in training inference; do \ FILENAME=/tmp/png/$acc-$test-$report.$fmt; \ python3 aggregate.py \ --accelerator=$acc \ --test=$test \ -i /tmp/csv-depot \ --report=$report \ --title="All benchmarks" \ --format=$fmt > $FILENAME || break; \ chmod 644 $FILENAME; \ done; \ done; \ done; \ done This generates plots and CSV files to summarize the latest performance vs. Inductor, as well as a histogram and a geomean speedup over time for all the input CSV data in /tmp/csv-depot. Results are broken down per accelerator and either inference or training. To generate results per tier, we just have to pass --filter-by-tier to the above and update the title to --title="Tier 1". * Fix syntax in experiment_runner.py (#5827) * Add flag to forward XLA flags and allow for experiment expansion (#5828) * Add hide-errors flag to result analyzer (#5836) * Add readme and linting * Fix ClusterResolver --------- Co-authored-by: Liyang90 <[email protected]> Co-authored-by: Manfei <[email protected]> Co-authored-by: zpcore <[email protected]> Co-authored-by: Grzegorz Olechwierowicz <[email protected]> Co-authored-by: Emilio Cota <[email protected]>
bhavya01
pushed a commit
that referenced
this pull request
Apr 22, 2024
* Adding the benchmarking with TorchBench (#5788) * Initial commit with dummy model benchmark * add XRT support * Add torchbench benchmark models * add randomize_input * add model set up for torchbench model * update ExperimentLoader * Add saving results * minor args update * update style * add experiment name * add grad context for eval and train * minor user config update * fix train() return item * minor refactor * add dynamo options * add column in result for dynamo setting * using to capture output and error * Fix some failure cases for dynamo * reduce eval result size by returning eval loss * minor refactor * revert eval result change * minor fix * Change output format to jsonl * Add accelerator model nname * add skipping finished experiments * main process needs to remove PJRT_DEVICE env var that is automatically added * Add a simple result analyzer * Result analyzer save to database csv with historical data * Handle detectron2 models * minor update * add deny list * Create run_benchmark * Rename run_benchmark to run_benchmark.sh * Fix device names and dynamo backend names in benchmark runner (#5806) * update optimizer for openxla * Add benchmark selection by tier 1-3 (#5808) * Apply Pytorch/XLA formatting style (#5816) * Add top tier benchmark runner (#5809) * Add profiling capabilities to experiment_runner.py script (#5812) * update run model config call interface, optimizer and result analyze script * update dependency errir * Add profiling capabilties --------- Co-authored-by: zpcore <[email protected]> * benchmarks: add script to aggregate results from result_analyzer (#5829) * benchmarks: extract tiers into their own file So that they can be reused in other files. The second user is coming next. * benchmarks: add aggregate.py This script processes output CSV files from results_analyzer to generate CSV/plots. Example: $ for fmt in csv png; do \ for acc in v100 a6000; do \ for report in latest histogram speedup; do \ for test in training inference; do \ FILENAME=/tmp/png/$acc-$test-$report.$fmt; \ python3 aggregate.py \ --accelerator=$acc \ --test=$test \ -i /tmp/csv-depot \ --report=$report \ --title="All benchmarks" \ --format=$fmt > $FILENAME || break; \ chmod 644 $FILENAME; \ done; \ done; \ done; \ done This generates plots and CSV files to summarize the latest performance vs. Inductor, as well as a histogram and a geomean speedup over time for all the input CSV data in /tmp/csv-depot. Results are broken down per accelerator and either inference or training. To generate results per tier, we just have to pass --filter-by-tier to the above and update the title to --title="Tier 1". * Fix syntax in experiment_runner.py (#5827) * Add flag to forward XLA flags and allow for experiment expansion (#5828) * Add hide-errors flag to result analyzer (#5836) * Add readme and linting * Fix ClusterResolver --------- Co-authored-by: Liyang90 <[email protected]> Co-authored-by: Manfei <[email protected]> Co-authored-by: zpcore <[email protected]> Co-authored-by: Grzegorz Olechwierowicz <[email protected]> Co-authored-by: Emilio Cota <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.