Support torch_xla2 benchmarking using torchbench #7013

zpcore · 2024-05-01T21:06:51Z

Summary

Integrate the torch_xla2 testing into the benchmarking.

Details

The PR ports the torch_xla2 running script based on file here into the existing benchmark script.

How to run

To run the torch_xla2 benchmarking, you need to install the torch_xla2 based on instructions experimental/torch_xla2/README.md

Currently torch_xla2 doesn't work with dynamo. We need to append flag --torch-xla2 to switch to torch_xla2, e.g.,

export JAX_PLATFORMS=TPU;
python experiment_runner.py \
--suite-name=torchbench \
--accelerator=tpu \
--progress-bar  \
--xla=PJRT  \
--test=eval \
--filter=dcgan \
--torch-xla2

In practical, we need to make sure JAX version and torch_xla are using the same PJRT version. I did the openxla pin update (backport) for torch_xla in order to use PJRT 0.47 verison.

Sample result

Just tried a simple model dcgan on TPU v5p-8, the torch_xla2 performance is impressive:

benchmark	platform	torch_xla version	backend	median_total_time (s)	compile_time (s)
dcgan (eval)	v5-8	torch_xla	LTC	0.0022	1.5729
dcgan (eval)	v5-8	torch_xla	openxla	0.0005	1.8661
dcgan (eval)	v5-8	torch_xla2	jax.jit	0.0003911869862349704	1.464099046002957

benchmarks/benchmark_experiment.py

benchmarks/experiment_runner.py

benchmarks/util.py

benchmarks/experiment_runner.py

benchmarks/benchmark_experiment.py

benchmarks/util.py

benchmarks/benchmark_experiment.py

benchmarks/benchmark_model.py

zpcore requested review from qihqi, ysiraichi, vanbasten23 and will-cromar May 2, 2024 00:49

zpcore marked this pull request as ready for review May 2, 2024 07:46

ysiraichi reviewed May 2, 2024

View reviewed changes

vanbasten23 reviewed May 2, 2024

View reviewed changes

benchmarks/experiment_runner.py Outdated Show resolved Hide resolved

vanbasten23 reviewed May 2, 2024

View reviewed changes

benchmarks/benchmark_experiment.py Outdated Show resolved Hide resolved

ysiraichi reviewed May 6, 2024

View reviewed changes

benchmarks/util.py Outdated Show resolved Hide resolved

qihqi approved these changes May 10, 2024

View reviewed changes

benchmarks/benchmark_experiment.py Outdated Show resolved Hide resolved

benchmarks/benchmark_model.py Outdated Show resolved Hide resolved

zpcore added 11 commits May 14, 2024 20:27

add torch_xla2 benchmark to torchbench test

dfad1d0

minor update

8e63770

minor update

d58dc0d

minor update

c76db09

minor update

d166215

add JAX_PLATFORMS env

c5d1d75

nit update

6698f37

fix bugs

6315459

nit update

4d59e43

update based on feedback

dcab385

update torch_xla2 argument

be40604

zpcore force-pushed the piz/xla2_bm branch from 92165f9 to be40604 Compare May 14, 2024 22:46

zpcore added 8 commits May 15, 2024 07:21

add extract_jax for torch_xla2

a6ba2c1

fix test

0305dcf

fix test

d55c86c

fix test

1609dd5

fix test

de91758

fix test

a804a5a

fix test

07c92af

fix test

d66baa4

fix test

5cfb47e

zpcore merged commit 9e18935 into master May 16, 2024
19 of 20 checks passed

zpcore deleted the piz/xla2_bm branch May 16, 2024 00:08

ysiraichi mentioned this pull request May 18, 2024

[benchmarks] Add default value to move_to_device. #7080

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support torch_xla2 benchmarking using torchbench #7013

Support torch_xla2 benchmarking using torchbench #7013

zpcore commented May 1, 2024 •

edited

Loading

Support torch_xla2 benchmarking using torchbench #7013

Support torch_xla2 benchmarking using torchbench #7013

Conversation

zpcore commented May 1, 2024 • edited Loading

Summary

Details

How to run

Sample result

zpcore commented May 1, 2024 •

edited

Loading