diff --git a/FIX_LOWERING_FOR_CORE_ATEN_OPS.md b/FIX_LOWERING_FOR_CORE_ATEN_OPS.md index 37c1fda22110..72f35ade2161 100644 --- a/FIX_LOWERING_FOR_CORE_ATEN_OPS.md +++ b/FIX_LOWERING_FOR_CORE_ATEN_OPS.md @@ -27,7 +27,7 @@ but produced a different result than torch eager mode. If the test uses 16-bit floats (float16, bfloat16); This is very likely that the tolerances that we give to `torch.allclose` to compare was to -strict. You can relax it a bit. Take a look at docs/fixing_core_aten_ops_log.md of one such example. +strict. You can relax it a bit. Take a look at [this issue](https://github.com/pytorch/xla/issues/5934) of one such example. If the result torchxla produces is totally different than what torch produces, that means it's a bug in lowering code; and probably need more work. Feel free to tag more people (such as qihqi to look). diff --git a/docs/fixing_core_aten_ops_log.md b/docs/fixing_core_aten_ops_log.md deleted file mode 100644 index 80ec9ddf609e..000000000000 --- a/docs/fixing_core_aten_ops_log.md +++ /dev/null @@ -1,79 +0,0 @@ - -# Issue beging worked https://github.com/pytorch/xla/issues/5902 - -qihqi - -## 1. Uncomment and rerun the test - -``` -LD_LIBRARY_PATH=/mnt/hanq/miniconda3/envs/torch310/lib/:/usr/lib/x86_64-linux-gnu/ PJRT_DEVICE=CPU XLA_STABLEHLO_COMPILE=1 XLA_HLO_DEBUG=1 XLA_IR_DEBUG=1 pytest test/test_core_aten_ops.py -k test_aten_tan_1 -``` - -output: -``` -=========================== short test summary info ============================ -[torch_xla_diff:0.001] SUBFAIL test/test_core_aten_ops.py::AtenOpTest::test_aten_tan_1 - AssertionError: False is not true -[stablehlo_diff: 0.001] SUBFAIL test/test_core_aten_ops.py::AtenOpTest::test_aten_tan_1 - AssertionError: False is not true -================= 2 failed, 1 passed, 514 deselected in 5.51s ================== -I0000 00:00:1700690393.569658 2513762 tfrt_cpu_pjrt_client.cc:352] TfrtCpuClient destroyed. -(torch310) hanq@hanq-compile-2:/mnt/hanq/git/qihqi/pytorch/xla$ -``` - -This means that the accuracy is not good. - -Break line here -``` -(torch310) hanq@hanq-compile-2:/mnt/hanq/git/qihqi/pytorch/xla$ git diff -diff --git a/test/test_core_aten_ops.py b/test/test_core_aten_ops.py -index 46a18494d..ff055ee38 100644 ---- a/test/test_core_aten_ops.py -+++ b/test/test_core_aten_ops.py -@@ -36,6 +36,7 @@ def run_export_and_compare(testcase, func, args, kwargs, atol=1e-3): - lambda x: x.to(device=device), kwargs) - res_xla = func(*args2, **kwargs2) - with testcase.subTest('torch_xla_diff:' + str(atol)): -+ import pdb; pdb.set_trace() - diff_output(testcase, res, res_xla, atol) -``` - -Rerun, print out the difference: -``` -(Pdb) p res - res_xla.cpu() -tensor([[ 0.0000e+00, 0.0000e+00, -4.8828e-04, 0.0000e+00, 0.0000e+00, - 0.0000e+00, 0.0000e+00, 6.1035e-05, 0.0000e+00, 0.0000e+00], - [-4.8828e-04, 0.0000e+00, 0.0000e+00, 9.7656e-04, 0.0000e+00, - 1.2207e-04, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00], - [ 0.0000e+00, -1.5259e-05, 0.0000e+00, 0.0000e+00, 0.0000e+00, - 4.8828e-04, 0.0000e+00, 0.0000e+00, 0.0000e+00, 1.2207e-04], - [ 0.0000e+00, 2.4414e-04, 0.0000e+00, -1.9531e-03, 0.0000e+00, - 0.0000e+00, 0.0000e+00, 0.0000e+00, -3.0518e-05, 0.0000e+00], - [ 0.0000e+00, -4.8828e-04, -2.4414e-04, 0.0000e+00, 0.0000e+00, - 0.0000e+00, 0.0000e+00, -6.1035e-05, 0.0000e+00, 0.0000e+00], - [ 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, -1.9531e-03, - 0.0000e+00, 0.0000e+00, 1.9531e-03, 0.0000e+00, 0.0000e+00], - [ 0.0000e+00, 0.0000e+00, -1.9531e-03, 0.0000e+00, 0.0000e+00, - 2.4414e-04, 9.7656e-04, 1.2207e-04, 0.0000e+00, 0.0000e+00], - [ 4.8828e-04, 0.0000e+00, 0.0000e+00, -7.8125e-03, 1.2207e-04, - -9.7656e-04, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00], - [ 0.0000e+00, 1.5625e-02, 0.0000e+00, 0.0000e+00, -4.8828e-04, - -1.2207e-04, 0.0000e+00, 0.0000e+00, -4.8828e-04, -3.9062e-03], - [ 0.0000e+00, -1.2207e-04, 0.0000e+00, 0.0000e+00, 0.0000e+00, - 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00]], - dtype=torch.float16) -``` -The result looks good enough; This means that probably we are being too strict in -test; setting a larger tolerance probably will work. -``` -(Pdb) p torch.max(torch.abs(res - res_xla.cpu())) -tensor(0.0156, dtype=torch.float16) - -``` -printing out the difference shows that roughly 0.01 atol with a slightly larger -`rtol` probably work. - -``` -(Pdb) torch.allclose(res, res_xla.cpu(), atol=0.01, rtol=0.001) -True -``` -Now it's time to PR: -https://github.com/pytorch/xla/pull/5915 \ No newline at end of file