-
Notifications
You must be signed in to change notification settings - Fork 487
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open XLA pin update #5675
Open XLA pin update #5675
Conversation
openxla_patches/f16_abi_clang.diff
Outdated
@@ -1,19 +0,0 @@ | |||
upstream CI will fail without this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know why we were able to remove this patch? Is it because we updated the compiler in the CI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to kick off upstream CI build targetting this branch and see whether CI will pass
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah turns out I do still need those patches... otherwise the training job hangs.
@@ -1,14 +0,0 @@ | |||
diff --git a/xla/service/gpu/gpu_executable.cc b/xla/service/gpu/gpu_executable.cc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question as above
WORKSPACE
Outdated
], | ||
strip_prefix = "xla-97a5f819faf9ff793b7ba68ff1f31f74f9459c18", | ||
strip_prefix = "xla-7a19856d74569fd1f765cd03bdee84e3b1fdc579", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also update the libtpu dependency in setup.py
to the same date as this commit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
tested on v4-8: with command
result: Old: === |
openxla_patches/gpu_build_file.diff
Outdated
"@tsl//tsl/platform:casts", | ||
"@tsl//tsl/platform:errors", | ||
- ] + if_cuda([ | ||
+ ] + if_cuda_or_rocm([ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
this patch looks like for openxla/xla@9938bdb, so curious about the reason to skip the modify of load("//xla/stream_executor:build_defs.bzl", "if_cuda_or_rocm", "if_gpu_is_configured")
?
since GPU CI failed with the same issue: RuntimeError: torch_xla/csrc/device.cpp:72 : Invalid device specification: CUDA:0
, are they related too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No particular reason.
I started importing on Oct 3 and this change is Oct 4.
6c59c2c
to
3f57cd1
Compare
b97aa10
to
2dc72ab
Compare
2dc72ab
to
af8bb2f
Compare
], | ||
strip_prefix = "xla-97a5f819faf9ff793b7ba68ff1f31f74f9459c18", | ||
strip_prefix = "xla-51b59cfb1999c6f1b3ec59851675044b2c502aae", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for moving the head to this commit!
setup.py
Outdated
@@ -72,7 +72,7 @@ | |||
|
|||
base_dir = os.path.dirname(os.path.abspath(__file__)) | |||
|
|||
_libtpu_version = '0.1.dev20230825' | |||
_libtpu_version = '0.1.dev20231009' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect this should be 0.1.dev20231010 in order to include the open xla commit you specified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Let me enable TPU CI and wait until it finishes.
Open XLA pin update - updated to 20231010
Open XLA pin update - updated to 20231010
Open XLA pin update - updated to 20231010
Open XLA pin update - updated to 20231010
Open XLA pin update - updated to 20231010
Open XLA pin update - updated to 20231010
No description provided.