-
Notifications
You must be signed in to change notification settings - Fork 487
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor Dynamo (custom op) integration code #5805
Conversation
@JackCaoG I see this test failing, but other dynamo tests are still passing
not sure what's happening, this fails consistently with/without the dynamo custom op. cc @wonjoolee95 |
Let me take a look later today |
105e4fe
to
b4e7a5a
Compare
The test requires multiple devices & non-GPU. So it won't run on the CI -- needs to test locally. |
d3a391f
to
46a7eb8
Compare
OK.. I can tell you what's support to happen. xla/torch_xla/core/dynamo_bridge.py Lines 364 to 378 in 6e66130
After THRESHOLD is reached, the dynamo won't check about input sharding. If we changed the input sharding, we will try to execute a compiled program with input that has different sharding, so it will crash. The check in the test was to make sure crash actually happened. I think you can just step through the test, you should see some C++ exception long during |
1dfd000
to
4e506b7
Compare
4e506b7
to
a6dac44
Compare
Synced offline with @JackCaoG he would help follow up. TLDR, it's recompiling, and we need to check if the threashold is being enforced.
|
@JackCaoG @wonjoolee95 we can review & land this refacotring PR, I won't address the test issue here (also, the test is not being run in the CPU/GPU CI). |
c998142
to
0664a30
Compare
Ok, found another test regression --
|
7cbefd4
to
acdd21b
Compare
acdd21b
to
1009476
Compare
This works now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* Refactor and clean SPMD+Dynamo integration code
* Refactor and clean SPMD+Dynamo integration code
* Refactor and clean SPMD+Dynamo integration code
* Refactor and clean SPMD+Dynamo integration code
Refactor after #5712 cc @wonjoolee95 @JackCaoG