-
Notifications
You must be signed in to change notification settings - Fork 487
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move where clear pending IR is called to avoid crash #5552
Conversation
Thanks! Seems like the dynamo unit tests fail due to metric comparisons, probably due to moving the |
I am kind of glad that CI failed, so I can figure out what IR produced by |
ok I am able to repo locally, will try to take a look today or tmr |
hmm, with my most recent fix, hf_bert crashed again.. looking into it |
Ah OK, the
I think it is OK, I will bump up the counter |
@wonjoolee95 can I get a review for this one? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
* Move where clear pending IR is called to avoid crash * fix CI * fix CI and add some debugging messages
* Handle dynamo function without input (#5565) (#5577) * Make cpu tensor on XLA dynamo backend a warning instead of error (#5549) (#5576) * [author: jluntamazon] Adding more explicit HLO lowering control by exposing LoweringContext… (#5431) (#5580) * Adding more explicit HLO lowering control by exposing LoweringContext (and utilities) to python for Neuron * fixing linter issues * fixing spacing * apply comments and fix compilation errors * add test for new apis * fix linter * update test * update test * modify test * reverse back to GetIrValue() * update test inputs with random numbers * skip unittest because it only fails in CI --------- Co-authored-by: aws-kingrj <[email protected]> Co-authored-by: Ubuntu <[email protected]> Co-authored-by: seanlatias <[email protected]> * fixing num_local_processes typo (#5573) (#5579) Co-authored-by: aws-kingrj <[email protected]> * Move where clear pending IR is called to avoid crash (#5552) (#5582) * Move where clear pending IR is called to avoid crash * fix CI * fix CI and add some debugging messages * Fix release branch and tag patterns for GitHub Actions (#5587) (#5590) * Improve bernoulli rng-bit-generation memory footprint (#5581) (#5589) * Allow downcasting RngUniform genenration for Bernoulli Co-authored-by: Yeounoh Chung <[email protected]> * Enable xla:gpu autocast for bfloat16 if not restricted (#5570) (#5591) * Enable autocast for XLA:GPU * linter fix * XLA autocast test for GPU and TPU * linter fix * Ensure that xla autocast is properly enabled for GPU and does not crash when torch cuda is not available. * linter fix * Add tests * Support bf16 * linter fix * exclude unsupported test cases * increase GPU test timeout to 300 Co-authored-by: Yeounoh Chung <[email protected]> * Cherry-pick: Don't trigger CI build on release tag push (#5595) Copy of #5594 on release branch * formatting --------- Co-authored-by: JackCaoG <[email protected]> Co-authored-by: Wonjoo Lee <[email protected]> Co-authored-by: aws-kingrj <[email protected]> Co-authored-by: Ubuntu <[email protected]> Co-authored-by: seanlatias <[email protected]> Co-authored-by: Manfei <[email protected]> Co-authored-by: Yeounoh Chung <[email protected]>
Without this patch
will crash with
We expect
FallBackNodeCollector
to introduce in place operation toxla_args
so we want to replaced it with the cloned arg and clearing the pending IR. What I found is thatCapabilityBasedPartitioner
(or something close to that region of the code) will also introduce IRs. On top of that after thePartitioner
, thexla_args
passed to theextract_internal
is not the samexla_arg
passed toextract_compiled_graph
. If we calledclearPendingIr
afterPartitioner
we might remove the pending IR of the copiedxla_arg
and we have no way to restore those values then dynamo will crash.I chose to move the
clearpendingIr
to a earlier region of the code and usemark_step
to turn them into device data. I have some concern of the correctness of this approach but I need more time to debug whyPartitioner
will introduce IRs to begin with.