Draft: Lowering Aten op to composite op instead of small ops #8502
+35
−2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR is to solve the 2nd question in this issue: supports composite op in training.
Motivation
Composite op is beneficial for performance optimization and we aim to apply it to training too. . According to the response in the issue, the community has no plan to extend this to training currently... Thus, I created this draft PR to demonstrate our intention.
Detail
This PR alters the Aten op lowering process when there isn't a 1:1 mapping to XLA op. It uses composite call instead of small XLA ops. Later, in the optimization process, the composite call can be easily replaced with a custom kernel or decomposed.
This is still a draft PR and only
Gelu
is implemented as an example. If it gets accepted, here are some further suggestions:XLA_COMPOSITE_OP
) to enable this feature. Also, add an op list setting to define which ops can be composed.Example
With this PR, the generated StableHLO is: