Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support S32/U32 indices for BWD embedding and index ops & Neuron implicit downcast #8450

Closed
wants to merge 17 commits into from

Conversation

rpsilva-aws
Copy link
Contributor

@rpsilva-aws rpsilva-aws commented Dec 4, 2024

In this PR, we extend tensor operations to allow S32 indices. This follows suits with other operations, in order to add flexibility and potentially performance benefits for accelerator backends. Reference for embedding dense bwd: https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/native/Embedding.cpp#L117

I re-structure a good portion of the tests to validate both types.

In addition, we also re-introduce the implicit downcasting for Neuron S64/U64 types, since the Neuron compiler does not support 64 bits.

@rpsilva-aws rpsilva-aws marked this pull request as ready for review December 4, 2024 18:41
@miladm miladm requested review from bhavya01 and tengyifei December 5, 2024 00:04
@miladm
Copy link
Collaborator

miladm commented Dec 5, 2024

is this PR ready for review @rpsilva-aws?

@rpsilva-aws
Copy link
Contributor Author

@miladm Yes, but there seem to be some issues with the test infra - I need to validate with the TPU docker. I can ping once done/fixed.

@tengyifei
Copy link
Collaborator

Can just ping me again when the "Build and test" has passed

@rpsilva-aws
Copy link
Contributor Author

Unfortunately, part of this PR has a dependency on pytorch/pytorch#142160.

Instead, creating 2 PRs:

@rpsilva-aws rpsilva-aws closed this Dec 6, 2024
@rpsilva-aws rpsilva-aws deleted the rpsilva_downcast branch December 9, 2024 19:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants