Support list of ShardingSpec in MpDeviceLoader #5789

jonb377 · 2023-11-10T01:20:16Z

In #5768, there is a use case to shard different-rank input tensors. Currently, the MpDeviceLoader sharding only accepts a single spec, which is applied to compatible tensors. With this change, multiple specs can be provided in construction, and each input tensor will have the first compatible spec applied to it.

yeounoh · 2023-11-10T20:43:59Z

torch_xla/core/xla_model.py

+          for sharding in input_sharding:
+            if sharding.can_apply(tensor):
+              shardings[i] = sharding.xla_spec(tensor)
+              break


Maybe we also add some comments about the first match is applied?

Done, thanks Yeounoh!

Is first match something common practice? Or will it be better:

either one sharding spec for everything;

or one sharding spec for each input and in order.

That the behavior is well defined. Otherwise, this "compatible check" is totally a black box for the user.

It's definitely not ideal... Probably better would be to have the user provide sharding specs in a structure matching that of the inputs and avoid the black-box compatibility check, e.g.:

# Inputs of the form: {"a": torch.randn(16), "b": torch.randn(16, 16), "c": torch.randn(16, 16, 16)} # Then input_sharding would be: {"a": ShardingSpec(mesh, (0,)), "b": ShardingSpec(mesh, (0, None)), "c": ShardingSpec(mesh, (0, None, None))}

This is a bit of a refactor of the TensorToXlaArena code though, so I took the easy way out. I'll go ahead and take a stab at the cleaner approach now.

No worries. Take your time.

yeounoh

LGTM

jonb377 added the SPMD / Distributed label Nov 10, 2023

jonb377 requested review from yeounoh and alanwaketan November 10, 2023 01:20

jonb377 self-assigned this Nov 10, 2023

Support list of ShardingSpec in MpDeviceLoader

ab617f8

jonb377 force-pushed the jonbolin/multi-spec branch from 8cd4e2e to ab617f8 Compare November 10, 2023 01:25

jonb377 mentioned this pull request Nov 10, 2023

How to provide sharding annotation for MpDeviceLoader when data has different dimensions #5768

Open

yeounoh reviewed Nov 10, 2023

View reviewed changes

yeounoh approved these changes Nov 10, 2023

View reviewed changes

Add documentation

f7bac25

jonb377 added the DO_NOT_MERGE_YET For PRs which cannot be merged, despite tests passing label Nov 30, 2023

JackCaoG added the backport_2.2 label Dec 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support list of ShardingSpec in MpDeviceLoader #5789

Support list of ShardingSpec in MpDeviceLoader #5789

jonb377 commented Nov 10, 2023 •

edited

Loading

yeounoh Nov 10, 2023

jonb377 Nov 14, 2023

alanwaketan Nov 14, 2023

jonb377 Nov 14, 2023 •

edited

Loading

alanwaketan Nov 15, 2023

yeounoh left a comment

Support list of ShardingSpec in MpDeviceLoader #5789

Are you sure you want to change the base?

Support list of ShardingSpec in MpDeviceLoader #5789

Conversation

jonb377 commented Nov 10, 2023 • edited Loading

yeounoh Nov 10, 2023

Choose a reason for hiding this comment

jonb377 Nov 14, 2023

Choose a reason for hiding this comment

alanwaketan Nov 14, 2023

Choose a reason for hiding this comment

jonb377 Nov 14, 2023 • edited Loading

Choose a reason for hiding this comment

alanwaketan Nov 15, 2023

Choose a reason for hiding this comment

yeounoh left a comment

Choose a reason for hiding this comment

jonb377 commented Nov 10, 2023 •

edited

Loading

jonb377 Nov 14, 2023 •

edited

Loading