Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add graph splitter for all-gather/all-reduce operations #1545

Merged

Conversation

polvalente
Copy link
Contributor

For compilation passes we'll have:

ShardingPropagation -> Part of what's present in ShardingCompiler currently -- basically, this is the purpose for __compile__. This is what propagates the sharding throughout the function.
GraphSplitter -> This PR (see below)
ShardingSeparation -> This is also present in ShardingCompiler, and is mostly what __jit__ currently executes.

This PR adds the GraphSplitter pass that we can use to split computations based on
certain conditions (such as operations that require that a given dimension is not sharded).

It also builds out a list of computation specs that we can use to build an execution graph afterwards.
Each of these stages might have further fan-out depending on the subsequent ShardingSeparation pass.

@polvalente polvalente self-assigned this Oct 16, 2024
@polvalente polvalente changed the base branch from main to pv-feat/experimental-sharding-backend October 16, 2024 08:54
@polvalente polvalente marked this pull request as ready for review October 17, 2024 08:22
@polvalente polvalente changed the title [WIP] feat: add graph splitter for all-gather/all-reduce operations feat: add graph splitter for all-gather/all-reduce operations Oct 17, 2024
@polvalente polvalente merged commit 6eb6fba into pv-feat/experimental-sharding-backend Oct 17, 2024
8 checks passed
@polvalente polvalente deleted the pv-feat/add-graph-splitter branch October 17, 2024 08:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant