Add the support of using WholeGraph distributed embedding to store/update sparse_emb #677

chang-l · 2023-12-06T07:40:04Z

Update on 12/11/2023:

Most added code is covered by unit tests, except the sparse optimizer step function in graphstorm. To compensate it and gain more confident for the integrated feature, I also added an independent unit test test_wg_sparse_opt.py, which has no dependency on graphstorm, to show the integrated wholegraph sparse optimizer can generate close enough embs vs. torch sparse adam optimizer.

Current code should be compatible now, ie., all unit tests should pass (except the fails due to #685). This PR is ready for review.

Updated todos:

~~Test/debug for sparse_optimizer.step() to show the updates are consistent between WholeGraph vs. DistDGL vs. PyTorch~~
Need to validate/benchmark an end2end test with learnable embeddings using WholeGraph
Misc. questions/TODOs in the comments

Just like using WholeGraph distributed feature store for gathering, this PR tries to add the support of using WholeGraph to manage sparse_embs to unblock learnable embeddings with WholeGraph.

For now, it is still a draft PR. The working test only covers save/load sparse embeddings using WholeGraph.

Todos:

Test/debug for sparse_optimizer.step() to show the updates are consistent between WholeGraph vs. DistDGL vs. PyTorch
Still need to work on adding end2end test with learnable embeddings using WholeGraph (any missing pieces?)
Misc. questions/TODOs in the comments

@zheng-da @classicsong @isratnisa @nvcastet @TristonC

python/graphstorm/model/embed.py

python/graphstorm/model/utils.py

python/graphstorm/utils.py

Refactor and simplify WholeGraph integration of Sparse Opt Bug fixes and improve tests

And slighty improve the code quality

chang-l · 2024-01-13T00:42:03Z

@classicsong Here is an example of yaml file change, w.r.t this PR, to add wholegraph support for learnable embeddings: 44e4643

Note I only updated file gsgnn_lp.py for link pred tasks.

python/graphstorm/config/argument.py

python/graphstorm/model/embed.py

python/graphstorm/model/utils.py

python/graphstorm/wholegraph/wholegraph.py

To accommodate pytorch/DistDGL logic

python/graphstorm/model/embed.py

python/graphstorm/model/gnn.py

python/graphstorm/wholegraph/wholegraph.py

python/graphstorm/model/utils.py

python/graphstorm/model/embed.py

python/graphstorm/wholegraph/wholegraph.py

tests/unit-tests/test_wg_sparse_opt.py

chang-l added 6 commits December 5, 2023 23:28

Add wholegraph distributed embedding support for sparse_emb

cc907b9

Remove the test that is not ready

ebe46e6

Add scatter op to load embeddings

ab492a6

Complete the tests

2c0b535

Minor update: reorder code

32a1f66

Add more tests

a407a01

chang-l marked this pull request as ready for review December 12, 2023 07:11

classicsong added the ready able to trigger the CI label Dec 12, 2023

chang-l added 3 commits December 12, 2023 14:35

Formatting for lint and better case control for tests

46ad27d

Add env to turn on/off wg sparse emb

48ed6e4

Fix a bug in wholegraph sparse_emb forward call

3c1e131

classicsong reviewed Dec 16, 2023

View reviewed changes

chang-l added 12 commits January 2, 2024 15:43

Refactor code to separate WholeGraph-related functions

003dfe0

Merge branch 'wholegraph_reorg' into add_wg_sparse_emb_rebased

3ef307e

Refactor and simplify WholeGraph integration of Sparse Opt

26eb51c

Refactor and simplify WholeGraph integration of Sparse Opt Bug fixes and improve tests

Fix lint

8bc04ce

Fix lint

04a47ea

Address comment

8164218

Fix lint

ef2b789

Add Copyright

0c36e8c

Merge branch 'wholegraph_reorg' into add_wg_sparse_emb_rebased

58b3e3a

And slighty improve the code quality

Fix all tests

afa3d05

Merge branch 'main' into add_wg_sparse_emb_rebased

f8ab9df

Minor update

a904722

Update to compatiable when wholegraph is not avail

532ffcf

classicsong reviewed Jan 18, 2024

View reviewed changes

chang-l added 3 commits January 19, 2024 15:40

Partly address comments

6ebb492

Intermediate commit of refactoring WholeGraph Tensor class

4728512

Refactor to materialize sparse_emb later

f5b93ae

To accommodate pytorch/DistDGL logic

Update WG sparse opt unit test to compare against distDGL

81b2b88

classicsong reviewed Jan 25, 2024

View reviewed changes

chang-l added 2 commits January 25, 2024 16:05

Address comments

bef8406

Minor update

a93b5ae

classicsong reviewed Feb 1, 2024

View reviewed changes

python/graphstorm/model/embed.py Show resolved Hide resolved

python/graphstorm/wholegraph/wholegraph.py Show resolved Hide resolved

python/graphstorm/wholegraph/wholegraph.py Outdated Show resolved Hide resolved

tests/unit-tests/test_wg_sparse_opt.py Outdated Show resolved Hide resolved

chang-l added 5 commits February 2, 2024 14:18

Address comments

5a8742f

Add cmd argument

9f59760

Add e2e tests

b990920

Add checker if see if wholegraph is installed or not

cc4fd30

Roll back to remove e2e tests

7fd694a

classicsong approved these changes Feb 8, 2024

View reviewed changes

classicsong and others added 2 commits February 7, 2024 22:25

Merge branch 'main' into add_wg_sparse_emb_rebased

8bad6bd

Resolve conflicts in main branch for unit tests

158e8c9

classicsong merged commit 5e05469 into awslabs:main Feb 8, 2024
6 checks passed

chang-l mentioned this pull request Feb 16, 2024

[WholeGraph] Add support of using WholeGraph to store/load cache_lm_emb #737

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add the support of using WholeGraph distributed embedding to store/update sparse_emb #677

Add the support of using WholeGraph distributed embedding to store/update sparse_emb #677

chang-l commented Dec 6, 2023 •

edited

Loading

chang-l commented Jan 13, 2024

Add the support of using WholeGraph distributed embedding to store/update sparse_emb #677

Add the support of using WholeGraph distributed embedding to store/update sparse_emb #677

Conversation

chang-l commented Dec 6, 2023 • edited Loading

chang-l commented Jan 13, 2024

chang-l commented Dec 6, 2023 •

edited

Loading