Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DO NOT MERGE] Reduce parallelism in devcontainer #3925

Closed
wants to merge 8 commits into from

Conversation

AyodeAwe
Copy link
Contributor

@AyodeAwe AyodeAwe commented Oct 10, 2023

alexbarghi-nv and others added 7 commits September 30, 2023 23:35
Created based on code from @dongxuy04 

Adds support for `WholeGraph` `WholeMemory` in the cuGraph `FeatureStore` class.  This enables both DGL and PyG to take advantage of distributed feature store functionality.

Adds `pylibwholegraph` as a testing dependency so the feature store can be tested.  Adds appropriate SG and MG tests.

Authors:
  - Alex Barghi (https://github.com/alexbarghi-nv)

Approvers:
  - Ray Douglass (https://github.com/raydouglass)
  - Brad Rees (https://github.com/BradReesWork)
  - Vibhu Jawa (https://github.com/VibhuJawa)

URL: rapidsai#3874
…MFG creation (rapidsai#3887)

Allow cugraph-dgl dataloader to consume sampled outputs from BulkSampler in CSC format.

Authors:
  - Tingyu Wang (https://github.com/tingyu66)
  - Seunghwa Kang (https://github.com/seunghwak)
  - Alex Barghi (https://github.com/alexbarghi-nv)

Approvers:
  - Seunghwa Kang (https://github.com/seunghwak)
  - Alex Barghi (https://github.com/alexbarghi-nv)
  - Vibhu Jawa (https://github.com/VibhuJawa)

URL: rapidsai#3887
This handles isolated nodes in `louvain_communities` similar to what is done in rapidsai#3886. This is expected to be a temporary fix until pylibcugraph can handle isolated nodes.

As a bonus, I added `isolates` algorithm 🎉

CC @naimnv @rlratzel

Authors:
  - Erik Welch (https://github.com/eriknw)

Approvers:
  - Rick Ratzel (https://github.com/rlratzel)

URL: rapidsai#3897
Integrates the new CSR bulk sampler output, allowing reading of batches without having to call CSC conversion or count the numbers of vertices and edges in each batch.  Should result in major performance improvements, especially for small batches.

Authors:
  - Alex Barghi (https://github.com/alexbarghi-nv)
  - Seunghwa Kang (https://github.com/seunghwak)
  - Brad Rees (https://github.com/BradReesWork)

Approvers:
  - Brad Rees (https://github.com/BradReesWork)
  - Ray Douglass (https://github.com/raydouglass)
  - Tingyu Wang (https://github.com/tingyu66)

URL: rapidsai#3873
This PR increases the minimum timeout when waiting for the workers to complete their tasks.

Authors:
  - Joseph Nke (https://github.com/jnke2016)

Approvers:
  - Brad Rees (https://github.com/BradReesWork)
  - Vibhu Jawa (https://github.com/VibhuJawa)
  - Rick Ratzel (https://github.com/rlratzel)
  - Jake Awe (https://github.com/AyodeAwe)

URL: rapidsai#3907
… `cugraph.Graph` (rapidsai#3895)

This PR attempts to fix rapidsai#3790 

Please note that I  have not being able to cause failure locally so it is really hard for me to know if it actually fixes anything or not  . 

MRE being used to test locally: https://gist.github.com/VibhuJawa/4b1ec24022b6e2dd7879cd2e8d3fab67


CC: @jnke2016 , @rlratzel ,

CC:  @rjzamora , Please let me know what i can do better here.

Authors:
  - Vibhu Jawa (https://github.com/VibhuJawa)
  - Brad Rees (https://github.com/BradReesWork)

Approvers:
  - Rick Ratzel (https://github.com/rlratzel)
  - Joseph Nke (https://github.com/jnke2016)

URL: rapidsai#3895
@AyodeAwe AyodeAwe added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Oct 10, 2023
@AyodeAwe AyodeAwe changed the title [DO NOT MERGE] Reduce parallel [DO NOT MERGE] Reduce parallelism in devcontainer Oct 10, 2023
@AyodeAwe
Copy link
Contributor Author

AyodeAwe commented Oct 10, 2023

There's a giant memory leak happening somewhere after the cpp build is kicked off in the devcontainer (for cuda 12.0, pip variant). Trying to figure out where.

Because of this, it looks like reducing cpu parallelism would be irrelevant here.

@AyodeAwe
Copy link
Contributor Author

Issue fixed, closing PR.

@AyodeAwe AyodeAwe closed this Oct 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants