Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GSProcessing] Add thread-parallelism to in-memory repartition implementation. #628

Merged
merged 3 commits into from
Nov 10, 2023

Conversation

thvasilo
Copy link
Contributor

@thvasilo thvasilo commented Nov 8, 2023

Also add more detailed docs about re-partitioning step.

Issue #, if available:

Description of changes:

  • We use thread-parallelism for in-memory file repartition to speed up the process. In our experiments with 10B+ edges, we observe at least 2x speedup.
  • Add more detailed documentation about the re-partitioning step, what it does and why.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@thvasilo thvasilo added this to the 0.2.1 Release Plan. milestone Nov 8, 2023
@thvasilo thvasilo self-assigned this Nov 8, 2023
@thvasilo thvasilo added ready able to trigger the CI gsprocessing For issues and PRs related the the GSProcessing library labels Nov 8, 2023
…entation.

Also add more detailed docs about re-partitioning step.
thvasilo and others added 2 commits November 8, 2023 12:50
@thvasilo thvasilo merged commit 56f8851 into awslabs:main Nov 10, 2023
3 checks passed
@thvasilo thvasilo deleted the parallel-repart branch November 10, 2023 00:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gsprocessing For issues and PRs related the the GSProcessing library ready able to trigger the CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants