Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve neighbor allreduce #78

Merged
merged 40 commits into from
Apr 11, 2021
Merged

Improve neighbor allreduce #78

merged 40 commits into from
Apr 11, 2021

Conversation

hanbinhu
Copy link
Collaborator

No description provided.

BichengYing and others added 30 commits February 7, 2021 21:06
…ion Not Implemented, CUDA data_weight problem
@hanbinhu hanbinhu added the enhancement New feature or request label Mar 18, 2021
@hanbinhu hanbinhu requested a review from BichengYing March 18, 2021 09:12
@hanbinhu hanbinhu self-assigned this Mar 18, 2021
Copy link
Collaborator

@BichengYing BichengYing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please merge the master to add the ci.yml. And add more test into it?

bluefog/common/common.h Show resolved Hide resolved
bluefog/common/cuda/cuda_kernels.cu Show resolved Hide resolved
bluefog/common/mpi_context.h Show resolved Hide resolved
bluefog/common/mpi_controller.cc Show resolved Hide resolved
bluefog/common/mpi_controller.cc Outdated Show resolved Hide resolved
bluefog/torch/mpi_ops.py Show resolved Hide resolved
test/torch_ops_test.py Show resolved Hide resolved
bluefog/torch/mpi_ops.py Outdated Show resolved Hide resolved
bluefog/torch/mpi_ops.py Outdated Show resolved Hide resolved
bluefog/common/nccl_controller.cc Outdated Show resolved Hide resolved
bluefog/common/mpi_controller.cc Outdated Show resolved Hide resolved
bluefog/common/common.h Show resolved Hide resolved
bluefog/common/mpi_controller.cc Outdated Show resolved Hide resolved
bluefog/common/nccl_controller.cc Outdated Show resolved Hide resolved
bluefog/common/operations.cc Outdated Show resolved Hide resolved
bluefog/common/mpi_controller.cc Show resolved Hide resolved
bluefog/common/mpi_controller.cc Outdated Show resolved Hide resolved
bluefog/common/mpi_controller.cc Outdated Show resolved Hide resolved
bluefog/common/mpi_controller.cc Outdated Show resolved Hide resolved
bluefog/common/nccl_controller.cc Outdated Show resolved Hide resolved
@hanbinhu hanbinhu requested a review from BichengYing March 26, 2021 08:06
bluefog/common/mpi_controller.cc Show resolved Hide resolved
bluefog/common/mpi_controller.cc Outdated Show resolved Hide resolved
bluefog/common/mpi_controller.cc Outdated Show resolved Hide resolved
bluefog/common/nccl_controller.cc Outdated Show resolved Hide resolved
bluefog/common/mpi_controller.cc Show resolved Hide resolved
bluefog/common/mpi_controller.cc Outdated Show resolved Hide resolved
bluefog/common/tensor_queue.h Outdated Show resolved Hide resolved
bluefog/torch/mpi_ops.cc Outdated Show resolved Hide resolved
bluefog/torch/mpi_ops.cc Outdated Show resolved Hide resolved
setup.py Outdated Show resolved Hide resolved
Hanbin Hu and others added 2 commits March 27, 2021 12:49
* Add condition variable to control the loop

* Minor update on topology_setting in global_state

* Add missing <condition_variable> header

* Change cv.wait to cv.wait_for 10 seconds

* Address comment and remove adjusting resetVersionWinMem in ibfrun
@hanbinhu hanbinhu merged commit 2f696ed into master Apr 11, 2021
@hanbinhu hanbinhu deleted the improve_neighbor_allreduce branch April 11, 2021 03:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants