-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEAT: Support pygloo in collective communication #38
Conversation
Codecov Report
@@ Coverage Diff @@
## main #38 +/- ##
==========================================
- Coverage 93.85% 93.78% -0.07%
==========================================
Files 43 42 -1
Lines 3399 3361 -38
Branches 675 672 -3
==========================================
- Hits 3190 3152 -38
Misses 138 138
Partials 71 71
Flags with carried forward coverage won't be shown. Click here to find out more. |
Also, we need to support |
Implemented |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should keep only one Store
type and wrap the TCPStore to PrefixStore directly.
|
||
std::vector<T> inputbuf(size); | ||
|
||
memcpy(inputbuf.data(), input_ptr, size * sizeof(T)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this mem copy required by gloo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gloo ReduceScatterHalvingDoubling algorithm accepts a vector with just one ptr: facebookincubator/gloo#303
I am not sure if we can remove this copy in the future.
@@ -1313,4 +1314,16 @@ void TCPStore::multiSet(const std::vector<std::string> &keys, | |||
|
|||
bool TCPStore::hasExtendedApi() const { return true; } | |||
|
|||
void TCPStore::set(const std::string &key, const std::vector<char> &data) { | |||
std::vector<uint8_t> dataSet(data.begin(), data.end()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here copying data to cast the vector<char>
to vector<uint8_t>
. But the send function accepts a const char *
type here https://github.com/xorbitsai/xoscar/blob/main/cpp/collective/rendezvous/include/utils.hpp#L121:
::send(socket, (const char *) currentBytes, bytesToSend, flags)
We should unify the data types to char* to avoid copying data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here copying data to cast the
vector<char>
tovector<uint8_t>
. But the send function accepts aconst char *
type here https://github.com/xorbitsai/xoscar/blob/main/cpp/collective/rendezvous/include/utils.hpp#L121:::send(socket, (const char *) currentBytes, bytesToSend, flags)We should unify the data types to char* to avoid copying data.
See issue pybind/pybind11#1807 in pybind11. Copying data here is just for pybind11.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
What do these changes do?
Related issue number
Related #22
Check code requirements