We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hi team,
I successfully compiled gloo on MacOS by setting USE_LIBUV ON, but when I test the reduce_scatter OP, I found that core dump at runtime.
MacOS
USE_LIBUV ON
reduce_scatter
I use pybind11 to bind python interface, here's the code:
def worker_reduce_scatter(rank): from .. import xoscar_pygloo as xp if rank == 0: if os.path.exists(fileStore_path): shutil.rmtree(fileStore_path) os.makedirs(fileStore_path) else: time.sleep(0.5) context = xp.rendezvous.Context(rank, 3) if system_name == "Linux": attr = xp.transport.tcp.attr("localhost") dev = xp.transport.tcp.CreateDevice(attr) else: attr = xp.transport.uv.attr("localhost") dev = xp.transport.uv.CreateDevice(attr) fileStore = xp.rendezvous.FileStore(fileStore_path) store = xp.rendezvous.PrefixStore(str(3), fileStore) context.connectFullMesh(store, dev) sendbuf = np.array( [i + 1 for i in range(sum([j + 1 for j in range(3)]))], dtype=np.float32 ) print(f'Send buf: {sendbuf}') sendptr = sendbuf.ctypes.data recvbuf = np.zeros(2, dtype=np.float32) recvptr = recvbuf.ctypes.data recvElems = [2, 2, 2] data_size = ( sendbuf.size if isinstance(sendbuf, np.ndarray) else sendbuf.numpy().size ) print(f'Data size: {data_size}') datatype = xp.glooDataType_t.glooFloat32 op = xp.ReduceOp.SUM xp.reduce_scatter(context, sendptr, recvptr, data_size, recvElems, datatype, op) print(f"rank {rank} sends {sendbuf}, receives {recvbuf}") def test_reduce_scatter(): process1 = mp.Process(target=worker_reduce_scatter, args=(0,)) process1.start() process2 = mp.Process(target=worker_reduce_scatter, args=(1,)) process2.start() process3 = mp.Process(target=worker_reduce_scatter, args=(2,)) process3.start() process1.join() process2.join() process3.join()
This test not work on MacOS, but works on Linux.
May I ask that why this happens? Thank you very much.
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Hi team,
I successfully compiled gloo on
MacOS
by settingUSE_LIBUV ON
,but when I test the
reduce_scatter
OP, I found that core dump at runtime.I use pybind11 to bind python interface, here's the code:
This test not work on MacOS, but works on Linux.
May I ask that why this happens? Thank you very much.
The text was updated successfully, but these errors were encountered: