Fuse packing/unpacking kernels for reshape3d_alltoall #24

mabraham · 2023-05-12T11:58:53Z

Currently reshape3d_alltoall for N ranks runs N packing and N unpacking kernels respectively before and after the MPI_Alltoall. As rank count grows, the overhead of launching and waiting on those kernels grows linearly with N. In sufficiently regular cases, the loop over ranks in heffte::reshape3d_alltoall::apply_base() can be lowered into the device kernel. I have working SYCL code that does that and shows a clear performance improvement even for small N. Is this an optimization you'd consider incorporating if I contribute it?

The text was updated successfully, but these errors were encountered:

mkstoyanov · 2023-05-12T13:49:52Z

As a general rule, everything that improves performance is of consideration and probably should be included. If you want, you can point me to the prototype for the code before you bother making a formal PR and you can even just give me the kernel so I can do the integration with the rest and other backends.

On the other hand, I don't recommend running so many nodes with so little data-per-node that the kernel launch will cause issues, but then again, it is a valid use case.

If you merge the for-loop into the kernel, then each iteration of the loop will manage different amount of data which in itself can lead to performance issues. This is precisely why I didn't do it using CUDA, and calling one packing kernel at a time makes it easier to pipeline packing and sending. I can certainly see how the SYCL logic will be easier to generalize (hopefully without loss of performance) and all-to-all doesn't pipeline, so we could have a boost of performance here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fuse packing/unpacking kernels for reshape3d_alltoall #24

Fuse packing/unpacking kernels for reshape3d_alltoall #24

mabraham commented May 12, 2023

mkstoyanov commented May 12, 2023

Fuse packing/unpacking kernels for reshape3d_alltoall #24

Fuse packing/unpacking kernels for reshape3d_alltoall #24

Comments

mabraham commented May 12, 2023

mkstoyanov commented May 12, 2023