Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPICH + CUDA #10

Open
mkstoyanov opened this issue Mar 13, 2023 · 4 comments
Open

MPICH + CUDA #10

mkstoyanov opened this issue Mar 13, 2023 · 4 comments
Assignees

Comments

@mkstoyanov
Copy link
Collaborator

Some tests, e.g., long long fail when using mpich and CUDA-aware GPU.

@mkstoyanov mkstoyanov self-assigned this Mar 13, 2023
@mkstoyanov
Copy link
Collaborator Author

Some issues resolved in #11 but alltoall (no-v) still fails when using empty boxes.

The test is disabled, since it is a fringe use-case (subcomm implies few ranks, so p2p should work better).

Testing should be passing under mpich + CUDA-aware, but further investigation of the alltoall is needed.

@ax3l
Copy link
Contributor

ax3l commented Aug 21, 2024

Thanks for testing this. We (WarpX & ImpactX) use GPU-aware MPI heavily on DOE Exascale machines, which are currently all HPE/Cray and thus MPICH. With the current releases, anything we should look out for?

We do R2C FW and C2R BW FFTs for 1D to 3D.

@mkstoyanov
Copy link
Collaborator Author

This should not affect you. The problem happens when we use alltoall (no-v) which means that we pad the MPI messages to the same size. There appears to be an MPI specific issue if the boxes are empty and we only pad (i.e., we push around fake data). I doubt it will affect you and it may be no-issue on newer installations of mpich. We found this in the version installed from apt on Ubuntu 22.04.

Other than that, check the Cray documentation about GPU-aware MPI. ROCm machines require special env and compiler flags to enable this, sometimes both compile time and runtime.

@ax3l
Copy link
Contributor

ax3l commented Aug 23, 2024

Thank you for the summary!

Other than that, check the Cray documentation about GPU-aware MPI. ROCm machines require special env and compiler flags to enable this, sometimes both compile time and runtime.

Yes, that's correct. For Cray/HPE machines, we control/request it at compile time so we can activate it at runtime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants