Calling regridder causes mpirun to fail #305
-
I'm writing a pipeline that needs to do some 'small' regridding tasks, and one 'big' one. I call However, when running the xesmf regridder, subsequent
|
Beta Was this translation helpful? Give feedback.
Replies: 4 comments
-
This is happening because the ESMF being used from xesmf is configured with MPI support. When the regridder is called, MPI is initialised within the context of the Python process. OpenMPI doesn't support recursively running MPI, so it aborts immediately (related: open-mpi/ompi#9729). I think the |
Beta Was this translation helpful? Give feedback.
-
Thanks @angus-g ! Yeah that's a good fix for now. I wonder if there's a way to get your kernal to purge mpi after a regridder call? Calling orte-clean doesn't seem to do anything and you can't kill orted without killing the whole kernel Or, is it possible to run xesmf without mpirun at all for the smaller tasks? |
Beta Was this translation helpful? Give feedback.
-
Hi, Unless there's a proposal for a specific change to xESMF, I'd close this. Thoughts ? |
Beta Was this translation helpful? Give feedback.
-
I think it's probably fine to close this. At most a note about why this occurs could go somewhere, but maybe people will stumble on this thread anyway! |
Beta Was this translation helpful? Give feedback.
This is happening because the ESMF being used from xesmf is configured with MPI support. When the regridder is called, MPI is initialised within the context of the Python process. OpenMPI doesn't support recursively running MPI, so it aborts immediately (related: open-mpi/ompi#9729).
I think the
RegridWeightGen
step needs to be performed either before, or external to the Python script which performs the regridding (see https://xesmf.readthedocs.io/en/latest/large_problems_on_HPC.html for suggestions).