Replies: 9 comments
-
@rokuingh can you please take a look |
Beta Was this translation helpful? Give feedback.
-
I just tried to reproduce this issue, but was unable to install one of the dependencies, pooch, due to a number of sub dependency conflicts. The rc code 545 generally indicates unmapped points. However, the log file has a message about a vm not being available.. which is a much more serious issue that is most likely due to something outside of ESMPy. |
Beta Was this translation helpful? Give feedback.
-
@rokuingh - if you have access to a conda (or mamba installation), this command will get you an exact replica of the environment this test was made in:
|
Beta Was this translation helpful? Give feedback.
-
@jhamman We are happy to provide support for ESMPy, but we don't currently have the resources (or expertise) to support the additional layers within xESMF. Could you please provide an ESMPy only reproducer that exhibits the issue? We can definitely work from that. |
Beta Was this translation helpful? Give feedback.
-
@rsdunlapiv and @rokuingh - here's a reproducer that does not use xesmf or xarray. import ESMF
import numpy as np
import dask
@dask.delayed
def make_grid(shape):
g = ESMF.Grid(
np.array(shape),
staggerloc=ESMF.StaggerLoc.CENTER,
coord_sys=ESMF.CoordSys.SPH_DEG,
num_peri_dims=None, # with out this, ESMF seems to seg fault (clue?)
)
return g
tasks = [make_grid((59, 87)), make_grid((60, 88))]
# this works
dask.compute(tasks, scheduler='single-threaded')
# this fails
dask.compute(tasks, scheduler='threads') |
Beta Was this translation helpful? Give feedback.
-
@jhamman one question is whether we would even expect the second one to work. I am not sure of the semantics of Dask 'single-threaded' versus 'threads'. In general, ESMF is going to be multi-process, but not multi-threaded. So Dask would need to give ESMF/ESMPy multiple MPI processes in order to run in parallel. Can this be done through Dask? |
Beta Was this translation helpful? Give feedback.
-
@rsdunlapiv - I'm not sure if this should work or not. If its not supposed to work, it may be nice to put a thread lock in place, or alternatively, raise a more informative error. Also, I think it would be good to restate my intended parallel behavior here. I want to generate regridding weights for two datasets in parallel. I do not want ESMF to do anything in parallel (or with MPI). I will say that my first example (most important) does work if called with the ...
/srv/conda/envs/notebook/lib/python3.9/site-packages/cloudpickle/cloudpickle_fast.py in dump()
600 def dump(self, obj):
601 try:
--> 602 return Pickler.dump(self, obj)
603 except RuntimeError as e:
604 if "recursion" in e.args[0]:
ValueError: ctypes objects containing pointers cannot be pickled |
Beta Was this translation helpful? Give feedback.
-
In case others are having this same problem and find this thread as I did: I was getting this same error and log message ( from dask.distributed import Client
client=Client() |
Beta Was this translation helpful? Give feedback.
-
@rsdunlapiv was there any resolution to this? I'm finding that xESMF still fails even when applied to embarrassingly parallel problems, and given one thread per dask worker. It really prevents using the package at scale. I also noticed this weird failure mode where submitting N regridding tasks (via See for example the result of submitting 4 identical regridding tasks to a dask |
Beta Was this translation helpful? Give feedback.
-
What happened:
When using xesmf inside a parallel framework, an opaque error is raised. I've observed this behavior using dask's threaded and distributed schedulers.
What you expected to happen:
I expected to be able to use xesmf within multiple processes. Or, if this is not possible, a descriptive error and/or documentation on the subject.
Minimal Complete Verifiable Example:
This simple example is just a slightly modified version of the basic example from the xesmf docs.
The traceback is here:
The
ESMF_LogFile
includes the following lines:Anything else we need to know?:
xref: JiaweiZhuang/xESMF#88
Environment:
Output of xr.show_versions() + xesmf + esmf
INSTALLED VERSIONS
commit: None
python: 3.9.7 | packaged by conda-forge | (default, Sep 29 2021, 19:20:46)
[GCC 9.4.0]
python-bits: 64
OS: Linux
OS-release: 5.4.0-1062-azure
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: C.UTF-8
LANG: C.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.1
libnetcdf: 4.8.1
xarray: 0.20.1
pandas: 1.3.4
numpy: 1.21.4
scipy: 1.7.3
netCDF4: 1.5.8
pydap: installed
h5netcdf: 0.11.0
h5py: 3.6.0
Nio: None
zarr: 2.10.3
cftime: 1.5.1.1
nc_time_axis: 1.4.0
PseudoNetCDF: None
rasterio: 1.2.10
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: 2021.10.0
distributed: 2021.10.0
matplotlib: 3.5.0
cartopy: 0.20.1
seaborn: 0.11.2
numbagg: None
fsspec: 2021.11.1
cupy: None
pint: None
sparse: 0.13.0
setuptools: 59.4.0
pip: 21.3.1
conda: None
pytest: 6.2.5
IPython: 7.30.1
sphinx: None
xesmf: 0.6.2
ESMF: 8.2.0
cc @rokuingh, @norlandrhagen, @theurich
Beta Was this translation helpful? Give feedback.
All reactions