-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ambiguous behavior with coordinates when appending to Zarr store with append_dim #8427
Labels
Comments
rabernat
added
bug
needs triage
Issue that has not been reviewed by xarray team member
labels
Nov 8, 2023
3 tasks
rabernat
added
design question
needs discussion
and removed
needs triage
Issue that has not been reviewed by xarray team member
labels
Nov 8, 2023
3 tasks
+1 |
We could have
|
That sounds good Deepak! Feel free to pick up #8428 and extend it with those options. |
#8428 now allow We could transition the default to |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
What happened?
There are two quite different scenarios covered by "append" with Zarr
append_dim
)This issue is about what should happen when using
append_dim
with variables that do not containappend_dim
.Here's the current behavior.
Currently, we always write all data variables in this scenario. That includes overwriting the coordinates every time we append. That makes appending more expensive than it needs to be. I don't think that is the behavior most users want or expect.
What did you expect to happen?
There are a couple of different options we could consider for how to handle this "extending" situation (with
append_dim
)a. [current behavior] Overwrite coordinates with new data
b. Keep original coordinates
c. Force the user to explicitly drop the coordinates, as we do for
region
operations.a. Fail if coordinates don't match
b. Extend the arrays to replicate the behavior of
concat
We currently do 1a. I propose to switch to 1b. I think it is closer to what users want, and it requires less I/O.
Anything else we need to know?
No response
Environment
INSTALLED VERSIONS
commit: None
python: 3.11.6 | packaged by conda-forge | (main, Oct 3 2023, 10:40:35) [GCC 12.3.0]
python-bits: 64
OS: Linux
OS-release: 5.10.176-157.645.amzn2.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: C.UTF-8
LANG: C.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.14.2
libnetcdf: 4.9.2
xarray: 2023.10.1
pandas: 2.1.2
numpy: 1.24.4
scipy: 1.11.3
netCDF4: 1.6.5
pydap: installed
h5netcdf: 1.2.0
h5py: 3.10.0
Nio: None
zarr: 2.16.0
cftime: 1.6.2
nc_time_axis: 1.4.1
PseudoNetCDF: None
iris: None
bottleneck: 1.3.7
dask: 2023.10.1
distributed: 2023.10.1
matplotlib: 3.8.0
cartopy: 0.22.0
seaborn: 0.13.0
numbagg: 0.6.0
fsspec: 2023.10.0
cupy: None
pint: 0.22
sparse: 0.14.0
flox: 0.8.1
numpy_groupies: 0.10.2
setuptools: 68.2.2
pip: 23.3.1
conda: None
pytest: 7.4.3
mypy: None
IPython: 8.16.1
sphinx: None
The text was updated successfully, but these errors were encountered: