Skip to content

Commit

Permalink
Merge branch 'main' into namedarray_chunkmanager
Browse files Browse the repository at this point in the history
  • Loading branch information
Illviljan committed Jul 11, 2024
2 parents 6530440 + a69815f commit 2f1a6ca
Show file tree
Hide file tree
Showing 16 changed files with 70 additions and 75 deletions.
5 changes: 5 additions & 0 deletions .github/release.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
changelog:
exclude:
authors:
- dependabot
- pre-commit-ci
5 changes: 2 additions & 3 deletions ci/install-upstream-wheels.sh
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ $conda remove -y numba numbagg sparse
# temporarily remove numexpr
$conda remove -y numexpr
# temporarily remove backends
$conda remove -y cf_units hdf5 h5py netcdf4 pydap
$conda remove -y pydap
# forcibly remove packages to avoid artifacts
$conda remove -y --force \
numpy \
Expand All @@ -37,8 +37,7 @@ python -m pip install \
numpy \
scipy \
matplotlib \
pandas \
h5py
pandas
# for some reason pandas depends on pyarrow already.
# Remove once a `pyarrow` version compiled with `numpy>=2.0` is on `conda-forge`
python -m pip install \
Expand Down
2 changes: 1 addition & 1 deletion ci/requirements/all-but-dask.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ dependencies:
- netcdf4
- numba
- numbagg
- numpy<2
- numpy
- packaging
- pandas
- pint>=0.22
Expand Down
2 changes: 1 addition & 1 deletion ci/requirements/doc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ dependencies:
- nbsphinx
- netcdf4>=1.5
- numba
- numpy>=1.21,<2
- numpy>=2
- packaging>=21.3
- pandas>=1.4,!=2.1.0
- pooch
Expand Down
2 changes: 1 addition & 1 deletion ci/requirements/environment-windows.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ dependencies:
- netcdf4
- numba
- numbagg
- numpy<2
- numpy
- packaging
- pandas
# - pint>=0.22
Expand Down
2 changes: 1 addition & 1 deletion ci/requirements/environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ dependencies:
- numba
- numbagg
- numexpr
- numpy<2
- numpy
- opt_einsum
- packaging
- pandas
Expand Down
4 changes: 2 additions & 2 deletions doc/getting-started-guide/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -352,9 +352,9 @@ Some packages may have additional functionality beyond what is shown here. You c
How does xarray handle missing values?
--------------------------------------

**xarray can handle missing values using ``np.NaN``**
**xarray can handle missing values using ``np.nan``**

- ``np.NaN`` is used to represent missing values in labeled arrays and datasets. It is a commonly used standard for representing missing or undefined numerical data in scientific computing. ``np.NaN`` is a constant value in NumPy that represents "Not a Number" or missing values.
- ``np.nan`` is used to represent missing values in labeled arrays and datasets. It is a commonly used standard for representing missing or undefined numerical data in scientific computing. ``np.nan`` is a constant value in NumPy that represents "Not a Number" or missing values.

- Most of xarray's computation methods are designed to automatically handle missing values appropriately.

Expand Down
4 changes: 2 additions & 2 deletions doc/user-guide/computation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -426,7 +426,7 @@ However, the functions also take missing values in the data into account:

.. ipython:: python
data = xr.DataArray([np.NaN, 2, 4])
data = xr.DataArray([np.nan, 2, 4])
weights = xr.DataArray([8, 1, 1])
data.weighted(weights).mean()
Expand All @@ -444,7 +444,7 @@ If the weights add up to to 0, ``sum`` returns 0:
data.weighted(weights).sum()
and ``mean``, ``std`` and ``var`` return ``NaN``:
and ``mean``, ``std`` and ``var`` return ``nan``:

.. ipython:: python
Expand Down
4 changes: 2 additions & 2 deletions doc/user-guide/interpolation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -292,8 +292,8 @@ Let's see how :py:meth:`~xarray.DataArray.interp` works on real data.
axes[0].set_title("Raw data")
# Interpolated data
new_lon = np.linspace(ds.lon[0], ds.lon[-1], ds.sizes["lon"] * 4)
new_lat = np.linspace(ds.lat[0], ds.lat[-1], ds.sizes["lat"] * 4)
new_lon = np.linspace(ds.lon[0].item(), ds.lon[-1].item(), ds.sizes["lon"] * 4)
new_lat = np.linspace(ds.lat[0].item(), ds.lat[-1].item(), ds.sizes["lat"] * 4)
dsi = ds.interp(lat=new_lat, lon=new_lon)
dsi.air.plot(ax=axes[1])
@savefig interpolation_sample3.png width=8in
Expand Down
5 changes: 3 additions & 2 deletions doc/user-guide/testing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -239,9 +239,10 @@ If the array type you want to generate has an array API-compliant top-level name
you can use this neat trick:

.. ipython:: python
:okwarning:
from numpy import array_api as xp # available in numpy 1.26.0
import numpy as xp # compatible in numpy 2.0
# use `import numpy.array_api as xp` in numpy>=1.23,<2.0
from hypothesis.extra.array_api import make_strategies_namespace
Expand Down
4 changes: 3 additions & 1 deletion doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,8 @@ Bug fixes
By `Pontus Lurcock <https://github.com/pont-us>`_.
- Allow diffing objects with array attributes on variables (:issue:`9153`, :pull:`9169`).
By `Justus Magin <https://github.com/keewis>`_.
- ``numpy>=2`` compatibility in the ``netcdf4`` backend (:pull:`9136`).
By `Justus Magin <https://github.com/keewis>`_ and `Kai Mühlbauer <https://github.com/kmuehlbauer>`_.
- Promote floating-point numeric datetimes before decoding (:issue:`9179`, :pull:`9182`).
By `Justus Magin <https://github.com/keewis>`_.
- Address regression introduced in :pull:`9002` that prevented objects returned
Expand All @@ -67,7 +69,7 @@ Documentation
- Adds a flow-chart diagram to help users navigate help resources (`Discussion #8990 <https://github.com/pydata/xarray/discussions/8990>`_).
By `Jessica Scheick <https://github.com/jessicas11>`_.
- Improvements to Zarr & chunking docs (:pull:`9139`, :pull:`9140`, :pull:`9132`)
By `Maximilian Roos <https://github.com/max-sixty>`_
By `Maximilian Roos <https://github.com/max-sixty>`_.


Internal Changes
Expand Down
16 changes: 10 additions & 6 deletions xarray/coding/variables.py
Original file line number Diff line number Diff line change
Expand Up @@ -516,10 +516,13 @@ def encode(self, variable: Variable, name: T_Name = None) -> Variable:
dims, data, attrs, encoding = unpack_for_encoding(variable)

pop_to(encoding, attrs, "_Unsigned")
signed_dtype = np.dtype(f"i{data.dtype.itemsize}")
# we need the on-disk type here
# trying to get it from encoding, resort to an int with the same precision as data.dtype if not available
signed_dtype = np.dtype(encoding.get("dtype", f"i{data.dtype.itemsize}"))
if "_FillValue" in attrs:
new_fill = signed_dtype.type(attrs["_FillValue"])
attrs["_FillValue"] = new_fill
new_fill = np.array(attrs["_FillValue"])
# use view here to prevent OverflowError
attrs["_FillValue"] = new_fill.view(signed_dtype).item()
data = duck_array_ops.astype(duck_array_ops.around(data), signed_dtype)

return Variable(dims, data, attrs, encoding, fastpath=True)
Expand All @@ -535,10 +538,11 @@ def decode(self, variable: Variable, name: T_Name = None) -> Variable:
if unsigned == "true":
unsigned_dtype = np.dtype(f"u{data.dtype.itemsize}")
transform = partial(np.asarray, dtype=unsigned_dtype)
data = lazy_elemwise_func(data, transform, unsigned_dtype)
if "_FillValue" in attrs:
new_fill = unsigned_dtype.type(attrs["_FillValue"])
attrs["_FillValue"] = new_fill
new_fill = np.array(attrs["_FillValue"], dtype=data.dtype)
# use view here to prevent OverflowError
attrs["_FillValue"] = new_fill.view(unsigned_dtype).item()
data = lazy_elemwise_func(data, transform, unsigned_dtype)
elif data.dtype.kind == "u":
if unsigned == "false":
signed_dtype = np.dtype(f"i{data.dtype.itemsize}")
Expand Down
47 changes: 1 addition & 46 deletions xarray/core/computation.py
Original file line number Diff line number Diff line change
Expand Up @@ -1066,7 +1066,7 @@ def apply_ufunc(
supported:
>>> magnitude(3, 4)
5.0
np.float64(5.0)
>>> magnitude(3, np.array([0, 4]))
array([3., 5.])
>>> magnitude(array, 0)
Expand Down Expand Up @@ -1589,15 +1589,6 @@ def cross(
array([-3, 6, -3])
Dimensions without coordinates: dim_0
Vector cross-product with 2 dimensions, returns in the perpendicular
direction:
>>> a = xr.DataArray([1, 2])
>>> b = xr.DataArray([4, 5])
>>> xr.cross(a, b, dim="dim_0")
<xarray.DataArray ()> Size: 8B
array(-3)
Vector cross-product with 3 dimensions but zeros at the last axis
yields the same results as with 2 dimensions:
Expand All @@ -1608,42 +1599,6 @@ def cross(
array([ 0, 0, -3])
Dimensions without coordinates: dim_0
One vector with dimension 2:
>>> a = xr.DataArray(
... [1, 2],
... dims=["cartesian"],
... coords=dict(cartesian=(["cartesian"], ["x", "y"])),
... )
>>> b = xr.DataArray(
... [4, 5, 6],
... dims=["cartesian"],
... coords=dict(cartesian=(["cartesian"], ["x", "y", "z"])),
... )
>>> xr.cross(a, b, dim="cartesian")
<xarray.DataArray (cartesian: 3)> Size: 24B
array([12, -6, -3])
Coordinates:
* cartesian (cartesian) <U1 12B 'x' 'y' 'z'
One vector with dimension 2 but coords in other positions:
>>> a = xr.DataArray(
... [1, 2],
... dims=["cartesian"],
... coords=dict(cartesian=(["cartesian"], ["x", "z"])),
... )
>>> b = xr.DataArray(
... [4, 5, 6],
... dims=["cartesian"],
... coords=dict(cartesian=(["cartesian"], ["x", "y", "z"])),
... )
>>> xr.cross(a, b, dim="cartesian")
<xarray.DataArray (cartesian: 3)> Size: 24B
array([-10, 2, 5])
Coordinates:
* cartesian (cartesian) <U1 12B 'x' 'y' 'z'
Multiple vector cross-products. Note that the direction of the
cross product vector is defined by the right-hand rule:
Expand Down
4 changes: 2 additions & 2 deletions xarray/plot/facetgrid.py
Original file line number Diff line number Diff line change
Expand Up @@ -774,7 +774,7 @@ def _get_largest_lims(self) -> dict[str, tuple[float, float]]:
>>> ds = xr.tutorial.scatter_example_dataset(seed=42)
>>> fg = ds.plot.scatter(x="A", y="B", hue="y", row="x", col="w")
>>> round(fg._get_largest_lims()["x"][0], 3)
-0.334
np.float64(-0.334)
"""
lims_largest: dict[str, tuple[float, float]] = dict(
x=(np.inf, -np.inf), y=(np.inf, -np.inf), z=(np.inf, -np.inf)
Expand Down Expand Up @@ -817,7 +817,7 @@ def _set_lims(
>>> fg = ds.plot.scatter(x="A", y="B", hue="y", row="x", col="w")
>>> fg._set_lims(x=(-0.3, 0.3), y=(0, 2), z=(0, 4))
>>> fg.axs[0, 0].get_xlim(), fg.axs[0, 0].get_ylim()
((-0.3, 0.3), (0.0, 2.0))
((np.float64(-0.3), np.float64(0.3)), (np.float64(0.0), np.float64(2.0)))
"""
lims_largest = self._get_largest_lims()

Expand Down
6 changes: 3 additions & 3 deletions xarray/plot/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -811,11 +811,11 @@ def _update_axes(
def _is_monotonic(coord, axis=0):
"""
>>> _is_monotonic(np.array([0, 1, 2]))
True
np.True_
>>> _is_monotonic(np.array([2, 1, 0]))
True
np.True_
>>> _is_monotonic(np.array([0, 2, 1]))
False
np.False_
"""
if coord.shape[axis] < 3:
return True
Expand Down
33 changes: 31 additions & 2 deletions xarray/tests/test_backends.py
Original file line number Diff line number Diff line change
Expand Up @@ -166,7 +166,7 @@ def create_encoded_masked_and_scaled_data(dtype: np.dtype) -> Dataset:

def create_unsigned_masked_scaled_data(dtype: np.dtype) -> Dataset:
encoding = {
"_FillValue": 255,
"_FillValue": np.int8(-1),
"_Unsigned": "true",
"dtype": "i1",
"add_offset": dtype.type(10),
Expand Down Expand Up @@ -925,6 +925,35 @@ def test_roundtrip_mask_and_scale(self, decoded_fn, encoded_fn, dtype) -> None:
assert decoded.variables[k].dtype == actual.variables[k].dtype
assert_allclose(decoded, actual, decode_bytes=False)

@pytest.mark.parametrize("fillvalue", [np.int8(-1), np.uint8(255)])
def test_roundtrip_unsigned(self, fillvalue):
# regression/numpy2 test for
encoding = {
"_FillValue": fillvalue,
"_Unsigned": "true",
"dtype": "i1",
}
x = np.array([0, 1, 127, 128, 254, np.nan], dtype=np.float32)
decoded = Dataset({"x": ("t", x, {}, encoding)})

attributes = {
"_FillValue": fillvalue,
"_Unsigned": "true",
}
# Create unsigned data corresponding to [0, 1, 127, 128, 255] unsigned
sb = np.asarray([0, 1, 127, -128, -2, -1], dtype="i1")
encoded = Dataset({"x": ("t", sb, attributes)})

with self.roundtrip(decoded) as actual:
for k in decoded.variables:
assert decoded.variables[k].dtype == actual.variables[k].dtype
assert_allclose(decoded, actual, decode_bytes=False)

with self.roundtrip(decoded, open_kwargs=dict(decode_cf=False)) as actual:
for k in encoded.variables:
assert encoded.variables[k].dtype == actual.variables[k].dtype
assert_allclose(encoded, actual, decode_bytes=False)

@staticmethod
def _create_cf_dataset():
original = Dataset(
Expand Down Expand Up @@ -4285,7 +4314,7 @@ def test_roundtrip_coordinates_with_space(self) -> None:
def test_roundtrip_numpy_datetime_data(self) -> None:
# Override method in DatasetIOBase - remove not applicable
# save_kwargs
times = pd.to_datetime(["2000-01-01", "2000-01-02", "NaT"])
times = pd.to_datetime(["2000-01-01", "2000-01-02", "NaT"], unit="ns")
expected = Dataset({"t": ("t", times), "t0": times[0]})
with self.roundtrip(expected) as actual:
assert_identical(expected, actual)
Expand Down

0 comments on commit 2f1a6ca

Please sign in to comment.