Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for temperature differences in CF attributes #1836

Merged
merged 21 commits into from
Oct 17, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@ Contributors to this version: David Huard (:user:`huard`).

Internal changes
^^^^^^^^^^^^^^^^
* Changed french translation of "wet days" from "jours mouillés" to "jours pluvieux". (:issue:`1825`, :pull:`1826`).

* Add attribute ``units_metadata`` to outputs representing a difference between temperatures. This is needed to disambiguate temperature differences from absolute temperature, and affects indicators using functions ``cumulative_difference``, ``temperature_sum``, ``interday_diurnal_temperature_range`` and `` extreme_temperature_range``. Added ``pint2cfattrs`` function to convert pint units to a dictionary of CF attributes. ``units2pint`` is also modified to support ``units_metadata`` attributes in DataArrays. Some SDBA properties and measures previously returning units of ``delta_degC`` will now return the original input DataArray units accompanied with the ``units_metadata`` attribute. (:issue:`1822`, :pull:`1830`).
* Changed french translation of "wet days" from "jours mouillés" to "jours pluvieux". (:issue:`1825`, :pull:`1836`).
huard marked this conversation as resolved.
Show resolved Hide resolved

v0.51.0 (2024-07-04)
--------------------
Expand Down
42 changes: 42 additions & 0 deletions docs/notebooks/units.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -223,6 +223,48 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Temperature differences vs absolute temperature\n",
"\n",
"Temperature anomalies and biases as well as degree-days indicators are all *differences* between temperatures. If we assign those differences units of degrees Celsius, then converting to Kelvins or Fahrenheits will yield nonsensical values. ``pint`` solves this using *delta* units such as ``delta_degC`` and ``delta_degF``. \n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# If we have a DataArray storing a temperature anomaly of 2°C, converting to Kelvin will yield a nonsensical value 0f 275.15.\n",
"# Fortunately, pint has delta units to represent temperature differences.\n",
"display(units.convert_units_to(xr.DataArray([2], attrs={\"units\": \"delta_degC\"}), \"K\"))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The issue for ``xclim`` is that there are no equivalent delta units in the CF convention. To resolve this ambiguity, the [CF convention](https://cfconventions.org/Data/cf-conventions/cf-conventions-1.11/cf-conventions.html#temperature-units) recommends including a ``units_metadata`` attribute set to ``\"temperature: difference\"``, and this is supported in ``xclim`` as of version 0.52. The function ``units2pint`` interprets the ``units_metadata`` attribute and returns a ``pint`` delta unit as needed. To convert a ``pint`` delta unit to CF attributes, use the function ``pint2cfattrs``, which returns a dictionary with the ``units`` and ``units_metadata`` attributes (``pint2cfunits`` cannot support the convention because it only returns the unit string). "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"delta = xr.DataArray(\n",
" [2], attrs={\"units\": \"K\", \"units_metadata\": \"temperature: difference\"}\n",
")\n",
"units.convert_units_to(delta, \"delta_degC\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"\n",
"\n",
"## Other utilities\n",
"\n",
"Many helper functions are defined in `xclim.core.units`, see [Unit handling module](../api.rst#units-handling-submodule).\n"
Expand Down
6 changes: 6 additions & 0 deletions tests/test_generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -332,6 +332,12 @@ def test_forbidden(self, tas_series):
with pytest.raises(NotImplementedError):
generic.cumulative_difference(tas, threshold="10 degC", op="!=")

def test_delta_units(self, tas_series):
tas = tas_series(np.array([-10, 15, 20, 3, 10]) + K2C)

out = generic.cumulative_difference(tas, threshold="10 degC", op=">=")
assert "units_metadata" in out.attrs


class TestFirstDayThreshold:
@pytest.mark.parametrize(
Expand Down
6 changes: 4 additions & 2 deletions tests/test_sdba/test_properties.py
Original file line number Diff line number Diff line change
Expand Up @@ -287,7 +287,8 @@ def test_annual_cycle(self, open_dataset):

assert amp.long_name.startswith("Absolute amplitude of the annual cycle")
assert phase.long_name.startswith("Phase of the annual cycle")
assert amp.units == "delta_degC"
assert amp.units == "K"
assert amp.units_metadata == "temperature: difference"
assert relamp.units == "%"
assert phase.units == ""

Expand Down Expand Up @@ -330,7 +331,8 @@ def test_annual_range(self, open_dataset):

assert amp.long_name.startswith("Average annual absolute amplitude")
assert phase.long_name.startswith("Average annual phase")
assert amp.units == "delta_degC"
assert amp.units == "K"
assert amp.units_metadata == "temperature: difference"
assert relamp.units == "%"
assert phase.units == ""

Expand Down
27 changes: 27 additions & 0 deletions tests/test_units.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
declare_units,
infer_context,
lwethickness2amount,
pint2cfattrs,
pint2cfunits,
pint_multiply,
rate2amount,
Expand Down Expand Up @@ -117,6 +118,14 @@ def test_cf_conversion_amount2lwethickness_amount2rate(self):
assert out.attrs["units"] == "kg d-1 m-2" # CF equivalent unit
assert out.attrs["standard_name"] == "rainfall_flux"

def test_temperature_difference(self):
delta = xr.DataArray(
[2], attrs={"units": "K", "units_metadata": "temperature: difference"}
)
out = convert_units_to(source=delta, target="delta_degC")
assert out == 2
assert out.attrs["units"] == "degC"


class TestUnitConversion:
def test_pint2cfunits(self):
Expand Down Expand Up @@ -347,3 +356,21 @@ def test_to_agg_units(in_u, opfunc, op, exp, exp_u):
out = to_agg_units(getattr(da, opfunc)(), da, op)
np.testing.assert_allclose(out, exp)
assert out.attrs["units"] == exp_u


def test_pint2cfattrs():
attrs = pint2cfattrs(units.degK, is_difference=True)
assert attrs == {"units": "K", "units_metadata": "temperature: difference"}


def test_temp_difference_rountrip():
"""Test roundtrip of temperature difference units."""
attrs = {"units": "degC", "units_metadata": "temperature: difference"}
da = xr.DataArray([1], attrs=attrs)
pu = units2pint(da)
# Confirm that we get delta pint units
assert pu == units.delta_degC

# and that converting those back to cf attrs gives the same result
attrs = pint2cfattrs(pu)
assert attrs == {"units": "degC", "units_metadata": "temperature: difference"}
64 changes: 57 additions & 7 deletions xclim/core/units.py
Original file line number Diff line number Diff line change
Expand Up @@ -134,11 +134,18 @@ def units2pint(value: xr.DataArray | str | units.Quantity) -> pint.Unit:
-------
pint.Unit
Units of the data array.

Notes
-----
To avoid ambiguity related to differences in temperature vs absolute temperatures, set the `units_metadata`
attribute to `"temperature: difference"` or `"temperature: on_scale"` on the DataArray.
"""
metadata = None
if isinstance(value, str):
unit = value
elif isinstance(value, xr.DataArray):
unit = value.attrs["units"]
metadata = value.attrs.get("units_metadata", None)
elif isinstance(value, units.Quantity):
# This is a pint.PlainUnit, which is not the same as a pint.Unit
return cast(pint.Unit, value.units)
Expand All @@ -164,7 +171,10 @@ def units2pint(value: xr.DataArray | str | units.Quantity) -> pint.Unit:
"Remove white space from temperature units, e.g. use `degC`."
)

return units.parse_units(unit)
pu = units.parse_units(unit)
if metadata == "temperature: difference":
return (1 * pu - 1 * pu).units
return pu


def pint2cfunits(value: units.Quantity | units.Unit) -> str:
Expand All @@ -185,11 +195,46 @@ def pint2cfunits(value: units.Quantity | units.Unit) -> str:

# Issue originally introduced in https://github.com/hgrecco/pint/issues/1486
# Should be resolved in pint v0.24. See: https://github.com/hgrecco/pint/issues/1913
# Relies on cf_xarray's unit string formatting logic.
with warnings.catch_warnings():
warnings.simplefilter("ignore", category=DeprecationWarning)
return f"{value:cf}".replace("dimensionless", "")


def pint2cfattrs(value: units.Quantity | units.Unit, is_difference=None) -> dict:
"""Return CF-compliant units attributes from a `pint` unit.

Parameters
----------
value : pint.Unit
Input unit.
is_difference : bool
Whether the value represent a difference in temperature, which is ambiguous in the case of absolute
temperature scales like Kelvin or Rankine. It will automatically be set to True if units are "delta_*"
units.

Returns
-------
dict
Units following CF-Convention, using symbols.
"""
s = pint2cfunits(value)
if "delta_" in s:
is_difference = True
s = s.replace("delta_", "")

attrs = {"units": s}
if "[temperature]" in value.dimensionality:
aulemahal marked this conversation as resolved.
Show resolved Hide resolved
if is_difference:
attrs["units_metadata"] = "temperature: difference"
elif is_difference is False:
attrs["units_metadata"] = "temperature: on_scale"
else:
attrs["units_metadata"] = "temperature: unknown"

return attrs


def ensure_cf_units(ustr: str) -> str:
"""Ensure the passed unit string is CF-compliant.

Expand Down Expand Up @@ -325,7 +370,7 @@ def convert_units_to( # noqa: C901

if isinstance(source, xr.DataArray):
source_unit = units2pint(source)
target_cf_unit = pint2cfunits(target_unit)
target_cf_attrs = pint2cfattrs(target_unit)

# Automatic pre-conversions based on the dimensionalities and CF standard names
standard_name = source.attrs.get("standard_name")
Expand Down Expand Up @@ -360,12 +405,12 @@ def convert_units_to( # noqa: C901
out: xr.DataArray
if source_unit == target_unit:
# The units are the same, but the symbol may not be.
out = source.assign_attrs(units=target_cf_unit)
out = source.assign_attrs(**target_cf_attrs)
return out

with units.context(context or "none"):
out = source.copy(data=units.convert(source.data, source_unit, target_unit))
out = out.assign_attrs(units=target_cf_unit)
out = out.assign_attrs(**target_cf_attrs)
return out

# TODO remove backwards compatibility of int/float thresholds after v1.0 release
Expand Down Expand Up @@ -560,20 +605,23 @@ def to_agg_units(
out.attrs.update(
units="", is_dayofyear=np.int32(1), calendar=get_calendar(orig)
)
out.attrs.pop("units_metadata", None)

elif op in ["count", "integral"]:
m, freq_u_raw = infer_sampling_units(orig[dim])
orig_u = str2pint(ensure_absolute_temperature(orig.units))
freq_u = str2pint(freq_u_raw)
out = out * m
with xr.set_options(keep_attrs=True):
out = out * m

if op == "count":
out.attrs["units"] = freq_u_raw
elif op == "integral":
if "[time]" in orig_u.dimensionality:
# We need to simplify units after multiplication
out_units = (orig_u * freq_u).to_reduced_units()
out = out * out_units.magnitude
with xr.set_options(keep_attrs=True):
out = out * out_units.magnitude
out.attrs["units"] = pint2cfunits(out_units)
else:
out.attrs["units"] = pint2cfunits(orig_u * freq_u)
Expand Down Expand Up @@ -1328,7 +1376,7 @@ def wrapper(*args, **kwargs):
return dec


def ensure_delta(unit: str) -> str:
def ensure_delta(unit: xr.DataArray | str | units.Quantity) -> str:
"""Return delta units for temperature.

For dimensions where delta exist in pint (Temperature), it replaces the temperature unit by delta_degC or
Expand All @@ -1344,6 +1392,8 @@ def ensure_delta(unit: str) -> str:
#
delta_unit = pint2cfunits(d - d)
# replace kelvin/rankine by delta_degC/F
# Note (DH): This will fail if dimension is [temperature]^-1 or [temperature]^2 (e.g. variance)
# Recent versions of pint have a `to_preferred` method that could be used here (untested).
if "kelvin" in u._units:
delta_unit = pint2cfunits(u / units2pint("K") * units2pint("delta_degC"))
if "degree_Rankine" in u._units:
Expand Down
12 changes: 7 additions & 5 deletions xclim/indices/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -701,6 +701,7 @@ def temperature_sum(

out = (data - threshold).where(cond).resample(time=freq).sum()
out = direction * out
out.attrs["units_metadata"] = "temperature: difference"
return to_agg_units(out, data, "integral")
huard marked this conversation as resolved.
Show resolved Hide resolved


Expand All @@ -718,7 +719,6 @@ def interday_diurnal_temperature_range(
freq : str
Resampling frequency defining the periods as defined in :ref:`timeseries.resampling`.


Returns
-------
xr.DataArray
Expand All @@ -728,8 +728,8 @@ def interday_diurnal_temperature_range(
vdtr = abs((high_data - low_data).diff(dim="time"))
out = vdtr.resample(time=freq).mean(dim="time")

u = str2pint(low_data.units)
out.attrs["units"] = pint2cfunits(u - u)
out.attrs["units"] = low_data.attrs["units"]
out.attrs["units_metadata"] = "temperature: difference"
return out


Expand All @@ -755,8 +755,8 @@ def extreme_temperature_range(

out = high_data.resample(time=freq).max() - low_data.resample(time=freq).min()

u = str2pint(low_data.units)
out.attrs["units"] = pint2cfunits(u - u)
out.attrs["units"] = low_data.attrs["units"]
out.attrs["units_metadata"] = "temperature: difference"
return out


Expand Down Expand Up @@ -892,6 +892,8 @@ def cumulative_difference(
if freq is not None:
diff = diff.resample(time=freq).sum(dim="time")

diff.attrs["units_metadata"] = "temperature: difference"

return to_agg_units(diff, data, op="integral")


Expand Down
9 changes: 5 additions & 4 deletions xclim/sdba/measures.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
import xarray as xr

from xclim.core.indicator import Indicator, base_registry
from xclim.core.units import convert_units_to, ensure_delta, units2pint
from xclim.core.units import convert_units_to, units2pint
from xclim.core.utils import InputKind

from .base import Grouper
Expand Down Expand Up @@ -173,7 +173,8 @@ def _bias(sim: xr.DataArray, ref: xr.DataArray) -> xr.DataArray:
Absolute bias
"""
out = sim - ref
out.attrs["units"] = ensure_delta(ref.attrs["units"])
out.attrs["units"] = ref.attrs["units"]
out.attrs["units_metadata"] = "temperature: difference"
return out


Expand Down Expand Up @@ -296,7 +297,7 @@ def _rmse_internal(_sim: xr.DataArray, _ref: xr.DataArray) -> xr.DataArray:
input_core_dims=[["time"], ["time"]],
dask="parallelized",
)
out = out.assign_attrs(units=ensure_delta(ref.units))
out = out.assign_attrs(units=ref.units, units_metadata={"temperature: difference"})
huard marked this conversation as resolved.
Show resolved Hide resolved
return out


Expand Down Expand Up @@ -343,7 +344,7 @@ def _mae_internal(_sim: xr.DataArray, _ref: xr.DataArray) -> xr.DataArray:
input_core_dims=[["time"], ["time"]],
dask="parallelized",
)
out = out.assign_attrs(units=ensure_delta(ref.units))
out = out.assign_attrs(units=ref.units, units_metadata="temperature: difference")
return out


Expand Down
6 changes: 4 additions & 2 deletions xclim/sdba/properties.py
Original file line number Diff line number Diff line change
Expand Up @@ -583,7 +583,8 @@ def _annual_cycle(
# TODO: In April 2024, use a match-case.
if stat == "absamp":
out = ac.max("dayofyear") - ac.min("dayofyear")
out.attrs["units"] = xc.core.units.ensure_delta(units)
out.attrs["units"] = units
out.attrs["units_metadata"] = "temperature: difference"
huard marked this conversation as resolved.
Show resolved Hide resolved
elif stat == "relamp":
out = (ac.max("dayofyear") - ac.min("dayofyear")) * 100 / ac.mean("dayofyear")
out.attrs["units"] = "%"
Expand Down Expand Up @@ -699,7 +700,8 @@ def _annual_statistic(

if stat == "absamp":
out = yrs.max() - yrs.min()
out.attrs["units"] = xc.core.units.ensure_delta(units)
out.attrs["units"] = units
out.attrs["units_metadata"] = "temperature: difference"
huard marked this conversation as resolved.
Show resolved Hide resolved
elif stat == "relamp":
out = (yrs.max() - yrs.min()) * 100 / yrs.mean()
out.attrs["units"] = "%"
Expand Down
Loading