Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

split out CFDatetimeCoder, deprecate use_cftime as kwarg #9901

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions doc/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1096,6 +1096,17 @@ DataTree methods
.. Missing:
.. ``open_mfdatatree``

Encoding/Decoding
=================

Coder objects
-------------

.. autosummary::
:toctree: generated/

coders.CFDatetimeCoder

Coordinates objects
===================

Expand Down
7 changes: 7 additions & 0 deletions doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,9 @@ New Features
- Add ``unit`` - keyword argument to :py:func:`date_range` and ``microsecond`` parsing to
iso8601-parser (:pull:`9885`).
By `Kai Mühlbauer <https://github.com/kmuehlbauer>`_.
- Split out ``CFDatetimeCoder`` in ``xr.coders``, make ``decode_times`` keyword argument
kmuehlbauer marked this conversation as resolved.
Show resolved Hide resolved
consume ``CFDatetimeCoder``.


Breaking changes
~~~~~~~~~~~~~~~~
Expand All @@ -42,6 +45,10 @@ Deprecations
- Finalize deprecation of ``closed`` parameters of :py:func:`cftime_range` and
:py:func:`date_range` (:pull:`9882`).
By `Kai Mühlbauer <https://github.com/kmuehlbauer>`_.
- Time decoding related kwarg ``use_cftime`` is deprecated. Use keyword argument
kmuehlbauer marked this conversation as resolved.
Show resolved Hide resolved
``decode_times=CFDatetimeCoder(use_cftime=True)`` in the respective functions
instead.
kmuehlbauer marked this conversation as resolved.
Show resolved Hide resolved
By `Kai Mühlbauer <https://github.com/kmuehlbauer>`_.

Bug fixes
~~~~~~~~~
Expand Down
3 changes: 2 additions & 1 deletion xarray/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
from importlib.metadata import version as _version

from xarray import groupers, testing, tutorial, ufuncs
from xarray import coders, groupers, testing, tutorial, ufuncs
from xarray.backends.api import (
load_dataarray,
load_dataset,
Expand Down Expand Up @@ -66,6 +66,7 @@
# `mypy --strict` running in projects that import xarray.
__all__ = ( # noqa: RUF022
# Sub-packages
"coders",
"groupers",
"testing",
"tutorial",
Expand Down
47 changes: 35 additions & 12 deletions xarray/backends/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@
_normalize_path,
)
from xarray.backends.locks import _get_scheduler
from xarray.coders import CFDatetimeCoder
from xarray.core import indexing
from xarray.core.combine import (
_infer_concat_order_from_positions,
Expand Down Expand Up @@ -481,7 +482,10 @@ def open_dataset(
cache: bool | None = None,
decode_cf: bool | None = None,
mask_and_scale: bool | Mapping[str, bool] | None = None,
decode_times: bool | Mapping[str, bool] | None = None,
decode_times: bool
| CFDatetimeCoder
| Mapping[str, bool | CFDatetimeCoder]
| None = None,
decode_timedelta: bool | Mapping[str, bool] | None = None,
use_cftime: bool | Mapping[str, bool] | None = None,
concat_characters: bool | Mapping[str, bool] | None = None,
Expand Down Expand Up @@ -543,9 +547,9 @@ def open_dataset(
be replaced by NA. Pass a mapping, e.g. ``{"my_variable": False}``,
to toggle this feature per-variable individually.
This keyword may not be supported by all the backends.
decode_times : bool or dict-like, optional
decode_times : bool, CFDatetimeCoder or dict-like, optional
If True, decode times encoded in the standard NetCDF datetime format
into datetime objects. Otherwise, leave them encoded as numbers.
into datetime objects. Otherwise, use CFDatetimeCoder or leave them encoded as numbers.
kmuehlbauer marked this conversation as resolved.
Show resolved Hide resolved
Pass a mapping, e.g. ``{"my_variable": False}``,
to toggle this feature per-variable individually.
This keyword may not be supported by all the backends.
Expand All @@ -569,6 +573,8 @@ def open_dataset(
raise an error. Pass a mapping, e.g. ``{"my_variable": False}``,
to toggle this feature per-variable individually.
This keyword may not be supported by all the backends.
Usage of 'use_cftime' as kwarg is deprecated. Please initialize it
kmuehlbauer marked this conversation as resolved.
Show resolved Hide resolved
with CFDatetimeCoder and 'decode_times' kwarg.
kmuehlbauer marked this conversation as resolved.
Show resolved Hide resolved
concat_characters : bool or dict-like, optional
If True, concatenate along the last dimension of character arrays to
form string arrays. Dimensions will only be concatenated over (and
Expand Down Expand Up @@ -698,7 +704,10 @@ def open_dataarray(
cache: bool | None = None,
decode_cf: bool | None = None,
mask_and_scale: bool | None = None,
decode_times: bool | None = None,
decode_times: bool
| CFDatetimeCoder
| Mapping[str, bool | CFDatetimeCoder]
| None = None,
decode_timedelta: bool | None = None,
use_cftime: bool | None = None,
concat_characters: bool | None = None,
Expand Down Expand Up @@ -761,9 +770,11 @@ def open_dataarray(
`missing_value` attribute contains multiple values a warning will be
issued and all array values matching one of the multiple values will
be replaced by NA. This keyword may not be supported by all the backends.
decode_times : bool, optional
decode_times : bool, CFDatetimeCoder or dict-like, optional
If True, decode times encoded in the standard NetCDF datetime format
into datetime objects. Otherwise, leave them encoded as numbers.
into datetime objects. Otherwise, use CFDatetimeCoder or leave them encoded as numbers.
Pass a mapping, e.g. ``{"my_variable": False}``,
to toggle this feature per-variable individually.
This keyword may not be supported by all the backends.
decode_timedelta : bool, optional
If True, decode variables and coordinates with time units in
Expand All @@ -781,6 +792,8 @@ def open_dataarray(
represented using ``np.datetime64[ns]`` objects. If False, always
decode times to ``np.datetime64[ns]`` objects; if this is not possible
raise an error. This keyword may not be supported by all the backends.
Usage of 'use_cftime' as kwarg is deprecated. Please initialize it
with CFDatetimeCoder and 'decode_times' kwarg.
concat_characters : bool, optional
If True, concatenate along the last dimension of character arrays to
form string arrays. Dimensions will only be concatenated over (and
Expand Down Expand Up @@ -903,7 +916,10 @@ def open_datatree(
cache: bool | None = None,
decode_cf: bool | None = None,
mask_and_scale: bool | Mapping[str, bool] | None = None,
decode_times: bool | Mapping[str, bool] | None = None,
decode_times: bool
| CFDatetimeCoder
| Mapping[str, bool | CFDatetimeCoder]
| None = None,
decode_timedelta: bool | Mapping[str, bool] | None = None,
use_cftime: bool | Mapping[str, bool] | None = None,
concat_characters: bool | Mapping[str, bool] | None = None,
Expand Down Expand Up @@ -961,9 +977,9 @@ def open_datatree(
be replaced by NA. Pass a mapping, e.g. ``{"my_variable": False}``,
to toggle this feature per-variable individually.
This keyword may not be supported by all the backends.
decode_times : bool or dict-like, optional
decode_times : bool, CFDatetimeCoder or dict-like, optional
If True, decode times encoded in the standard NetCDF datetime format
into datetime objects. Otherwise, leave them encoded as numbers.
into datetime objects. Otherwise, use CFDatetimeCoder or leave them encoded as numbers.
Pass a mapping, e.g. ``{"my_variable": False}``,
to toggle this feature per-variable individually.
This keyword may not be supported by all the backends.
Expand All @@ -987,6 +1003,8 @@ def open_datatree(
raise an error. Pass a mapping, e.g. ``{"my_variable": False}``,
to toggle this feature per-variable individually.
This keyword may not be supported by all the backends.
Usage of 'use_cftime' as kwarg is deprecated. Please initialize it
with CFDatetimeCoder and 'decode_times' kwarg.
concat_characters : bool or dict-like, optional
If True, concatenate along the last dimension of character arrays to
form string arrays. Dimensions will only be concatenated over (and
Expand Down Expand Up @@ -1118,7 +1136,10 @@ def open_groups(
cache: bool | None = None,
decode_cf: bool | None = None,
mask_and_scale: bool | Mapping[str, bool] | None = None,
decode_times: bool | Mapping[str, bool] | None = None,
decode_times: bool
| CFDatetimeCoder
| Mapping[str, bool | CFDatetimeCoder]
| None = None,
decode_timedelta: bool | Mapping[str, bool] | None = None,
use_cftime: bool | Mapping[str, bool] | None = None,
concat_characters: bool | Mapping[str, bool] | None = None,
Expand Down Expand Up @@ -1180,9 +1201,9 @@ def open_groups(
be replaced by NA. Pass a mapping, e.g. ``{"my_variable": False}``,
to toggle this feature per-variable individually.
This keyword may not be supported by all the backends.
decode_times : bool or dict-like, optional
decode_times : bool, CFDatetimeCoder or dict-like, optional
If True, decode times encoded in the standard NetCDF datetime format
into datetime objects. Otherwise, leave them encoded as numbers.
into datetime objects. Otherwise, use CFDatetimeCoder or leave them encoded as numbers.
Pass a mapping, e.g. ``{"my_variable": False}``,
to toggle this feature per-variable individually.
This keyword may not be supported by all the backends.
Expand All @@ -1206,6 +1227,8 @@ def open_groups(
raise an error. Pass a mapping, e.g. ``{"my_variable": False}``,
to toggle this feature per-variable individually.
This keyword may not be supported by all the backends.
Usage of 'use_cftime' as kwarg is deprecated. Please initialize it
with CFDatetimeCoder and 'decode_times' kwarg.
concat_characters : bool or dict-like, optional
If True, concatenate along the last dimension of character arrays to
form string arrays. Dimensions will only be concatenated over (and
Expand Down
10 changes: 10 additions & 0 deletions xarray/coders.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
"""
This module provides coder objects that encapsulate the
"encoding/decoding" process.
"""

from xarray.coding.times import CFDatetimeCoder

__all__ = [
"CFDatetimeCoder",
]
21 changes: 17 additions & 4 deletions xarray/coding/times.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,11 @@
except ImportError:
cftime = None

from xarray.core.types import CFCalendar, NPDatetimeUnitOptions, T_DuckArray
from xarray.core.types import (
CFCalendar,
NPDatetimeUnitOptions,
T_DuckArray,
)

T_Name = Union[Hashable, None]

Expand Down Expand Up @@ -204,7 +208,10 @@ def _unpack_time_units_and_ref_date(units: str) -> tuple[str, pd.Timestamp]:


def _decode_cf_datetime_dtype(
data, units: str, calendar: str | None, use_cftime: bool | None
data,
units: str,
calendar: str | None,
use_cftime: bool | None,
) -> np.dtype:
# Verify that at least the first and last date can be decoded
# successfully. Otherwise, tracebacks end up swallowed by
Expand Down Expand Up @@ -311,7 +318,10 @@ def _decode_datetime_with_pandas(


def decode_cf_datetime(
num_dates, units: str, calendar: str | None = None, use_cftime: bool | None = None
num_dates,
units: str,
calendar: str | None = None,
use_cftime: bool | None = None,
) -> np.ndarray:
"""Given an array of numeric dates in netCDF format, convert it into a
numpy array of date time objects.
Expand Down Expand Up @@ -974,7 +984,10 @@ def _lazily_encode_cf_timedelta(


class CFDatetimeCoder(VariableCoder):
def __init__(self, use_cftime: bool | None = None) -> None:
def __init__(
self,
use_cftime: bool | None = None,
) -> None:
self.use_cftime = use_cftime

def encode(self, variable: Variable, name: T_Name = None) -> Variable:
Expand Down
46 changes: 36 additions & 10 deletions xarray/conventions.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@

import numpy as np

from xarray.coders import CFDatetimeCoder
from xarray.coding import strings, times, variables
from xarray.coding.variables import SerializationWarning, pop_to
from xarray.core import indexing
Expand Down Expand Up @@ -88,7 +89,7 @@ def encode_cf_variable(
ensure_not_multiindex(var, name=name)

for coder in [
times.CFDatetimeCoder(),
CFDatetimeCoder(),
times.CFTimedeltaCoder(),
variables.CFScaleOffsetCoder(),
variables.CFMaskCoder(),
Expand All @@ -109,7 +110,7 @@ def decode_cf_variable(
var: Variable,
concat_characters: bool = True,
mask_and_scale: bool = True,
decode_times: bool = True,
decode_times: bool | CFDatetimeCoder = True,
decode_endianness: bool = True,
stack_char_dim: bool = True,
use_cftime: bool | None = None,
Expand All @@ -136,7 +137,7 @@ def decode_cf_variable(
Lazily scale (using scale_factor and add_offset) and mask
(using _FillValue). If the _Unsigned attribute is present
treat integer arrays as unsigned.
decode_times : bool
decode_times : bool or CFDatetimeCoder
Decode cf times ("hours since 2000-01-01") to np.datetime64.
decode_endianness : bool
Decode arrays from non-native to native endianness.
Expand All @@ -154,6 +155,8 @@ def decode_cf_variable(
represented using ``np.datetime64[ns]`` objects. If False, always
decode times to ``np.datetime64[ns]`` objects; if this is not possible
raise an error.
Usage of use_cftime as kwarg is deprecated, please initialize it with
CFDatetimeCoder and ``decode_times``.

Returns
-------
Expand All @@ -167,7 +170,7 @@ def decode_cf_variable(
original_dtype = var.dtype

if decode_timedelta is None:
decode_timedelta = decode_times
decode_timedelta = True if decode_times else False

if concat_characters:
if stack_char_dim:
Expand All @@ -191,7 +194,28 @@ def decode_cf_variable(
if decode_timedelta:
var = times.CFTimedeltaCoder().decode(var, name=name)
if decode_times:
var = times.CFDatetimeCoder(use_cftime=use_cftime).decode(var, name=name)
# remove checks after end of deprecation cycle
if not isinstance(decode_times, CFDatetimeCoder):
if use_cftime is not None:
from warnings import warn

warn(
kmuehlbauer marked this conversation as resolved.
Show resolved Hide resolved
"Usage of 'use_cftime' as kwarg is deprecated. "
"Please initialize it with CFDatetimeCoder and "
"'decode_times' kwarg.",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"'decode_times' kwarg.",
"'decode_times' kwarg.\n",
"Example usage:\n",
" time_coder = xr.coders.CFDatetimeCoder(use_cftime=True)\n",
" ds = xr.open_dataset(decode_times=time_coder)\n",

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mathause Would that help in the DeprecationWarning?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's good.

DeprecationWarning,
stacklevel=2,
)
decode_times = CFDatetimeCoder(use_cftime=use_cftime)
else:
if use_cftime is not None:
raise TypeError(
"Usage of 'use_cftime' as kwarg is not allowed, "
"if 'decode_times' is initialized with "
"CFDatetimeCoder. Please add 'use_cftime' "
"when initializing CFDatetimeCoder."
)
var = decode_times.decode(var, name=name)

if decode_endianness and not var.dtype.isnative:
var = variables.EndianCoder().decode(var)
Expand Down Expand Up @@ -302,7 +326,7 @@ def decode_cf_variables(
attributes: T_Attrs,
concat_characters: bool | Mapping[str, bool] = True,
mask_and_scale: bool | Mapping[str, bool] = True,
decode_times: bool | Mapping[str, bool] = True,
decode_times: bool | CFDatetimeCoder | Mapping[str, bool | CFDatetimeCoder] = True,
decode_coords: bool | Literal["coordinates", "all"] = True,
drop_variables: T_DropVariables = None,
use_cftime: bool | Mapping[str, bool] | None = None,
Expand Down Expand Up @@ -439,7 +463,7 @@ def decode_cf(
obj: T_DatasetOrAbstractstore,
concat_characters: bool = True,
mask_and_scale: bool = True,
decode_times: bool = True,
decode_times: bool | CFDatetimeCoder = True,
decode_coords: bool | Literal["coordinates", "all"] = True,
drop_variables: T_DropVariables = None,
use_cftime: bool | None = None,
Expand All @@ -458,7 +482,7 @@ def decode_cf(
mask_and_scale : bool, optional
Lazily scale (using scale_factor and add_offset) and mask
(using _FillValue).
decode_times : bool, optional
decode_times : bool or CFDatetimeCoder, optional
Decode cf times (e.g., integers since "hours since 2000-01-01") to
np.datetime64.
decode_coords : bool or {"coordinates", "all"}, optional
Expand All @@ -483,6 +507,8 @@ def decode_cf(
represented using ``np.datetime64[ns]`` objects. If False, always
decode times to ``np.datetime64[ns]`` objects; if this is not possible
raise an error.
Usage of use_cftime as kwarg is deprecated, please initialize it with
CFDatetimeCoder and ``decode_times``.
decode_timedelta : bool, optional
If True, decode variables and coordinates with time units in
{"days", "hours", "minutes", "seconds", "milliseconds", "microseconds"}
Expand Down Expand Up @@ -536,7 +562,7 @@ def cf_decoder(
attributes: T_Attrs,
concat_characters: bool = True,
mask_and_scale: bool = True,
decode_times: bool = True,
decode_times: bool | CFDatetimeCoder = True,
) -> tuple[T_Variables, T_Attrs]:
"""
Decode a set of CF encoded variables and attributes.
Expand All @@ -553,7 +579,7 @@ def cf_decoder(
mask_and_scale : bool
Lazily scale (using scale_factor and add_offset) and mask
(using _FillValue).
decode_times : bool
decode_times : bool | CFDatetimeCoder
Decode cf times ("hours since 2000-01-01") to np.datetime64.

Returns
Expand Down
Loading
Loading