Skip to content

Commit

Permalink
Address a few FutureWarnings (#380)
Browse files Browse the repository at this point in the history
<!-- Please ensure the PR fulfills the following requirements! -->
<!-- If this is your first PR, make sure to add your details to the
AUTHORS.rst! -->
### Pull Request Checklist:
- [x] This PR addresses an already opened issue (for bug fixes /
features)
    - This PR fixes #xyz
- [x] (If applicable) Documentation has been added / updated (for bug
fixes / features).
- [x] (If applicable) Tests have been added.
- [x] This PR does not seem to break the templates.
- [x] CHANGES.rst has been updated (with summary of main changes).
- [x] Link to issue (:issue:`number`) and pull request (:pull:`number`)
has been added.

### What kind of change does this PR introduce?

* Addresses a few FutureWarning I encountered recently:
* `groupby` will change the default to `observed=True`. I think that our
implementation here does not care about `observed`, even if we use
categoricals, but I'm not 100% sure. We could use `observed=False` to
ensure no breaking change.
*
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.groupby.html
*
https://towardsdatascience.com/be-careful-when-using-pandas-groupby-with-categorical-data-type-a1d31f66b162
  * Changed a few of the old `pandas` codes that were missed.
  * Changed `pd.unique` to `np.unique`
* `pd.unique with argument that is not not a Series, Index,
ExtensionArray, or np.ndarray is deprecated and will raise in a future
version.`
* `intake_esm` no longer spams the warning about `applymap`, so our fix
was removed. It still has the "observed=True" spam, however.
    * Changed an implementation of inplace modifications to a DataFrame.
* `FutureWarning: A value is trying to be set on a copy of a DataFrame
or Series through chained assignment using an inplace method. The
behavior will change in pandas 3.0. This inplace method will never work
because the intermediate object on which we are setting values always
behaves as a copy. For example, when doing 'df[col].method(value,
inplace=True)', try using 'df.method({col: value}, inplace=True)' or
df[col] = df[col].method(value) instead, to perform the operation
inplace on the original object.`
  * Added a temporary fix for the `flox` spam in the documentation.

### Does this PR introduce a breaking change?

- To avoid breaking changes, 'Y' and 'M' are still allowed in
`date_parser`, so no.

### Other information:

-
  • Loading branch information
RondeauG authored Apr 11, 2024
2 parents fc0ae6e + 3d38b8c commit 32b360f
Show file tree
Hide file tree
Showing 10 changed files with 100 additions and 25 deletions.
1 change: 1 addition & 0 deletions CHANGES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ Internal changes
* Added more tests. (:pull:`366`, :pull:`367`, :pull:`372`).
* Refactored ``xs.spatial.subset`` into smaller functions. (:pull:`367`).
* An `encoding` argument was added to ``xs.config.load_config``. (:pull:`370`).
* Various small fixes to the code to address FutureWarnings. (:pull:`380`).

Bug fixes
^^^^^^^^^
Expand Down
21 changes: 20 additions & 1 deletion docs/notebooks/2_getting_started.ipynb
Original file line number Diff line number Diff line change
@@ -1,5 +1,24 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"id": "eb10a72a-9ea1-4414-922b-0ea1aaea0648",
"metadata": {
"nbsphinx": "hidden"
},
"outputs": [],
"source": [
"# Remove flox spam\n",
"\n",
"import logging\n",
"\n",
"# Get the logger for the 'flox' package\n",
"logger = logging.getLogger(\"flox\")\n",
"# Set the logging level to WARNING\n",
"logger.setLevel(logging.WARNING)"
]
},
{
"cell_type": "markdown",
"id": "4f220a85",
Expand Down Expand Up @@ -1481,7 +1500,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
"version": "3.12.2"
}
},
"nbformat": 4,
Expand Down
21 changes: 20 additions & 1 deletion docs/notebooks/3_diagnostics.ipynb
Original file line number Diff line number Diff line change
@@ -1,5 +1,24 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"id": "d513b8c4-0cb4-429b-b169-e0d8d40c795f",
"metadata": {
"nbsphinx": "hidden"
},
"outputs": [],
"source": [
"# Remove flox spam\n",
"\n",
"import logging\n",
"\n",
"# Get the logger for the 'flox' package\n",
"logger = logging.getLogger(\"flox\")\n",
"# Set the logging level to WARNING\n",
"logger.setLevel(logging.WARNING)"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down Expand Up @@ -484,7 +503,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
"version": "3.12.2"
}
},
"nbformat": 4,
Expand Down
24 changes: 22 additions & 2 deletions docs/notebooks/4_ensembles.ipynb
Original file line number Diff line number Diff line change
@@ -1,5 +1,23 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"nbsphinx": "hidden"
},
"outputs": [],
"source": [
"# Remove flox spam\n",
"\n",
"import logging\n",
"\n",
"# Get the logger for the 'flox' package\n",
"logger = logging.getLogger(\"flox\")\n",
"# Set the logging level to WARNING\n",
"logger.setLevel(logging.WARNING)"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -36,7 +54,9 @@
"\n",
"for d in datasets:\n",
" ds = open_dataset(datasets[d]).isel(lon=slice(0, 4), lat=slice(0, 4))\n",
" ds = xs.climatological_mean(ds, window=30, periods=[[1981, 2010], [2021, 2050]])\n",
" ds = xs.climatological_op(\n",
" ds, op=\"mean\", window=30, periods=[[1981, 2010], [2021, 2050]]\n",
" )\n",
" datasets[d] = xs.compute_deltas(ds, reference_horizon=\"1981-2010\")\n",
" datasets[d].attrs[\"cat:id\"] = d # Required by build_reduction_data\n",
" datasets[d].attrs[\"cat:xrfreq\"] = \"AS-JAN\""
Expand Down Expand Up @@ -270,7 +290,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.5"
"version": "3.12.2"
}
},
"nbformat": 4,
Expand Down
21 changes: 20 additions & 1 deletion docs/notebooks/5_warminglevels.ipynb
Original file line number Diff line number Diff line change
@@ -1,5 +1,24 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"id": "f1899896-70a1-4efb-80e6-8765b95f4388",
"metadata": {
"nbsphinx": "hidden"
},
"outputs": [],
"source": [
"# Remove flox spam\n",
"\n",
"import logging\n",
"\n",
"# Get the logger for the 'flox' package\n",
"logger = logging.getLogger(\"flox\")\n",
"# Set the logging level to WARNING\n",
"logger.setLevel(logging.WARNING)"
]
},
{
"cell_type": "markdown",
"id": "3e311475",
Expand Down Expand Up @@ -483,7 +502,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
"version": "3.12.2"
}
},
"nbformat": 4,
Expand Down
4 changes: 2 additions & 2 deletions docs/notebooks/6_config.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -277,7 +277,7 @@
"import xarray as xr\n",
"\n",
"# Create a dummy dataset\n",
"time = pd.date_range(\"1951-01-01\", \"2100-01-01\", freq=\"AS-JAN\")\n",
"time = pd.date_range(\"1951-01-01\", \"2100-01-01\", freq=\"YS-JAN\")\n",
"da = xr.DataArray([0] * len(time), coords={\"time\": time})\n",
"da.name = \"test\"\n",
"ds = da.to_dataset()\n",
Expand Down Expand Up @@ -378,7 +378,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.11"
"version": "3.12.2"
}
},
"nbformat": 4,
Expand Down
6 changes: 0 additions & 6 deletions xscen/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -75,9 +75,3 @@ def warning_on_one_line(
"Pass observed=False to retain current behavior or observed=True to adopt the future default "
"and silence this warning.",
)
warnings.filterwarnings(
"ignore",
category=FutureWarning,
module="intake_esm",
message="DataFrame.applymap has been deprecated. Use DataFrame.map instead.",
)
12 changes: 7 additions & 5 deletions xscen/catutils.py
Original file line number Diff line number Diff line change
Expand Up @@ -634,11 +634,13 @@ def parse_directory( # noqa: C901

# translate xrfreq into frequencies and vice-versa
if {"xrfreq", "frequency"}.issubset(df.columns):
df["xrfreq"].fillna(
df["frequency"].apply(CV.frequency_to_xrfreq, default=pd.NA), inplace=True
df.fillna(
{"xrfreq": df["frequency"].apply(CV.frequency_to_xrfreq, default=pd.NA)},
inplace=True,
)
df["frequency"].fillna(
df["xrfreq"].apply(CV.xrfreq_to_frequency, default=pd.NA), inplace=True
df.fillna(
{"frequency": df["xrfreq"].apply(CV.xrfreq_to_frequency, default=pd.NA)},
inplace=True,
)

# Parse dates
Expand Down Expand Up @@ -757,7 +759,7 @@ def parse_from_ds( # noqa: C901
attrs["variable"] = tuple(sorted(variables))
elif name in ("frequency", "xrfreq") and time is not None and time.size > 3:
# round to the minute to catch floating point imprecision
freq = xr.infer_freq(time.round("T"))
freq = xr.infer_freq(time.round("min"))
if freq:
if "xrfreq" in names:
attrs["xrfreq"] = freq
Expand Down
5 changes: 3 additions & 2 deletions xscen/extract.py
Original file line number Diff line number Diff line change
Expand Up @@ -175,7 +175,7 @@ def extract_dataset( # noqa: C901
)

out_dict = {}
for xrfreq in pd.unique([x for y in variables_and_freqs.values() for x in y]):
for xrfreq in np.unique([x for y in variables_and_freqs.values() for x in y]):
ds = xr.Dataset()
attrs = {}
# iterate on the datasets, in reverse timedelta order
Expand Down Expand Up @@ -814,7 +814,8 @@ def search_data_catalogs( # noqa: C901
valid_tp = []
for var, group in varcat.df.groupby(
varcat.esmcat.aggregation_control.groupby_attrs
+ ["variable"]
+ ["variable"],
observed=True,
):
valid_tp.append(
subset_file_coverage(
Expand Down
10 changes: 5 additions & 5 deletions xscen/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -172,7 +172,7 @@ def date_parser( # noqa: C901
date : str, cftime.datetime, pd.Timestamp, datetime.datetime, pd.Period
Date to be converted
end_of_period : bool or str
If 'Y' or 'M', the returned date will be the end of the year or month that contains the received date.
If 'YE' or 'ME', the returned date will be the end of the year or month that contains the received date.
If True, the period is inferred from the date's precision, but `date` must be a string, otherwise nothing is done.
out_dtype : str
Choices are 'datetime', 'period' or 'str'
Expand Down Expand Up @@ -245,12 +245,12 @@ def _parse_date(date, fmts):

if isinstance(end_of_period, str) or (end_of_period is True and fmt):
quasiday = (pd.Timedelta(1, "d") - pd.Timedelta(1, "s")).as_unit(date.unit)
if end_of_period == "Y" or "m" not in fmt:
if end_of_period in ["Y", "YE"] or "m" not in fmt:
date = (
pd.tseries.frequencies.to_offset("A-DEC").rollforward(date) + quasiday
pd.tseries.frequencies.to_offset("YE-DEC").rollforward(date) + quasiday
)
elif end_of_period == "M" or "d" not in fmt:
date = pd.tseries.frequencies.to_offset("M").rollforward(date) + quasiday
elif end_of_period in ["M", "ME"] or "d" not in fmt:
date = pd.tseries.frequencies.to_offset("ME").rollforward(date) + quasiday
# TODO: Implement subdaily ?

if out_dtype == "str":
Expand Down

0 comments on commit 32b360f

Please sign in to comment.