Skip to content

Commit

Permalink
Added teststo utils.py and fixed bugs (#450)
Browse files Browse the repository at this point in the history
<!-- Please ensure the PR fulfills the following requirements! -->
<!-- If this is your first PR, make sure to add your details to the
AUTHORS.rst! -->
### Pull Request Checklist:
- [x] This PR addresses an already opened issue (for bug fixes /
features)
    - This PR fixes #xyz
- [x] (If applicable) Documentation has been added / updated (for bug
fixes / features).
- [x] (If applicable) Tests have been added.
- [x] This PR does not seem to break the templates.
- [ ] CHANGELOG.rst has been updated (with summary of main changes).
- [ ] Link to issue (:issue:`number`) and pull request (:pull:`number`)
has been added.

### What kind of change does this PR introduce?

* Adds tests for all functions located in `utils.py`. With the exception
of a few lines impossible to test, everything should be covered.
* The `mask` argument in `stack_drop_nans` can now be a list of
dimensions. In that case, a `dropna(how='all')` operation will be used
to create the mask on-the-fly.
* `convert_calendar` in `xs.utils.clean_up` now uses `xarray` instead of
`xclim`.
* `attrs_to_remove` and `remove_all_attrs_except` in `xs.utils.clean_up`
now use real regex.
* Smaller changes:
  * `minimum_calendar` now accepts a list as input.
  * More calendars are recognized in `translate_time_chunk`
* Multiple entries can now be given for `change_attr_prefix` in
`xs.utils.clean_up`.
* `new_dim` in `unstack_dates` is now None by default and changes
depending on the frequency. It becomes `month` if the data is exactly
monthly, and keep the old default of `season` otherwise.
* Updated the list of libraries in `show_versions` to reflect our
current environment.

* Bug fixes:
  * `maybe_unstack` now works if the dimension name is not the default.
  * `xs.utils.clean_up` now does not also modify the original dataset.
* `unstack_dates` now works correctly for yearly datasets when
`winter_starts_year=True`.
  * `unstack_dates` now works correctly for multi-year frequencies.


### Does this PR introduce a breaking change?

* `convert_calendar` in `xs.utils.clean_up` now uses `xarray` instead of
`xclim`. Keywords aren't compatible between the two, but given that
`xclim` will abandon its function, no backwards compatibility was
sought.
* `attrs_to_remove` and `remove_all_attrs_except` in `xs.utils.clean_up`
now use real regex. It should not be too breaking since a `fullmatch()`
is used, but `*` is now `.*`.

### Other information:

There are 2 lines in `unstack_fill_nan` that I simply don't
understand... They will be covered next week when I can talk to the
author.
  • Loading branch information
RondeauG authored Sep 10, 2024
2 parents d165269 + f8749e2 commit 7a4aed0
Show file tree
Hide file tree
Showing 6 changed files with 1,428 additions and 179 deletions.
22 changes: 22 additions & 0 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,29 @@ v0.9.2 (unreleased)
-------------------
Contributors to this version: Juliette Lavoie (:user:`juliettelavoie`), Pascal Bourgault (:user:`aulemahal`), Gabriel Rondeau-Genesse (:user:`RondeauG`).

New features and enhancements
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* The `mask` argument in ``stack_drop_nans`` can now be a list of dimensions. In that case, a `dropna(how='all')` operation will be used to create the mask on-the-fly. (:pull:`450`).
* Few changes to ``clean_up``:
* The `convert_calendar` function now uses `xarray` instead of `xclim`. (:pull:`450`).
* The `attrs_to_remove` and `remove_all_attrs_except` arguments now use real regex. (:pull:`450`).
* Multiple entries can now be given for `change_attr_prefix`. (:pull:`450`).
* ``minimum_calendar`` now accepts a list as input. (:pull:`450`).
* More calendars are now recognized in ``translate_time_chunk``. (:pull:`450`).
* `new_dim` in ``unstack_dates`` is now None by default and changes depending on the frequency. It becomes `month` if the data is exactly monthly, and keep the old default of `season` otherwise. (:pull:`450`).
* Updated the list of libraries in `show_versions` to reflect our current environment. (:pull:`450`).

Bug fixes
^^^^^^^^^
* Fixed bug with reusing weights. (:issue:`411`, :pull:`414`).
* Fixed bug in `update_from_ds` when "time" is a coordinate, but not a dimension. (:pull: `417`).
* Avoid modification of mutable arguments in ``search_data_catalogs`` (:pull:`413`).
* ``ensure_correct_time`` now correctly handles cases where timesteps are missing. (:pull:`440`).
* If using the argument `tile_buffer` with a `shape` method in ``spatial.subset``, the shapefile will now be reprojected to a WGS84 grid before the buffer is applied. (:pull:`440`).
* ``maybe_unstack`` now works if the dimension name is not the default. (:pull:`450`).
* ``unstack_fill_nan`` now works if given a dictionary that contains both dimensions and coordinates. (:pull:`450`).
* ``clean_up`` no longer modifies the original dataset. (:pull:`450`).
* ``unstack_dates`` now works correctly for yearly datasets when `winter_starts_year=True`, as well as multi-year datasets. (:pull:`450`).

Internal changes
^^^^^^^^^^^^^^^^
Expand All @@ -21,6 +37,12 @@ Internal changes
* Add ``.zip`` and ``.zarr.zip`` as possible file extensions for Zarr datasets. (:pull:`426`).
* Explicitly assign coords of multiindex in `xs.unstack_fill_nan`. (:pull:`427`).
* French translations are compiled offline. A new check ensures no PR are merged with missing messages. (:issue:`342`, :pull:`443`).
* Continued work to add tests. (:pull:`450`).

Breaking changes
^^^^^^^^^^^^^^^^
* `convert_calendar` in ``clean_up`` now uses `xarray` instead of `xclim`. Keywords aren't compatible between the two, but given that `xclim` will abandon its function, no backwards compatibility was sought. (:pull:`450`).
* `attrs_to_remove` and `remove_all_attrs_except` in ``clean_up`` now use real regex. It should not be too breaking since a `fullmatch()` is used, but `*` is now `.*`. (:pull:`450`).

v0.9.1 (2024-06-04)
-------------------
Expand Down
17 changes: 7 additions & 10 deletions docs/notebooks/2_getting_started.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -1389,7 +1389,7 @@
"metadata": {},
"outputs": [],
"source": [
"convert_calendar_kwargs = {\"target\": \"standard\"}\n",
"convert_calendar_kwargs = {\"calendar\": \"standard\"}\n",
"missing_by_var = {\"tas\": \"interpolate\"}"
]
},
Expand All @@ -1404,14 +1404,11 @@
"\n",
"It is possible to write a list of attributes to remove with `attrs_to_remove`, or a list of attributes to keep and remove everything else with `remove_all_attrs_except`. Both take the shape of a dictionnary where the keys are the variables (and 'global' for global attrs) and the values are the list.\n",
"\n",
"The element of the list can be exact matches for the attribute names or use the same regex matching rules as `intake_esm`:\n",
"\n",
"- ending with a '*' means checks if the substring is contained in the string\n",
"- starting with a '^' means check if the string starts with the substring.\n",
"The element of the list can be exact matches for the attribute names or use regex matching rules (using a `fullmatch`):\n",
"\n",
"Attributes can also be added to datasets using `add_attrs`. This is a dictionary where the keys are the variables and the values are a another dictionary of attributes.\n",
"\n",
"It is also possible to modify the catalogue prefix 'cat:' by a new string with `change_attr_prefix`. Don't use this if this is not the last step of your workflow.\n"
"It is also possible to modify the catalogue prefix 'cat:' by a new string with `change_attr_prefix`. Don't use this if this is not the last step of your workflow, as it may break some functions that rely on those prefixes to find the right dataset attributes.\n"
]
},
{
Expand All @@ -1422,15 +1419,15 @@
"outputs": [],
"source": [
"attrs_to_remove = {\n",
" \"tas\": [\"name*\"]\n",
" \"tas\": [\".*name.*\"]\n",
"} # remove tas attrs that contain the substring 'name'\n",
"remove_all_attrs_except = {\n",
" \"global\": [\"^cat:\"]\n",
" \"global\": [\"cat:.*\"]\n",
"} # remove all the global attrs EXCEPT for the one starting with cat:\n",
"add_attrs = {\n",
" \"tas\": {\"notes\": \"some crucial information\"}\n",
"} # add a new tas attribute named 'notes' with value 'some crucial information'\n",
"change_attr_prefix = \"dataset:\" # change /cat to dataset:"
"change_attr_prefix = \"dataset:\" # change 'cat': to 'dataset:'"
]
},
{
Expand Down Expand Up @@ -1500,7 +1497,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.2"
"version": "3.12.5"
}
},
"nbformat": 4,
Expand Down
Loading

0 comments on commit 7a4aed0

Please sign in to comment.