Skip to content

Commit

Permalink
Merge branch 'main' into pre-commit-ci-update-config
Browse files Browse the repository at this point in the history
  • Loading branch information
Zeitsperre authored Sep 13, 2023
2 parents c1e5624 + f189b60 commit acb50f5
Show file tree
Hide file tree
Showing 41 changed files with 1,018 additions and 34,843 deletions.
2 changes: 1 addition & 1 deletion .cruft.json
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
"project_slug": "xscen",
"project_short_description": "A climate change scenario-building analysis framework, built with xclim/xarray.",
"pypi_username": "RondeauG",
"version": "0.7.3-beta",
"version": "0.7.5-beta",
"use_pytest": "y",
"use_black": "y",
"add_pyup_badge": "n",
Expand Down
2 changes: 1 addition & 1 deletion .readthedocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ formats:
build:
os: ubuntu-22.04
tools:
python: "mambaforge-4.10"
python: "mambaforge-22.9"
jobs:
post_create_environment:
- pip install . --no-deps
Expand Down
11 changes: 9 additions & 2 deletions HISTORY.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,20 +12,27 @@ Announcements

New features and enhancements
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* N/A
* Added the ability to search for simulations that reach a given warming level. (:pull:`251`).

Breaking changes
^^^^^^^^^^^^^^^^
* N/A

Bug fixes
^^^^^^^^^
* N/A
* Fixed a bug in ``xs.search_data_catalogs`` when searching for fixed fields and specific experiments/members. (:pull:`251`).

Internal changes
^^^^^^^^^^^^^^^^
* Continued work on adding tests. (:pull:`251`).
* Fixed pre-commit's pretty-format-json so it ignores notebooks. (:pull:`254`).
* Fixed the labeler so docs/CI isn't automatically added for contributions by new collaborators. (:pull:`254`).
* Made it so that `tests` are no longer treated as an installable package. (:pull:`248`).
* Renamed the pytest marker from `requires_docs` to `requires_netcdf`. (:pull:`248`).
* Included the documentation in the source distribution, while excluding the NetCDF files. (:pull:`248`).
* Reduced the size of the files in /docs/notebooks/samples and changed the Notebooks and tests accordingly. (:issue:`247`, :pull:`248`).
* Added a new `xscen.testing` module with the `datablock_3d` function previously located in `/tests/conftest.py`. (:pull:`248`).
* New function `xscen.testing.fake_data` to generate fake data for testing. (:pull:`248`).

v0.7.1 (2023-08-23)
-------------------
Expand Down
16 changes: 4 additions & 12 deletions MANIFEST.in
Original file line number Diff line number Diff line change
Expand Up @@ -3,27 +3,19 @@ include CONTRIBUTING.rst
include HISTORY.rst
include LICENSE
include README.rst
include requirements_dev.txt
include requirements_docs.txt

recursive-include xscen *.json *.yml *.py *.csv
recursive-include tests *
recursive-include docs notebooks *.rst conf.py Makefile make.bat *.jpg *.png *.gif *.ipynb *.csv *.json *.yml *.md
recursive-include docs notebooks samples *.csv *.json

recursive-exclude * __pycache__
recursive-exclude * *.py[co]
recursive-exclude docs notebooks *.rst conf.py Makefile make.bat *.jpg *.png *.gif *.ipynb *.csv *.json *.yml *.md
recursive-exclude docs notebooks samples *.csv *.json
recursive-exclude conda *.yml
recursive-exclude templates *.csv *.json *.py *.yml
recursive-exclude docs notebooks samples tutorial *.nc

exclude .cruft.json
exclude .editorconfig
exclude .gitlab-ci.yml
exclude .gitmodules
exclude .pre-commit-config.yaml
exclude .readthedocs.yml
exclude .secrets.baseline
exclude .yamllint.yaml
exclude .*
exclude Makefile
exclude environment.yml
exclude environment-dev.yml
Expand Down
6 changes: 3 additions & 3 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,6 @@ This package was created with Cookiecutter_ and the `Ouranosinc/cookiecutter-pyp
:target: https://pypi.python.org/pypi/xscen
:alt: Supported Python Versions

.. |status| image:: https://www.repostatus.org/badges/latest/wip.svg
:target: https://www.repostatus.org/#wip
:alt: Project Status: WIP – Initial development is in progress, but there has not yet been a stable, usable release suitable for the public.
.. |status| image:: https://www.repostatus.org/badges/latest/active.svg
:target: https://www.repostatus.org/#active
:alt: Project Status: Active The project has reached a stable, usable state and is being actively developed.
116 changes: 87 additions & 29 deletions docs/notebooks/1_catalog.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -71,12 +71,11 @@
"\n",
"from xscen import DataCatalog, ProjectCatalog\n",
"\n",
"# Prepare a dummy folder where data will be put\n",
"output_folder = Path().absolute() / \"_data\"\n",
"output_folder.mkdir(exist_ok=True)\n",
"\n",
"\n",
"DC = DataCatalog(f\"{Path().absolute()}/samples/pangeo-cmip6.json\")\n",
"\n",
"DC"
]
},
Expand All @@ -96,7 +95,7 @@
"outputs": [],
"source": [
"# Access the catalog\n",
"DC.df"
"DC.df[0:3]"
]
},
{
Expand Down Expand Up @@ -173,7 +172,7 @@
"metadata": {},
"outputs": [],
"source": [
"# Regex: Find all entries that start with \"rcp\"\n",
"# Regex: Find all entries that start with \"ssp\"\n",
"print(DC.search(experiment=\"^ssp\").unique(\"experiment\"))"
]
},
Expand All @@ -195,8 +194,8 @@
"metadata": {},
"outputs": [],
"source": [
"# Regex: Find all experiments except the exact string \"ssp245\"\n",
"print(DC.search(experiment=\"^(?!ssp245$).*$\").unique(\"experiment\"))"
"# Regex: Find all experiments except the exact string \"ssp126\"\n",
"print(DC.search(experiment=\"^(?!ssp126$).*$\").unique(\"experiment\"))"
]
},
{
Expand Down Expand Up @@ -255,14 +254,14 @@
"- `allow_conversion` is used to allow searching for calculable variables, in the case where the requested variable would not be available.\n",
"- `restrict_resolution` is used to limit the results to the finest or coarsest resolution available for each source.\n",
"- `restrict_members` is used to limit the results to a maximum number of realizations for each source.\n",
"- `restrict_warming_level` is used to limit the results to only datasets that are present in the csv used for calculating warming levels.\n",
"- `restrict_warming_level` is used to limit the results to only datasets that are present in the csv used for calculating warming levels. You can also pass a dict to verify that a given warming level is reached.\n",
"\n",
"Note that compared to `search`, the result of `search_data_catalog` is a dictionary with one entry per unique ID. A given unique ID might contain multiple datasets as per `intake-esm`'s definition, because it groups catalog lines per *id - domain - processing_level - xrfreq*. Thus, it separates model data that exists at different frequencies.\n",
"Note that compared to `search`, the result of `search_data_catalog` is a dictionary with one entry per unique ID. A given unique ID might contain multiple datasets as per `intake-esm`'s definition, because it groups catalog lines per *id - domain - processing_level - xrfreq*. Thus, it would separate model data that exists at different frequencies.\n",
"\n",
"\n",
"#### Example 1: Simple dataset\n",
"#### Example 1: Multiple variables and frequencies + Historical and future\n",
"\n",
"Let's start by searching for CMIP6 data that has subdaily precipitation, daily temperature and the land fraction data. The main difference compared to searching for reference datasets is that in most cases, `match_hist_and_fut` will be required to match *historical* simulations to their future counterparts. This works for both CMIP5 and CMIP6 nomenclatures."
"Let's start by searching for CMIP6 data that has subdaily precipitation, daily minimum temperature and the land fraction data. The main difference compared to searching for reference datasets is that in most cases, `match_hist_and_fut` will be required to match *historical* simulations to their future counterparts. This works for both CMIP5 and CMIP6 nomenclatures."
]
},
{
Expand All @@ -276,8 +275,8 @@
"source": [
"import xscen as xs\n",
"\n",
"variables_and_freqs = {\"tas\": \"D\", \"pr\": \"3H\", \"sftlf\": \"fx\"}\n",
"other_search_criteria = {\"institution\": [\"NOAA-GFDL\"], \"experiment\": [\"ssp585\"]}\n",
"variables_and_freqs = {\"tasmin\": \"D\", \"pr\": \"3H\", \"sftlf\": \"fx\"}\n",
"other_search_criteria = {\"institution\": [\"NOAA-GFDL\"]}\n",
"\n",
"cat_sim = xs.search_data_catalogs(\n",
" data_catalogs=[f\"{Path().absolute()}/samples/pangeo-cmip6.json\"],\n",
Expand All @@ -289,12 +288,34 @@
"cat_sim"
]
},
{
"cell_type": "markdown",
"id": "82535e6c",
"metadata": {},
"source": [
"If required, at this stage, a dataset can be looked at in more details. If we examine the results (look at the 'date_start' and 'date_end' columns), we'll see that it successfully found historical simulations in the *CMIP* activity and renamed both their *activity* and *experiment* to match the future simulations."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a6e5bd7e",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"cat_sim[\"ScenarioMIP_NOAA-GFDL_GFDL-CM4_ssp585_r1i1p1f1_gr1\"].df"
]
},
{
"cell_type": "markdown",
"id": "85ee34fe",
"metadata": {},
"source": [
"Two simulations correspond to the search criteria, but as can be seen from the results, it is the same simulation on 2 different grids (`gr1` and `gr2`). If desired, `restrict_resolution` can be called to choose the finest or coarsest grid in such cases."
"#### Example 2: Restricting results\n",
"\n",
"The two previous search results were the same simulation, but on 2 different grids (`gr1` and `gr2`). If desired, `restrict_resolution` can be called to choose the finest or coarsest grid."
]
},
{
Expand All @@ -306,7 +327,7 @@
},
"outputs": [],
"source": [
"variables_and_freqs = {\"tas\": \"D\", \"pr\": \"3H\", \"sftlf\": \"fx\"}\n",
"variables_and_freqs = {\"tasmin\": \"D\", \"pr\": \"3H\", \"sftlf\": \"fx\"}\n",
"other_search_criteria = {\"institution\": [\"NOAA-GFDL\"], \"experiment\": [\"ssp585\"]}\n",
"\n",
"cat_sim = xs.search_data_catalogs(\n",
Expand All @@ -322,30 +343,67 @@
},
{
"cell_type": "markdown",
"id": "82535e6c",
"id": "fcd847c0-0ea8-46ad-bc28-9b73edd627bc",
"metadata": {},
"source": [
"If required, at this stage a dataset can be looked at in more details. If we examine the results (look at the 'date_start' and 'date_end' columns), we'll see that it successfully found historical simulations in the *CMIP* activity and renamed both their *activity* and *experiment* to match the future simulations."
"Similarly, if we search for historical NorESM2-MM data, we'll find that it has 3 members. If desired, `restrict_members` can be called to choose a maximum number of realization per model."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a6e5bd7e",
"metadata": {
"tags": []
},
"id": "6c5ce0fc-2f25-4b55-bf65-5f140f07e331",
"metadata": {},
"outputs": [],
"source": [
"cat_sim[\"ScenarioMIP_NOAA-GFDL_GFDL-CM4_ssp585_r1i1p1f1_gr1\"].df"
"variables_and_freqs = {\"tasmin\": \"D\"}\n",
"other_search_criteria = {\"source\": [\"NorESM2-MM\"], \"experiment\": [\"historical\"]}\n",
"\n",
"cat_sim = xs.search_data_catalogs(\n",
" data_catalogs=[f\"{Path().absolute()}/samples/pangeo-cmip6.json\"],\n",
" variables_and_freqs=variables_and_freqs,\n",
" other_search_criteria=other_search_criteria,\n",
" restrict_members={\"ordered\": 2},\n",
")\n",
"\n",
"cat_sim"
]
},
{
"cell_type": "markdown",
"id": "4fd28f58-5ab7-4d65-8906-2197592c8c94",
"metadata": {},
"source": [
"Finally, `restrict_warming_level` can be used to be sure that the results either exist in `xscen`'s warming level database (if a boolean), or reach a given warming level."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3300b1d7-e37b-4aa4-991e-99609bb1adea",
"metadata": {},
"outputs": [],
"source": [
"variables_and_freqs = {\"tasmin\": \"D\"}\n",
"\n",
"cat_sim = xs.search_data_catalogs(\n",
" data_catalogs=[f\"{Path().absolute()}/samples/pangeo-cmip6.json\"],\n",
" variables_and_freqs=variables_and_freqs,\n",
" match_hist_and_fut=True,\n",
" restrict_warming_level={\n",
" \"wl\": 2\n",
" }, # SSP126 gets eliminated, since it doesn't reach +2°C by 2100.\n",
")\n",
"\n",
"cat_sim"
]
},
{
"cell_type": "markdown",
"id": "6bddc58d",
"metadata": {},
"source": [
"#### Example 2: Advanced search\n",
"#### Example 3: Search for data that can be computed from what's available\n",
"\n",
"`allow_resampling` and `allow_conversion` are powerful search tools to find data that doesn't explicitely exist in the catalog, but that can easily be computed."
]
Expand Down Expand Up @@ -375,7 +433,7 @@
"id": "b33b2ad7",
"metadata": {},
"source": [
"If we examine the SSP5-8.5 results, we'll see that while it failed to find *evspsblpot*, it successfully understood that *tasmin* and *tasmax* can be used to compute it. It also understood that daily *tas* is a valid search result for `{tas: YS}`, since it can be aggregated."
"If we examine the SSP5-8.5 results, we'll see that while it failed to find *evspsblpot*, it successfully understood that *tasmin* and *tasmax* can be used to compute it. It also understood that daily *tasmin* and *tasmax* is a valid search result for `{tas: YS}`, since it can be computed first, then aggregated to a yearly frequency."
]
},
{
Expand Down Expand Up @@ -413,15 +471,15 @@
" other_search_criteria={\n",
" \"source\": [\"NorESM2-MM\"],\n",
" \"processing_level\": [\"raw\"],\n",
" \"experiment\": [\"ssp370\"],\n",
" \"experiment\": [\"ssp585\"],\n",
" },\n",
" match_hist_and_fut=True,\n",
" allow_resampling=True,\n",
" allow_conversion=True,\n",
")\n",
"print(\n",
" cat_sim_adv_multifreq[\n",
" \"ScenarioMIP_NCC_NorESM2-MM_ssp370_r1i1p1f1_gn\"\n",
" \"ScenarioMIP_NCC_NorESM2-MM_ssp585_r1i1p1f1_gn\"\n",
" ]._requested_variable_freqs\n",
")"
]
Expand All @@ -435,7 +493,7 @@
"\n",
"The `allow_conversion` argument is built upon `xclim`'s virtual indicators module and `intake-esm`'s [DerivedVariableRegistry](https://ncar.github.io/esds/posts/2021/intake-esm-derived-variables/) in a way that should be seamless to the user. It works by using the methods defined in `xscen/xclim_modules/conversions.yml` to add a registry of *derived* variables that exist virtually through computation methods.\n",
"\n",
"In the example above, we can see that the search failed to find *evspsblpot* within *NORESM2-MM*, but understood that *tasmin* and *tasmax* could be used to estimate it using `xclim`'s `potential_evapotranspiration`.\n",
"In the example above, we can see that the search failed to find *evspsblpot* within *NorESM2-MM*, but understood that *tasmin* and *tasmax* could be used to estimate it using `xclim`'s `potential_evapotranspiration`.\n",
"\n",
"Most use cases should already be covered by the aforementioned file. The preferred way to add new methods is to [submit a new indicator to xclim](https://xclim.readthedocs.io/en/stable/contributing.html), and then to add a call to that indicator in `conversions.yml`. In the case where this is not possible or where the transformation would be out of scope for `xclim`, the calculation can be implemented into `xscen/xclim_modules/conversions.py` instead.\n",
"\n",
Expand Down Expand Up @@ -678,7 +736,7 @@
"outputs": [],
"source": [
"# Create fake files for the example:\n",
"root = Path(\".\").absolute() / \"parser_examples\"\n",
"root = Path(\".\").absolute() / \"_data\" / \"parser_examples\"\n",
"root.mkdir(exist_ok=True)\n",
"\n",
"paths = [\n",
Expand Down Expand Up @@ -1006,7 +1064,7 @@
"import shutil as sh\n",
"\n",
"# Create the destination folder\n",
"root = Path(\".\").absolute() / \"path_builder_examples\"\n",
"root = Path(\".\").absolute() / \"_data\" / \"path_builder_examples\"\n",
"root.mkdir(exist_ok=True)\n",
"\n",
"# Get new names:\n",
Expand Down Expand Up @@ -1044,7 +1102,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.4"
"version": "3.10.12"
}
},
"nbformat": 4,
Expand Down
Loading

0 comments on commit acb50f5

Please sign in to comment.