Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fast MBCn (a la groupies) #1580

Merged
merged 113 commits into from
Jul 24, 2024
Merged
Show file tree
Hide file tree
Changes from 60 commits
Commits
Show all changes
113 commits
Select commit Hold shift + click to select a range
5fce024
npdf_transform relying more on numpy
coxipi Dec 7, 2023
60c58f1
Fix check rot_matrices is None in NpdfTransform
coxipi Dec 8, 2023
c915cd7
allow rotation_matrices input in fast_npdf (for real)
coxipi Dec 8, 2023
4a5fcb2
not None -> None (duh)
coxipi Dec 8, 2023
2767b56
modular training/adjustment
coxipi Dec 8, 2023
3351727
handle default kws with .setdefault
coxipi Dec 9, 2023
9221fac
put back None -> {}
coxipi Dec 9, 2023
c00558e
convert quantiles to array
coxipi Dec 9, 2023
5c0041d
npdf_adj 'sim' dataset with movingwin dim
coxipi Dec 9, 2023
a05eefb
fusion _single_qdm and single_qdm (much less code)
coxipi Dec 9, 2023
674e1b0
forgotten indices
coxipi Dec 9, 2023
55b4e03
fixes to fast_npdf_adjust support for movingwin
coxipi Dec 9, 2023
58903cd
opti loop order, fix slice bug np, (2,1,10950) -> (10950,1)
coxipi Dec 11, 2023
351f73e
_adj -> _adjust
coxipi Dec 11, 2023
254f4d4
Minimal working example, fast npdf a la xr
coxipi Dec 16, 2023
f0a8ab3
npdf_train/adjust methods for NpdfTransform
coxipi Dec 17, 2023
7308f20
fix ungrouping (group -> group.name)
coxipi Dec 17, 2023
ae5bc2d
optimize interp / quantile computation
coxipi Dec 17, 2023
0c50fc1
assign correct values to quantiles in af_q
coxipi Dec 18, 2023
8288c92
use map_blocks
coxipi Dec 19, 2023
388995f
revert to no map_blocks
coxipi Dec 19, 2023
e4ebf64
adjust now performs reordering
coxipi Dec 20, 2023
dce3241
map_blocks in training
coxipi Dec 20, 2023
f0fb4dc
map_blocks for adjust too
coxipi Dec 20, 2023
62aad22
fixed train map_blocks, adjust map_blocks still problematic
coxipi Dec 20, 2023
bd0a415
remove map_blocks, chunk group dimension (e.g. doy)
coxipi Dec 20, 2023
f6a1a5e
control number of points in group chunks
coxipi Dec 20, 2023
a01e465
rotations in numpy call
coxipi Dec 20, 2023
660b0ec
train with map groups
coxipi Dec 21, 2023
d2e00ea
loop over groups
coxipi Dec 21, 2023
aadcfd3
apply_ufunc needs to know size of quantiles
coxipi Dec 21, 2023
912ac0d
standardize before doing npdft
coxipi Dec 21, 2023
13320c1
Merge branch 'master' of github.com:Ouranosinc/xclim into npdf_gpies
coxipi Jan 9, 2024
6574b85
update CHANGES
coxipi Jan 9, 2024
64af439
remove comments and unused functions
coxipi Jan 9, 2024
e4764a8
adjust a la groupies too
coxipi Jan 10, 2024
1d7a50a
remove bug in last iteration skip
coxipi Jan 10, 2024
cea04a3
fix _npdf_train, mbcn in train_adjust
coxipi Jan 12, 2024
ffba740
QDM scen in npdf_adjust
coxipi Jan 12, 2024
f672e1b
more comments and refactoring
coxipi Jan 12, 2024
45cc4c6
more documentation, refactoring
coxipi Jan 13, 2024
84c2944
use mbcn for functions name and bring back NpdfTransform
coxipi Jan 15, 2024
be0c3fb
escores in MBCn
coxipi Jan 15, 2024
5e2c969
use _sortquantile
coxipi Jan 15, 2024
4ed7cad
update CHANGES
coxipi Jan 15, 2024
92b8a85
min implementation accepting different base_kws for diff vars
coxipi Jan 15, 2024
aefd71f
Refactoring, more comments
coxipi Jan 16, 2024
eb74ba4
finish the change of argument name
coxipi Jan 16, 2024
79d7798
sdba notebook uses MBCn appropriately
coxipi Jan 16, 2024
5351ad1
public functions in MBCn
coxipi Jan 16, 2024
d98cb83
Merge remote-tracking branch 'origin/master' into npdf_gpies
coxipi Jan 16, 2024
e225f57
Merge remote-tracking branch 'origin/fix-yamale-schema' into npdf_gpies
coxipi Jan 16, 2024
29eddb1
added back scens computation
coxipi Jan 16, 2024
14165ca
typo, missing variable to select value in dict
coxipi Jan 18, 2024
33b8afd
Merge branch 'main' of https://github.com/Ouranosinc/xclim into npdf_…
coxipi Apr 23, 2024
31c1154
deactivate warning for missing output in _escores
coxipi Apr 24, 2024
f2240b7
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 24, 2024
4c19bf6
Merge branch 'main' into npdf_gpies
Zeitsperre Apr 25, 2024
f366cf3
better handling of units & refactor
coxipi Jun 3, 2024
671dba5
Merge branch 'npdf_gpies' of github.com:Ouranosinc/xclim into npdf_gpies
coxipi Jun 3, 2024
60076b5
format docstrings
coxipi Jun 11, 2024
44ff0db
Merge branch 'main' into npdf_gpies
Zeitsperre Jun 11, 2024
e03815a
fix bad merge
Zeitsperre Jun 11, 2024
1e2e06e
refactor
coxipi Jun 11, 2024
30eaca6
Merge branch 'npdf_gpies' of github.com:Ouranosinc/xclim into npdf_gpies
coxipi Jun 11, 2024
418148d
simplify & remove fast_interp
coxipi Jun 12, 2024
d223862
change names, more doc
coxipi Jun 12, 2024
a8792e9
use old quantile backend
coxipi Jun 12, 2024
d19249e
update CHANGES
coxipi Jun 12, 2024
7f27f4f
Merge branch 'main' into npdf_gpies
Zeitsperre Jun 13, 2024
ff6bd3b
Merge branch 'main' of github.com:Ouranosinc/xclim into npdf_gpies
coxipi Jun 19, 2024
812a97e
multivariate adjustment classes
coxipi Jun 19, 2024
a4de9a9
Dataset -> xr.Dataset
coxipi Jun 19, 2024
1324ebe
sim: optional in MultivariateAdjust
coxipi Jun 21, 2024
16b23f4
use base cls properly
coxipi Jun 21, 2024
0596788
standardize in npdf_train
coxipi Jun 25, 2024
dab437e
fix input check
juliettelavoie Jun 25, 2024
8c3c288
pin netcdf4
juliettelavoie Jun 25, 2024
bbdba8a
Merge branch 'npdf_gpies' of github.com:Ouranosinc/xclim into npdf_gpies
coxipi Jun 25, 2024
c0bc251
Merge branch 'main' into npdf_gpies
Zeitsperre Jun 25, 2024
3196688
fix grouped_time_indexes
coxipi Jun 26, 2024
26e0959
Merge branch 'main' of github.com:Ouranosinc/xclim into npdf_gpies
coxipi Jun 26, 2024
bce768c
work with stacked DataArrays
coxipi Jun 26, 2024
f28abfa
`multivar` ==> thresh must be converted early
coxipi Jun 27, 2024
70c0efb
updates CHANGES
coxipi Jun 27, 2024
eecf10f
revert unwanted changes
coxipi Jun 27, 2024
d49916e
Merge branch 'main' of github.com:Ouranosinc/xclim into npdf_gpies
coxipi Jun 27, 2024
ecdd4ee
MBCn minimal test
coxipi Jun 27, 2024
fff35d4
don't keep indexes
coxipi Jun 28, 2024
87504cb
Merge branch 'main' of github.com:Ouranosinc/xclim into npdf_gpies
coxipi Jun 28, 2024
8466344
Merge branch 'main' into npdf_gpies
Zeitsperre Jul 2, 2024
bfc6866
format docstrings
coxipi Jul 19, 2024
3956044
Merge branch 'npdf_gpies' of github.com:Ouranosinc/xclim into npdf_gpies
coxipi Jul 19, 2024
605f068
jitter_under_thresh defaults to None
coxipi Jul 19, 2024
2232ea8
simple harmonize_units & better docstrings
coxipi Jul 19, 2024
2868ae1
rem list comprehension
coxipi Jul 19, 2024
a734038
directly code adjustment factor
coxipi Jul 19, 2024
ce44d8a
better doc part1
coxipi Jul 19, 2024
385cdab
improve doc part2
coxipi Jul 19, 2024
5f6e7f5
Merge branch 'npdf_gpies' of github.com:Ouranosinc/xclim into npdf_gpies
coxipi Jul 19, 2024
9043d5d
fast path _harmonize_units_multivariate
coxipi Jul 19, 2024
53380ac
test harmonize_units
coxipi Jul 19, 2024
fcdaf86
fillna(-1) before selecting group indexes
coxipi Jul 19, 2024
9d46a6a
update doc in nb
coxipi Jul 19, 2024
aa311c2
Merge branch 'main' of github.com:Ouranosinc/xclim into npdf_gpies
coxipi Jul 19, 2024
e4f248f
rem unintended change in DQM docstring
coxipi Jul 19, 2024
aa48579
Merge branch 'main' of github.com:Ouranosinc/xclim into npdf_gpies
coxipi Jul 22, 2024
c7c93c1
Merge branch 'npdf_gpies' of github.com:Ouranosinc/xclim into npdf_gpies
coxipi Jul 22, 2024
f6cd662
Merge branch 'main' into npdf_gpies
coxipi Jul 23, 2024
a9fc293
Merge branch 'main' into npdf_gpies
Zeitsperre Jul 23, 2024
13ce8ab
multivar da need units in test (no skip check)
coxipi Jul 23, 2024
2116c63
_quantile: default behaviour == np.nanquantile & better error msg
coxipi Jul 23, 2024
02f3c55
don't see units for multivariate scen
coxipi Jul 24, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions CHANGES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,11 @@ Announcements
^^^^^^^^^^^^^
* `xclim` has migrated its development branch name from `master` to `main`. (:issue:`1667`, :pull:`1669`).


New features and enhancements
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* New `TrainAdjust` and class `MBCn` giving a faster and more accurate implementation of the 'MBCn' algorithm (:issue:`1551`, :pull:`1580`).

New indicators
^^^^^^^^^^^^^^
* New ``snw_season_length`` and ``snd_season_length`` computing the duration between the start and the end of the snow season, both defined as the first day of a continuous period with snow above/under a threshold. Previous versions of these indicators were renamed ``snw_days_above`` and ``snd_days_above`` to better reflect what they computed : the number of days with snow above a given threshold (with no notion of continuity). (:issue:`1703`, :pull:`1708`).
Expand Down Expand Up @@ -83,6 +88,7 @@ New features and enhancements
* Added a new ``xclim.indices.generic.select_rolling_resample_op`` function to allow for computing rolling statistics. (:issue:`1480`, :pull:`1643`).
* Add the possibility to use a group with a window in ``xc.sdba.processing.reordering``. (:pull:`1566`).


Breaking changes
^^^^^^^^^^^^^^^^
* `xclim` base Python version has been raised to Python3.9. Python3.9+ coding conventions are now supported. (:issue:`1268`, :pull:`1565`).
Expand Down
148 changes: 29 additions & 119 deletions docs/notebooks/sdba.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -481,7 +481,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Perform an initial univariate adjustment."
"##### Perform the multivariate adjustment (MBCn)."
]
},
{
Expand All @@ -490,132 +490,42 @@
"metadata": {},
"outputs": [],
"source": [
"# additive for tasmax\n",
"QDMtx = sdba.QuantileDeltaMapping.train(\n",
" dref.tasmax, dhist.tasmax, nquantiles=20, kind=\"+\", group=\"time\"\n",
"ADJ = sdba.MBCn.train(\n",
" dref,\n",
" dhist,\n",
" base_kws={\"nquantiles\": 20, \"group\": \"time\"},\n",
" adj_kws={\"interp\": \"nearest\", \"extrapolation\": \"constant\"},\n",
" n_iter=20, # perform 20 iteration\n",
" n_escore=1000, # only send 1000 points to the escore metric\n",
")\n",
"# Adjust both hist and sim, we'll feed both to the Npdf transform.\n",
"scenh_tx = QDMtx.adjust(dhist.tasmax)\n",
"scens_tx = QDMtx.adjust(dsim.tasmax)\n",
"\n",
"# remove == 0 values in pr:\n",
"dref[\"pr\"] = sdba.processing.jitter_under_thresh(dref.pr, \"0.01 mm d-1\")\n",
"dhist[\"pr\"] = sdba.processing.jitter_under_thresh(dhist.pr, \"0.01 mm d-1\")\n",
"dsim[\"pr\"] = sdba.processing.jitter_under_thresh(dsim.pr, \"0.01 mm d-1\")\n",
"\n",
"# multiplicative for pr\n",
"QDMpr = sdba.QuantileDeltaMapping.train(\n",
" dref.pr, dhist.pr, nquantiles=20, kind=\"*\", group=\"time\"\n",
")\n",
"# Adjust both hist and sim, we'll feed both to the Npdf transform.\n",
"scenh_pr = QDMpr.adjust(dhist.pr)\n",
"scens_pr = QDMpr.adjust(dsim.pr)\n",
"\n",
"scenh = xr.Dataset(dict(tasmax=scenh_tx, pr=scenh_pr))\n",
"scens = xr.Dataset(dict(tasmax=scens_tx, pr=scens_pr))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Stack the variables to multivariate arrays and standardize them\n",
"The standardization process ensure the mean and standard deviation of each column (variable) is 0 and 1 respectively.\n",
"\n",
"`scenh` and `scens` are standardized together, so the two series are coherent. As we'll see further, we do not need to keep the mean and standard deviation, as we only keep the rank order information from the `NpdfTransform` output."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Stack the variables (tasmax and pr)\n",
"ref = sdba.processing.stack_variables(dref)\n",
"scenh = sdba.processing.stack_variables(scenh)\n",
"scens = sdba.processing.stack_variables(scens)\n",
"\n",
"# Standardize\n",
"ref, _, _ = sdba.processing.standardize(ref)\n",
"\n",
"allsim_std, _, _ = sdba.processing.standardize(xr.concat((scenh, scens), \"time\"))\n",
"scenh_std = allsim_std.sel(time=scenh.time)\n",
"scens_std = allsim_std.sel(time=scens.time)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Perform the N-dimensional probability density function transform\n",
"\n",
"The NpdfTransform will iteratively randomly rotate our arrays in the \"variables\" space and apply the univariate adjustment before rotating it back. In Cannon (2018) and Pitié et al. (2005), it can be seen that the source array's joint distribution converges toward the target's joint distribution when a large number of iterations is done."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from xclim import set_options\n",
"\n",
"# See the advanced notebook for details on how this option work\n",
"with set_options(sdba_extra_output=True):\n",
" out = sdba.adjustment.NpdfTransform.adjust(\n",
" ref,\n",
" scenh_std,\n",
" scens_std,\n",
" base=sdba.QuantileDeltaMapping, # Use QDM as the univariate adjustment.\n",
" base_kws={\"nquantiles\": 20, \"group\": \"time\"},\n",
" n_iter=20, # perform 20 iteration\n",
" n_escore=1000, # only send 1000 points to the escore metric (it is really slow)\n",
"scenh, scens = (\n",
" ADJ.adjust(\n",
" sim=ds,\n",
" ref=dref,\n",
" hist=dhist,\n",
" base=sdba.QuantileDeltaMapping,\n",
" base_kws_vars={\n",
" \"pr\": {\n",
" \"kind\": \"*\",\n",
" \"jitter_under_thresh_value\": \"0.01 mm d-1\",\n",
" \"adapt_freq_thresh\": \"0.1 mm d-1\",\n",
" },\n",
" \"tasmax\": {\"kind\": \"+\"},\n",
" },\n",
" adj_kws={\"interp\": \"nearest\", \"extrapolation\": \"constant\"},\n",
" )\n",
"\n",
"scenh_npdft = out.scenh.rename(time_hist=\"time\") # Bias-adjusted historical period\n",
"scens_npdft = out.scen # Bias-adjusted future period\n",
"extra = out.drop_vars([\"scenh\", \"scen\"])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Restoring the trend\n",
"\n",
"The NpdfT has given us new \"hist\" and \"sim\" arrays with a correct rank structure. However, the trend is lost in this process. We reorder the result of the initial adjustment according to the rank structure of the NpdfT outputs to get our final bias-adjusted series.\n",
"\n",
"`sdba.processing.reordering`: 'ref' the argument that provides the order, 'sim' is the argument to reorder."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"scenh = sdba.processing.reordering(scenh_npdft, scenh, group=\"time\")\n",
"scens = sdba.processing.reordering(scens_npdft, scens, group=\"time\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"scenh = sdba.processing.unstack_variables(scenh)\n",
"scens = sdba.processing.unstack_variables(scens)"
" for ds in [dhist, dsim]\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### There we are!\n",
"##### Let's trigger all the computations.\n",
"\n",
"Let's trigger all the computations. The use of `dask.compute` allows the three DataArrays to be computed at the same time, avoiding repeating the common steps."
"The use of `dask.compute` allows the three DataArrays to be computed at the same time, avoiding repeating the common steps."
]
},
{
Expand All @@ -629,7 +539,7 @@
"\n",
"with ProgressBar():\n",
" scenh, scens, escores = compute(\n",
" scenh.isel(location=2), scens.isel(location=2), extra.escores.isel(location=2)\n",
" scenh.isel(location=2), scens.isel(location=2), ADJ.ds.escores.isel(location=2)\n",
" )"
]
},
Expand Down Expand Up @@ -686,7 +596,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.4"
"version": "3.11.7"
},
"toc": {
"base_numbering": 1,
Expand Down
Loading