-
Notifications
You must be signed in to change notification settings - Fork 8
Clustering
Clustering is broadly classified as an unsupervised learning family of algorithms. The aim of Cluster Analysis is to create subgroups that will contain observations as similar as possible to each other within a given set of observations. Clustering techniques are routinely used in climate science for Circulation Weather Typing. Circulation types (CTs) and Weather types (WTs) provide a classification of atmospheric circulation patterns in synoptic climatology. In a nutshell, CTs refers to clusters of variables at atmospheric levels (also surface atmospheric pressure) and WTs refers to clusters of variables at surface level (e.g.: precipitation, near-surface temperature etc.).
The function clusterGrid
is the workhorse for the application of different clustering techniques. The primary input is a climate4R grid
, possibly containing multiple variables (multigrid) and/or members. The basic preprocessing operations required are undertaken under the hood to ease its application. For example, scaling of the input variables is internally undertaken via the scaleGrid
function. In addition to model training (i.e., cluster analysis of a given dataset), prediction of new data is straightforward via the newdata
argument, allowing its application in different research applications, including seasonal forecasting or climate change studies. The output will be a climate4R grid containing either the training or prediction data plus the clustering analysis results saved as attributes, where attribute wt.index
may have special interest among all. It is also possible to assign computed CTs to an additional input grid of an arbitrary variable on a daily basis in order to obtain WTs (argument y
).
The specific clustering techniques currently available through clusterGrid
are next enumerated. Follow the links for more specific details and worked examples.
transformeR - Santander MetGroup (Univ. Cantabria - CSIC)
- Package Installation
- Included illustrative datasets
- Standard data manipulation
- Principal Components (and EOFs)
- Circulation and Weather Typing