Example showing sea surface temperature decomposed via EOF analysis, Varimax rotation and Promax rotation.
Empirical orthogonal function (EOF) analysis, more commonly known as principal component analysis (PCA), is one of the most popular methods for dimension reduction and structure identification in Earth system sciences. Due to this popularity, a number of different EOF variants have been developed over the last few years, either to mitigate some pitfalls of ordinary EOF analysis (e.g. orthogonality, interpretability, linearity) or to broaden its scope (e.g. multivariate variants).
Currently, there are several implementations of EOF analysis on GitHub that facilitate the acceptance and application of this method by the broader scientific community. Each of these implementations has its own strengths, which need to be highlighted (please let me know, if I forgot any):
Package | xeofs | eofs | pyEOF | xeof | xMCA | xmca2 |
---|---|---|---|---|---|---|
EOF analysis | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
Rotated EOF analysis | ✅ | ❌ | ✅ | ❌ | ❌ | ✅ |
Complex EOF analysis | ❌ | ❌ | ❌ | ❌ | ❌ | ✅ |
ROCK-PCA | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
Multivariate EOF | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ |
MCA | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ |
Rotated MCA | ✅ | ❌ | ❌ | ❌ | ❌ | ✅ |
Complex MCA | ❌ | ❌ | ❌ | ❌ | ❌ | ✅ |
Multivariate MCA | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
Package | xeofs | eofs | pyEOF | xeof | xMCA | xmca2 |
---|---|---|---|---|---|---|
numpy interface |
✅ | ✅ | ❌ | ❌ | ❌ | ✅ |
pandas interface |
✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
xarray interface |
✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
Fast algorithm | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ |
Dask support | ❌ | ✅ | ❌ | ✅ | ❌ | ❌ |
Multi-dimensional | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
Significance analysis | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
The goal of xeofs
is to merge these different implementations and to simplify the integration of other existing and future variants of EOF analysis thanks to its modular code structure.
The official name is deliberately chosen to be similar to the other implementations to make it clear that xeofs
is nothing revolutionary new in itself. The point is not to distinguish this implementation from the others, but rather to unify (+ extend) already existing implementations.
This project is intended to be a collaborative project of the scientific community and the contribution of EOF variants in the form of pull requests is explicitly encouraged. If you are interested, just contact me or open an Issue.
If you are using conda
, it is recommend to install via:
conda install -c conda-forge xeofs
Alternatively, you can install the package through pip
:
pip install xeofs
Documentation is work in progress. Meanwhile check out some examples to get started:
- EOF analysis (S-mode)
- EOF analysis (T-mode)
- Rotated EOF analysis (Varimax, Promax)
- Weighted EOF analysis
- Multivariate EOF analysis
- Significance analysis via bootstrapping
- Maximum Covariance Analysis
- to Andrew Dawson for the first and fundamental Python package for EOF analysis
- to Yefee from which I took some inspiration to implement MCA
- to James Chapman who created a great Python package for Canonical Correlation Analysis
- to Diego Bueso for his open-source ROCK-PCA implementation in Matlab
- to yngvem for how to organize the project folder structure
- to all the developers of NumPy, pandas & xarray for their invaluable contributions to science
Please make sure that when using xeofs
you always cite the original source of the method used. Additionally, if you find xeofs
useful for your research, you may cite it as follows:
@software{rieger_xeofs_2022, title = {xeofs: Multi-dimensional {EOF} analysis and variants in xarray}, url = {https://github.com/nicrie/xeofs} version = {0.6.0}, author = {Rieger, Niclas}, date = {2022}, doi = {10.5281/zenodo.6323011} }