README.Rmd

---
title: "stackr package"
output: github_document
---

<!-- badges: start -->
![GitHub R package version](https://img.shields.io/github/r-package/v/epiforecasts/stackr)
[![R-CMD-check](https://github.com/epiforecasts/stackr/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/epiforecasts/stackr/actions/workflows/R-CMD-check.yaml)
[![codecov](https://codecov.io/github/epiforecasts/stackr/branch/main/graph/badge.svg?token=rYeyG3kFIa)](https://codecov.io/github/epiforecasts/stackr)
![GitHub contributors](https://img.shields.io/github/contributors/epiforecasts/stackr)
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
<!-- badges: end -->

# Overview
The `stackr` package provides an easy way to combine predictions
from individual time series or panel data models to an 
ensemble. `stackr` stacks models according to the Continuous Ranked Probability 
Score (CRPS) over k-step ahead predictions. It is therefore especially
suited for time-series and panel data. A function for 
leave-one-out CRPS may be added in the future. Predictions need to be 
predictive distributions represented by predictive samples. Usually, these will 
be sets of posterior predictive simulation draws generated by an MCMC 
algorithm. 

# Installation
Install using

``` {r eval = FALSE}
devtools::install_github("epiforecasts/stackr")
```

# CRPS Stacking 
Given some training data with true observed values as well as predictive samples
generated from different models, `stackr` finds the optimal (in the sense of 
minimizing expected cross-validation predictive error) weights to form an
ensemble of these models. Using these weights, `stackr` can then provide
samples from the optimal model mixture by drawing from the predictive samples
of those models in the correct proportion. This gives a mixture model
solely based on predictive samples and is in this regard superior to other
ensembling techniques like Bayesian Model Averaging. More information 
can be found in the package vignette. 

Weights are generated using the `crps_weights` function. With these weights 
and predictive samples, the `mixture_from_samples` function can be used to obtain 
predictive samples from the optimal mixture model.

# Usage
## Load example data and split into train and test data
``` {r eval = FALSE}
splitdate <- as.Date("2020-03-28")
traindata <- example_data[date <= splitdate]
testdata <- example_data[date > splitdate]
```


## Get weights and create mixture 
``` {r eval = FALSE}
weights <- crps_weights(traindata)
test_mixture <- mixture_from_samples(testdata, weights = weights)
```

## Score predictions
``` {r eval = FALSE}
library("scoringutils")

# combine data.frame with mixture with predictions from other models
score_df <- rbindlist(list(testdata, test_mixture), fill = TRUE)

# score all predictions using from github.com/epiforecasts/scoringutils
score_df[, crps := crps(unique(observed), t(predicted)),
  by = .(geography, model, date)
]

# summarise scores
score_df[, mean(crps), by = model][, setnames(.SD, "V1", "CRPS")]
```

# References
- Using Stacking to Average Bayesian Predictive Distributions, Yuling Yao, Aki Vehtari, Daniel Simpson, and Andrew Gelman, 2018, Bayesian Analysis 13, Number 3, pp. 917–1003 DOI 10.1214/17-BA1091
- Strictly Proper Scoring Rules, Prediction, and Estimation, 
Tilmann Gneiting and Adrian E. Raftery, 2007, Journal of the American
Statistical Association, Volume 102, 2007 - Issue 477 DOI 10.1198/016214506000001437
- Comparing Bayes Model Averaging and Stacking When Model Approximation Error Cannot be Ignored, 
Bertrand Clarke, 2003, Journal of Machine Learning Research 4
- Bayesian Model Weighting: The Many Faces of Model Averaging, 
Marvin Höge, Anneli Guthke and Wolfgang Nowak, 2020, Water, DOI 10.3390/w12020309
- Bayesian Stacking and Pseudo-BMA weights using the loo package, 
Aki Vehtari and Jonah Gabry, 2019, https://mc-stan.org/loo/articles/loo2-weights.html


Contributors
---


<!-- ALL-CONTRIBUTORS-LIST:START - Do not remove or modify this section -->
<!-- prettier-ignore-start -->
<!-- markdownlint-disable -->

All contributions to this project are gratefully acknowledged using the [`allcontributors` package](https://github.com/ropensci/allcontributors) following the [all-contributors](https://allcontributors.org) specification. Contributions of any kind are welcome!

### Code


<a href="https://github.com/epiforecasts/stackr/commits?author=nikosbosse">nikosbosse</a>, 
<a href="https://github.com/epiforecasts/stackr/commits?author=sbfnk">sbfnk</a>, 
<a href="https://github.com/epiforecasts/stackr/commits?author=seabbs">seabbs</a>


### Issues


<a href="https://github.com/epiforecasts/stackr/issues?q=is%3Aissue+commenter%3Ajonathonmellor">jonathonmellor</a>


<!-- markdownlint-enable -->
<!-- prettier-ignore-end -->
<!-- ALL-CONTRIBUTORS-LIST:END -->