+ +
+ + + +
+

Ensembling Models

+

The following is a brief demonstration of the ensemble models in Darts. Starting from the examples provided in the Quickstart Notebook, some advanced features and subtilities will be detailed.

+

The following topics are covered in this notebook : * Basics & references * Naive Ensembling * Deterministic * Covariates & multivariate series * Probabilistic * Learned Ensembling * Deterministic * Probabilistic * +Bootstraping * Pre-trained Ensembling

+
+
[1]:
+
+
+
+# fix python path if working locally
+from utils import fix_pythonpath_if_working_locally
+
+fix_pythonpath_if_working_locally()
+
+
+
+
+
[2]:
+
+
+
+%load_ext autoreload
+%autoreload 2
+%matplotlib inline
+import matplotlib.pyplot as plt
+
+from darts.models import (
+    ExponentialSmoothing,
+    KalmanForecaster,
+    LinearRegressionModel,
+    NaiveDrift,
+    NaiveEnsembleModel,
+    NaiveSeasonal,
+    RandomForest,
+    RegressionEnsembleModel,
+    TCNModel,
+)
+from darts.metrics import mape
+from darts.datasets import AirPassengersDataset
+from darts.utils.timeseries_generation import (
+    datetime_attribute_timeseries as dt_attr,
+)
+from darts.dataprocessing.transformers import Scaler
+
+import warnings
+
+warnings.filterwarnings("ignore")
+
+import logging
+
+logging.disable(logging.CRITICAL)
+
+
+
+
+

Basics & references

+

Ensembling combines the forecasts of several “weak” models to obtain a more robust and accurate model.

+

All of Darts’ ensembling models rely on the stacking technique (reference). They provide the same functionalities as the other forecasting models. Depending on the ensembled models, they can:

+
    +
  • levarage covariates

  • +
  • be trained on multiple-series

  • +
  • predict multivariate targets

  • +
  • generate probabilistic forecasts

  • +
  • and more…

  • +
+
+
[3]:
+
+
+
+# using the AirPassenger dataset, directly available in darts
+ts_air = AirPassengersDataset().load()
+ts_air.plot()
+
+
+
+
+
[3]:
+
+
+
+
+<Axes: xlabel='Month'>
+
+
+
+
+
+
+../_images/examples_19-EnsembleModel-examples_4_1.png +
+
+
+
+

Naive Ensembling

+

Naive ensembling simply takes the average (mean) of the forecasts generated by the ensembled forecasting models. Darts’ NaiveEnsembleModel accepts both local and global forecasting models (as well as combination of the two, with some additional limitations).

+
+
[4]:
+
+
+
+naive_ensemble = NaiveEnsembleModel(
+    forecasting_models=[NaiveSeasonal(K=12), NaiveDrift()]
+)
+
+backtest = naive_ensemble.historical_forecasts(ts_air, start=0.6, forecast_horizon=3)
+
+ts_air.plot(label="series")
+backtest.plot(label="prediction")
+print("NaiveEnsemble (naive) MAPE:", round(mape(backtest, ts_air), 5))
+
+
+
+
+
+
+
+
+NaiveEnsemble (naive) MAPE: 11.87818
+
+
+
+
+
+
+../_images/examples_19-EnsembleModel-examples_6_1.png +
+
+

Note: after looking at the each model’s MAPE, one would notice that NaiveSeasonal is actually performing better on its own than ensembled with NaiveDrift. Checking the performance of the single models is in general a good practice before defining an ensemble.

+

Before creating the new NaiveEnsemble, we will screen models to identify which ones would do well together. The candidates are : - LinearRegressionModel : classic and simple model - ExponentialSmoothing : moving window model - FalmanForecaster : a filter-based model - RandomForest : decision trees model

+
+
[5]:
+
+
+
+candidates_models = {
+    "LinearRegression": (LinearRegressionModel, {"lags": 12}),
+    "ExponentialSmoothing": (ExponentialSmoothing, {}),
+    "KalmanForecaster": (KalmanForecaster, {"dim_x": 12}),
+    "RandomForest": (RandomForest, {"lags": 12, "random_state": 0}),
+}
+
+backtest_models = []
+
+for model_name, (model_cls, model_kwargs) in candidates_models.items():
+    model = model_cls(**model_kwargs)
+    backtest_models.append(
+        model.historical_forecasts(ts_air, start=0.6, forecast_horizon=3)
+    )
+    print(f"{model_name} MAPE: {round(mape(backtest_models[-1], ts_air), 5)}")
+
+
+
+
+
+
+
+
+LinearRegression MAPE: 4.64008
+ExponentialSmoothing MAPE: 4.44874
+KalmanForecaster MAPE: 4.5539
+RandomForest MAPE: 8.02264
+
+
+
+
[6]:
+
+
+
+fix, axes = plt.subplots(2, 2, figsize=(9, 6))
+for ax, backtest, model_name in zip(
+    axes.flatten(),
+    backtest_models,
+    list(candidates_models.keys()),
+):
+    ts_air[-len(backtest) :].plot(ax=ax, label="ground truth")
+    backtest.plot(ax=ax, label=model_name)
+
+    ax.set_title(model_name)
+    ax.set_ylim([250, 650])
+plt.tight_layout()
+
+
+
+
+
+
+
+../_images/examples_19-EnsembleModel-examples_9_0.png +
+
+

The historical forecasts obtained with the LinearRegressionModel and KalmanForecaster look quite similar whereas ExponentialSmoothing tends to understimate the true values and RandomForest is failing to capture the peaks. To benefits from the ensemble, we will favor diversity and continue with the LinearRegressionModel and ExponentialSmoothing.

+
+
[7]:
+
+
+
+ensemble = NaiveEnsembleModel(
+    forecasting_models=[LinearRegressionModel(lags=12), ExponentialSmoothing()]
+)
+
+backtest = ensemble.historical_forecasts(ts_air, start=0.6, forecast_horizon=3)
+
+ts_air[-len(backtest) :].plot(label="series")
+backtest.plot(label="prediction")
+plt.ylim([250, 650])
+print("NaiveEnsemble (v2) MAPE:", round(mape(backtest, ts_air), 5))
+
+
+
+
+
+
+
+
+NaiveEnsemble (v2) MAPE: 4.04297
+
+
+
+
+
+
+../_images/examples_19-EnsembleModel-examples_11_1.png +
+
+

Compared to the individual model MAPE score, 4.64008 for the LinearRegressionModel and 4.44874 for the ExponentialSmoothing, the ensembling improved the accuracy to 4.04297!

+
+

Using covariates & predicting multivariate series

+

Depending on the forecasting models used, the EnsembleModel can of course also leverage covariates or forecast multivariates series! The covariates will be passed only to the forecasting models supporting them.

+

In the example below, the ExponentialSmoothing model does not support any covariates whereas the LinearRegressionModel model supports future_covariates.

+
+
[8]:
+
+
+
+ensemble = NaiveEnsembleModel(
+    [LinearRegressionModel(lags=12, lags_future_covariates=[0]), ExponentialSmoothing()]
+)
+
+# encoding the months as integer, normalised
+future_cov = dt_attr(ts_air.time_index, "month", add_length=12) / 12
+backtest = ensemble.historical_forecasts(
+    ts_air, future_covariates=future_cov, start=0.6, forecast_horizon=3
+)
+
+ts_air[-len(backtest) :].plot(label="series")
+backtest.plot(label="prediction")
+plt.ylim([250, 650])
+print("NaiveEnsemble (w/ future covariates) MAPE:", round(mape(backtest, ts_air), 5))
+
+
+
+
+
+
+
+
+NaiveEnsemble (w/ future covariates) MAPE: 4.07502
+
+
+
+
+
+
+../_images/examples_19-EnsembleModel-examples_13_1.png +
+
+
+
+

Probabilistic naive ensembling

+

Combining models supporting probabilistic forecasts results in a probabilistic NaiveEnsembleModel! We can easily tweak the models used above to make them probabilistic and obtain confidence interval in ours forecasts:

+
+
[9]:
+
+
+
+ensemble_probabilistic = NaiveEnsembleModel(
+    forecasting_models=[
+        LinearRegressionModel(
+            lags=12,
+            likelihood="quantile",
+            quantiles=[0.05, 0.5, 0.95],
+        ),
+        ExponentialSmoothing(),
+    ]
+)
+
+# must pass num_samples > 1 to obtain a probabilistic forecasts
+backtest = ensemble_probabilistic.historical_forecasts(
+    ts_air, start=0.6, forecast_horizon=3, num_samples=100
+)
+
+ts_air[-len(backtest) :].plot(label="ground truth")
+backtest.plot(label="prediction")
+
+
+
+
+
[9]:
+
+
+
+
+<Axes: xlabel='time'>
+
+
+
+
+
+
+../_images/examples_19-EnsembleModel-examples_15_1.png +
+
+
+
+
+

Learned Ensembling

+

Ensembling can also be considered as a supervised regression problem: given a set of forecasts (features), find a model that combines them in order to minimise errors on the target. This is what the RegressionEnsembleModel does. The main three parameters are:

+
    +
  • forecasting_models is a list of forecasting models whose predictions we want to ensemble.

  • +
  • regression_train_n_points is the number of time steps to use for fitting the “ensemble regression” model (i.e., the inner model that combines the forecasts).

  • +
  • regression_model is, optionally, a sklearn-compatible regression model or a Darts RegressionModel to be used for the ensemble regression. If not specified, Darts’ LinearRegressionModel is used. Using a sklearn model is easy out-of-the-box, but using one of Darts’ regression models allows to potentially take arbitrary lags of the individual forecasts as inputs of the regression model.

  • +
+

Once these elements are in place, a RegressionEnsembleModel can be used like a regular forecasting model:

+
+
[10]:
+
+
+
+ensemble_model = RegressionEnsembleModel(
+    forecasting_models=[NaiveSeasonal(K=12), NaiveDrift()],
+    regression_train_n_points=12,
+)
+
+backtest = ensemble_model.historical_forecasts(ts_air, start=0.6, forecast_horizon=3)
+
+ts_air.plot()
+backtest.plot()
+
+print("RegressionEnsemble (naive) MAPE:", round(mape(backtest, ts_air), 5))
+
+
+
+
+
+
+
+
+RegressionEnsemble (naive) MAPE: 4.85142
+
+
+
+
+
+
+../_images/examples_19-EnsembleModel-examples_17_1.png +
+
+

Compared to the MAPE of 11.87818 obtained at the beginning of the naive ensembling section, adding a LinearRegressionModel on top of the two naive models does improve the quality of the forecast.

+

Now, let’s see if we can observe similar gain when the RegressionEnsemble forecasting models are not naive:

+
+
[11]:
+
+
+
+ensemble = RegressionEnsembleModel(
+    forecasting_models=[LinearRegressionModel(lags=12), ExponentialSmoothing()],
+    regression_train_n_points=12,
+)
+
+backtest = ensemble.historical_forecasts(ts_air, start=0.6, forecast_horizon=3)
+
+ts_air.plot(label="series")
+backtest.plot(label="prediction")
+
+print("RegressionEnsemble (v2) MAPE:", round(mape(backtest, ts_air), 5))
+
+
+
+
+
+
+
+
+RegressionEnsemble (v2) MAPE: 4.63334
+
+
+
+
+
+
+../_images/examples_19-EnsembleModel-examples_19_1.png +
+
+

Interestingly, even if the MAPE improved compared to the RegressionEnsemble relying on naive models (MAPE: 4.85142), it does not outperform the NaiveEnsemble using similar forecasting models (MAPE: 4.04297).

+

This performance gap is partially caused by the points set aside to train the ensembling LinearRegression; the two forecasting models (LinearRegression and ExponentialSmoothing) cannot access the latest values of the series, which contains a marked upward trend.

+

Out of curiosity, we can use the Ridge regression model from the sklearn library to ensemble the forecasts:

+
+
[12]:
+
+
+
+from sklearn.linear_model import Ridge
+
+ensemble = RegressionEnsembleModel(
+    forecasting_models=[LinearRegressionModel(lags=12), ExponentialSmoothing()],
+    regression_train_n_points=12,
+    regression_model=Ridge(),
+)
+
+backtest = ensemble.historical_forecasts(ts_air, start=0.6, forecast_horizon=3)
+
+print("RegressionEnsemble (Ridge) MAPE:", round(mape(backtest, ts_air), 5))
+
+
+
+
+
+
+
+
+RegressionEnsemble (Ridge) MAPE: 6.46803
+
+
+

In this particuliar scenario, using a regression model with a regularization term deteriorated the forecasts but there might be other cases where it will improve them.

+
+

Training using historical forecasts

+

When predicting a number of values greater than their output_chunk_length, GlobalForecastingModels rely on auto-regression (use their own output as input) to forecast values far in the future. However, the quality of the forecasts can considerably decrease as the predicted timestamp get further from the end of the observations. During RegressionEnsemble’s regression model training, the forecasting models generate forecasts for timestamps where the ground truth is actually known and +available, making it possible to use historical_forecasts instead of predict().

+

This can be activated with train_using_historical_forecasts=True.

+

Under the hood, the ensemble model will trigger historical forecasting for each model with forecast_horizon=model.output_chunk_length, stride=model.output_chunk_length, last_points_only=False, and overlap_end=False to predict the last regression_train_n_points points of the target series.

+
+
[13]:
+
+
+
+# replacing the ExponentialSmoothing (local) with RandomForest (global)
+ensemble = RegressionEnsembleModel(
+    forecasting_models=[
+        LinearRegressionModel(lags=12),
+        RandomForest(lags=12, random_state=0),
+    ],
+    regression_train_n_points=12,
+    train_using_historical_forecasts=False,
+)
+backtest = ensemble.historical_forecasts(ts_air, start=0.6, forecast_horizon=3)
+
+ensemble_hist_fct = RegressionEnsembleModel(
+    forecasting_models=[
+        LinearRegressionModel(lags=12),
+        RandomForest(lags=12, random_state=0),
+    ],
+    regression_train_n_points=12,
+    train_using_historical_forecasts=True,
+)
+backtest_hist_fct = ensemble_hist_fct.historical_forecasts(
+    ts_air, start=0.6, forecast_horizon=3
+)
+
+print("RegressionEnsemble (no hist_fct) MAPE:", round(mape(backtest, ts_air), 5))
+print("RegressionEnsemble (hist_fct) MAPE:", round(mape(backtest_hist_fct, ts_air), 5))
+
+
+
+
+
+
+
+
+RegressionEnsemble (no hist_fct) MAPE: 5.7016
+RegressionEnsemble (hist_fct) MAPE: 5.12017
+
+
+

As expected, using historical forecasts with the forecasting models to the train the regression model produces better forecasts.

+
+
+

Probabilistic regression ensemble

+

In order to be probabilistic, the RegressionEnsembleModel, must have a probabilistic ensembling regression model (see table in the README):

+
+
[14]:
+
+
+
+ensemble = RegressionEnsembleModel(
+    forecasting_models=[LinearRegressionModel(lags=12), ExponentialSmoothing()],
+    regression_train_n_points=12,
+    regression_model=LinearRegressionModel(
+        lags_future_covariates=[0], likelihood="quantile", quantiles=[0.05, 0.5, 0.95]
+    ),
+)
+
+backtest = ensemble.historical_forecasts(
+    ts_air, start=0.6, forecast_horizon=3, num_samples=100
+)
+
+ts_air[-len(backtest) :].plot(label="ground truth")
+backtest.plot(label="prediction")
+
+print("RegressionEnsemble (probabilistic) MAPE:", round(mape(backtest, ts_air), 5))
+
+
+
+
+
+
+
+
+RegressionEnsemble (probabilistic) MAPE: 5.15071
+
+
+
+
+
+
+../_images/examples_19-EnsembleModel-examples_25_1.png +
+
+
+
+

Bootstrapping regression ensemble

+

When the forecasting models of a RegressionEnsembleModel are probabilistic, the samples dimension of their forecasts is reduced and used as covariates for the ensembling regression. Since the ensembling regression model is deterministic, the generated forecasts is deterministic as well.

+
+
[15]:
+
+
+
+ensemble = RegressionEnsembleModel(
+    forecasting_models=[
+        LinearRegressionModel(
+            lags=12, likelihood="quantile", quantiles=[0.05, 0.5, 0.95]
+        ),
+        ExponentialSmoothing(),
+    ],
+    regression_train_n_points=12,
+    regression_train_num_samples=100,
+    regression_train_samples_reduction="median",
+)
+
+backtest = ensemble.historical_forecasts(ts_air, start=0.6, forecast_horizon=3)
+
+ts_air[-len(backtest) :].plot(label="ground truth")
+backtest.plot(label="prediction")
+
+print("RegressionEnsemble (bootstrap) MAPE:", round(mape(backtest, ts_air), 5))
+
+
+
+
+
+
+
+
+RegressionEnsemble (bootstrap) MAPE: 5.10138
+
+
+
+
+
+
+../_images/examples_19-EnsembleModel-examples_27_1.png +
+
+
+
+

Pre-trained Ensembling

+

As both NaiveEnsembleModel and RegressionEnsembleModel accept GlobalForecastingModel as forecasting models, they can be used to ensemble pre-trained deep learning and regression models. Note that this functionnality is only supported if all the ensembled forecasting models are instances from the GlobalForecastingModel class and are already trained when creating the ensemble.

+

Disclaimer : Be careful not to pre-train the models with data used during validation as this would introduce considerable bias.

+

Note : The parameters for the TCNModel is heavily inspired from the TCNModel example notebook.

+
+
[16]:
+
+
+
+# holding out values for validation
+train, val = ts_air.split_after(0.8)
+
+# scaling the target
+scaler = Scaler()
+train = scaler.fit_transform(train)
+val = scaler.transform(val)
+
+# use the month as a covariate
+month_series = dt_attr(ts_air.time_index, attribute="month", one_hot=True)
+scaler_month = Scaler()
+month_series = scaler_month.fit_transform(month_series)
+
+# training a regular linear regression, without any covariates
+linreg_model = LinearRegressionModel(lags=24)
+linreg_model.fit(train)
+
+# instanciating a TCN model with parameters optimized for the AirPassenger dataset
+tcn_model = TCNModel(
+    input_chunk_length=24,
+    output_chunk_length=12,
+    n_epochs=500,
+    dilation_base=2,
+    weight_norm=True,
+    kernel_size=5,
+    num_filters=3,
+    random_state=0,
+)
+tcn_model.fit(train, past_covariates=month_series)
+
+
+
+
+
+
+
+
+
+
+
[16]:
+
+
+
+
+TCNModel(kernel_size=5, num_filters=3, num_layers=None, dilation_base=2, weight_norm=True, dropout=0.2, input_chunk_length=24, output_chunk_length=12, n_epochs=500, random_state=0)
+
+
+

As a sanity check, we will look at the forecast of the model taken individually:

+
+
[17]:
+
+
+
+# individual model forecasts
+pred_linreg = linreg_model.predict(24)
+pred_tcn = tcn_model.predict(24, verbose=False)
+
+# scaling them back
+pred_linreg_rescaled = scaler.inverse_transform(pred_linreg)
+pred_tcn_rescaled = scaler.inverse_transform(pred_tcn)
+
+# plotting
+ts_air[-24:].plot(label="ground truth")
+pred_linreg_rescaled.plot(label="LinearRegressionModel")
+pred_tcn_rescaled.plot(label="TCNModel")
+plt.show()
+
+
+
+
+
+
+
+../_images/examples_19-EnsembleModel-examples_31_0.png +
+
+

Now that we have a good idea of the individual performance of each of these models, we can ensemble them. We must make sure to set retrain_forecasting_models=False or the ensemble will need to be fitted before being able to call predict().

+

Advice : Use the save() method to export your model and keep a copy of your weights.

+
+
[18]:
+
+
+
+naive_ensemble = NaiveEnsembleModel(
+    forecasting_models=[tcn_model, linreg_model], train_forecasting_models=False
+)
+# NaiveEnsemble initialized with pre-trained models can call predict() directly,
+# the `series` argument must however be provided
+pred_naive = naive_ensemble.predict(len(val), train)
+
+pretrain_ensemble = RegressionEnsembleModel(
+    forecasting_models=[tcn_model, linreg_model],
+    regression_train_n_points=24,
+    train_forecasting_models=False,
+    train_using_historical_forecasts=False,
+)
+# RegressionEnsemble must train the ensemble model, even if the forecasting models are already trained
+pretrain_ensemble.fit(train)
+pred_ens = pretrain_ensemble.predict(len(val))
+
+# scaling back the predictions
+pred_naive_rescaled = scaler.inverse_transform(pred_naive)
+pred_ens_rescaled = scaler.inverse_transform(pred_ens)
+
+# plotting
+plt.figure(figsize=(8, 5))
+scaler.inverse_transform(val).plot(label="ground truth")
+pred_naive_rescaled.plot(label="pre-trained NaiveEnsemble")
+pred_ens_rescaled.plot(label="pre-trained RegressionEnsemble")
+plt.ylim([250, 650])
+
+# MAPE
+print("LinearRegression MAPE:", round(mape(pred_linreg_rescaled, ts_air), 5))
+print("TCNModel MAPE:", round(mape(pred_tcn_rescaled, ts_air), 5))
+print("NaiveEnsemble (pre-trained) MAPE:", round(mape(pred_naive_rescaled, ts_air), 5))
+print(
+    "RegressionEnsemble (pre-trained) MAPE:", round(mape(pred_ens_rescaled, ts_air), 5)
+)
+
+
+
+
+
+
+
+
+LinearRegression MAPE: 3.91311
+TCNModel MAPE: 4.70491
+NaiveEnsemble (pre-trained) MAPE: 3.82837
+RegressionEnsemble (pre-trained) MAPE: 3.61749
+
+
+
+
+
+
+../_images/examples_19-EnsembleModel-examples_33_1.png +
+
+
+
+
+

Conclusion

+

Ensembling pre-trained LinearRegression and TCNModel models allowed us to out-perform single models and training a linear regression on top of these two models forecasts further improved the MAPE score.

+

If the gains remain limited on this small dataset, ensembling is a powerful technique that can yield impressive results and was notably used by the winners of the 4th edition of the Makridakis Competition (website, github repository).

+
+
+ + +
+ + + + + +