- Issue when writing extremely large forecasts to disk
- Tidymodels speed up
- Added external regressor support for ARIMA by introducing a new model option of
arimax
, which uses engineered features in addition to any external regressors supplied. - Automated feature selection, refer to feature selection vignette for more details
- Error handling in hierarchical forecast reconciliation
- Box-cox and differencing transformations
- Added new function,
list_models()
, that lists available models in the package
- Best model selection
- Hierarchical forecast reconciliation
- Spark data frame support. Initial input data can now be a spark data frame, enabling millions of time series to be ran across a spark compute cluster.
- Updated train/validation/test process for multivariate ML models.
- In addition to existing
forecast_time_series()
, added new sub components of the finnts forecast process that can be called separately or in a production pipeline. Allows for more control of the forecast processprep_data()
prep_models()
train_models()
ensemble_models()
final_models()
- Automated read and write capabilities. Intermediate and final Finn outputs are now automatically written to disk (see options below). This creates better MLOps capabilities, easier scale on spark, and better fault tolerance by not needing to start the whole forecast process over from scratch if an error occurred.
- Temporary location on local machine, which will then get deleted after R session is closed.
- Path on local machine or a mounted Azure Data Lake Storage path in spark to save the intermediate and final Finn run results.
- Azure Blob Storage to store non-spark runs on a data lake. SharePoint/OneDrive storage to store non-spark runs within M365.
- New MLOps features that allow you to retrieve the final trained models through
get_trained_models()
, get specific run information thoroughget_run_info()
, and even retrieve the initial feature engineered data throughget_prepped_data()
.
run_model_parallel
has been replaced withinner_parallel
withinforecast_time_series()
- Data being returned as a list when running
forecast_time_series()
. Instead please useget_forecast_data()
to retrieve Finn forecast outputs.
- No longer support for Azure Batch parallel processing, please use spark instead
- Parallel processing through spark now needs a mounted Azure Data Lake Storage path supplied through
set_run_info()
. Please refer to the vignettes for more details.
- Fixed dependency issue with timetk.
- Removed package dependency modeltime.gluonts and its deep learning models because the package is no longer on CRAN.
- Fixed hierarchical forecast reconciliation issues for certain forecasts that have high residuals.
- Compliant with latest dplyr v1.1.0
- Fixed feature engineering issue around NaN/Inf values when computing log values of negative external regressor values.
- Fixed issue of ensuring random seed is set correctly in parallel processing.
- Added spark support to run Finn in parallel on Azure Databricks or Azure Synapse.
- Added error handling when creating simple model averages. Should allow forecast to keep running even if there are memory issues when averaging individual forecast models, which helps on large data sets.
- Expand Azure Batch task timeout from one day to one week. Prevents errors when running large forecasts that take over a day to run in Azure Batch.
- Deprecated azure_batch parallel compute option within forecast_time_series function since the Azure Batch R packages are deprecated. Please use the new integration with spark on Azure.
- Change default behavior to only run R1 feature engineering recipe when the argument run_global_models is set to TRUE or NULL and recipes_to_run is set to NULL in the forecast_time_series function. Running R2 recipe with global models on large data sets often results in RAM issues when running in Azure Batch.
- Fixed error when converting infinite values to NA values after model forecasts are created.
- Changed the cubist model to reference the new cubist model definition in parsnip package.
- Fixed bug in hierarchical forecasting. Missing values in the hierarchy are converted from NA to zero, which fixes how data is aggregated at various levels of hierarchy.
- Initial CRAN Release