Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/specify lags per component for RegressionModel #1962

Merged
merged 38 commits into from
Sep 14, 2023

Conversation

madtoinou
Copy link
Collaborator

@madtoinou madtoinou commented Aug 21, 2023

Fixes #1110.

Summary

  • RegressionModel's constructor accepts Dict[str, Sequence[int]] for the arguments lags, lags_past_covariates and lags_future_covariates. The keys in the dictionary must correspond to the name of the components of the series used for training (or the first series when using multiple series) as well as the components generated by the encoders (when applicable).
  • the key "default_lags" can be used to define a set of lags for the components not indicated in the dictionnary
  • lagged_features_name attribute was updated accordingly
  • added tests, making sure the forecasts are the same when using equivalent definitions of the lags using int/list versus dict.
  • added sanity checks for multiple series training, checking that they all contain the same number of components.

Other Information

  • The order of the components (as well as coefficient & features names) is different when using the dictionary definition of lags : instead of being grouped by lags, they are grouped by components.
  • only _create_lagged_data_by_moving_window supports extracting the values for each component independently, it's the function used by default in RegressionModel._create_lagged_data and the user cannot change it from the API.
  • The ShapExplainer is working as intented with component-specific lags.

@madtoinou madtoinou requested a review from dennisbader as a code owner August 21, 2023 15:38
@codecov-commenter
Copy link

codecov-commenter commented Aug 21, 2023

Codecov Report

Patch coverage is 97.65% of modified lines.

❗ Current head edf8554 differs from pull request most recent head 1235e59. Consider uploading reports for the commit 1235e59 to get more accurate results

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Files Changed Coverage
darts/models/forecasting/regression_model.py 96.90%
darts/explainability/shap_explainer.py 100.00%
darts/models/forecasting/lgbm.py 100.00%
...arts/models/forecasting/linear_regression_model.py 100.00%
darts/models/forecasting/random_forest.py 100.00%
darts/models/forecasting/xgboost.py 100.00%
darts/utils/data/tabularization.py 100.00%

📢 Thoughts on this report? Let us know!.

Copy link
Collaborator

@dennisbader dennisbader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @madtoinou, this looks great 🚀

Only had some suggestions about simplifying the lag handling in the model constructor, and some docs rephrasing here and there.

darts/models/forecasting/forecasting_model.py Outdated Show resolved Hide resolved
darts/models/forecasting/regression_model.py Outdated Show resolved Hide resolved
darts/models/forecasting/regression_model.py Outdated Show resolved Hide resolved
darts/models/forecasting/regression_model.py Outdated Show resolved Hide resolved
darts/models/forecasting/regression_model.py Outdated Show resolved Hide resolved
darts/models/forecasting/regression_model.py Outdated Show resolved Hide resolved
darts/utils/data/tabularization.py Outdated Show resolved Hide resolved
darts/utils/data/tabularization.py Outdated Show resolved Hide resolved
darts/tests/models/forecasting/test_regression_models.py Outdated Show resolved Hide resolved
@madtoinou madtoinou requested a review from dennisbader August 28, 2023 11:51
Copy link
Collaborator

@dennisbader dennisbader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, just a last few suggestions, then we can merge :)

darts/utils/data/tabularization.py Show resolved Hide resolved
# cannot use 'default_lags' because it's converted in `fit()`, before calling `_created_lagged_data`
model_instance = RegressionModel(
lags={"0-trgt-0": [-4, -3], "0-trgt-1": [-3, -2], "0-trgt-2": [-2, -1]},
lags_past_covariates={"0-pcov-0": [-10], "0-pvoc-1": [-7]},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the second component's name be "0-pcov-1"? Is it a bug that the model doesn't raise an error for a wrong component name?

Suggested change
lags_past_covariates={"0-pcov-0": [-10], "0-pvoc-1": [-7]},
lags_past_covariates={"0-pcov-0": [-10], "0-pcov-1": [-7]},

darts/tests/models/forecasting/test_regression_models.py Outdated Show resolved Hide resolved
Dockerfile Outdated Show resolved Hide resolved
darts/models/forecasting/regression_model.py Outdated Show resolved Hide resolved
darts/models/forecasting/regression_model.py Outdated Show resolved Hide resolved
darts/models/forecasting/regression_model.py Show resolved Hide resolved
darts/models/forecasting/regression_model.py Outdated Show resolved Hide resolved
darts/models/forecasting/regression_model.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@dennisbader dennisbader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great now, thanks a lot @madtoinou 🚀
After we check the issue from our discussion today, we can merge this one

Copy link
Collaborator

@dennisbader dennisbader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, thanks a lot @madtoinou, everything looks great now 💯

🚀

@dennisbader dennisbader merged commit b3498bf into master Sep 14, 2023
@dennisbader dennisbader deleted the feat/lags_per_component branch September 14, 2023 09:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Released
Development

Successfully merging this pull request may close these issues.

Different lags for each past_covariate in past_covariates
3 participants