Skip to content

Latest commit

 

History

History
243 lines (135 loc) · 7.31 KB

CHANGELOG.md

File metadata and controls

243 lines (135 loc) · 7.31 KB

1.6.2 - 2024-01-29

Fixed

  • Swapped or removed deprecated LightGBM input parameters
  • The 'shap' package is now a required dependency

1.6.1 - 2022-06-07

Fixed

  • Forecasts now default to last period if test set is empty

1.6.0 - 2022-04-29

Changed

  • Forecasts now default to first period of test set if a test set exists

1.5.2 - 2022-03-23

Changed

  • Tensorflow now an optional dependency

Fixed

  • LGBModelers now correctly handle datetime categories

1.5.1 - 2021-03-25

Added

  • Interacted fixed effects state and exit modelers
  • "_label" is now a reserved column name; FIFE may not work if a column in your data has a reserved name

Fixed

StateModeler and ExitModeler

  • Now work as intended with time identifiers that are not a non-negative integer progression

Changed

  • Interacted fixed effects modelers now predict NaN instead of the mean of all predictions

1.5.0 - 2021-01-31

Added

Command-line Interface

  • Can now specify TIME_ID_AS_FEATURE as false to exclude the time identifier from the set of features

Fixed

StateModeler and ExitModeler

  • Observations with NaN outcome values now excluded from R-squared calculation

Changed

  • build_packages.bat and requirements.txt updated to Python 3.8

StateModeler and ExitModeler

  • Prediction DataFrames for categorical outcomes now include future state in the index

1.4.2 - 2021-01-21

Added

StateModeler and ExitModeler

  • Outcome categories now accessible through class_values attribute

Fixed

ExitModeler

  • If the outcome is categorical, only labels associated with an exit (i.e., that appear in the last observation of a spell) are used for training

1.4.1 - 2020-12-30

Added

Modelers

  • Can now specify observation weights through the argument weight_col. The specified column will not be used as a feature, but will be used to weight observations during training and evaluation.

Fixed

  • Area under the receiver operating characteristic curve (AUROC) now computed for multiclass if no class is entirely positive. Classes with no positive values are excluded.
  • ExitModeler outcome labeling
  • Two hyperparameter prior distribution lower bounds now 2 ** -5 instead of 2e-5.
  • LGBModelers now handle datetime categories

Changed

1.4.0 - 2020-12-11

Added

Modelers

  • Can now specify allow_gaps=True to remove the restriction that individuals be observed in every future period over the given time horizon. For example, for a time horizon of 2, the default behavior of the StateModeler is to train and evaluate only on observations where the same individual was observed in the next 2 periods. allow_gaps=True will instead only require that the same individual be observed 2 periods into the future, thereby allowing a gap where the individual is not observed 1 period into the future.

PanelDataProcessor

  • Now produces "_spell" column, which reports the number of gaps in observing the given individual up to the observed time period.

1.3.4 - 2020-12-09

Added

Command-line Interface

  • Can now use BY_FEATURE to produce separate Metrics.csv files for each value of a selected feature

Fixed

  • Number of classes now correctly specified for multiclass outcomes during hyperoptimization

1.3.3 - 2020-09-24

Changed

  • SHAP is now an optional dependency; install fife with pip install fife[shap] to ensure you can produce SHAP plots

Removed

  • Dask optional dependencies except cloudpickle and toolz

1.3.2 - 2020-09-22

Removed

  • Bokeh dependency

1.3.1 - 2020-09-08

Added

Changed

  • modeler.evaluate method now defaults to evaluating on the earliest period of test set observations instead of all observations

1.3.0 - 2020-08-25

Added

  • LGBStateModeler, which forecasts the value of a feature conditional on survival ("multivariate time series forecasting")

  • LGBExitModeler, which forecasts the circumstances of exit conditional on exit ("competing risks")

Deprecated

  • GradientBoostedTreesModeler, now called "LGBSurvivalModeler"

  • Standalone functions in the processors module, their responsibility having moved to the modeler method transform_features()

1.2.0 - 2020-08-17

Added

GradientBoostedTreesModeler

  • modeler.build_model() and modeler.train() now parallelize training over time horizons

PanelDataProcessor

  • processor.build_processed_data() and processor.process_all_columns() now parallelize processing over columns

Command-line Interface

  • Command-line execution now produces calibration and forecast error outputs

Utils

  • Option within create_example_data() to specify number of persons and time periods in dataset

Fixed

  • Null category added to columns of pandas Categorical type in PanelDataProcessor
  • Command-line execution now trains modeler for specific number of test intervals if specified

1.1.0 - 2020-07-20

Added

GradientBoostedTreesModeler and FeedforwardNeuralNetworkModeler

  • Support for hyperoptimization with modeler.hyperoptimize()
  • Options within modeler.build_model() and modeler.train() for:
    • hyperparameters (such as those returned by hyperoptimization)
    • toggling off validation early stopping (using params argument in the case of build_model())
    • training on subset
  • Defaults for all configuration parameters
  • Default option to represent datetime features represented YYYYMMDD integers
  • Option to represent datetime features as nanoseconds

PanelDataProcessor

  • "_period" and "_maximum_lead" columns, which replace computation of "factorized time ids" in various methods
  • Defaults for all configuration parameters
  • Categorical feature conversion to pandas Categorical type

Command-line Interface

  • Option to execute from command line without configuration file
  • Option to specify individual parameter values
  • Default configuration for processors and modelers
  • Command-line execution now uses data file in current directory if there is only one file with a matching extension

Removed

PanelDataProcessor

  • Numeric feature normalization
  • Homebrewed categorical feature integer mapping
  • Raw subsetting

Command-line Interface

  • Interacted fixed effects modeling
  • Metrics-related output when no test set specified
  • Forecast-related output when test set specified

Fixed

  • Validation and test sets no longer overlap
  • modeler.evaluate() now reports correct metrics for subsets in which maximum observable period varies (e.g., train and test set combined)
  • First period of test set now considered observed for computing training set outcomes
  • ProportionalHazards models can now be saved to files
  • Code now formatted using Black
  • Command-line interface now evaluates on earliest period of test set instead of validation set