Display resampling accuracy metrics by ID. It is possible to see the results of the resampling folds by ID, that is, for each predicted time series using the function modeltime_resample_accuracy #15

forecastingEDs · 2023-05-25T21:45:08Z

There is the possibility of viewing the metrics in the resampling forecast by ID, I saw that summary_fns = NULL #1 #3 provides the results by resampling fold, but they are global results. I think it would be very valuable to be able to access the local metrics for each ID and the results per fold resampling. I saw that #4 already requested something similar. If this is already possible and you can help me visualize these results, I would appreciate it. Follow the scripts:

Link to download the database used: https://github.com/forecastingEDs/Forecasting-of-admissions-in-the-emergency-departments/blob/131bd23723a39724ad4f88ad6b8e5a58f42a7960/datasets.xlsx

data_tbl <- datasets %>%
  select(id, Date, attendences, average_temperature, min, max,  sunday, monday, tuesday, wednesday, thursday, friday, saturday, Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov, Dec) %>%
  set_names(c("id", "date", "value","tempe_verage", "tempemin", "tempemax", "sunday", "monday", "tuesday", "wednesday", "thursday", "friday", "saturday", "Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"))

data_tbl

Full = Training + Forecast Datasets

full_data_tbl <- datasets %>%
  select(id, Date, attendences, average_temperature, min, max,  sunday, monday, tuesday, wednesday, thursday, friday, saturday, Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov, Dec) %>%
  set_names(c("id", "date", "value","tempe_verage", "tempemin", "tempemax", "sunday", "monday", "tuesday", "wednesday", "thursday", "friday", "saturday", "Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec")) %>%

Apply Group-wise Time Series Manipulations

group_by(id) %>%
 future_frame(
   .date_var   = date,
   .length_out = "3 days",
   .bind_data  = TRUE
 ) %>%
 ungroup() %>%

Consolidate IDs

mutate(id = fct_drop(id))

Training Data

data_prepared_tbl <- full_data_tbl %>%
  filter(!is.na(value))

Forecast Data

future_tbl <- full_data_tbl %>%
  filter(is.na(value))

emergency_tscv <- data_prepared_tbl %>%
  time_series_cv(
    date_var    = date, 
    assess      = "3 days",
    skip        = "30 days",
    cumulative  = TRUE,
    slice_limit = 5
  )

emergency_tscv

test data preprocessing for ML ----

recipe_spec <- recipe(value ~ ., 
                      data = training(emergency_tscv$splits[[1]])) %>%
  step_timeseries_signature(date) %>%
  step_rm(matches("(.iso$)|(.xts$)|(hour)|(minute)|(second)|(am.pm)")) %>%
  step_mutate(data = factor(value, ordered = TRUE))%>%
  step_dummy(all_nominal(), one_hot = TRUE)%>%
  step_normalize (date_index.num,tempe_verage,tempemin,tempemax,date_year, -all_outcomes())

Model 1: Xgboost ----

wflw_fit_xgboost <- workflow() %>%
  add_model(
    boost_tree("regression") %>% set_engine("xgboost") 
  ) %>%
  add_recipe(recipe_spec %>% step_rm(date)) %>%
  fit(training(emergency_tscv$splits[[1]]))

Model 2: LightGBM ----

wflw_fit_lightgbm <- workflow() %>%
  add_model(
    boost_tree("regression") %>% set_engine("lightgbm")
  ) %>%
  add_recipe(recipe_spec %>% step_rm(date)) %>%
  fit(training(emergency_tscv$splits[[1]]))

---- MODELTIME TABLE ----

model_tbl <- modeltime_table(
  wflw_fit_xgboost,
  wflw_fit_lightgbm
)

model_tbl

resample_results <- model_tbl %>%
  modeltime_fit_resamples(
    resamples = emergency_tscv,
    control   = control_resamples(allow_par = TRUE, verbose = TRUE)
  )

resample_results

This step I need the results by ID but I only get the global results by fold resampling. Can you help me?

resample_results %>%
  modeltime_resample_accuracy(summary_fns = NULL, yardstick::metric_set(mape, smape, mase, rmse)) %>%
  table_modeltime_accuracy(.interactive = FALSE)

The text was updated successfully, but these errors were encountered:

forecastingEDs · 2023-07-14T23:40:24Z

Hello, @mdancho84 @AlbertoAlmuinha
Can you please help me with this question?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Display resampling accuracy metrics by ID. It is possible to see the results of the resampling folds by ID, that is, for each predicted time series using the function modeltime_resample_accuracy #15

Display resampling accuracy metrics by ID. It is possible to see the results of the resampling folds by ID, that is, for each predicted time series using the function modeltime_resample_accuracy #15

forecastingEDs commented May 25, 2023 •

edited

Loading

forecastingEDs commented Jul 14, 2023

Display resampling accuracy metrics by ID. It is possible to see the results of the resampling folds by ID, that is, for each predicted time series using the function modeltime_resample_accuracy #15

Display resampling accuracy metrics by ID. It is possible to see the results of the resampling folds by ID, that is, for each predicted time series using the function modeltime_resample_accuracy #15

Comments

forecastingEDs commented May 25, 2023 • edited Loading

Link to download the database used: https://github.com/forecastingEDs/Forecasting-of-admissions-in-the-emergency-departments/blob/131bd23723a39724ad4f88ad6b8e5a58f42a7960/datasets.xlsx

Full = Training + Forecast Datasets

Apply Group-wise Time Series Manipulations

Consolidate IDs

Training Data

Forecast Data

test data preprocessing for ML ----

Model 1: Xgboost ----

Model 2: LightGBM ----

---- MODELTIME TABLE ----

This step I need the results by ID but I only get the global results by fold resampling. Can you help me?

forecastingEDs commented Jul 14, 2023

forecastingEDs commented May 25, 2023 •

edited

Loading