You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# 2. Now lets loop over each location in system id and find the valid periods
# Should we have a different option if there are not nans
sites = datasets_dict["site"]
site_ids = sites.site_id.values
site_config = config.input_data.site
valid_t0_and_site_ids = []
for site_id in site_ids:
site = sites.sel(site_id=site_id)
# drop any nan values
# not sure this is right?
site = site.dropna(dim='time_utc')
I don't think we should be doing dropna here. This will make a block of dates with even 1 missing value discontinuous which I think is less beneficial than using it and filling in the missing timestamp later on (e g one missing point can cost you something like 45 potential t0s with 3h history and 8h forecast). I've used data with about 3% missing, sometimes in considerable chunks, and the model seemed to do fine and not get distracted by nan infills.
There is a greater discussion to be had around how much missing data we allow to be infilled and at what times, but I think this should be done in preprocessing anyway and not here; I'd remove it for now.
The text was updated successfully, but these errors were encountered:
Missed this on merge, but re: comments here:
I don't think we should be doing dropna here. This will make a block of dates with even 1 missing value discontinuous which I think is less beneficial than using it and filling in the missing timestamp later on (e g one missing point can cost you something like 45 potential t0s with 3h history and 8h forecast). I've used data with about 3% missing, sometimes in considerable chunks, and the model seemed to do fine and not get distracted by nan infills.
There is a greater discussion to be had around how much missing data we allow to be infilled and at what times, but I think this should be done in preprocessing anyway and not here; I'd remove it for now.
The text was updated successfully, but these errors were encountered: