-
-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Satellite normalisation #97
base: main
Are you sure you want to change the base?
Conversation
Override anaconda-client issue via setting version to 3.11
I just re opened it to try and trigger the tests, but weirdly they dont seem to be running |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this @felix-e-h-p! It looks good, but I think there are a few things that we should clean up first
return xr.DataArray( | ||
data, | ||
coords=coords, | ||
dims=["init_time_utc", "step", "y_osgb", "x_osgb"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is strictly not the data format that process_and_combine_datasets()
expects. It expects that the NWP data will have been passed through the select_time_slice_nwp()
function function which changes the dimensions from ["init_time_utc", "step", "channel", "x", "y"]
to ["target_time_utc", "channel", "x", y"]
.
Although admittedly the process_and_combine_datasets()
doesn't really check this since it is only ever supposed to be used as part of the pipeline where select_time_slice_nwp()
has been run
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the test to ensure that NWP data is passed through the select_time_slice_nwp function initially before process and combine is called - would appreciate your feedback on this section if all OK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now I think I'd bypass the select_time_slice_nwp()
function here and just update this fixture. It means this test is independent of errors in select_time_slice_nwp()
.
something like:
data = np.random.rand(3, 3, 4, 4)
coords = {
"target_time_utc": pd.date_range("2023-01-01 12:00", freq="1h", periods=3),
"channel": ["t", "dswrf", "lcc"],
"x_osgb": np.arange(4),
"y_osgb": np.arange(4),
}
xr.DataArray(
data,
coords=coords,
dims=["target_time_utc", "channel", "y_osgb", "x_osgb"],
)
Python version set removed
…atefix/ocf-data-sampler into satellite-normalisation
numpy_modalities.append(convert_satellite_to_numpy_batch(da_sat)) | ||
sat_numpy_modalities = convert_satellite_to_numpy_batch(da_sat) | ||
# Combine the Satellite into NumpyBatch | ||
numpy_modalities.append({SatelliteBatchKey.satellite_actual: sat_numpy_modalities}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line, could be changed to back to numpy_modalities.append(convert_satellite_to_numpy_batch(da_sat)
, and then I think itll be fine
Would you be able to pull |
@@ -119,8 +125,9 @@ def process_and_combine_site_sample_dict( | |||
data_arrays.append((f"nwp-{provider}", da_nwp)) | |||
|
|||
if "sat" in dataset_dict: | |||
# TODO add some satellite normalisation | |||
# Satellite normalisation added |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@felix-e-h-p this is picky but can we change this comment to just # Standardise
or something. I don't think "Satellite normalisation added" makes sense out of context now the TODO is gone
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #97 +/- ##
==========================================
- Coverage 95.44% 94.72% -0.73%
==========================================
Files 37 39 +2
Lines 989 1061 +72
==========================================
+ Hits 944 1005 +61
- Misses 45 56 +11 ☔ View full report in Codecov by Sentry. |
Normalisation introduced for satellite data utilising 2020 values for std and mean
Not entirely sure epsilon factor is truly necessary - yet left in as a safe guarding mechanism