Skip to content
This repository has been archived by the owner on Sep 11, 2023. It is now read-only.

Use minimal data types for PV #624

Closed
JackKelly opened this issue Mar 24, 2022 · 3 comments · Fixed by #641
Closed

Use minimal data types for PV #624

JackKelly opened this issue Mar 24, 2022 · 3 comments · Fixed by #641
Labels
data New data source or feature; or modification of existing data source enhancement New feature or request good first issue Good for newcomers

Comments

@JackKelly
Copy link
Member

This is what a PV batch looks like in v15. All the 64-bit data types could be 32-bit:

image

@JackKelly JackKelly added enhancement New feature or request good first issue Good for newcomers data New data source or feature; or modification of existing data source labels Mar 24, 2022
@JackKelly JackKelly added this to the v17 dataset milestone Mar 24, 2022
@JackKelly JackKelly moved this to Todo in Nowcasting Mar 24, 2022
@peterdudfield
Copy link
Contributor

Ill have a look at v16, just incase its been done already

@peterdudfield
Copy link
Contributor

peterdudfield commented Apr 8, 2022

Looking in /mnt/storage_ssd_4tb/data/ocf/solar_pv_nowcasting/nowcasting_dataset_pipeline/prepared_ML_training_data/v16/test/pv

import xarray as xr
pv = xr.load_dataset('000000.nc',engine='h5netcdf')

Screenshot 2022-04-07 at 13 43 23

so the varibales that need upgrading are

  • x_coords -> float32
  • y_coords -> float32
  • id -> int32
  • example_index
  • id_index
  • time_index

@peterdudfield peterdudfield mentioned this issue Apr 8, 2022
6 tasks
@JackKelly
Copy link
Member Author

sounds good!

I think we need to use float32 for id (so we can use an ID value of NaN to represent "missing" PV systems)

Also, I think we can completely throw away example_index, id_index, and time_index because they're just repeats of the example, id, and time dimensions 🙂 Please see issues #629 and #630 for more context on throwing away example_index, id_index, and time_index.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
data New data source or feature; or modification of existing data source enhancement New feature or request good first issue Good for newcomers
Projects
No open projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants