-
-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix PV system IDs #220
Comments
@jacobbieker and @peterdudfield, thoughts? |
I would be inclined to our own encode_label() option, as we use more PV systems, it could keep the code a bit easier than multiple ID systems. Keeping them consistent might be a bit of work, but probably good long term too |
Thanks for putting this all together, its really nice having it all laid out. Ive put some thought into here - openclimatefix/nowcasting_dataset#691, but never finished it. Few comments
|
I agree that
Also, unless I'm mistaken, for loading the PV data from the database we set the Something like 3 might be the best bet |
Describe the bug
When using the Passiv PV for training and loading it using
ocf_datapipes.load.pv.pv.load_everything_into_ram()
, the function takes thess_id
ID and renames it aspv_system_id
in the xarray.Dataset returned. In the input metadata, there is also a column namedsystem_id
, which is not carried through to the xarray.Dataset. Both of thess_id
andsystem_id
initially come from the sheffield solar API and they are different.Currently, only the
system_id
ID is available in our database, although it is under thepv_system_id
column there. Also, when we load from the database we encode this ID liken -> int(f"{n}1")
.All this makes it quite difficult to match between the same systems in training and production. Being able to match will be important for adding Passiv inputs to PVNet as per openclimatefix/PVNet#68
Some potential fixes
ss_id
ss_id
available in the databasess_id
is unique for each system across Enphase, Passiv, and PVOutput. This means that we would need to change the library if we wanted to use PV systems outside of these sources.ss_id
(sheffield solar ID?) already hard coded intoload_everything_into_ram()
, so we would likely need to change that anyway.system_id
.get_pv_power_from_database()
and maybe think about the encoding.system_id
throughload_everything_into_ram()
, and that may involve some renaming of the IDs variables.encode_label()
and apply this everywhere consistently.system_id
to the dataset created inload_everything_into_ram()
and select systems based on this.get_pv_power_from_database()
but just select the IDs likesystem_id -> int(f"{system_id}1")
The text was updated successfully, but these errors were encountered: