Skip to content
This repository has been archived by the owner on Sep 11, 2023. It is now read-only.

Refactor on-disk PV datasets #630

Open
1 task
JackKelly opened this issue Apr 1, 2022 · 0 comments
Open
1 task

Refactor on-disk PV datasets #630

JackKelly opened this issue Apr 1, 2022 · 0 comments
Labels
data New data source or feature; or modification of existing data source

Comments

@JackKelly
Copy link
Member

JackKelly commented Apr 1, 2022

v15 currently looks like this:

image

I'd propose making these changes:

# Drop redundant coordinates (these are redundant because they
# just repeat the contents of each *dimension*):
dataset = dataset.drop_vars(["example", "id_index", "time_index"])

# Rename coords to be more explicit about exactly what some coordinates hold:
# Note that, in v15 of the dataset, the keys are incorrectly named
# power_mw and capacity_mwp, even though the power and capacity are both in watts.
# See https://github.com/openclimatefix/nowcasting_dataset/issues/530
dataset = dataset.rename_vars(
    {
        "time": "time_utc",
        "power_mw": "power_w",
        "capacity_mwp": "capacity_wp",
        "id": "pv_system_id",
        "y_coords": "y_osgb",
        "x_coords": "x_osgb",
    }
)

# Rename dimensions. Standardize on the singular (time, channel, etc.).
# Remove redundant "index" from the dim name. These are *dimensions* so,
# by definition, they are indicies!
dataset = dataset.rename_dims(
    {
        "time_index": "time",
        "id_index": "pv_system",
    }
)

# Setting coords won't be necessary once this is fixed:
# https://github.com/openclimatefix/nowcasting_dataset/issues/627
dataset = dataset.set_coords(
    ["time_utc", "pv_system_id", "pv_system_row_number", "y_osgb", "x_osgb"])

dataset = dataset.transpose("example", "time", "pv_system")

When ends up looking like this:

image

Related

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
data New data source or feature; or modification of existing data source
Projects
No open projects
Status: Todo
Development

No branches or pull requests

2 participants