Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update HF upload script and model cards #239

Merged
merged 7 commits into from
Aug 7, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 15 additions & 4 deletions pvnet/models/base_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -230,6 +230,7 @@ def save_pretrained(
data_config: Optional[Union[str, Path]],
repo_id: Optional[str] = None,
push_to_hub: bool = False,
wandb_repo: Optional[str] = None,
wandb_ids: Optional[Union[list[str], str]] = None,
card_template_path=None,
**kwargs,
Expand All @@ -248,9 +249,10 @@ def save_pretrained(
ID of your repository on the Hub. Used only if `push_to_hub=True`. Will default to
the folder name if not provided.
push_to_hub (`bool`, *optional*, defaults to `False`):
Whether or not to push your model to the Huggingface Hub after saving it.
Whether or not to push your model to the HuggingFace Hub after saving it.
wandb_repo: Identifier of the repo on wandb.
wandb_ids: Identifier(s) of the model on wandb.
card_template_path: Path to the huggingface model card template. Defaults to card in
card_template_path: Path to the HuggingFace model card template. Defaults to card in
PVNet library if set to None.
kwargs:
Additional key word arguments passed along to the
Expand All @@ -277,19 +279,28 @@ def save_pretrained(
# Taylor the data config to the model being saved
minimize_data_config(new_data_config_path, new_data_config_path, self)

# Get appropriate model card
model_name = repo_id.split("/")[1]
if model_name == "windnet_india":
model_card = "wind_india_model_card_template.md"
elif model_name == "pvnet_india":
model_card = "pv_india_model_card_template.md"
else:
model_card = "pv_uk_regional_model_card_template.md"

Sukh-P marked this conversation as resolved.
Show resolved Hide resolved
# Creating and saving model card.
card_data = ModelCardData(language="en", license="mit", library_name="pytorch")
if card_template_path is None:
card_template_path = (
f"{os.path.dirname(os.path.abspath(__file__))}/model_card_template.md"
f"{os.path.dirname(os.path.abspath(__file__))}/model_cards/{model_card}"
)

if isinstance(wandb_ids, str):
wandb_ids = [wandb_ids]

wandb_links = ""
for wandb_id in wandb_ids:
link = f"https://wandb.ai/openclimatefix/pvnet2.1/runs/{wandb_id}"
link = f"https://wandb.ai/{wandb_repo}/runs/{wandb_id}"
wandb_links += f" - [{link}]({link})\n"

card = ModelCard.from_template(
Expand Down
51 changes: 51 additions & 0 deletions pvnet/models/model_cards/pv_india_model_card_template.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
---
{{ card_data }}
---






# PVNet India

## Model Description

<!-- Provide a longer summary of what this model is/does. -->
This model class uses numerical weather predictions from providers such as ECMWF to forecast the PV power in North West India over the next 48 hours. More information can be found in the model repo [1] and experimental notes [here](https://github.com/openclimatefix/PVNet/tree/main/experiments/india).


- **Developed by:** openclimatefix
- **Model type:** Fusion model
- **Language(s) (NLP):** en
- **License:** mit


# Training Details

## Data

<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

The model is trained on data from 2019-2022 and validated on data from 2022-2023. See experimental notes [here](https://github.com/openclimatefix/PVNet/tree/main/experiments/india)


### Preprocessing

Data is prepared with the `ocf_datapipes.training.pvnet_site` datapipe [2].


## Results

The training logs for the current model can be found here:
{{ wandb_links }}


### Hardware

Trained on a single NVIDIA Tesla T4

### Software

- [1] https://github.com/openclimatefix/PVNet
- [2] https://github.com/openclimatefix/ocf_datapipes
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
## Model Description

<!-- Provide a longer summary of what this model is/does. -->
This model class uses satellite data, numericl weather predictions, and recent Grid Service Point( GSP) PV power output to forecast the near-term (~8 hours) PV power output at all GSPs. More information can be found in the model repo [1] and experimental notes in [this google doc](https://docs.google.com/document/d/1fbkfkBzp16WbnCg7RDuRDvgzInA6XQu3xh4NCjV-WDA/edit?usp=sharing).
This model class uses satellite data, numerical weather predictions, and recent Grid Service Point( GSP) PV power output to forecast the near-term (~8 hours) PV power output at all GSPs. More information can be found in the model repo [1] and experimental notes in [this google doc](https://docs.google.com/document/d/1fbkfkBzp16WbnCg7RDuRDvgzInA6XQu3xh4NCjV-WDA/edit?usp=sharing).

- **Developed by:** openclimatefix
- **Model type:** Fusion model
Expand Down
51 changes: 51 additions & 0 deletions pvnet/models/model_cards/wind_india_model_card_template.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
---
{{ card_data }}
---






# WindNet

## Model Description

<!-- Provide a longer summary of what this model is/does. -->
This model class uses numerical weather predictions from providers such as ECMWF to forecast the wind power in North West India over the next 48 hours at 15 minute granularity. More information can be found in the model repo [1] and experimental notes [here](https://github.com/openclimatefix/PVNet/tree/main/experiments/india).


- **Developed by:** openclimatefix
- **Model type:** Fusion model
- **Language(s) (NLP):** en
- **License:** mit


# Training Details

## Data

<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

The model is trained on data from 2019-2022 and validated on data from 2022-2023. See experimental notes [here](https://github.com/openclimatefix/PVNet/tree/main/experiments/india)


### Preprocessing

Data is prepared with the `ocf_datapipes.training.windnet` datapipe [2].


## Results

The training logs for the current model can be found here:
{{ wandb_links }}


### Hardware

Trained on a single NVIDIA Tesla T4

### Software

- [1] https://github.com/openclimatefix/PVNet
- [2] https://github.com/openclimatefix/ocf_datapipes
17 changes: 10 additions & 7 deletions scripts/checkpoint_to_huggingface.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,33 +2,35 @@

use:
python checkpoint_to_huggingface.py "path/to/model/checkpoints" \
--huggingface_repo="openclimatefix/pvnet_uk_region" \
--wandb_repo="openclimatefix/pvnet2.1"" \
--local-path="~/tmp/this_model" \
--no-push-to-hub
"""

import tempfile
from typing import Optional

import typer
import wandb

from pvnet.load_model import get_model_from_checkpoints

wandb_repo = "openclimatefix/pvnet2.1"
huggingface_repo = "openclimatefix/pvnet_uk_region"


def push_to_huggingface(
checkpoint_dir_paths: list[str],
huggingface_repo: str = "openclimatefix/pvnet_uk_region", # e.g. openclimatefix/windnet_india
wandb_repo: str | None = "openclimatefix/pvnet2.1",
val_best: bool = True,
wandb_ids: Optional[list[str]] = None,
local_path: Optional[str] = None,
wandb_ids: list[str] | None = [],
local_path: str | None = None,
push_to_hub: bool = True,
):
"""Push a local model to pvnet_v2 huggingface model repo
"""Push a local model to a huggingface model repo

Args:
checkpoint_dir_paths: Path(s) of the checkpoint directory(ies)
huggingface_repo: Name of the HuggingFace repo to push the model to
wandb_repo: Name of the wandb repo which has training logs
val_best: Use best model according to val loss, else last saved model
wandb_ids: The wandb ID code(s)
local_path: Where to save the local copy of the model
Expand Down Expand Up @@ -65,6 +67,7 @@ def push_to_huggingface(
model_output_dir,
config=model_config,
data_config=data_config,
wandb_repo=wandb_repo,
wandb_ids=wandb_ids,
push_to_hub=push_to_hub,
repo_id=huggingface_repo if push_to_hub else None,
Expand Down
Loading