Update HF upload script and model cards #239

Sukh-P · 2024-07-24T16:31:16Z

Pull Request

Description

Have run into this a couple times now where I have ran the push checkpoint to HF script with just the checkpoint_dir_path parameter and I got an error about the wandb_ids being None, I realised it didn't go into the logic to get the run ID starting at line 43 since that checks if it is an empty list and the default is None and then runs into an error if it's not an ensemble due to

if not is_ensemble:
        wandb_ids = wandb_ids[0]

so I think changing it's default to an empty list might work a little more smoothly

Also removed some hardcoding of bits used in the model card and added separate model cards for different model types

Fixes #237 and #235

AUdaltsova · 2024-08-05T17:05:24Z

scripts/checkpoint_to_huggingface.py

@@ -21,7 +21,7 @@
 def push_to_huggingface(
    checkpoint_dir_paths: list[str],
    val_best: bool = True,
-    wandb_ids: Optional[list[str]] = None,
+    wandb_ids: list[str] = [],


Out of curiosity: why not Optional anymore?

That was a mistake, added it back in, thanks! Also updated the type hint slightly to the more up to date python way so we don't need to use Optional

dfulu · 2024-08-06T09:59:14Z

This is a partial fix and it might be worth implementing a more complete one or making an issue to do so.

The script adds the link to the wandb run in the huggingface model card. For this it uses a hard-coded wandb project and hf repo since it was written when we only used PVNet for UK regional. Having that automatically generated link between the model on HF and the actual training run is really good for traceability of models for us, and is also good for our open source-ness.

I presume that you must be changing some of these hard coded paths locally in order to run this script anyway?

dfulu · 2024-08-06T10:06:11Z

I've just seen that the windnet and pvnet-india HF repos have used the model cards from PVNet UK which mention the model being used for UK regional GSP predictions. Also the links to the training runs lead to nowhere because of these hard coded values.

So I think this should probably be a separate issue. Misusing the model cards like this is a bit messy of us

Sukh-P · 2024-08-06T12:41:00Z

This is a partial fix and it might be worth implementing a more complete one or making an issue to do so.

The script adds the link to the wandb run in the huggingface model card. For this it uses a hard-coded wandb project and hf repo since it was written when we only used PVNet for UK regional. Having that automatically generated link between the model on HF and the actual training run is really good for traceability of models for us, and is also good for our open source-ness.

I presume that you must be changing some of these hard coded paths locally in order to run this script anyway?

Yep so we edit these locally when running the script as it wouldn't be put in the correct HF repo otherwise (for the HF one), I guess this could be done more cleanly and these be included as required parameters in the function itself to reduce the chance of some forgetting to change these

Sukh-P · 2024-08-06T12:42:36Z

I've just seen that the windnet and pvnet-india HF repos have used the model cards from PVNet UK which mention the model being used for UK regional GSP predictions. Also the links to the training runs lead to nowhere because of these hard coded values.

So I think this should probably be a separate issue. Misusing the model cards like this is a bit messy of us

Yep we agree, @AUdaltsova and I noticed this too and Alex has created an issue here for this #235

AUdaltsova · 2024-08-06T12:58:30Z

Yep so we edit these locally when running the script as it wouldn't be put in the correct HF repo otherwise (for the HF one), I guess this could be done more cleanly and these be included as required parameters in the function itself to reduce the chance of some forgetting to change these

Can we maybe add choices/hints for it? Like "must be one of: path-to-pvnet-repo, path-to-windnet-repo" etc or even just in comments somewhere so you don't have to go look up the specific name on HF? Not that it's a lot of work but I'm a big fan of human laziness accommodation

for more information, see https://pre-commit.ci

Sukh-P · 2024-08-07T10:51:28Z

FYI I am trying to tackle #235 in this PR too just to try and clean this up a bit more in one go

Sukh-P · 2024-08-07T11:42:22Z

Okay should be good to go now for a review with the additional changes, thanks

pvnet/models/base_model.py

AUdaltsova

Looks good to me!

pvnet/models/model_cards/pvnet_india_model_card_template.md

dfulu

Looks really good!

Update default for wandb_ids

1a5cfdc

Sukh-P changed the title ~~Update default for wandb_ids~~ Update default for wandb_ids in HF upload script Jul 24, 2024

Sukh-P requested a review from dfulu July 25, 2024 15:24

Sukh-P marked this pull request as ready for review July 25, 2024 15:24

Sukh-P requested a review from AUdaltsova July 30, 2024 13:46

AUdaltsova reviewed Aug 5, 2024

View reviewed changes

AUdaltsova approved these changes Aug 5, 2024

View reviewed changes

Update type hints

e2b27ac

Sukhil Patel and others added 2 commits August 7, 2024 11:42

Fix model cards

91e26ad

[pre-commit.ci] auto fixes from pre-commit.com hooks

8c00aab

for more information, see https://pre-commit.ci

linting

cc9cc41

Sukh-P changed the title ~~Update default for wandb_ids in HF upload script~~ Update HF upload script and model cards Aug 7, 2024

Sukh-P requested a review from AUdaltsova August 7, 2024 11:41

AUdaltsova reviewed Aug 7, 2024

View reviewed changes

pvnet/models/base_model.py Show resolved Hide resolved

AUdaltsova approved these changes Aug 7, 2024

View reviewed changes

AUdaltsova reviewed Aug 7, 2024

View reviewed changes

pvnet/models/model_cards/pvnet_india_model_card_template.md Outdated Show resolved Hide resolved

Correct model cards

8742ebc

dfulu approved these changes Aug 7, 2024

View reviewed changes

Remove duplicated parameter

5e07cc1

Sukh-P merged commit d908019 into main Aug 7, 2024
3 checks passed

Sukh-P deleted the update_to_hf_script branch November 5, 2024 11:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update HF upload script and model cards #239

Update HF upload script and model cards #239

Sukh-P commented Jul 24, 2024 •

edited

Loading

AUdaltsova Aug 5, 2024

Sukh-P Aug 6, 2024

dfulu commented Aug 6, 2024 •

edited

Loading

dfulu commented Aug 6, 2024

Sukh-P commented Aug 6, 2024 •

edited

Loading

Sukh-P commented Aug 6, 2024

AUdaltsova commented Aug 6, 2024

Sukh-P commented Aug 7, 2024

Sukh-P commented Aug 7, 2024

AUdaltsova left a comment

dfulu left a comment

Update HF upload script and model cards #239

Update HF upload script and model cards #239

Conversation

Sukh-P commented Jul 24, 2024 • edited Loading

Pull Request

Description

AUdaltsova Aug 5, 2024

Choose a reason for hiding this comment

Sukh-P Aug 6, 2024

Choose a reason for hiding this comment

dfulu commented Aug 6, 2024 • edited Loading

dfulu commented Aug 6, 2024

Sukh-P commented Aug 6, 2024 • edited Loading

Sukh-P commented Aug 6, 2024

AUdaltsova commented Aug 6, 2024

Sukh-P commented Aug 7, 2024

Sukh-P commented Aug 7, 2024

AUdaltsova left a comment

Choose a reason for hiding this comment

dfulu left a comment

Choose a reason for hiding this comment

Sukh-P commented Jul 24, 2024 •

edited

Loading

dfulu commented Aug 6, 2024 •

edited

Loading

Sukh-P commented Aug 6, 2024 •

edited

Loading