-
-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmark #27
Comments
Hi! Could I take this one? Is there any deadline? I would aim at doing it in the following weeks. |
Thanks @felipewhitaker , there is no deadline, So really appreciate you taking this on |
Hello, can anyone please guide to how to perform it in correct way |
@ombhojane, it is quite common to use Mean Absolute Error (MAE) for evaluating models, including in the weather research area. Another common metric is Continuous Ranked Probability Score (CRPS), which is a generalization of MAE to take scenarios into consideration (properscoring has an implementation of it). Independent of the metric, what do you expect to be a correct way? When comparing models, it is important that both are compared by using a dataset that neither have used to learn (test dataset), and that the comparison is fair (it doesn't make much sense to compare two models that predict different things). |
After exploring |
I think ideall it would be similar to this https://github.com/openclimatefix/Open-Source-Quartz-Solar-Forecast/blob/main/quartz_solar_forecast/forecast.py#L11. Does this answer you question? |
Or perhaps something like this https://github.com/openclimatefix/Open-Source-Quartz-Solar-Forecast/blob/main/quartz_solar_forecast/forecasts/v1.py#L12 It would be good to be able to switch it into the evaulation script easier too, here - https://github.com/openclimatefix/Open-Source-Quartz-Solar-Forecast/blob/main/quartz_solar_forecast/eval/forecast.py#L19, |
It does help, thanks! I might've missed some details there. Moreover, is there a file containg how the current model was trained (which I believe is in |
|
The running of the model is in here - https://github.com/openclimatefix/Open-Source-Quartz-Solar-Forecast/blob/main/quartz_solar_forecast/forecasts/v1.py. I'm hoping we can make v2, v3, ... e.t.c. A really simple benchmark could be the prediction is always half the capacity and then run the evaluation. Oviously it would a very bad model, but helps give an impression on what the MAE numbers mean |
Detailed Description
It would be great to bench mark the model
Context
Always good to benchmark
Possible Implementation
The text was updated successfully, but these errors were encountered: