Skip to content

Commit

Permalink
Tryolabs hourly predictions (#125)
Browse files Browse the repository at this point in the history
* quartz_solar_forecast/weather/open_meteo.py

* hourly predictions adapt dates

* hourly time freq

* add image model comparison

* change method name

* filter possible negative model output to 0

* change example image in readme
  • Loading branch information
froukje authored Jun 4, 2024
1 parent 3853e59 commit e9021a6
Show file tree
Hide file tree
Showing 6 changed files with 18 additions and 13 deletions.
6 changes: 4 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
# Quartz Solar Forecast

<!-- ALL-CONTRIBUTORS-BADGE:START - Do not remove or modify this section -->

[![All Contributors](https://img.shields.io/badge/all_contributors-17-orange.svg?style=flat-square)](#contributors-)

<!-- ALL-CONTRIBUTORS-BADGE:END -->

The aim of the project is to build an open source PV forecast that is free and easy to use.
Expand Down Expand Up @@ -131,15 +133,15 @@ To use this model specify `model="xgb"` in `run_forecast(site=site, model="xgb",

The following plot shows example predictions of both models for the same time period. Additionally for the Gradient Boosting model (default) the results from the two different data sources are shown.

![model comparison](images/model_data_comparison.png)
![model comparison](images/model_data_comparison_hr.png)
_Predictions using the two different models and different data sources._

## Known restrictions

- The model is trained on [UK MetOffice](https://www.metoffice.gov.uk/services/data/met-office-weather-datahub) NWPs, but when running inference we use [GFS](https://www.ncei.noaa.gov/products/weather-climate-models/global-forecast) data from [Open-meteo](https://open-meteo.com/). The differences between GFS and UK MetOffice could led to some odd behaviours.
- Depending, whether the timestamp for the prediction lays more than 90 days in the past or not, different data sources for the NWP are used. If we predict within the last 90 days, we can use ICON or GFS from the open-meteo Weather Forecast API. Since ICON doesn't provide visibility, this parameter is queried from GFS in any case. If the date for the prediction is further back in time, a reanalysis model of historical data is used (open-meteo | Historical Weather API). The historical weather API doesn't't provide visibility at all, that's why it's set to a maximum of 24000 meter in this case. This can lead to some loss of precision.
- The model was trained and tested only over the UK, applying it to other geographical regions should be done with caution.
- When using the XGBoost model, only predictions within the last 90 days are available for data consistency.
- When using the XGBoost model, only hourly predictions within the last 90 days are available for data consistency.

## Evaluation

Expand Down
Binary file added images/model_data_comparison_hr.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions quartz_solar_forecast/forecast.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,10 +51,10 @@ def predict_tryolabs(
# set start and end time, if no time is given use current time
if ts is None:
start_date = pd.Timestamp.now().strftime("%Y-%m-%d")
start_time = pd.Timestamp.now().round("15min")
start_time = pd.Timestamp.now().round(freq='h')
else:
start_date = pd.Timestamp(ts).strftime("%Y-%m-%d")
start_time = pd.Timestamp(ts).round("15min")
start_time = pd.Timestamp(ts).round(freq='h')

end_time = start_time + pd.Timedelta(hours=48)
start_date_datetime = datetime.strptime(start_date, "%Y-%m-%d")
Expand Down
6 changes: 4 additions & 2 deletions quartz_solar_forecast/forecasts/v2.py
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,7 @@ def get_data(

weather_service = WeatherService()

weather_data = weather_service.get_15_minutely_weather(
weather_data = weather_service.get_hourly_weather(
latitude, longitude, start_date, end_date
)

Expand Down Expand Up @@ -220,7 +220,9 @@ def predict_power_output(
predictions_df = pd.DataFrame(predictions, columns=["prediction"])
final_data = cleaned_data.join(predictions_df)
# set night predictions to 0
final_data.loc[final_data["is_day"]==0, "prediction"] = 0
final_data.loc[final_data["is_day"] == 0, "prediction"] = 0
# set negative output to 0
final_data.loc[final_data["prediction"] < 0, "prediction"] = 0
df = final_data[[self.DATE_COLUMN, "prediction"]]
df = df.rename(columns={"prediction": "power_wh"})
return df
8 changes: 4 additions & 4 deletions quartz_solar_forecast/weather/open_meteo.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ def _build_url(
str
The URL for the OpenMeteo API.
"""
url = "https://api.open-meteo.com/v1/forecast?latitude={latitude}&longitude={longitude}&minutely_15={variables}&start_date={start_date}&end_date={end_date}&timezone=GMT".format(
url = "https://api.open-meteo.com/v1/forecast?latitude={latitude}&longitude={longitude}&hourly={variables}&start_date={start_date}&end_date={end_date}&timezone=GMT".format(
latitude=latitude,
longitude=longitude,
variables=",".join(variables),
Expand Down Expand Up @@ -99,11 +99,11 @@ def _validate_date_format(self, start_date: str, end_date: str) -> None:
f"Invalid date format or range. Please use YYYY-MM-DD and ensure end_date is greater than start_date. Error: {str(e)}"
)

def get_15_minutely_weather(
def get_hourly_weather(
self, latitude: float, longitude: float, start_date: str, end_date: str
) -> pd.DataFrame:
"""
Get 15 minutely weather data ranging from 3 months ago up to 15 days ahead (forecast).
Get hourly weather data ranging from 3 months ago up to 15 days ahead (forecast).
Parameters
----------
Expand Down Expand Up @@ -150,7 +150,7 @@ def get_15_minutely_weather(
]
url = self._build_url(latitude, longitude, start_date, end_date, variables)
response = requests.get(url)
data = response.json()["minutely_15"]
data = response.json()["hourly"]

df = pd.DataFrame(data)
df["time"] = pd.to_datetime(df["time"])
Expand Down
7 changes: 4 additions & 3 deletions tests/test_forecast_no_ts.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,9 @@ def test_run_forecast_no_ts():
site = PVSite(latitude=51.75, longitude=-1.25, capacity_kwp=1.25)

current_ts = pd.Timestamp.now().round("15min")
current_hr = pd.Timestamp.now().round(freq='h')

# run ocf model with no ts
# run gradient boosting model with no ts
predications_df = run_forecast(site=site, model="gb")
# check current ts agrees with dataset
assert predications_df.index.min() == current_ts
Expand All @@ -18,10 +19,10 @@ def test_run_forecast_no_ts():
print(f"Current time: {current_ts}")
print(f"Max: {predications_df['power_wh'].max()}")

# run tryolabs model with no ts
# run xgb model with no ts
predications_df = run_forecast(site=site, model="xgb")
# check current ts agrees with dataset
assert predications_df.index.min() == current_ts
assert predications_df.index.min() == current_hr

print(predications_df)
print(f"Current time: {current_ts}")
Expand Down

0 comments on commit e9021a6

Please sign in to comment.