Skip to content

Commit

Permalink
pvsite-datamodel integration & fake forecasts (#1)
Browse files Browse the repository at this point in the history
* Set min python version to 3.11 and included pvsite-datamodel as dependency

* Added dummy model for generating fake forecasts. App uses this to output the forecasts for sites

* Added test fixtures for DB connection

* Updated version of docker-release shared workflow in github action

* Updated python version in github action to match project - 3.11

* formatted code

* Fixed test coverage config error

* Updated pvsite-model to latest version

* Fleshed out the _get_site_ids function

* Updated readme with instructions for spinning up a local DB

* Added script to seed local DB

* Saving forecasts to DB

* Simplified linting/formatting using just ruff

* tests for some app functions

* Updated pvsite-model to latest version

* Added remaining tests for app functions

* Added a few extra logging statements and updated Dockerfile
  • Loading branch information
confusedmatrix authored Jan 25, 2024
1 parent 5abf2f7 commit 0e4431a
Show file tree
Hide file tree
Showing 14 changed files with 1,714 additions and 116 deletions.
16 changes: 8 additions & 8 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: CI Pipeline for SDK - Python
name: CI pipeline for India Forecast App

on: push

Expand All @@ -8,16 +8,16 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Check out the repo
uses: actions/checkout@v2

uses: actions/checkout@v4
- name: Install poetry
run: pipx install poetry==1.7.1

- name: Install python
uses: actions/setup-python@v4
uses: actions/setup-python@v5
with:
python-version: '3.9'
cache: poetry
python-version: '3.11'
cache: 'poetry'

- name: Install python dependencies
run: poetry install
Expand All @@ -34,7 +34,7 @@ jobs:
release:
needs: [lint_and_test]
if: github.ref_name == 'main'
uses: openclimatefix/.github/.github/workflows/docker-release.yml@v1.2.0
uses: openclimatefix/.github/.github/workflows/docker-release.yml@v1.8.1
secrets:
DOCKERHUB_USERNAME: ${{ secrets.DOCKERHUB_USERNAME }}
DOCKERHUB_TOKEN: ${{ secrets.DOCKERHUB_TOKEN }}
Expand Down
5 changes: 4 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM python:3.9-slim as base
FROM python:3.11-slim as base

ENV PYTHONFAULTHANDLER=1 \
PYTHONHASHSEED=random \
Expand All @@ -8,6 +8,9 @@ WORKDIR /app

FROM base as builder

RUN apt-get update
RUN apt-get install -y gdal-bin libgdal-dev g++

ENV PIP_DEFAULT_TIMEOUT=100 \
PIP_DISABLE_PIP_VERSION_CHECK=1 \
PIP_NO_CACHE_DIR=1 \
Expand Down
7 changes: 1 addition & 6 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,20 +1,15 @@
#
# This mostly contains shortcut for multi-command steps.
#
SRC = india_forecast_app tests
SRC = india_forecast_app scripts tests

.PHONY: lint
lint:
poetry run ruff $(SRC)
poetry run black --check $(SRC)
poetry run isort --check $(SRC)


.PHONY: format
format:
poetry run ruff --fix $(SRC)
poetry run black $(SRC)
poetry run isort $(SRC)

.PHONY: test
test:
Expand Down
29 changes: 29 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,35 @@ make format
make test
```

## Running the app locally
Replace `{DB_URL}` with a postgres DB connection string (see below for setting up a ephemeral local DB)

If testing on a local DB, you may use the following script to seed the the DB with a dummy user, site and site_group.
```
DB_URL={DB_URL} poetry run seeder
```
⚠️ Note this is a destructive script and will drop all tables before recreating them to ensure a clean slate. DO NOT RUN IN PRODUCTION ENVIRONMENTS

This example invokes app.py and passes the help flag
```
DB_URL={DB_URL} poetry run app --help
```

### Starting a local database using docker

```bash
docker run \
-it --rm \
-e POSTGRES_USER=postgres \
-e POSTGRES_PASSWORD=postgres \
-p 54545:5432 postgres:14-alpine \
postgres
```

The corresponding `DB_URL` will be

`postgresql://postgres:postgres@localhost:54545/postgres`

## Building and running in [Docker](https://www.docker.com/)

Build the Docker image
Expand Down
2 changes: 1 addition & 1 deletion india_forecast_app/__init__.py
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
"""India Forecast App"""
__version__ = "0.1.0"
__version__ = "0.1.0"
188 changes: 183 additions & 5 deletions india_forecast_app/app.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,188 @@
"""
Main forecast app entrypoint
"""

import datetime as dt
import logging
import os
import sys

import click
import pandas as pd
from pvsite_datamodel import DatabaseConnection
from pvsite_datamodel.read import get_sites_by_country
from pvsite_datamodel.write import insert_forecast_values
from sqlalchemy.orm import Session

from .model import DummyModel

log = logging.getLogger(__name__)


def get_site_ids(db_session: Session) -> list[str]:
"""
Gets all avaiable site_ids in India
Args:
db_session: A SQLAlchemy session
Returns:
A list of site_ids
"""

sites = get_sites_by_country(db_session, country="india")

return [s.site_uuid for s in sites]


def get_model():
"""
Instantiates and returns the forecast model ready for running inference
Returns:
A forecasting model
"""

model = DummyModel()
return model


def run_model(model, site_id: str, timestamp: dt.datetime):
"""
Runs inference on model for the given site & timestamp
Args:
model: A forecasting model
site_id: A specific site ID
timestamp: timestamp to run a forecast for
Returns:
A forecast or None if model inference fails
"""

try:
forecast = model.predict(site_id=site_id, timestamp=timestamp)
except Exception:
log.error(
f"Error while running model.predict for site_id={site_id}. Skipping",
exc_info=True,
)
return None

return forecast


def save_forecast(db_session: Session, forecast, write_to_db: bool):
"""
Saves a forecast for a given site & timestamp
Args:
db_session: A SQLAlchemy session
forecast: a forecast dict containing forecast meta and predicted values
write_to_db: If true, forecast values are written to db, otherwise to stdout
Raises:
IOError: An error if database save fails
"""

forecast_meta = {
"site_uuid": forecast["meta"]["site_id"],
"timestamp_utc": forecast["meta"]["timestamp"],
"forecast_version": forecast["meta"]["version"],
}
forecast_values_df = pd.DataFrame(forecast["values"])
forecast_values_df["horizon_minutes"] = (
(forecast_values_df["start_utc"] - forecast_meta["timestamp_utc"])
/ pd.Timedelta("60s")
).astype("int")

if write_to_db:
insert_forecast_values(db_session, forecast_meta, forecast_values_df)
else:
log.info(
f'site_id={forecast_meta["site_uuid"]}, \
timestamp={forecast_meta["timestamp_utc"]}, \
version={forecast_meta["forecast_version"]}, \
forecast values={forecast_values_df.to_string()}'
)


@click.command()
@click.option("--site", help="Site ID")
def app(site):
"""Runs the forecast for a given site"""
print(f"Running forecast for site: {site}")
@click.option(
"--date",
"-d",
"timestamp",
type=click.DateTime(formats=["%Y-%m-%d-%H-%M"]),
default=None,
help='Date-time (UTC) at which we make the prediction. \
Format should be YYYY-MM-DD-HH-mm. Defaults to "now".',
)
@click.option(
"--write-to-db",
is_flag=True,
default=False,
help="Set this flag to actually write the results to the database.",
)
@click.option(
"--log-level",
default="info",
help="Set the python logging log level",
show_default=True,
)
def app(timestamp: dt.datetime | None, write_to_db: bool, log_level: str):
"""
Main function for running forecasts for sites in India
"""
logging.basicConfig(stream=sys.stdout, level=getattr(logging, log_level.upper()))

if timestamp is None:
timestamp = dt.datetime.now(tz=dt.UTC)
log.info('Timestamp omitted - will generate forecasts for "now"')
else:
# Ensure timestamp is UTC
timestamp.replace(tzinfo=dt.UTC)

# 0. Initialise DB connection
url = os.environ["DB_URL"]

db_conn = DatabaseConnection(url, echo=False)

with db_conn.get_session() as session:

# 1. Get sites
log.info("Getting sites...")
site_ids = get_site_ids(session)
log.info(f"Found {len(site_ids)} sites")

# 2. Load model
log.info("Loading model...")
model = get_model()
log.info("Loaded model")

# 3. Run model for each site
for site_id in site_ids:
log.info(f"Running model for site={site_id}...")
forecast_values = run_model(model=model, site_id=site_id, timestamp=timestamp)

if forecast_values is None:
log.info(f"No forecast values for site_id={site_id}")
else:
# 4. Write forecast to DB or stdout
log.info(f"Writing forecast for site_id={site_id}")
forecast = {
"meta": {
"site_id": site_id,
"version": model.version,
"timestamp": timestamp,
},
"values": forecast_values,
}
save_forecast(
session,
forecast=forecast,
write_to_db=write_to_db,
)


if __name__ == "__main__":
app()
app()
Loading

0 comments on commit 0e4431a

Please sign in to comment.