Formatting Data with Tests #25

siddharth7113 · 2025-01-02T05:39:02Z

Pull Request

Description

This pull request addresses the functionality required for:

Formatting NESO solar forecast data into OCF-compatible Forecast objects.

It tackles the following issues:

Fixes format into Forecast object #4

Key changes include:

Added format_to_forecast_sql to transform fetched forecast data into ForecastSQL objects.
Applied code formatting (Black)

How Has This Been Tested?

Tests were added for both:

Formatting function (format_to_forecast_sql).

The tests verify:

The correctness of data transformation and structure.

To reproduce:

Run pytest tests/test_format_forecast.py to validate the formatting logic.

Checklist:

My code follows [OCF's coding style guidelines](https://github.com/openclimatefix/.github/blob/main/coding_style.md).
I have performed a self-review of my own code.
I have made corresponding changes to the documentation.
I have added tests that prove my fix is effective or that my feature works.
I have checked my code and corrected any misspellings.

Added necessary dependencies (excluding `pandas`) to support project functionality, testing, and CI workflows.

- Added `format_to_forecast_sql` in `format_forecast.py` to convert NESO solar forecast data into a `ForecastSQL` object. - Fetches model metadata, location, and input data last updated timestamp. - Processes rows to create `ForecastValue` objects and aggregates them into a single `ForecastSQL` object. - Logs key steps for improved traceability and debugging. - Added `test_format_to_forecast_sql_real` in `test_format_forecast.py` to validate the functionality of `format_to_forecast_sql`. - Fetches real data using `fetch_data` and ensures it is correctly formatted into a `ForecastSQL` object. - Performs validation to ensure the number of `ForecastValue` entries matches the filtered DataFrame. - Added `conftest.py` to provide shared test fixtures for the test suite: - `postgres_container` fixture spins up a PostgreSQL container for isolated testing using `testcontainers`. - `db_session` fixture sets up a clean database schema and a test ML model for each test function. - `test_config` fixture provides shared configuration constants for tests. These changes establish a robust framework for processing and testing solar forecast data effectively.

…tion - Replaced hardcoded variables (`resource_id`, `limit`, `columns`, `rename_columns`, `sql_query`) with `test_config` fixture for better reusability and maintainability. - Updated all test cases (`test_fetch_data_api`, `test_fetch_data_sql`, `test_data_consistency`) to use `test_config` for dynamic parameterization. - Ensured SQL queries and API calls use values from `test_config`. - Improved code readability and consistency across test cases.

Ran Black on the codebase to standardize formatting for this PR. No functional changes were made.

siddharth7113 · 2025-01-02T05:48:13Z

Hi @peterdudfield ,

I’ve made the changes and added the formatting function, but I’m encountering some strange behavior. Initially, I created this PR feeling a bit stuck because the formatting test wasn’t passing on my local setup. Specifically, it seems to create 5 objects with 1 value each, instead of 1 object with 5 values.

However, for some reason, these tests are passing in CI, and I’m not sure why. I’m concerned this discrepancy could lead to issues in the future, so I wanted to give you a heads-up and ask for your guidance. I’ve been struggling with this for a few days, and any insights you have would be greatly appreciated!

peterdudfield · 2025-01-02T09:05:35Z

Thanks @siddharth7113 for all this. Ive had a quick loo at the PR and it looks good.

What do you mean creates 5 objects instead of 1? and the other way round?

siddharth7113 · 2025-01-02T09:31:48Z

Thanks @siddharth7113 for all this. Ive had a quick loo at the PR and it looks good.

What do you mean creates 5 objects instead of 1? and the other way round?

OK as in the issue you mentioned to create 1 ForecastSQL object which contains all the Forecast values and accordingly I wrote the test and code for it, but when I was running it on my local system for every df row it was getting from API , it was creating an object in DB, and it contained 1 ForecastValue for some reason I was always getting the Assertion Error which I wrote in the test to check, but when I opened the PR, the test was passing in the CI.

pyproject.toml

peterdudfield · 2025-01-02T09:52:18Z

Thanks @siddharth7113 for all this. Ive had a quick loo at the PR and it looks good.
What do you mean creates 5 objects instead of 1? and the other way round?

OK as in the issue you mentioned to create 1 ForecastSQL object which contains all the Forecast values and accordingly I wrote the test and code for it, but when I was running it on my local system for every df row it was getting from API , it was creating an object in DB, and it contained 1 ForecastValue for some reason I was always getting the Assertion Error which I wrote in the test to check, but when I opened the PR, the test was passing in the CI.

hmm, do you want to make sure you have no local changes, and CI and local tests should be the same.
Apart form that, the code looks ready to merge for me

siddharth7113 · 2025-01-02T10:27:27Z

Thanks @siddharth7113 for all this. Ive had a quick loo at the PR and it looks good.
What do you mean creates 5 objects instead of 1? and the other way round?

OK as in the issue you mentioned to create 1 ForecastSQL object which contains all the Forecast values and accordingly I wrote the test and code for it, but when I was running it on my local system for every df row it was getting from API , it was creating an object in DB, and it contained 1 ForecastValue for some reason I was always getting the Assertion Error which I wrote in the test to check, but when I opened the PR, the test was passing in the CI.

hmm, do you want to make sure you have no local changes, and CI and local tests should be the same. Apart form that, the code looks ready to merge for me

Yeah, can you give me few hours, to be completely sure that I am not missing anything, most probably it would some configuration issue, but I want to be 100% sure before we merge this PR..

peterdudfield · 2025-01-02T10:31:52Z

sure no problem, and no rush

- Updated `pyproject.toml` to ensure `nowcasting_datamodel` is restricted to version 1.5.56.

…lar-consumer into formatting

siddharth7113 · 2025-01-03T01:28:37Z

sure no problem, and no rush

I performed a clean build locally, and the tests are now passing, which resolves the previous issue. However, the current test failures seem to be caused by an API-side error. I wanted to propose modifying the tests to focus on unit testing by mocking the API data, as this approach would help avoid similar issues in the future. This seems like a good opportunity to make that change.

For now, we could proceed with merging this and open a new issue to address transitioning the tests to use mocked data.

peterdudfield · 2025-01-03T08:38:56Z

sure no problem, and no rush

I performed a clean build locally, and the tests are now passing, which resolves the previous issue. However, the current test failures seem to be caused by an API-side error. I wanted to propose modifying the tests to focus on unit testing by mocking the API data, as this approach would help avoid similar issues in the future. This seems like a good opportunity to make that change.

For now, we could proceed with merging this and open a new issue to address transitioning the tests to use mocked data.

Thats a good idea, you ok to make an issue for that?

siddharth7113 · 2025-01-03T09:10:16Z

sure no problem, and no rush

I performed a clean build locally, and the tests are now passing, which resolves the previous issue. However, the current test failures seem to be caused by an API-side error. I wanted to propose modifying the tests to focus on unit testing by mocking the API data, as this approach would help avoid similar issues in the future. This seems like a good opportunity to make that change.
For now, we could proceed with merging this and open a new issue to address transitioning the tests to use mocked data.

Thats a good idea, you ok to make an issue for that?

Sure, will make that and start working on it.

peterdudfield · 2025-01-03T11:03:35Z

@all-contributors please add @siddharth7113 for code

allcontributors · 2025-01-03T11:03:44Z

@peterdudfield

I've put up a pull request to add @siddharth7113! 🎉

siddharth7113 added 5 commits January 2, 2025 10:30

feat(dependencies): add required dependencies to pyproject.toml

5a32560

Added necessary dependencies (excluding `pandas`) to support project functionality, testing, and CI workflows.

style: apply Black formatting to ensure consistent code style

f4ef298

Ran Black on the codebase to standardize formatting for this PR. No functional changes were made.

Merge branch 'openclimatefix:main' into formatting

671f79f

peterdudfield reviewed Jan 2, 2025

View reviewed changes

pyproject.toml Outdated Show resolved Hide resolved

siddharth7113 added 2 commits January 3, 2025 06:28

fix: pin nowcasting_datamodel dependency to version 1.5.56

87d70d5

- Updated `pyproject.toml` to ensure `nowcasting_datamodel` is restricted to version 1.5.56.

Merge branch 'formatting' of https://github.com/siddharth7113/neso-so…

dd19b6a

…lar-consumer into formatting

peterdudfield closed this Jan 3, 2025

peterdudfield reopened this Jan 3, 2025

peterdudfield merged commit 49412c7 into openclimatefix:main Jan 3, 2025
2 of 4 checks passed

allcontributors bot mentioned this pull request Jan 3, 2025

docs: add siddharth7113 as a contributor for code #26

Merged

siddharth7113 deleted the formatting branch January 4, 2025 08:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Formatting Data with Tests #25

Formatting Data with Tests #25

siddharth7113 commented Jan 2, 2025

siddharth7113 commented Jan 2, 2025

peterdudfield commented Jan 2, 2025

siddharth7113 commented Jan 2, 2025

peterdudfield commented Jan 2, 2025

siddharth7113 commented Jan 2, 2025

peterdudfield commented Jan 2, 2025

siddharth7113 commented Jan 3, 2025

peterdudfield commented Jan 3, 2025

siddharth7113 commented Jan 3, 2025

peterdudfield commented Jan 3, 2025

allcontributors bot commented Jan 3, 2025

Formatting Data with Tests #25

Formatting Data with Tests #25

Conversation

siddharth7113 commented Jan 2, 2025

Pull Request

Description

How Has This Been Tested?

Checklist:

siddharth7113 commented Jan 2, 2025

peterdudfield commented Jan 2, 2025

siddharth7113 commented Jan 2, 2025

peterdudfield commented Jan 2, 2025

siddharth7113 commented Jan 2, 2025

peterdudfield commented Jan 2, 2025

siddharth7113 commented Jan 3, 2025

peterdudfield commented Jan 3, 2025

siddharth7113 commented Jan 3, 2025

peterdudfield commented Jan 3, 2025

allcontributors bot commented Jan 3, 2025