This repository has been archived by the owner on Sep 30, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 1
Restore the time series to the variable summaries #408
Merged
esheehan-gsl
merged 11 commits into
main
from
350-restore-the-time-series-to-the-variable-summaries
Oct 23, 2023
Merged
Restore the time series to the variable summaries #408
esheehan-gsl
merged 11 commits into
main
from
350-restore-the-time-series-to-the-variable-summaries
Oct 23, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Update the tests and the implementation for retrieving historical data. In the process, I simplified the response type. I think, in the future, if we want to select unadjusted O - F or just the historical observations, we should build that into the URL, rather than sending all of the data back to the client and letting the client pick what to view. This cuts down on data transfer and allows us to simplify our components when they read data, since they don't have to parse a nested JavaScript object.
Removed some of the dataclasses and functions that had been part of the history API which are no longer used, now that we have the Parquet implementation.
The diag.history function fails when there's a file:// prefix in the diag file path because apparently Pandas gets a little confused and tries to pass it to urllib instead of just opening the file.
This reverts commit 716a400.
Instead of deriving the Parquet file path from the Zarr path, the route reads in the FLASK_DIAG_PARQUET variable from the environment and passes that explicitly to diag.history. diag.history uses that to find the Parquet file for this model, and strips out `file://` if it's present because Pandas is a bit silly about that.
The charts really need to be refactored (in part) to eliminate these hard-coded property accesses, but for now we can just update the properties that the time series component expects so that we can include test out the time series data.
Instead of deriving the path to save parquet fixtures for tests, we use tmp_path, basically the same way we would for production.
It seems like pathlib.Path is having an issue with the `file://` protocol on our environment variables, although I don't think it used to. It's easy enough to strip this, although this may require more robust path/uri handling in our tests.
When we process diag files in our pipeline, we convert the integers in the diag files for is_used into booleans, so should treat them the same way in our test fixtures. I think this is a sign that our test fixtures are poorly set up, since they can get out of sync with reality. The result of this problem was that either the application worked, or our tests passed. Prior to this change, we needed to compare is_used to a boolean for Parquet files generated with our ETL code, but to an integer for test files generated with our fixtures.
ian-noaa
approved these changes
Oct 20, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I agree that it's disquieting that the test fixtures can drift from reality. An immediate solution doesn't come to mind. Though there is part of me that is a little surprised that Python's typing didn't catch this for us at some point.
esheehan-gsl
deleted the
350-restore-the-time-series-to-the-variable-summaries
branch
October 23, 2023 16:14
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Restore the time series charts to the diagnostics display using the Parquet files.