-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAS-2236 - Updates to methods that calculate the points used to create the dimensions #18
base: DAS-2208-SMAP-L3
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a look, and I appreciate this PR being another more bite-sized piece of progress towards the end goal.
I think there are few themes to my comments:
- You haven't added sufficient unit tests to provide thorough coverage of your changes. In most cases, you've got some tests for new functions, but not all. The tests you've added are generally good, but often are only testing the happy path. It's pretty important to also test realistic edge-cases, which isn't really done in this PR.
- There are quite a lot of places where you are performing very similar tasks, but in some cases along a row, rather than up-and-down a column. Some of those tasks are basically the same thing (performing some operation on a 1-D slice of an array), so it would be cleaner to write a function to do something to a 1-D slice and then invoke it with the correct 1-D slice for either a row or a column.
- I think generally, there are a few places where the documentation strings for each function could do with a bit more descriptive text. This will be the number one location that future developers try to read to understand what the code is meant to do.
- General coding comments.
hoss/coordinate_utilities.py
Outdated
@@ -3,16 +3,16 @@ | |||
coordinate variable data to projected x/y dimension values | |||
""" | |||
|
|||
from typing import Dict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I think the better way to go is just to use the Python built-in dict
, much as you already do for list
and tuple
. That means you don't need to import typing.Dict
here. (When we don't have a big feature on the boil for HOSS, I'm going to go through and clean up the other modules that currently use typing.Dict
, typing.List
, etc, for consistency)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated. - 1c29221
hoss/coordinate_utilities.py
Outdated
@@ -101,20 +101,23 @@ def get_coordinate_variables( | |||
CF-Convention coordinates metadata attribute. It returns them in a specific | |||
order [latitude, longitude]" | |||
""" | |||
|
|||
coordinate_variables_list = varinfo.get_references_for_attribute( | |||
# varinfo returns a set and not a list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you really need this comment? It seems a bit superfluous.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated - 1c29221
hoss/coordinate_utilities.py
Outdated
@@ -101,20 +101,23 @@ def get_coordinate_variables( | |||
CF-Convention coordinates metadata attribute. It returns them in a specific | |||
order [latitude, longitude]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is a comment on the previously written documentation string, but this description strongly implies the return of this function is a list containing a single latitude and longitude variable. Instead it returns a list with two sub-lists (one each of latitude and longitude variable names).
Also,
This function returns latitude and longitude variables...
probably would be clearer as:
This function returns latitude and longitude variable names...
(E.g., it specifically returns the names of the variables, not the variables themselves)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done - 1c29221
for coordinate in coordinate_variables_list | ||
if varinfo.get_variable(coordinate).is_latitude() | ||
for coordinate in coordinate_variables | ||
if varinfo.get_variable(coordinate) is not None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Same comment for the similar change in the list comprehension for longitude_coordinate_variables
)
Where do you catch any issues if there is a reference to a variable in a coordinates
metadata attribute that isn't present in the file? If it's pointing to a latitude or longitude variable that isn't present in the file, at some point that's going to cause a failure when you try to access that variable. Is that gracefully handled somewhere?
(It's likely this is already nicely supported, but it's been a while since the previous PR and I just can't remember)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Separate comment - you have not added unit tests to prove this case is being handled as expected. (Needed both for latitude or longitude)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- this method is first called before prefetch. it should not fail if the variables don't exist.
- if it is present in the variable and not in the file - it will fail during prefetch. That is when it is first accessed from the file. - will check if I have a test on that
- There is a unit test for the function - will add subtests if the variables don't exist
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are unit tests to check for the above cases in line 205 of test_coordinate_utilities.py
https://github.com/nasa/harmony-opendap-subsetter/blob/57744a6298adf081d0bf06c9a6acd0b98cccf99c/tests/unit/test_coordinate_utilities.py#L205C1-L219C1
hoss/coordinate_utilities.py
Outdated
x_y_points = list(zip(points_x, points_y)) | ||
|
||
return x_y_points |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can avoid declaring a variable for a single immediate use and just do:
return list(zip(points_x, points))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done. - 1c29221
hoss/coordinate_utilities.py
Outdated
points: list[tuple], # list of data points as tuple in (lon, lat) order | ||
crs: CRS, # CRS object from PyProj | ||
) -> list[tuple]: # list of X-Y points as tuple in (X, Y) order |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is another place where the in-line comments should go in the documentation string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done - 2f07d1a
"""Ensure that two valid lat/lon points returned by the method | ||
with a set of lat/lon coordinates as input | ||
|
||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice tests. I think there are a couple of additional missing test-cases, though:
- There is only a single valid pixel.
- There are no valid pixels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly also what happens when there are only valid points in a single row or single column (for geographic things).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated unit tests - a3680c5
returned by the method with a set of lat/lon coordinates as input | ||
|
||
""" | ||
with self.subTest('Get two sets of valid indices from coordinates dataset'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comments to a couple of other places. This unit test is only really testing the happy-path. You aren't really throwing any potential edge-cases at the function. Other test cases are:
- Only a single valid pixel.
- Only valid pixels in a single row.
- Only valid pixels in a single column.
- No valid pixels.
These seem like overkill, but it's really important to know how this code behaves in realistic edge-cases, so we know failure is graceful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated unit tests - 57744a6
@@ -328,7 +328,7 @@ def test_get_fill_slice(self): | |||
|
|||
@patch('hoss.dimension_utilities.add_bounds_variables') | |||
@patch('hoss.dimension_utilities.get_opendap_nc4') | |||
def test_prefetch_dimension_variables( | |||
def test_get_prefetch_variables( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that you've added conditional logic to the get_prefetch_variables
function, you need to add more test-cases for it. The get_prefetch_variables
function will need a subtest (or even separate tests if you prefer) that runs through each branch of code.
Right now, I think this test just ensures variables with dimensions, but not bounds, behave nicely. There's also test_prefetch_dimensions_with_bounds
lower down in the file that ensures variables with dimensions and bounds work well. But there's nothing invoking prefetch_variables
for requested variables with anonymous dimensions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated - 3b0a149
I'll check in after Owen has had a chance to review. I've reviewed my suggestions and updates made are ok. |
Description
Jira Issue ID
DAS-2236
Local Test Steps
http://localhost:3000/C1268452378-EEDTEST/ogc-api-coverages/1.0.0/collections/parameter_vars/coverage/rangeset?forceAsync=true&granuleId=G1268452388-EEDTEST&subset=lat(70%3A80)&subset=lon(-160%3A-150)&variable=Soil_Moisture_Retrieval_Data_AM%2Fstatic_water_body_fraction&skipPreview=true
and verify you can check the outputs in panoply
PR Acceptance Checklist
CHANGELOG.md
updated to include high level summary of PR changes.docker/service_version.txt
updated if publishing a release.