Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tests for Xarray io functions #30

Merged
merged 21 commits into from
Dec 14, 2024
Merged

Add tests for Xarray io functions #30

merged 21 commits into from
Dec 14, 2024

Conversation

Jack-Hayes
Copy link
Member

Related: #22
Related: #23

This PR implements returning an xarray.core.dataset.Dataset raster for cop30 and worldcover datasets rather than the respective item metadata. There are pretty big issues with efficiency in terms of loading in these rasters for large AOIs. I haven't implemented any GDAL enhancements (OSGeo/gdal#11317) or addressed odc-stac for accepting the geodataframe/stac-geoparquet directly (I'm still using to_pystac_items here).

@Jack-Hayes Jack-Hayes requested a review from scottyhq November 27, 2024 19:29
@scottyhq
Copy link
Member

scottyhq commented Nov 29, 2024

Thanks for starting this @Jack-Hayes ! If adding a new run-time dependency (odc.stac) you'll want to add it here (with version constraints)

dependencies = [

And then also lower in that file (with "*") so that pixi recognizes it:

[tool.pixi.dependencies]

I moved the 'contributing' guide from the readme to this page in the docs recently https://coincident.readthedocs.io/en/latest/user_guide/contribute.html


# Used to access formatters
from pystac_client.item_search import ItemSearch as _ItemSearch
from shapely.validation import make_valid
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. I didn't know about this function. Currently, the code just checks for ccw ordering

if (
not shapely_geometry.exterior.is_ccw
): # Apparently NASA CMR enforces polygon CCW order

Out of curiosity, were did you encounter an invalid geometry where this was needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of flight extent polygons from the NOAA Costal LiDAR catalog are snake-like and non-contiguous which resulted 'invalid' polygons even after a simplify() with a tolerance of 0.01 (and I believe some of the really bad flight extent polygons were still invalid after simplify(0.01).make_valid(), so I buffered those by a small amount before searching for PCD sites).
There were also issues with simplify(0.01) making holes in some of the NEON flight extent polygons which were created by merging all of the tile extents in the flight, and make_valid() fixed some of these (see one of our Slack threads on this!). I ended up using a minor buffer on these as well for initial PCD site searches rather than simplify.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha thanks! Could leave this for a separate PR, but if including, you could add one of these polygons to test (can add as a geojson in a 'tests/data' subfolder).

max(y for x, y in coordinates),
)
items = stac.to_pystac_items(gf)
ds = odc.stac.load(
Copy link
Member

@scottyhq scottyhq Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be slow for large areas because by default chunks=None. chunks='auto' chunks={} requires dask, but should to be much faster.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, great point 😊 I'm assuming we'll eventually add support for dask so I might as well incorporate it here?

@scottyhq scottyhq mentioned this pull request Dec 6, 2024
7 tasks
tests/test_xarray.py Outdated Show resolved Hide resolved
tests/test_xarray.py Outdated Show resolved Hide resolved
tests/test_xarray.py Outdated Show resolved Hide resolved
tests/test_xarray.py Outdated Show resolved Hide resolved
tests/test_xarray.py Outdated Show resolved Hide resolved
tests/test_xarray.py Outdated Show resolved Hide resolved
tests/test_xarray.py Outdated Show resolved Hide resolved
@scottyhq scottyhq changed the title Implement xarray dataset raster return for cop30 and worldcover datasets Add tests for Xarray io functions Dec 14, 2024
@scottyhq scottyhq merged commit f8b0973 into main Dec 14, 2024
11 checks passed
@scottyhq scottyhq deleted the hayes-dev branch December 14, 2024 13:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants