-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
List of beam PTransforms to implement to recreate XarrayZarrRecipe #376
Comments
Re: |
One implementation note, related to A sister team of mine has recently had great success using SDFs to read GFS data, and I'm starting to look into them in my own projects (google/weather-tools#189). |
When the Beam refactor is complete (#256), the Pangeo Forge public API will look quite different. Broadly, we will export three main things:
A sketch of how 2 and 3 might look is already underway in #375.
Let's use this issue brainstorm all of the PTransforms we imagine we will need to implement to get feature complete with XarrayZarrRecipe
FilePatternSource
- Right now we are just usingbeam.Create(pattern.items()
to start our pipelines. Might be better to follow the docs for Custom I/O connectorsOpenWithFSSpec
- Turn whatever comes out of the FilePattern into an FSSpec OpenFile object. Implemented in Improved Beam Opener PTransforms #375OpenWithXarray
- Turn an OpenFile or URL PCollection into an Xarray dataset PCollection (without loading into memory if possible). Implemented in Improved Beam Opener PTransforms #375.InferXarraySchema
- Takes a collection of Xarray datasets and figures out a schema for the target dataset. Implemented in Schema aggregation #377.PrepareZarrTarget
- Takes that schema and use it to initialize the Zarr target. This is a singleton operation (single item PCollection). Can we make that explicit in beam? Implemented in Initialize target #379RechunkForTarget
- Takes the Xarray Dataset PCollection, plus a specification of the target chunks, and returns a new PCollection that is evenly aligned with the target chunksToZarr
- Here we could possibly just use xarray-beamChunksToZarr
. In progress in Zarr fragment writers #391The text was updated successfully, but these errors were encountered: