-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add GRIB2ReferenceRecipe #387
Comments
Or what about a generic |
This seems pretty useful; I've been playing around with this during the evening, and it looks like fsspec/kerchunk#198 would unblock writing recipes that read from remotely hosted GRIB files. Directly translating the example HRRR concatenation into a recipe here would be a really cool addition. Happy to play around with this and propose a refactor for |
A quick update - I was able to hack out a minimal working example of using the existing In a nut-shell, it was pretty straightforward to modify the existing codepaths for scanning GRIB files. A major departure from the HDF formulation is that each GRIB message gets its own reference, so we need a more elegant way to combine this so that a single GRIB file maps to a single Zarr file. I opted just to flatten all the messages, assuming metadata about coords remains constant, but there are other places this could be fixed, namely in the The other big change would be the actual interface for For a quick demo, here's a simple gist which subsets from a HRRR forecast and concatenates the output. This workflow is vastly superior to most ones that I've seen for extracting useful data from the online HRRR archive - it's fast, easy to modify, and is very concise thanks to the great mechanics inside pangeo-forge. |
Another quick update - I started re-factoring a generic |
Thanks for working on this @darothen! For reference, we are in the process of completely refactoring the internals of pangeo forge recipes. (See #376 and #391 for latest progress.) That is happening on the That said, I'm 👍 on doing whatever is necessary to make Grib work with the current code. |
No worries @rabernat. This project is really just an excuse for me to get much more comfortable with the internals of |
Just a quick note to say #486 will resolve this issue. 🎉 This integration test, which is directly modeled on @darothen's gist linked in #387 (comment), is passing in CI and seems to confirm that at least a minimally-functional version of the GRIB2 case will soon be supported in our
This issue, identified by Daniel above, was indeed the crux of the matter. We've addressed it in #486 with the with beam.Pipeline(options=options) as p:
(
p
| beam.Create(pattern.items())
| OpenWithKerchunk(...)
| CombineReferences(..., precombine_inputs=True)
| WriteCombinedReference(...)
) When I am certainly a GRIB (and kerchunk) neophyte, so very much welcome feedback on this from Daniel, @rsignell-usgs, or others with deeper domain knowledge. Just now cleaning up #486 with hopes to merge it later today. Assuming you may not get a chance to take a look today, I'd love to follow up post-merge to get both of you to test drive 🚗 💨 these new transforms. |
Resolved by #486. See this file for the merged version of the integration test mentioned in previous comment. Here's an excerpt demonstrating the GRIB2 -> kerchunk pipeline there:
As mentioned elsewhere, would love feedback on this from anyone working with GRIB2, in the form of issues, bug reports, etc. |
Hey @cisaacstern thanks for the awesome update here! It sounds like the solution you landed on is a great path forward. I'll try to test-drive this with one of my personal data processing pipelines. Not sure when I'll find the time, but we will try to ping back here with results if I can squeeze in the work some evening this week. |
We have
HDFReferenceRecipe
, but kerchunk also handles GRIB2. PerhapsGRIB2ReferenceRecipe
?The text was updated successfully, but these errors were encountered: