Nothing about this is specific to GRIB #2

TomNicholas · 2025-01-10T21:05:07Z

Great write up, really interesting. I just want to give my 2 cents:

Nothing about the problems described are specific to GRIB.

Once you know where the bytes in each chunk are and how to decode them, every problem you have described applies to any chunked multi-dimensional array dataset.

Therefore, any solution to this should not be specific to GRIB at all! The only part that needs to be specific to GRIB is the part that determines the locations of the bytes and how to decode them. Every step after that (and I agree with a lot of your ideas) should not have anything to do with GRIB specifically. All your cool caching layer ideas can be implemented entirely in the language of zarr / icechunk / seamless arrays.

One way to do this would be to write a GRIB reader for VirtualiZarr. That's the only GRIB-specific step. This just brings us back to zarr-developers/VirtualiZarr#238, and thus finishes my attempt to nerd-snipe you :)

JackKelly · 2025-01-13T12:26:30Z

Great write up, really interesting

Thanks! I'm really glad you like it!

Nothing about the problems described are specific to GRIB.

I agree 100%! (Sorry, I should've written more about this in the text!)

Some of this text was originally in the hypergrib repo. But, as you say, it's clear that a lot of this isn't specific to GRIB. So I (clumsily) created this new repo to draw a distinction between GRIB-specific things (like hypergrib) and ideas that are more general.

I definitely agree that the caching should be a separate project! (And you could imagine an MVP caching system being pretty simple: maybe just a ~100 lines of Python, connecting together existing tools).

I'm deliberately keeping hypergrib as a GRIB-specific thing (for now, at least) if only because my brain isn't capable of designing a general-purpose thing until I've built several special-purpose things! And because I want to see how fast we can go if we "cheat" and create a special-purpose multi-file GRIB reader. Although I'm optimistic that, over time, we can make it more and more general (whilst maintaining performance!).

JackKelly added a commit that referenced this issue Jan 13, 2025

State that the plan is to be more general purpose than just GRIB. #2

29d4cfe

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nothing about this is specific to GRIB #2

Nothing about this is specific to GRIB #2

TomNicholas commented Jan 10, 2025 •

edited

Loading

JackKelly commented Jan 13, 2025

Nothing about this is specific to GRIB #2

Nothing about this is specific to GRIB #2

Comments

TomNicholas commented Jan 10, 2025 • edited Loading

JackKelly commented Jan 13, 2025

TomNicholas commented Jan 10, 2025 •

edited

Loading