-
-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Write script to convert .nat
EUMETSAT files to Zarr intermediate
#14
Comments
@jacobbieker and @peterdudfield what do you guys think about keeping the script to convert EUMETSAT
|
Yeah, I agree, keeping as much of the satellite specific stuff in Satip makes sense. I like generally keeping the packages as separate as possible. |
.nat
EUMETSAT files to intermediate file format.nat
EUMETSAT files to Zarr intermediate
On the topic of compression, it might be worth checking out zfp. I haven't tried it! Definitely not that important though! zstd is probably fine! |
Features of this script
.nat
files; and target directory for the Zarr..nat
data: When the script starts, it checks through all the.nat
files (recursively), and checks through the existing Zarr, and only converts data which is present in the.nat
files but absent in the Zarr. I think you can append to Zarr stores using something likexr.Dataset.to_zarr(mode='a', append_dim='time')
. Definitely have a look at the xarray docs on appending to Zarr. It's possible that appending to Zarr only works correctly if data is appending in order, but I'm not certain! (Zarr's fragility when it comes to appending data might be one strong argument for swapping to using GeoTIFF or individual NetCDF files per EUMETSAT timestep, instead of Zarr... But let's try to get Zarr to work because it does seem to enable the fastest reads).int16
, using only 10 bits per pixel per channel. i.e., re-scale each channel to [0, 1023], and save innp.int16
dtype. This results in really good compression (better than usingfloat16
), and probably more precise (see the raw benchmark results here. I benchmarked a bunch of compression algorithms.compressor = numcodecs.Blosc(cname="zstd", clevel=5)
was the best setting I found. If we want to be really ambitious we could try compressing with a lossless, modern image compression algorithm like AVIF or WebP. Some more notes about these options in Benchmark candidate intermediate file formats for EUMETSAT data #13. But, for now,zstd
is probably fine.)Related:
.nat
files to NetCDFThe text was updated successfully, but these errors were encountered: