You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
when loading to_dask with caching as in pangeo-data/pangeo-datastore#113, fsspec.open_local first loads the whole dataset and then opens the data in xarray, still with chunks but after having spend the time on downloading.
is there a way to circumvent this in intake-xarray or is this a consequence from fsspec caching that cannot be changed for intake-xarray?
it would be great to just do to_dask() without spending the time to download and only cache when xarray runs compute.
The text was updated successfully, but these errors were encountered:
Whilst this may be possible, it would be tricky. Dask wants to open the file to assess the chunking; it could be done on the original file, but only cache it when actually loading, in theory. There is a block-wise cacher in fsspec, which only downloads the parts of a file that are accessed, as they are accessed, but that only works with a library expecting to work with python file-like objects (i.e., there's a reason to call open_local: the library wants a real local file). You could do something with FUSE, where the file looks real to the OS, but uses block-wise chunking internally - this kind of thing I'm pretty sure has never been tried.
when loading
to_dask
with caching as in pangeo-data/pangeo-datastore#113,fsspec.open_local
first loads the whole dataset and then opens the data inxarray
, still with chunks but after having spend the time on downloading.is there a way to circumvent this in
intake-xarray
or is this a consequence fromfsspec
caching that cannot be changed forintake-xarray
?it would be great to just do
to_dask()
without spending the time to download and only cache whenxarray
runscompute
.The text was updated successfully, but these errors were encountered: