[Feature]: Define partial chunk shape for GenericDataChunkIterator #995
Labels
category: enhancement
improvements of code or code behavior
priority: medium
non-critical problem and/or affecting only a small set of users
What would you like to see added to HDMF?
Right now, for the
GenericDataChunkIterator
, it's possible to definechunk_mb
orchunk_shape
. I would like to enable a hybrid approach, where a user could inputchunk_mb=10.0, chunk_shape=(None, 64)
, and theGenericDataChunkIterator
would identify the remaining dimension that gets you close to the target chunk size.Is your feature request related to a problem?
It is pretty common for users to have some insight into the likely read patterns of a dataset.
What solution would you like?
I would like
GenericDataChunkIterator
to find the maximum size (prod of dims) that is <= the target size. I also would like the chunk to be as cube-like as possible, so I would like to minimize the sum of the dimensions of the array. Previously, we tried building chunks that were scaled down versions of the data shape, similar to h5py, but experience with Jeremy has shown that this approach is poorly suited for common data reading routines, and I think a better naive assumption would be that (hyper-) cube chunks are a good default.Do you have any interest in helping implement the feature?
Yes.
Code of Conduct
The text was updated successfully, but these errors were encountered: