Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(feat): read_elem_as_dask method #1469

Merged
merged 160 commits into from
Jul 23, 2024
Merged
Changes from 1 commit
Commits
Show all changes
160 commits
Select commit Hold shift + click to select a range
d111f04
(feat): `read_elem_lazy` method
ilan-gold Apr 11, 2024
00be7f0
(revert): error message
ilan-gold Apr 11, 2024
fd635d7
(refactor): declare `is_csc` reading elem directly in h5
ilan-gold Apr 11, 2024
f5e7fda
(chore): `read_elem_lazy` -> `read_elem_as_dask`
ilan-gold Apr 12, 2024
ae5396c
(chore): remove string handling
ilan-gold Apr 12, 2024
664336a
(refactor): use `elem` for h5 where posssble
ilan-gold Apr 12, 2024
2370215
Merge branch 'main' into ig/read_dask_elem
ilan-gold Apr 17, 2024
52002b6
(chore): remove invlaud syntax
ilan-gold Apr 17, 2024
5ab1ad1
Merge branch 'ig/read_dask_elem' of github.com:scverse/anndata into i…
ilan-gold Apr 17, 2024
aa1006e
(fix): put dask import inside function
ilan-gold Apr 17, 2024
dda7d83
(refactor): try maybe open?
ilan-gold Apr 17, 2024
fd418f0
Merge branch 'main' into ig/read_dask_elem
ilan-gold May 27, 2024
23b0bfd
Merge branch 'main' into ig/read_dask_elem
ilan-gold May 27, 2024
97b8031
Merge branch 'main' into ig/read_dask_elem
ilan-gold Jun 3, 2024
1fc4cc3
(fix): revert `encoding-version`
ilan-gold Jun 3, 2024
5ca71ea
(chore): document `create_sparse_store` test function
ilan-gold Jun 3, 2024
3672c18
(chore): sort indices to prevent warning
ilan-gold Jun 3, 2024
33c3599
(fix): remove utility function `make_dask_array`
ilan-gold Jun 3, 2024
157e710
(chore): `read_sparse_as_dask_h5` -> `read_sparse_as_dask`
ilan-gold Jun 3, 2024
375000d
(feat): make params of `h5_chunks` and `stride`
ilan-gold Jun 3, 2024
241904a
(chore): add distributed test
ilan-gold Jun 3, 2024
42d0d22
(fix): `TypeVar` bind
ilan-gold Jun 3, 2024
0bba2c0
(chore): release note
ilan-gold Jun 4, 2024
0d0b43a
(chore): `0.10.8` -> `0.11.0`
ilan-gold Jun 5, 2024
762d4c6
Merge branch 'main' into ig/read_dask_elem
ilan-gold Jun 26, 2024
c935fe0
(fix): `ruff` for default `pytest.fixture` `scope`
ilan-gold Jun 26, 2024
23e0ea2
Apply suggestions from code review
ilan-gold Jul 1, 2024
5b96c77
(fix): `Any` to `DaskArray`
ilan-gold Jul 1, 2024
0907a4e
(fix): type `make_index` + fix undeclared
ilan-gold Jul 1, 2024
20ced16
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 1, 2024
36ae8f2
Merge branch 'main' into ig/read_dask_elem
ilan-gold Jul 1, 2024
bb6607e
fix rest
flying-sheep Jul 1, 2024
419691b
(fix): use `chunks` kwarg
ilan-gold Jul 2, 2024
a23df34
Merge branch 'main' into ig/read_dask_elem
ilan-gold Jul 2, 2024
fd2376a
(feat): expose `chunks` as an option to `read_elem_as_dask` via `data…
ilan-gold Jul 2, 2024
ae723d0
Merge branch 'ig/read_dask_elem' of github.com:scverse/anndata into i…
ilan-gold Jul 2, 2024
42b1093
(fix): `test_read_dispatched_null_case` test
ilan-gold Jul 2, 2024
78de057
(fix): disallowed spread syntax?
ilan-gold Jul 2, 2024
717b997
(refactor): reuse `compute_chunk_layout_for_axis_shape` functionality
ilan-gold Jul 2, 2024
2b86293
(fix): remove unneeded `slice` arguments
ilan-gold Jul 3, 2024
8d5a9df
(fix): revert message
ilan-gold Jul 3, 2024
449fc1a
(refactor): `make_index` -> `make_block_indexer`
ilan-gold Jul 3, 2024
1522de3
(fix): export from `experimental`
ilan-gold Jul 3, 2024
71c150d
(fix): `callback` signature for `test_read_dispatched_null_case
ilan-gold Jul 3, 2024
b441366
(chore): `get_elem_name` helper
ilan-gold Jul 3, 2024
0307a1d
(chore): use `H5Group` consistently
ilan-gold Jul 3, 2024
ee075cd
(refactor): make `chunks` public facing API instead of `dataset_kwargs`
ilan-gold Jul 3, 2024
89acec4
(fix): regsiter for group not array
ilan-gold Jul 3, 2024
48b7630
(chore): add warning test
ilan-gold Jul 3, 2024
8712582
(chore): make arg order consistent
ilan-gold Jul 3, 2024
cda8aa7
(feat): add `callback` typing for `read_dispatched`
ilan-gold Jul 5, 2024
e8f62f4
(chore): use `npt.NDArray`
ilan-gold Jul 5, 2024
f6e48ac
(fix): remove uneceesary union
ilan-gold Jul 5, 2024
4de3246
(chore): release note
ilan-gold Jul 5, 2024
ba817e0
(fix); try protocol docs
ilan-gold Jul 5, 2024
438d28d
(feat): create `InMemoryElem` + `DictElemType` to remove `Any`
ilan-gold Jul 5, 2024
296ea3f
(chore): refactor `DictElemType` -> `InMemoryArrayOrScalarType` for r…
ilan-gold Jul 5, 2024
cf13a57
(fix): use `Union`
ilan-gold Jul 5, 2024
d02ba49
(fix): more `Union`
ilan-gold Jul 5, 2024
6970a97
(refactor): `InMemoryElem` -> `InMemoryReadElem`
ilan-gold Jul 5, 2024
2282351
(chore): add needed types to public export + docs fix
ilan-gold Jul 5, 2024
810cd0a
Merge branch 'main' into ig/read_dask_elem
flying-sheep Jul 8, 2024
a996081
(chore): type `write_elem` functions
ilan-gold Jul 8, 2024
f6e457b
(chore): create `write_callback` protocol
ilan-gold Jul 8, 2024
a0b4057
Merge branch 'main' into ig/protocol_for_callback
ilan-gold Jul 8, 2024
4416526
(chore): export + docs
ilan-gold Jul 8, 2024
fbe44f0
(fix): add string descriptions
ilan-gold Jul 8, 2024
8c1f01d
(fix): try sphinx protocol doc
ilan-gold Jul 8, 2024
a7d412a
(fix): try ignoring exports
ilan-gold Jul 8, 2024
4d56396
(fix): remap callback internal usages
ilan-gold Jul 8, 2024
2012ee5
(fix): add docstring
ilan-gold Jul 8, 2024
f65f065
Discard changes to pyproject.toml
flying-sheep Jul 9, 2024
8f6ea49
re-add dep
flying-sheep Jul 9, 2024
155a21e
Fix docs
flying-sheep Jul 9, 2024
daae3e5
Almost works
flying-sheep Jul 9, 2024
c415ae4
works!
flying-sheep Jul 9, 2024
00010b8
(chore): use pascal-case
ilan-gold Jul 9, 2024
0bd87fc
(feat): type read/write funcs in callback
ilan-gold Jul 9, 2024
5997678
(fix): use generic for `Read` as well.
ilan-gold Jul 9, 2024
f208332
(fix): need more aliases
ilan-gold Jul 9, 2024
eb69fcb
Split table, format
flying-sheep Jul 9, 2024
477bbef
(refactor): move to `_types` file
ilan-gold Jul 9, 2024
103cad6
Merge branch 'ig/protocol_for_callback' of github.com:scverse/anndata…
ilan-gold Jul 9, 2024
8d23f6f
bump scanpydoc
flying-sheep Jul 9, 2024
9b647c2
Some basic syntax fixes
flying-sheep Jul 9, 2024
d6d01bc
Merge branch 'ig/protocol_for_callback' into ig/read_dask_elem
ilan-gold Jul 9, 2024
5ef93e1
(fix): change `Read{Callback}` type for kwargs
ilan-gold Jul 9, 2024
9cfe908
(chore): test `chunks `argument
ilan-gold Jul 9, 2024
99fc6db
(fix): type `read_recarray`
ilan-gold Jul 9, 2024
b5bccc3
(fix): `GroupyStorageType` not `StorageType`
ilan-gold Jul 9, 2024
e5ea2b0
(fix): little type fixes
ilan-gold Jul 9, 2024
6ac72d6
(fix): clarify `H5File` typing
ilan-gold Jul 9, 2024
989dc65
(fix): dask doc
ilan-gold Jul 9, 2024
36b0207
(fix): dask docs
ilan-gold Jul 9, 2024
dadfb4d
Merge branch 'ig/protocol_for_callback' into ig/read_dask_elem
ilan-gold Jul 9, 2024
ca6cf66
(fix): typing
ilan-gold Jul 9, 2024
eabaf35
(fix): handle case when `chunks` is `None`
ilan-gold Jul 9, 2024
4c398c3
(feat): add string-array reading
ilan-gold Jul 9, 2024
d6fc8a4
(fix): remove `string-array` because it is not tested
ilan-gold Jul 9, 2024
33aebb2
(refactor): clean up tests
ilan-gold Jul 10, 2024
701cd85
(fix): overfetching problem
ilan-gold Jul 10, 2024
43b21a2
Fix circular import
flying-sheep Jul 11, 2024
0e22449
add some typing
flying-sheep Jul 11, 2024
ec546f4
fix mapping types
flying-sheep Jul 11, 2024
7c2e4da
Fix Read/Write
flying-sheep Jul 11, 2024
1ba5b99
Fix one more
flying-sheep Jul 11, 2024
49c0d49
unify names
flying-sheep Jul 11, 2024
3666735
claift ReadCallback signature
flying-sheep Jul 11, 2024
3a332ad
Fix type aliases
flying-sheep Jul 11, 2024
d0f4d13
(fix): clean up typing to use `RWAble`
ilan-gold Jul 11, 2024
6e89e14
Merge branch 'main' into ig/protocol_for_callback
ilan-gold Jul 11, 2024
ea29cfa
(fix): use `Union`
ilan-gold Jul 11, 2024
f4ff236
(fix): add qualname override
ilan-gold Jul 11, 2024
f50b286
(fix): ignore dask and masked array
ilan-gold Jul 11, 2024
712e085
(fix): ignore erroneous class warning
ilan-gold Jul 11, 2024
24dd18b
(fix): upgrade `scanpydoc`
ilan-gold Jul 11, 2024
79d3fdc
(fix): use `MutableMapping` instead of `dict` due to broken docstring
ilan-gold Jul 11, 2024
9a2be00
Merge branch 'ig/protocol_for_callback' into ig/read_dask_elem
ilan-gold Jul 11, 2024
d3bcddf
Add data docs
flying-sheep Jul 11, 2024
84fdc96
Revert "(fix): use `MutableMapping` instead of `dict` due to broken d…
flying-sheep Jul 11, 2024
2608bc3
(fix): add clarification
ilan-gold Jul 11, 2024
e551e18
Simplify
flying-sheep Jul 11, 2024
13e3bb1
Merge branch 'ig/protocol_for_callback' into ig/read_dask_elem
ilan-gold Jul 11, 2024
2935e45
Merge branch 'main' into ig/read_dask_elem
ilan-gold Jul 11, 2024
bf0be15
Merge branch 'ig/read_dask_elem' of github.com:scverse/anndata into i…
ilan-gold Jul 11, 2024
9d37fc8
Merge branch 'main' into ig/read_dask_elem
ilan-gold Jul 12, 2024
1ffe43e
(fix): remove double `dask` intersphinx
ilan-gold Jul 12, 2024
f9df5bc
(fix): remove `_types.DaskArray` from type checking block
ilan-gold Jul 12, 2024
a85da39
(refactor): use `block_info` for resolving fetch location
ilan-gold Jul 15, 2024
3bef77c
Merge branch 'ig/read_dask_elem' of github.com:scverse/anndata into i…
ilan-gold Jul 15, 2024
899184f
(fix): dtype for reading
ilan-gold Jul 15, 2024
efb70ec
(fix): ignore import cycle problem (why??)
ilan-gold Jul 16, 2024
118f43c
(fix): add issue
ilan-gold Jul 16, 2024
f742a0a
(fix): subclass `Reader` to remove `datasetkwargs`
ilan-gold Jul 18, 2024
ae68731
(fix): add message tp errpr
ilan-gold Jul 18, 2024
f5e7760
Update tests/test_io_elementwise.py
ilan-gold Jul 18, 2024
96b13a3
(fix): correct `self.callback` check
ilan-gold Jul 18, 2024
9c68e36
(fix): erroneous diffs
ilan-gold Jul 18, 2024
410aeda
(fix): extra `read_elem` `dataset_kwargs`
ilan-gold Jul 18, 2024
31a30c4
(fix): remove more `dataset_kwargs` nonsense
ilan-gold Jul 18, 2024
80fe8cb
(chore): add docs
ilan-gold Jul 18, 2024
b314248
(fix): use `block_info` for dense
ilan-gold Jul 18, 2024
02d4735
(fix): more erroneous diffs
ilan-gold Jul 18, 2024
6e5534a
(fix): use context again
ilan-gold Jul 18, 2024
d26cfe8
(fix): change size by dimension in tests
ilan-gold Jul 22, 2024
94e43a3
(refactor): clean up `get_elem_name`
ilan-gold Jul 22, 2024
5160016
(fix): try new sphinx for error
ilan-gold Jul 22, 2024
43da9a3
(fix): return type
ilan-gold Jul 22, 2024
9735ced
(fix): protocol for reading
ilan-gold Jul 22, 2024
f1730c3
(fix): bring back ignored warning
ilan-gold Jul 22, 2024
9861b56
Fix docs
flying-sheep Jul 22, 2024
235096a
almost fix typing
flying-sheep Jul 22, 2024
dce9f07
add wrapper
flying-sheep Jul 22, 2024
2725ef2
move into type checking
flying-sheep Jul 22, 2024
ffe89f0
(fix): small type fxes
ilan-gold Jul 22, 2024
6cb231e
Merge branch 'main' into ig/read_dask_elem
ilan-gold Jul 22, 2024
75a64fc
block info types
flying-sheep Jul 22, 2024
3f734fe
simplify
flying-sheep Jul 22, 2024
c4c2356
rename
flying-sheep Jul 22, 2024
cc67a9b
simplify more
flying-sheep Jul 22, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
(refactor): clean up tests
ilan-gold committed Jul 10, 2024
commit 33aebb26c6100e74e24124d483b2f47f4b198480
17 changes: 4 additions & 13 deletions tests/test_io_elementwise.py
Original file line number Diff line number Diff line change
@@ -245,7 +245,7 @@ def test_read_lazy_2d_dask(sparse_format, store):
(2, None),
],
)
def test_read_lazy_nd_dask(store, n_dims, chunks):
def test_read_lazy_subsets_nd_dask(store, n_dims, chunks):
arr_store = create_dense_store(store, n_dims)
X_dask_from_disk = read_elem_as_dask(arr_store["X"], chunks=chunks)
X_from_disk = read_elem(arr_store["X"])
@@ -285,11 +285,7 @@ def test_read_lazy_h5_cluster(sparse_format, tmp_path):
("csr", None),
],
)
def test_read_lazy_h5_chunk_kwargs(arr_type, chunks, tmp_path):
import dask.distributed as dd

file = h5py.File(tmp_path / "test.h5", "w")
store = file["/"]
def test_read_lazy_2d_chunk_kwargs(store, arr_type, chunks):
if arr_type == "dense":
arr_store = create_dense_store(store)
X_dask_from_disk = read_elem_as_dask(arr_store["X"], chunks=chunks)
@@ -302,15 +298,10 @@ def test_read_lazy_h5_chunk_kwargs(arr_type, chunks, tmp_path):
# assert that sparse chunks are set correctly by default
assert X_dask_from_disk.chunksize[bool(arr_type == "csr")] == SIZE
X_from_disk = read_elem(arr_store["X"])
file.close()
with (
dd.LocalCluster(n_workers=1, threads_per_worker=1) as cluster,
dd.Client(cluster) as _client,
):
assert_equal(X_from_disk, X_dask_from_disk)
assert_equal(X_from_disk, X_dask_from_disk)


def test_read_lazy_h5_bad_chunk_kwargs(tmp_path):
def test_read_lazy_bad_chunk_kwargs(tmp_path):
arr_type = "csr"
file = h5py.File(tmp_path / "test.h5", "w")
store = file["/"]

Unchanged files with check annotations Beta

@singledispatch
def get_elem_name(x):
raise NotImplementedError(f"Not implemented for {type(x)}")

Check warning on line 82 in src/anndata/_io/specs/lazy_methods.py

Codecov / codecov/patch

src/anndata/_io/specs/lazy_methods.py#L82

Added line #L82 was not covered by tests
@get_elem_name.register(H5Group)