Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type chunkmanagers #9227

Draft
wants to merge 61 commits into
base: main
Choose a base branch
from
Draft

Conversation

Illviljan
Copy link
Contributor

@Illviljan Illviljan commented Jul 10, 2024

Use chunkedduckarray and duckarray in chunkmanagers now that it is used in namedarray.

Split from #8933
Requires #9226, #9225

def encode_cf_datetime(
dates: T_DuckArray, # type: ignore
dates: duckarray | chunkedduckarray,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should still use TypeVars and not the protocol directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that works (yet, waiting on Higher-Kinded TypeVars). Using T_DuckArray will indeed retain the np.ndarray typing (good) but also the dtype and shape typing (bad/wrong).

We don't really care (in theory) if it is ndarray or cupy array, just that they pass the minimal requirements for a duckarray. But we do care that dtype and shape is correct after calculation.

duckarray is actually duckarray[_ShapeType, _DType] where the dtype and shape are TypeVars, in same fashion as numpy see #8294. I am intentionally lazy with this in non-namedarray files though to avoid the PRs touching every file.

@TomNicholas TomNicholas added the topic-chunked-arrays Managing different chunked backends, e.g. dask label Jul 23, 2024
@Illviljan
Copy link
Contributor Author

pd.index is apparently not a "duckarray":

    def test_pd_index_duckarray(self) -> None:
        import pandas as pd

        a: duckarray[Any, Any] = pd.Index([])
        check_duck_array_typevar(a)
xarray/tests/test_namedarray.py: note: In member "test_pd_index_duckarray" of class "TestNamedArray":
xarray/tests/test_namedarray.py:366: error: Incompatible types in assignment (expression has type "Index[int]", variable has type "_arrayfunction[Any, Any] | _arrayapi[Any, Any]")  [assignment]
FAILED xarray/tests/test_namedarray.py::TestNamedArray::test_pd_index_duckarray - TypeError: a (<class 'pandas.core.indexes.base.Index'>) is not a valid _arrayfunction or _arrayapi

Illviljan and others added 27 commits August 5, 2024 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-chunked-arrays Managing different chunked backends, e.g. dask topic-typing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants