-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TypeError: Unsupported type <class 'numpy.ndarray'>
#1405
Comments
I'm no expert when it comes to Dask dataframes, after some investigation I've found the problem with the code above is import dask
from dask import array as da
from dask import dataframe as dd
from dask_cuda import LocalCUDACluster
from distributed import Client, LocalCluster
import numpy as np
import cupy as cp
def main(client: Client) -> None:
rng = da.random.default_rng(1994)
X = rng.random(size=(2048, 4))
df = dd.from_dask_array(X, columns=[f"f{i}" for i in range(4)])
df["qid"] = rng.integers(low=0, high=4, size=(2048, ), dtype=np.int64)
s = da.cumsum(df.groupby("qid").qid.count().to_dask_array(lengths=True, meta=cp.array(()))).compute()
print(s)
if __name__ == "__main__":
with LocalCUDACluster() as cluster:
with Client(cluster) as client:
with dask.config.set(
{"array.backend": "cupy", "dataframe.backend": "cudf"}
):
main(client) During the investigation I could not find many uses of In any case, specifying |
Thank you for looking into this @pentschev ! I will leave this open in case this is considered a bug. |
Thanks for raising this @trivialfis - There are some known rough edges in DataFrame <-> Array conversion. I believe this is indeed a bug caused by the ongoing removal of legacy Dask DataFrame (dask/dask-expr#1168). |
FWIW, this is reproducible with |
Okay, thanks - That's useful info. The fix will require an upstream change to |
Would this just be the inverse of the logic in xref: dask/dask#9579 |
Script:
Version:
The text was updated successfully, but these errors were encountered: