-
Notifications
You must be signed in to change notification settings - Fork 921
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Single-partition Dask executor for cuDF-Polars #17262
Single-partition Dask executor for cuDF-Polars #17262
Conversation
…to cudf-polars-dask-simple
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some small suggestions, but I think this doesn't need another look from me after.
# Return reconstructed node and partition-info dict | ||
partition = PartitionInfo(count=1) | ||
new_node = ir.reconstruct(children) | ||
partition_info[new_node] = partition | ||
return new_node, partition_info |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: No need to address here, but just to note for a followup. This does "unnecessary" reconstruction if the children are unchanged. We could consider making reconstruct
return self
if the children match.
@rjzamora I've addressed Lawrence's comments on the code I've added, for now I left the others up to you but I can work on those too if you prefer. |
Thanks @pentschev! Hopefully I've addressed everything else now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Giving you a ci-codeowners
approval, the changes to testing configuration make sense to me.
/merge |
Follow-up to #17262 Adds support for parallel `DataFrameScan` operations. Authors: - Richard (Rick) Zamora (https://github.com/rjzamora) Approvers: - Lawrence Mitchell (https://github.com/wence-) URL: #17441
Description
The goal here is to lay down the initial foundation for dask-based evaluation of
IR
graphs in cudf-polars. The first pass will only support single-partition workloads. This functionality could be achieved with much less-complicated changes to cudf-polars. However, we do want to build multi-partition support on top of this.Checklist