-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor mixed_semi_join using cuco::static_set #16230
Refactor mixed_semi_join using cuco::static_set #16230
Conversation
As rapids-cmake uses the latest cuco which uses key_equal as lhs, thus build and probe table needed to swaped accordingly.
@@ -160,22 +121,16 @@ std::unique_ptr<rmm::device_uvector<size_type>> mixed_join_semi( | |||
auto const preprocessed_probe = | |||
experimental::row::equality::preprocessed_table::create(probe, stream); | |||
auto const row_comparator = | |||
cudf::experimental::row::equality::two_table_comparator{preprocessed_probe, preprocessed_build}; | |||
cudf::experimental::row::equality::two_table_comparator{preprocessed_build, preprocessed_probe}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason for this order change build and probe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to dig into why but without this change the gtests start failing. Maybe @srinivasyadav18 can provide us a quick insight.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: We changed this order and added swap_tables=true
at mixed_join_kernels_semi.cu:+72 (tagged you there) to fix the left and right directionality in the equality in single_expression_equality::operator()
. Specifically, the single_expression_equality
expects indices in {build, probe}
order and we have also reset the hasher and equality functors for the cuco::static_set
which swaps the internal left
and right
directions for {build, probe}
hence us using swap_table=true
.
I know it's quite confusing. 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is mainly due to a breaking change in NVIDIA/cuCollections#479 where we always put query keys on the left-hand side for comparison.
Please provide benchmark showing what benefit this PR will bring to us. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor improvement otherwise looks great to me.
@mhaseeb123 Thank you for stepping in to help resolve the issue and finalize the PR! 🔥 🔥 🔥
@@ -160,22 +121,16 @@ std::unique_ptr<rmm::device_uvector<size_type>> mixed_join_semi( | |||
auto const preprocessed_probe = | |||
experimental::row::equality::preprocessed_table::create(probe, stream); | |||
auto const row_comparator = | |||
cudf::experimental::row::equality::two_table_comparator{preprocessed_probe, preprocessed_build}; | |||
cudf::experimental::row::equality::two_table_comparator{preprocessed_build, preprocessed_probe}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is mainly due to a breaking change in NVIDIA/cuCollections#479 where we always put query keys on the left-hand side for comparison.
Co-authored-by: Yunsong Wang <[email protected]>
Performance results from Spark (spark-h). Thanks to @zpuller for running this. Queries of interest: q16 and q94 that use
|
Shorter for easy reading:
|
/merge |
This reverts commit e68f55c.
Reverting #16230 as this PR leads to #16852. Authors: - Muhammad Haseeb (https://github.com/mhaseeb123) Approvers: - Nghia Truong (https://github.com/ttnghia) - Yunsong Wang (https://github.com/PointKernel) - Bradley Dice (https://github.com/bdice) URL: #16855
This PR reapplies changes from #16230 and adds bug fixes and performance improvements for mixed_semi_join. Authors: - Muhammad Haseeb (https://github.com/mhaseeb123) Approvers: - Yunsong Wang (https://github.com/PointKernel) - MithunR (https://github.com/mythrocks) - Nghia Truong (https://github.com/ttnghia) URL: #16859
This PR reapplies changes from #16230 and adds bug fixes and performance improvements for mixed_semi_join. Authors: - Muhammad Haseeb (https://github.com/mhaseeb123) Approvers: - Yunsong Wang (https://github.com/PointKernel) - MithunR (https://github.com/mythrocks) - Nghia Truong (https://github.com/ttnghia) URL: #16859
Description
This PR refactors
mixed_semi_join
by replacing cuco legacystatic_map
with lateststatic_set
.Contributes to #12261.
Checklist