Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor histogram reduction using cuco::static_set::insert_and_find #16485

Merged

Conversation

srinivasyadav18
Copy link
Contributor

@srinivasyadav18 srinivasyadav18 commented Aug 2, 2024

Description

Refactors histogram reduce and groupby aggregations using cuco::static_set::insert_and_find. Speed improvement results here and here.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@github-actions github-actions bot added libcudf Affects libcudf (C++/CUDA) code. CMake CMake build issue labels Aug 2, 2024
@srinivasyadav18 srinivasyadav18 added Performance Performance related issue non-breaking Non-breaking change improvement Improvement / enhancement to an existing function labels Aug 2, 2024
cpp/benchmarks/reduction/histogram.cpp Outdated Show resolved Hide resolved
cpp/src/groupby/sort/group_histogram.cu Outdated Show resolved Hide resolved
@srinivasyadav18

This comment was marked as resolved.

@srinivasyadav18

This comment was marked as resolved.

cpp/src/reductions/histogram.cu Outdated Show resolved Hide resolved
cpp/src/reductions/histogram.cu Outdated Show resolved Hide resolved
cpp/src/reductions/histogram.cu Outdated Show resolved Hide resolved
@mhaseeb123 mhaseeb123 self-assigned this Sep 24, 2024
@mhaseeb123 mhaseeb123 changed the base branch from branch-24.10 to branch-24.12 September 24, 2024 21:15
@mhaseeb123 mhaseeb123 changed the title Refactor histogram reduction using cuco::static_map::insert_or_apply Refactor histogram reduction using cuco::static_set::insert_and_find Oct 4, 2024
@rapidsai rapidsai deleted a comment from copy-pr-bot bot Oct 7, 2024
@mhaseeb123 mhaseeb123 marked this pull request as ready for review October 7, 2024 23:42
@mhaseeb123 mhaseeb123 requested a review from a team as a code owner October 7, 2024 23:42
@mhaseeb123 mhaseeb123 requested a review from davidwendt October 7, 2024 23:42
@mhaseeb123 mhaseeb123 added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Oct 7, 2024
@@ -1,169 +0,0 @@
/*
* Copyright (c) 2022-2024, NVIDIA CORPORATION.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer in use

@PointKernel PointKernel requested a review from ttnghia October 8, 2024 20:54
Copy link
Member

@PointKernel PointKernel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Requesting @ttnghia's review as he worked on the original hash reduce-by-row implementation.

Comment on lines +135 to +136
// Hard set the tparam `has_nested_columns` = false for now as we don't yet support nested columns
auto const key_equal = row_comp.equal_to<false>(has_nulls, null_equality::EQUAL, value_comp);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is error prone. Suppose that now we want to support the nested types, it will be difficult to trace back to this line of code to change it. So I would rather implement the code path ready for nested types here, like what it was before:

if (has_nested_columns) {
    auto const key_equal = row_comp.equal_to<true>(has_nulls, null_equality::EQUAL, value_comp);
    map.insert(pair_iter, pair_iter + input.num_rows(), key_hasher, key_equal, stream.value());
  } else {
    auto const key_equal = row_comp.equal_to<false>(has_nulls, null_equality::EQUAL, value_comp);
    map.insert(pair_iter, pair_iter + input.num_rows(), key_hasher, key_equal, stream.value());
  }

Copy link
Member

@PointKernel PointKernel Oct 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point about the implicit use of non-nested key equality. Instead of adding a code path for nested types that isn't used by any existing code, I suggest leaving the code as is and adding a CUDF_EXPECTS check to ensure no nested types are involved in the calculation. This approach also avoids templating functions, which can help reduce build time.

Copy link
Contributor

@ttnghia ttnghia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The performance improvement is fantastic. There's just one small request.

@mhaseeb123 mhaseeb123 requested a review from ttnghia October 8, 2024 23:29
@rapidsai rapidsai deleted a comment from copy-pr-bot bot Oct 8, 2024
@rapidsai rapidsai deleted a comment from copy-pr-bot bot Oct 8, 2024
@rapidsai rapidsai deleted a comment from copy-pr-bot bot Oct 9, 2024
@mhaseeb123
Copy link
Member

mhaseeb123 commented Oct 9, 2024

The performance improvement is fantastic. There's just one small request.

Reverting the added functor which enabled the unused code path in light of Yunsong's comment.

Copy link

copy-pr-bot bot commented Oct 9, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@mhaseeb123
Copy link
Member

/merge

@rapids-bot rapids-bot bot merged commit a6853f4 into rapidsai:branch-24.12 Oct 9, 2024
107 checks passed
@mhaseeb123 mhaseeb123 added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team labels Oct 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge CMake CMake build issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change Performance Performance related issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants