-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix assert failure for range window functions #14168
Fix assert failure for range window functions #14168
Conversation
Fixes rapidsai#13855. This commit fixes an erroneous assert check in range window functions, where the wrong data-type is checked for timestamp order-by columns. For timestamps, it's the duration::rep type that should be checked, when accessing the order by column. This fix allows ROLLING_TEST to complete correctly, in DEBUG mode.
// For timestamp types, T must match the range rep type for the order-by column. | ||
cudf_assert((type_id_matches_device_storage_type<T>(col.type().id()) or | ||
cudf::type_dispatcher(col.type(), is_correct_range_rep{})) && | ||
"data type mismatch when accessing the order-by column"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I agree with retrieving the data (via element()
) using the rep-type and doing operations on said rep-type. I would rather the actual type be read (via element()
) and used directly.
I feel the original error is correct and this is a flaw in the design of this code perhaps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason we're comparing the rep-type values here is that the order-by columns are timestamps, and the preceding/following values are durations (e.g. 2 days ago).
For integral order-by columns, preceding/following are of the same types. But not for timestamps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. That said, maybe I could stand to tighten this up a little more... I'll see about a stricter check for timestamps.
I did take a look. By this point, the duration types for preceding
and following
are normalized with the order-by's timestamp types already, upstream.
For this release, I'd rather we didn't dismantle the logic in range_window_bounds
(i.e. using rep-types) as part of fixing this assert condition. I could tackle this as a follow-on, if that's agreeable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I just realized that one unintended consequence of doing order-by comparisons without switching to the rep-types is that all those templates will be instantiated for all timestamp types, duration types, fixed-point, etc. :/
/ok to test |
/ok to test |
/merge |
This change has been merged. Thank you all for the reviews. |
Fixes #13855.
This commit fixes an erroneous assert check in range window functions, where the wrong data-type is checked for timestamp order-by columns.
For timestamps, it's the
duration::rep
type that should be checked, when accessing the order by column.This fix allows ROLLING_TEST to complete correctly, in DEBUG mode.
Description
Checklist