-
Notifications
You must be signed in to change notification settings - Fork 846
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Binary
arrays in starts_with
, ends_with
and contains
#6926
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am sorry in the delay reviewing this PR -- it is hard to find time reviewing such a large PR
I wonder what the usecase is for using LIKE on binary data? I as because it seems to me that LIKE is mostly useful for character strings.
I can see the usecase for starts_with
/ ends_with
and contains
for binary data,
Perhaps instead of trying to inject binary array into the code for handling strings, we could simply have simpler prefix/suffix matching for binary -- it might have some more repetition but would be simpler to understand any avoid any potential performance issues related to this code 🤔
@@ -59,6 +59,16 @@ pub struct FixedSizeBinaryArray { | |||
} | |||
|
|||
impl FixedSizeBinaryArray { | |||
/// Returns true if all data within this array is ASCII | |||
pub fn is_ascii(&self) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the need to check a binary array for ASCII -- there shouldn't be any optimizations that rely on the data being ASCII
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because I did not want to duplicate the whole like and predicate file I implemented like and this needed for the like impl.
But instead I only implemented the contains, start with, and ends with
Marking as draft as I think this PR is no longer waiting on feedback. Please mark it as ready for review when it is ready for another look |
5f9a6c8
to
30b432e
Compare
@alamb this PR is now ready for review and is much smaller. thanks for the feedback |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @rluvaton -- I looked through this and I think it looks quite nice and easy to follow now and I think the risk of regressions in performance for strings is low
Thank you for your patience
Binary
arrays in starts_with
, ends_with
and contains
I plan to merge this tomorrow unless anyone else would like time to review |
Thanks again @rluvaton |
(ignore branch name)
Which issue does this PR close?
Closes #6923
Closes #6924
What changes are included in this PR?
PredicateImpl
trait to work with the predicate regardless of string or binaryPredicateImpl
for the oldPredicate
and the newBinaryPredicate
using macro (I don't really like this as it seem less maintainable, but not sure what's better, duplicating or macro, or another approach)Are there any user-facing changes?
Yes, allow users to pass binary arrays to like/starts with/contains and more