-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce natural comparator for types that don't have a StringComparator #15145
Conversation
public static StringComparator getStringComparatorForValueType(ColumnType valueType) | ||
{ | ||
if (valueType.isNumeric()) { | ||
return StringComparators.NUMERIC; | ||
} else if (valueType.is(ValueType.STRING)) { | ||
return StringComparators.LEXICOGRAPHIC; | ||
} else { | ||
throw new ISE("Unrecognized valueType[%s]", valueType); | ||
return StringComparators.NATURAL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, we might need to be careful with this, for the two things that use this, i know bound filters do not support array types, and pretty sure topN doesn't either. That said, bound filters are only used if druid.generic.useDefaultValueForNull=true
which is no longer default, or if useBoundsAndSelectors
context flag is set to true, so i'm not entirely sure how big of a deal that part is, but because of this filtering with arrays is really only well supported if sql compatible nulls are enabled, since we use equality and range filters which do support arrays instead of selectors and bounds which do not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I still think it makes sense that we plan the queries through the planner instead of failing them with the exception, and do this handling at the filter level.
NATURAL
represents no type. If array ordering is not supported in the topN & the bound filter part, I guess it should be handled already (unless it bypasses it which should be a diff in this patch).
That's why I felt comfortable with this change - The places where we are already hitting this would leave users with an error message. And if we are actually using the comparator.compareTo(), we will get a different error message (which we can then refine at the site where it surfaces from)
The only place where this would change the logic would be where we are setting it to NATURAL and also bypassing it, effectively leaving the orderBy useless, which should be the changes in the diff (where we bypass). I'll revert some equality I did related to the bound filter so that this doesn't happen.
NATURAL
for comparator refers to the comparator of the type, therefore the comparator shouldn't be used to gatekeep what types are allowed, there should be clear explicit type checks, I think. Also, as long as we don't bypass the natural comparator inadvertently, we won't be producing incorrect results, and throw back an error to the user. If we do see these cases, we can perhaps handle them at their call site then, instead of here.
Thanks for the review @clintropolis |
…ator (apache#15145) Fixes a bug when executing queries with the ordering of arrays
…ator (apache#15145) Fixes a bug when executing queries with the ordering of arrays
…ator (apache#15145) Fixes a bug when executing queries with the ordering of arrays
Description
Planning queries with ordering on arrays throws an error, since there isn't a
StringComparator
associated with it.StringComparator
is an old concept, when dimensions were strings only, and a string dimension could have its ordering defined based on what the string represented. For example, for numeric strings, it could be set toNUMERIC
.However, in the native engine, the comparators are bypassed altogether if the type of the dimension is the natural comparator for that dimension. In the case of numeric dimensions, the
NUMERIC
comparator is the natural comparator. SQL always sets the comparator such that it gets bypassed in the native layer, therefore comparators can used only by used via native queries.With MSQ, we can now have order by's with scan queries on arrays. To represent a natural ordering on a type for which we don't have a defined ordering, we add a
NATURAL
comparator, which is similar to setting theNUMERIC
comparator on thenumeric
types, i.e. it should get bypassed and the natural ordering should be used. Alternatively is a dummy value for representingStringComparator = null
and its a no-op.Fixes a bug when executing queries like
This PR has: