-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Supports additional query timing types for profiling plugin query components #17146
base: 2.x
Are you sure you want to change the base?
Conversation
ContextIndexSearcher Signed-off-by: Tejas Shah <[email protected]>
Signed-off-by: Tejas Shah <[email protected]>
❌ Gradle check result for 51f1446: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
@@ -48,7 +48,9 @@ public enum QueryTimingType { | |||
SCORE, | |||
SHALLOW_ADVANCE, | |||
COMPUTE_MAX_SCORE, | |||
SET_MIN_COMPETITIVE_SCORE; | |||
SET_MIN_COMPETITIVE_SCORE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shatejas the core does not know anything about k-nn plugin (or any other plugin per se), this has to be part of the plugin related instrumentation. We may need to think how the profile phases could be extended / customized though, if required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@reta I understand, whats the recommendation in that case? this is one way I found to be able to have additional components in profile query
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shatejas we may never run into a need to have such an extensibility feature, so we may have to design in a plugin neural way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we may never run into a need to have such an extensibility feature
Currently there is a need to have time_in_nanos
for knn. KNN query is relatively complex, both ann and exact search as well as the filter query inside the knn are major components and there is no visibility on these making it extremely difficult to debug performance issues.
Currently I wasn't able to find a hook to have knn components in query breakdown without these changes.
so we may have to design in a plugin neural way.
can you elaborate whats involved here? if its major change in knn plugin it might have to be iterative and this change might work till then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@reta I think agree on designing this in a plugin neutral way. @shatejas lets have extension points in core that can be used by Plugins to provide their QueryTimingTyes.
One idea I can think of here is QueryTimingType
would be getting used to put in some string in the profile output. We can create another enum/class which collects all the TimingType from all the plugins and then put them at right place during serialization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @shatejas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shatejas I am not sure (just my opinion) this is the way solve the problem. The query timing types are context dependent: if we run k-nn query, they have to be there, otherwise - nothing should be in profile (otherwise we are polluting profiling summary with irrelevant details). The ability to add context dependent profiler (supported out of the box) solves it nicely I believe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@reta One thing I tried was to have types
in query builder, for normal case it works fine. But the problem comes during complex queries example
{
"profile": "true",
"query": {
"bool": {
"should": [
{
"range": {
"rating": {
"gte": 8,
"lte": 10
}
}
},
{
"knn": {
"location": {
"vector": [
4.2,
4.6
],
"k": 3
}
}
}
]
}
}
}
In this scenario, it doesn't recognize knn specific strings and throw npe for profiling since the top level boolean query will have empty set.
Another option is we simply filter 0 values in the breakdown which will remove the verbosity
Overall, context might be helpful but it might not be a bad thing to always tell the user that a particular component 0 nanos
The ability to add context dependent profiler (supported out of the box) solves it nicely I believe.
I will need recommendation here. QueryBuilder and Plugin interfaces are exposed to plugins so, I tried those. Anything apart from these two that is exposed to plugin that I can leverage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will need recommendation here. QueryBuilder and Plugin interfaces are exposed to plugins so, I tried those. Anything apart from these two that is exposed to plugin that I can leverage?
I listed them here: #17146 (comment) . We do support injection of multiple profilers into SearchContext
, all these APIs are open to the plugins.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
QueryProfiler<T extends Enum> (or alternatively, introduce the intermediate abstraction like QueryProfilerBase<T extends Enum> and let QueryProfiler extend it).
The reason I did not go this route is because we want to add a few types in addition to QueryTimingType. Making it generic forces to have additional enums and have duplicate values of QueryTimingType since plugins like knn just want to piggy back on existing profiler
I am also not sure having a plugin specific profiler solves the context problem - providing custom profiler through querybuilder or plugin will require this profiler to set it in index searcher when the query is being executed.
SET_MIN_COMPETITIVE_SCORE; | ||
SET_MIN_COMPETITIVE_SCORE, | ||
EXACT_KNN_SEARCH, | ||
ANN_SEARCH; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't tried this so idk about the viability of the idea, but is it possible to change QueryTimingType to an interface
interface QueryTimingType {
String getValue();
}
and then have core define a default
enum DefaultQueryTimingType implements QueryTimingType {
CREATE_WEIGHT,
BUILD_SCORER,
NEXT_DOC,
ADVANCE,
MATCH,
SCORE,
SHALLOW_ADVANCE,
COMPUTE_MAX_SCORE,
SET_MIN_COMPETITIVE_SCORE
@Override
public String getValue() {
return name();
}
@Override
public String toString() {
return name().toLowerCase(Locale.ROOT);
}
}
There could be an extension point for plugins to also define QueryTimingTypes and then a mechanism to collect them across the plugins and create a registry.
Signed-off-by: Tejas Shah <[email protected]>
❌ Gradle check result for 568cfe2: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
On the unrelated note, @shatejas please target |
Description
Adds enums related to knn to be able to profile ann query. Currently its difficult to debug latencies for knn, this will help increase visibility on knn query
KNN PR: opensearch-project/k-NN#2450
Related Issues
Resolves opensearch-project/k-NN#2286
Sample response
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.