-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of the bitmap filtering #16936
Improve performance of the bitmap filtering #16936
Conversation
server/src/main/java/org/opensearch/search/query/BitmapIndexQuery.java
Outdated
Show resolved
Hide resolved
❌ Gradle check result for e80b830: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add test case for following
1.Empty BitMaps
2.Query with high and low cardinality.
3.Query behaviour on multi-threaded executions.
e80b830
to
0697100
Compare
❌ Gradle check result for 0697100: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
6a8a0d0
to
941a5c2
Compare
❌ Gradle check result for 941a5c2: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for e152a01: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
e152a01
to
8de4f5b
Compare
❕ Gradle check result for 8de4f5b: UNSTABLE
Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #16936 +/- ##
============================================
- Coverage 72.31% 72.19% -0.12%
+ Complexity 65346 65333 -13
============================================
Files 5301 5302 +1
Lines 303805 303899 +94
Branches 44030 44045 +15
============================================
- Hits 219702 219407 -295
- Misses 66055 66466 +411
+ Partials 18048 18026 -22 ☔ View full report in Codecov by Sentry. |
6d11f30
to
d7db6da
Compare
❌ Gradle check result for 97ee7e0: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: bowenlan-amzn <[email protected]>
Signed-off-by: bowenlan-amzn <[email protected]>
Signed-off-by: bowenlan-amzn <[email protected]>
Signed-off-by: bowenlan-amzn <[email protected]>
Signed-off-by: bowenlan-amzn <[email protected]>
Signed-off-by: bowenlan-amzn <[email protected]>
Signed-off-by: bowenlan-amzn <[email protected]>
Signed-off-by: bowenlan-amzn <[email protected]>
97ee7e0
to
ed293c4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice work on the BitMapIndexQuery
! That's a nice elegant solution.
--------- Signed-off-by: bowenlan-amzn <[email protected]> (cherry picked from commit ba0c4f3) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
--------- (cherry picked from commit ba0c4f3) Signed-off-by: bowenlan-amzn <[email protected]> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Description
This change adds a new bitmap index query that solves existing performance issue described in the related issue below. In short, the time spent in constructor and cost estimation.
Bitmap index query takes advantage of the index structure or points of the numeric field, and traverse points to return an iterator of matching doc ids. Matching doc here means its value is inside the bitmap.
The main reason bitmap index query is needed is to support IndexOrDocValuesQuery.
IndexOrDocValuesQuery can decide which query, index or doc value, to supply scorer at the runtime depending on the comparison between cost of the chosen lead iterator (c_lead) and the index query (c_iq). For example, we have a term filter that matches 1% of the total documents, and a numeric range IndexOrDocValuesQuery matches 10% of the total documents. Obviously term filter will become the lead iterator since it's matching fewer docs. And more importantly, IndexOrDocValuesQuery will choose doc value query at runtime because the c_iq = 10 x c_lead. Note IndexOrDocValuesQuery has a heuristic to choose doc value query only when cost of index query is 8 times the lead iterator cost. https://github.com/apache/lucene/blob/ddf538d43e94a814f783fcc40728ee45038a03a1/lucene/core/src/java/org/apache/lucene/search/IndexOrDocValuesQuery.java#L170-L174
Another reason bitmap index query is useful is it's much faster than doc value query when the size of queried terms is small, like only 0.01% of the total docs. It's because index structure is always much faster to find a smaller matching set than doc value which needs to iterate over all documents. This is useful either when bitmap query is used alone, or chosen as the lead iterator.
Benchmark
I use a 3 nodes cluster, one node is purely client, the other two nodes each holding one primary shard.
The index has 100 million ordered numeric documents.
This is the results of running simple terms query, using index or doc values. X axis represents the number of IDs that are queried. For example, 10^5 means randomly choosing 100K different numbers from 100M which is 0.1% of total documents. You can see when the size of queried IDs become larger, bitmap queries are performing better, mostly because of the saved network bandwidth.
If we just look at the 2 bitmap queries, index and doc value, you may wonder why doc value becomes much faster than index when size of queried terms becomes large. I turn on the
track_total_hits
and the result is below as expected. So by default the query will stop iterate when there are 10K matches, and the larger the queried terms, the denser matched documents can be iterated by doc values, which is what I think the reason.Choose the Cost
The cost of bitmap index query is chosen to be based on the cardinality of the bitmap because we do iterate over the items in the bitmap. Below are the results of using the cardinality as the cost and run conjunction query of term filter plus bitmap query. Note the legend here has a little mistake, (index) actually represents bitmap IndexOrDocValuesQuery. Red dotted line indicates the query size of the term filter.
For 100K or 0.1% query size of the term filter, the IndexOrDocValuesQuery always beats the pure DocValuesQuery.
However, for 1M or 1% query size, the IndexOrDocValuesQuery seems either may take over the lead iterator at the wrong time, or not choosing the doc values query when it should.
I can see at this condition, at 25K the pure doc values is already better, but the big speed up comes late at between 250K~500K, so I decide to add a 20X penalty to the cost which currently is just the cardinality, and re-run the benchmark.
Related Issues
Resolves #16317
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.