Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extensible design to add new query and field type support for Star Tree #17100

Closed

Conversation

expani
Copy link
Contributor

@expani expani commented Jan 23, 2025

Description

This PR introduces constructs to help convert user queries into the equivalent star tree traversal by generating the appropriate DimensionFilter based on the type of query.

DimensionFilterMapper acts as an equivalent mapper for Star Tree supported fields and queries.

StarTreeFilterProvider is responsible for converting the user queries ( represented by QueryBuilder )

I have changed all the older tests to point towards the new classes to ensure that we get coverage for at least numeric term queries.

We have tested term, terms and range queries on numeric and keyword over http_logs for sanity of the code and fixed some tricky bugs on different cases. Thanks to @bharath-techie for help with creating end to end query test with randomness.

I will be adding Unit tests for Range and other new classes that are not covered by existing tests in the coming days.

Most of the TODO left around the code will be picked up in future when supporting Filter Aggregations and Boolean queries. If that's not the case, please comment on the same.

Related Issues

#16537
#16538
#16539

Copy link
Contributor

❌ Gradle check result for b85183d: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

low++;
}

if (lastMatchedNode instanceof FixedLengthStarTreeNode) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we move this up ?

FixedLengthStarTreeNode highStarTreeNode = binarySearchChild(high, false, lowStarTreeNode);
if (highStarTreeNode != null) {
for (int lowNodeId = lowStarTreeNode.nodeId(); lowNodeId <= highStarTreeNode.nodeId(); ++lowNodeId) {
collector.collectStarNode(new FixedLengthStarTreeNode(in, lowNodeId));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

collectStarNode is confusing since we have special star nodes as well - can it simply be collectNode or collect since collector is starTreeNodeCollector

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops thats a typo.
I will change StarTreeNodeCollector

}
}

private FixedLengthStarTreeNode binarySearchChild(long dimensionValue, boolean matchNextHighest, StarTreeNode lastMatchedNode)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subjective but can't we rather name matchNextHighest to be match/queryLower or something similar. inverse of MatchNextHighest doesn't tell us that we are looking for match next lowest

Copy link
Contributor Author

@expani expani Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Incidentally, @msfroh had asked the same.
I will add the inverse statement while adding Java docs or let me know if there are any variable names that captures the sentiment better.

}

public Map<String, Long> getQueryMap() {
return queryMap;
public FixedBitSet getStarTreeValue(LeafReaderContext ctx) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit : can we make the naming consistent :

Technically its getStarTreeValues and getStarTreeValuesArr - since starTreeValues is the name of the variable - we can also rename to be something better like getCachedStarTreeValues/Arr as well

FLOAT,
new FloatFieldMapperNumeric(),
DOUBLE,
new DoubleFieldMapperNumeric()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets add todo for unsigned long.

Also how does scaled float get handled here ? scaled float is a supported dimension which is present as part of modules

Copy link
Contributor Author

@expani expani Jan 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack will add a TODO for unsigned long.

Scaled Float along with other mapped field types ( that can be registered from plugins by extending MapperPlugin ) will have this problem since they are not present in OS Core.

So, we will have to expose methods via the existing MappedFieldType which will overridden by the custom mapped types of plugin.

Given this might be a significant change, I think we should pick it up separately rather than accommodating within this PR post 2.19

return null;
}

abstract long convertToDocValues(Number parsedValue);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit : can we rename this to something better - convertToIndexedValue or something similar

private Long lowOrdinal;
private Long highOrdinal;

private boolean skipRangeCollection = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add comments on what this does?

Also java docs we will need to add for all new classes - which you already might be doing

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

skipCollectionForOutOfBounds maybe ?

Copy link
Contributor

❌ Gradle check result for 973edf5: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 5e7a711: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

junweid62 and others added 23 commits January 27, 2025 08:48
…t#17098)

* change version to 2.19

Signed-off-by: Junwei Dai <[email protected]>

* bring back the constructor InternalSearchResponse and SearchResponseSections

Signed-off-by: Junwei Dai <[email protected]>

---------

Signed-off-by: Junwei Dai <[email protected]>
Co-authored-by: Junwei Dai <[email protected]>
Signed-off-by: expani <[email protected]>
Introduce template query that holds the content of query which can contain placeholders and can be filled by the variables from PipelineProcessingContext produced by search processors. This allows query rewrite by the search processors.

---------

Signed-off-by: Mingshi Liu <[email protected]>
Co-authored-by: Michael Froh <[email protected]>
Signed-off-by: expani <[email protected]>
…tor (opensearch-project#17080)

Signed-off-by: Ganesh Ramadurai <[email protected]>
Co-authored-by: Ganesh Ramadurai <[email protected]>
Signed-off-by: expani <[email protected]>
…g settings on restore snapshot (opensearch-project#16957)

* Add index.knn setting to list of unmodifiable settings when restore snapshot

Signed-off-by: AnnTian Shao <[email protected]>

* Add index.knn setting to list of unmodifiable settings when restore snapshot

Signed-off-by: AnnTian Shao <[email protected]>

* Add new Setting property UnmodifiableOnRestore to mark setting as immutable on restore snapshot

Signed-off-by: AnnTian Shao <[email protected]>

* Add tests for new Setting property UnmodifiableOnRestore

Signed-off-by: AnnTian Shao <[email protected]>

* fixes and added more tests for new setting property UnmodifiableOnRestore

Signed-off-by: AnnTian Shao <[email protected]>

* fix test failures

Signed-off-by: AnnTian Shao <[email protected]>

* Revert "fix test failures"

This reverts commit 252100c.

Signed-off-by: AnnTian Shao <[email protected]>

* fixes by adding back USER_UNMODIFIABLE_SETTINGS for settings without Setting implementation

Signed-off-by: AnnTian Shao <[email protected]>

* testing CI config for registering plugin settings

Signed-off-by: AnnTian Shao <[email protected]>

* Revert "testing CI config for registering plugin settings"

This reverts commit 9ebab5a.

Signed-off-by: AnnTian Shao <[email protected]>

* Add UnmodifiableOnRestore only to unmodifiable settings that are already registered, will raise separate PR for settings not yet registered. Add validation check in Setting.java. Add UnmodifiableOnRestore settings cannot be removed on restore

Signed-off-by: AnnTian Shao <[email protected]>

* fixes and move tests to RestoreSnapshotIT

Signed-off-by: AnnTian Shao <[email protected]>

* Add back testInvalidRestoreRequestScenarios test method

Signed-off-by: AnnTian Shao <[email protected]>

---------

Signed-off-by: AnnTian Shao <[email protected]>
Signed-off-by: Tommy Shao <[email protected]>
Co-authored-by: AnnTian Shao <[email protected]>
Signed-off-by: expani <[email protected]>
Move AcknowledgedResponse, AcknowledgedRequest, AcknowledgedRequestBuilder,and
ShardsAcknowledgedResponse to "clustermanager" Java package. This is a purely
structural move done via the IDE with no logic changes.

Signed-off-by: Andrew Ross <[email protected]>
Signed-off-by: expani <[email protected]>
…-project#17005)

Prior to this fix, the _msearch API would keep running search requests
even after being canceled. With this change, we explicitly check if
the task has been canceled before kicking off subsequent requests.

---------

Signed-off-by: Michael Froh <[email protected]>
Signed-off-by: expani <[email protected]>
Signed-off-by: expani <[email protected]>
@expani expani force-pushed the startree_query_field_support branch from 97a21b1 to 629866d Compare January 27, 2025 16:48
Copy link
Contributor

❌ Gradle check result for 629866d: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@expani
Copy link
Contributor Author

expani commented Jan 27, 2025

Closing this as the replacement with clean branch is created
#17137

@expani expani closed this Jan 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.