Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix NPE in ConcurrentQueryProfile while computing the breakdown map for slices #10111

Merged
merged 2 commits into from
Sep 21, 2023

Conversation

sohami
Copy link
Collaborator

@sohami sohami commented Sep 18, 2023

Description

There can be cases where one or more slice may not have timing related information for its leaves in contexts map. During creation of slice and query level breakdown map it needs to handle such cases by using default values correctly. Also updating the min/max/avg sliceNodeTime to not include time to create weight and wait times by slice threads. It will reflect the min/max/avg execution time of each slice whereas totalNodeTime will reflect the total query time.

Related Issues

Resolves #9815

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Contributor

github-actions bot commented Sep 18, 2023

Compatibility status:

Checks if related components are compatible with change 9fb011c

Incompatible components

Incompatible components: [https://github.com/opensearch-project/k-nn.git]

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/reporting.git]

@sohami
Copy link
Collaborator Author

sohami commented Sep 18, 2023

@ticheng-aws @reta Can you please help with review.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@sohami
Copy link
Collaborator Author

sohami commented Sep 19, 2023

We need to update CHANGELOG.md as well.

Don't see a need for Changelog as it is a bug fix for unreleased changes.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.smoketest.SmokeTestMultiNodeClientYamlTestSuiteIT.test {yaml=pit/10_basic/Delete all}

…or slices.

There can be cases where one or more slice may not have timing related information for its leaves in
contexts map. During creation of slice and query level breakdown map it needs to handle such cases by using
default values correctly. Also updating the min/max/avg sliceNodeTime to not include time to create
weight and wait times by slice threads. It will reflect the min/max/avg execution time of each slice
whereas totalNodeTime will reflect the total query time.

Signed-off-by: Sorabh Hamirwasia <[email protected]>
Signed-off-by: Sorabh Hamirwasia <[email protected]>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@sohami
Copy link
Collaborator Author

sohami commented Sep 21, 2023

Gradle Check (Jenkins) Run Completed with:

All known issues:
#5030
#10133

@sohami sohami merged commit 9f0e017 into opensearch-project:main Sep 21, 2023
@reta
Copy link
Collaborator

reta commented Sep 21, 2023

@sohami I understand the frustration regarding failing checks but we should not merge pull requests that have them failed

@reta reta added the backport 2.x Backport to 2.x branch label Sep 21, 2023
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-10111-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 9f0e01743de1bf23264945d54d0d5ab20593020b
# Push it to GitHub
git push --set-upstream origin backport/backport-10111-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-10111-to-2.x.

sarthakaggarwal97 pushed a commit to sarthakaggarwal97/OpenSearch that referenced this pull request Sep 24, 2023
…or slices (opensearch-project#10111)

* Fix NPE in ConcurrentQueryProfile while computing the breakdown map for slices.

There can be cases where one or more slice may not have timing related information for its leaves in
contexts map. During creation of slice and query level breakdown map it needs to handle such cases by using
default values correctly. Also updating the min/max/avg sliceNodeTime to not include time to create
weight and wait times by slice threads. It will reflect the min/max/avg execution time of each slice
whereas totalNodeTime will reflect the total query time.

Signed-off-by: Sorabh Hamirwasia <[email protected]>

* Address review comments

Signed-off-by: Sorabh Hamirwasia <[email protected]>

---------

Signed-off-by: Sorabh Hamirwasia <[email protected]>
brusic pushed a commit to brusic/OpenSearch that referenced this pull request Sep 25, 2023
…or slices (opensearch-project#10111)

* Fix NPE in ConcurrentQueryProfile while computing the breakdown map for slices.

There can be cases where one or more slice may not have timing related information for its leaves in
contexts map. During creation of slice and query level breakdown map it needs to handle such cases by using
default values correctly. Also updating the min/max/avg sliceNodeTime to not include time to create
weight and wait times by slice threads. It will reflect the min/max/avg execution time of each slice
whereas totalNodeTime will reflect the total query time.

Signed-off-by: Sorabh Hamirwasia <[email protected]>

* Address review comments

Signed-off-by: Sorabh Hamirwasia <[email protected]>

---------

Signed-off-by: Sorabh Hamirwasia <[email protected]>
Signed-off-by: Ivan Brusic <[email protected]>
vikasvb90 pushed a commit to vikasvb90/OpenSearch that referenced this pull request Oct 10, 2023
…or slices (opensearch-project#10111)

* Fix NPE in ConcurrentQueryProfile while computing the breakdown map for slices.

There can be cases where one or more slice may not have timing related information for its leaves in
contexts map. During creation of slice and query level breakdown map it needs to handle such cases by using
default values correctly. Also updating the min/max/avg sliceNodeTime to not include time to create
weight and wait times by slice threads. It will reflect the min/max/avg execution time of each slice
whereas totalNodeTime will reflect the total query time.

Signed-off-by: Sorabh Hamirwasia <[email protected]>

* Address review comments

Signed-off-by: Sorabh Hamirwasia <[email protected]>

---------

Signed-off-by: Sorabh Hamirwasia <[email protected]>
sohami added a commit to sohami/OpenSearch that referenced this pull request Oct 24, 2023
…or slices (opensearch-project#10111)

* Fix NPE in ConcurrentQueryProfile while computing the breakdown map for slices.

There can be cases where one or more slice may not have timing related information for its leaves in
contexts map. During creation of slice and query level breakdown map it needs to handle such cases by using
default values correctly. Also updating the min/max/avg sliceNodeTime to not include time to create
weight and wait times by slice threads. It will reflect the min/max/avg execution time of each slice
whereas totalNodeTime will reflect the total query time.

Signed-off-by: Sorabh Hamirwasia <[email protected]>

* Address review comments

Signed-off-by: Sorabh Hamirwasia <[email protected]>

---------

Signed-off-by: Sorabh Hamirwasia <[email protected]>
sohami added a commit to sohami/OpenSearch that referenced this pull request Oct 30, 2023
…or slices (opensearch-project#10111)

* Fix NPE in ConcurrentQueryProfile while computing the breakdown map for slices.

There can be cases where one or more slice may not have timing related information for its leaves in
contexts map. During creation of slice and query level breakdown map it needs to handle such cases by using
default values correctly. Also updating the min/max/avg sliceNodeTime to not include time to create
weight and wait times by slice threads. It will reflect the min/max/avg execution time of each slice
whereas totalNodeTime will reflect the total query time.

Signed-off-by: Sorabh Hamirwasia <[email protected]>

* Address review comments

Signed-off-by: Sorabh Hamirwasia <[email protected]>

---------

Signed-off-by: Sorabh Hamirwasia <[email protected]>
sohami added a commit to sohami/OpenSearch that referenced this pull request Oct 31, 2023
…or slices (opensearch-project#10111)

* Fix NPE in ConcurrentQueryProfile while computing the breakdown map for slices.

There can be cases where one or more slice may not have timing related information for its leaves in
contexts map. During creation of slice and query level breakdown map it needs to handle such cases by using
default values correctly. Also updating the min/max/avg sliceNodeTime to not include time to create
weight and wait times by slice threads. It will reflect the min/max/avg execution time of each slice
whereas totalNodeTime will reflect the total query time.

Signed-off-by: Sorabh Hamirwasia <[email protected]>

* Address review comments

Signed-off-by: Sorabh Hamirwasia <[email protected]>

---------

Signed-off-by: Sorabh Hamirwasia <[email protected]>
reta pushed a commit that referenced this pull request Oct 31, 2023
…#10898)

* Add support for query profiler with concurrent aggregation (#9248)

* Add support for query profiler with concurrent aggregation (#9248)

Signed-off-by: Ticheng Lin <[email protected]>

* Refactor and work on the PR comments

Signed-off-by: Ticheng Lin <[email protected]>

* Update collectorToLeaves mapping for children breakdowns post profile metric collection and before creating the results

Signed-off-by: Sorabh Hamirwasia <[email protected]>

* Refactor logic to compute the slice level breakdown stats and query level breakdown stats for concurrent search case

Signed-off-by: Sorabh Hamirwasia <[email protected]>

* Fix QueryProfilePhaseTests and QueryProfileTests, and parameterize QueryProfilerIT with concurrent search enabled

Signed-off-by: Ticheng Lin <[email protected]>

* Handle the case when there are no leaf context to compute the profile stats to return default stats
for all breakdown type along with min/max/avg values. Replace queryStart and queryEnd time with queryNodeTime

Signed-off-by: Sorabh Hamirwasia <[email protected]>

* Add UTs for ConcurrentQueryProfileBreakdown

Signed-off-by: Sorabh Hamirwasia <[email protected]>

* Add concurrent search stats test into the QueryProfilerIT

Signed-off-by: Ticheng Lin <[email protected]>

* Address review comments

Signed-off-by: Sorabh Hamirwasia <[email protected]>

---------

Signed-off-by: Ticheng Lin <[email protected]>
Signed-off-by: Sorabh Hamirwasia <[email protected]>
Co-authored-by: Sorabh Hamirwasia <[email protected]>

* Fix NPE in ConcurrentQueryProfile while computing the breakdown map for slices (#10111)

* Fix NPE in ConcurrentQueryProfile while computing the breakdown map for slices.

There can be cases where one or more slice may not have timing related information for its leaves in
contexts map. During creation of slice and query level breakdown map it needs to handle such cases by using
default values correctly. Also updating the min/max/avg sliceNodeTime to not include time to create
weight and wait times by slice threads. It will reflect the min/max/avg execution time of each slice
whereas totalNodeTime will reflect the total query time.

Signed-off-by: Sorabh Hamirwasia <[email protected]>

* Address review comments

Signed-off-by: Sorabh Hamirwasia <[email protected]>

---------

Signed-off-by: Sorabh Hamirwasia <[email protected]>

* Fix flaky query profile phase tests with concurrent search enabled (#10547) (#10547)

Signed-off-by: Ticheng Lin <[email protected]>

* Introduce ConcurrentQueryProfiler to profile query using concurrent segment search path and support concurrency during rewrite and create weight (#10352)

* Fix timer race condition in profile rewrite and create weight for concurrent segment search (#10352)

Signed-off-by: Ticheng Lin <[email protected]>

* Refactor and work on the PR comments (#10352)

Signed-off-by: Ticheng Lin <[email protected]>

---------

Signed-off-by: Ticheng Lin <[email protected]>

---------

Signed-off-by: Ticheng Lin <[email protected]>
Signed-off-by: Sorabh Hamirwasia <[email protected]>
Co-authored-by: Ticheng Lin <[email protected]>
@ticheng-aws
Copy link
Contributor

Resolves #9869

shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
…or slices (opensearch-project#10111)

* Fix NPE in ConcurrentQueryProfile while computing the breakdown map for slices.

There can be cases where one or more slice may not have timing related information for its leaves in
contexts map. During creation of slice and query level breakdown map it needs to handle such cases by using
default values correctly. Also updating the min/max/avg sliceNodeTime to not include time to create
weight and wait times by slice threads. It will reflect the min/max/avg execution time of each slice
whereas totalNodeTime will reflect the total query time.

Signed-off-by: Sorabh Hamirwasia <[email protected]>

* Address review comments

Signed-off-by: Sorabh Hamirwasia <[email protected]>

---------

Signed-off-by: Sorabh Hamirwasia <[email protected]>
Signed-off-by: Shivansh Arora <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch backport-failed bug Something isn't working flaky-test Random test failure that succeeds on second run skip-changelog
Projects
None yet
3 participants