Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit RW separation to remote store enabled clusters and update recovery flow #16760

Merged
merged 15 commits into from
Jan 10, 2025

Conversation

mch2
Copy link
Member

@mch2 mch2 commented Dec 2, 2024

Description

This PR includes multiple changes to search replica recovery to further decouple these shards from primaries.

  1. Change to recover as empty store instead of peer. This will run a store recovery that syncs segments from remote store directly and eliminate any primary communication.
  2. Remove search replicas from the in-sync allocation ID set and update routing table to exclude them from allAllocationIds. This ensures primaries aren't tracking or validating the routing table for any search replica's presence.
  3. Simplify RW separation by limiting to only remote store enabled clusters. There are versions of the above changes that are still possible with primary based node-node replication but they require additional public api changes and I don't think we have the need at this time.

Related Issues

Resolves #15952

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions github-actions bot added the v2.19.0 Issues and PRs related to version 2.19.0 label Dec 2, 2024
Copy link
Contributor

github-actions bot commented Dec 2, 2024

❌ Gradle check result for a932d59:

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

github-actions bot commented Dec 2, 2024

❌ Gradle check result for a932d59:

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

github-actions bot commented Dec 3, 2024

✅ Gradle check result for a932d59: SUCCESS

Copy link

codecov bot commented Dec 3, 2024

Codecov Report

Attention: Patch coverage is 69.35484% with 19 lines in your changes missing coverage. Please review.

Project coverage is 72.10%. Comparing base (f6dc4a6) to head (1c95833).
Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
...in/java/org/opensearch/index/shard/IndexShard.java 44.44% 2 Missing and 3 partials ⚠️
...search/cluster/routing/IndexShardRoutingTable.java 42.85% 2 Missing and 2 partials ⚠️
.../opensearch/cluster/routing/IndexRoutingTable.java 78.57% 2 Missing and 1 partial ⚠️
...llocation/decider/ThrottlingAllocationDecider.java 84.61% 0 Missing and 2 partials ⚠️
...a/org/opensearch/cluster/routing/ShardRouting.java 66.66% 0 Missing and 1 partial ⚠️
...uster/routing/allocation/IndexMetadataUpdater.java 75.00% 0 Missing and 1 partial ⚠️
...org/opensearch/index/seqno/ReplicationTracker.java 0.00% 0 Missing and 1 partial ⚠️
...a/org/opensearch/index/shard/ReplicationGroup.java 50.00% 0 Missing and 1 partial ⚠️
...java/org/opensearch/index/shard/StoreRecovery.java 80.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #16760      +/-   ##
============================================
- Coverage     72.17%   72.10%   -0.08%     
+ Complexity    65251    65197      -54     
============================================
  Files          5301     5301              
  Lines        303662   303723      +61     
  Branches      43989    44006      +17     
============================================
- Hits         219181   218994     -187     
- Misses        66552    66730     +178     
- Partials      17929    17999      +70     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mch2 mch2 added the backport 2.x Backport to 2.x branch label Dec 3, 2024
Copy link
Contributor

github-actions bot commented Dec 3, 2024

✅ Gradle check result for 8935bc7: SUCCESS

mch2 added 9 commits January 8, 2025 17:13
… the AllAllocationIds set in the routing table

Signed-off-by: Marc Handalian <[email protected]>
…e store cluster.

This check had previously only checked for segrep

Signed-off-by: Marc Handalian <[email protected]>
Signed-off-by: Marc Handalian <[email protected]>
… a remote store cluster."

reverting this, we already check for remote store earlier.

This reverts commit 48ca1a3.

Signed-off-by: Marc Handalian <[email protected]>
Signed-off-by: Marc Handalian <[email protected]>
This commit adds PR feedback and recovery tests post node restart.

Signed-off-by: Marc Handalian <[email protected]>
@mch2
Copy link
Member Author

mch2 commented Jan 9, 2025

apologies for the rebase, had to fix DCO check on an old commit

Copy link
Contributor

github-actions bot commented Jan 9, 2025

❌ Gradle check result for cf68380: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Marc Handalian <[email protected]>
Copy link
Contributor

github-actions bot commented Jan 9, 2025

❌ Gradle check result for eaa38d9: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

github-actions bot commented Jan 9, 2025

✅ Gradle check result for eaa38d9: SUCCESS

Copy link
Contributor

❌ Gradle check result for 002323e: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

✅ Gradle check result for 002323e: SUCCESS

Copy link
Contributor

✅ Gradle check result for 1c95833: SUCCESS

@mch2 mch2 merged commit 8191de8 into opensearch-project:main Jan 10, 2025
36 of 37 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-16760-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 8191de85856d291507d09a7fd425908843ed8675
# Push it to GitHub
git push --set-upstream origin backport/backport-16760-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-16760-to-2.x.

mch2 added a commit to mch2/OpenSearch that referenced this pull request Jan 10, 2025
…ery flow (opensearch-project#16760)

* Update search only replica recovery flow

This PR includes multiple changes to search replica recovery.
1. Change search only replica copies to recover as empty store instead of PEER. This will run a store recovery that syncs segments from remote store directly and eliminate any primary communication.
2. Remove search replicas from the in-sync allocation ID set and update routing table to exclude them from allAllocationIds.  This ensures primaries aren't tracking or validating the routing table for any search replica's presence.
3. Change search replica validation to require remote store.  There are versions of the above changes that are still possible with primary based node-node replication, but I don't think they are worth making  at this time.

Signed-off-by: Marc Handalian <[email protected]>

* more coverage

Signed-off-by: Marc Handalian <[email protected]>

* add changelog entry

Signed-off-by: Marc Handalian <[email protected]>

* add assertions that Search Replicas are not in the in-sync id set nor the AllAllocationIds set in the routing table

Signed-off-by: Marc Handalian <[email protected]>

* update async task to only run if the FF is enabled and we are a remote store cluster.

This check had previously only checked for segrep

Signed-off-by: Marc Handalian <[email protected]>

* clean up max shards logic

Signed-off-by: Marc Handalian <[email protected]>

* remove search replicas from check during renewPeerRecoveryRetentionLeases

Signed-off-by: Marc Handalian <[email protected]>

* Revert "update async task to only run if the FF is enabled and we are a remote store cluster."

reverting this, we already check for remote store earlier.

This reverts commit 48ca1a3.

Signed-off-by: Marc Handalian <[email protected]>

* Add more tests for failover case

Signed-off-by: Marc Handalian <[email protected]>

* Update remotestore restore logic and add test ensuring we can restore only writers when red

Signed-off-by: Marc Handalian <[email protected]>

* Fix Search replicas to honor node level recovery limits

Signed-off-by: Marc Handalian <[email protected]>

* Fix translog UUID mismatch on existing store recovery.

This commit adds PR feedback and recovery tests post node restart.

Signed-off-by: Marc Handalian <[email protected]>

* Fix spotless

Signed-off-by: Marc Handalian <[email protected]>

* Fix bug with remote restore and add more tests

Signed-off-by: Marc Handalian <[email protected]>

---------

Signed-off-by: Marc Handalian <[email protected]>
(cherry picked from commit 8191de8)
msfroh pushed a commit that referenced this pull request Jan 10, 2025
…ery flow (#16760) (#16999)

* Update search only replica recovery flow

This PR includes multiple changes to search replica recovery.
1. Change search only replica copies to recover as empty store instead of PEER. This will run a store recovery that syncs segments from remote store directly and eliminate any primary communication.
2. Remove search replicas from the in-sync allocation ID set and update routing table to exclude them from allAllocationIds.  This ensures primaries aren't tracking or validating the routing table for any search replica's presence.
3. Change search replica validation to require remote store.  There are versions of the above changes that are still possible with primary based node-node replication, but I don't think they are worth making  at this time.

Signed-off-by: Marc Handalian <[email protected]>

* more coverage

Signed-off-by: Marc Handalian <[email protected]>

* add changelog entry

Signed-off-by: Marc Handalian <[email protected]>

* add assertions that Search Replicas are not in the in-sync id set nor the AllAllocationIds set in the routing table

Signed-off-by: Marc Handalian <[email protected]>

* update async task to only run if the FF is enabled and we are a remote store cluster.

This check had previously only checked for segrep

Signed-off-by: Marc Handalian <[email protected]>

* clean up max shards logic

Signed-off-by: Marc Handalian <[email protected]>

* remove search replicas from check during renewPeerRecoveryRetentionLeases

Signed-off-by: Marc Handalian <[email protected]>

* Revert "update async task to only run if the FF is enabled and we are a remote store cluster."

reverting this, we already check for remote store earlier.

This reverts commit 48ca1a3.

Signed-off-by: Marc Handalian <[email protected]>

* Add more tests for failover case

Signed-off-by: Marc Handalian <[email protected]>

* Update remotestore restore logic and add test ensuring we can restore only writers when red

Signed-off-by: Marc Handalian <[email protected]>

* Fix Search replicas to honor node level recovery limits

Signed-off-by: Marc Handalian <[email protected]>

* Fix translog UUID mismatch on existing store recovery.

This commit adds PR feedback and recovery tests post node restart.

Signed-off-by: Marc Handalian <[email protected]>

* Fix spotless

Signed-off-by: Marc Handalian <[email protected]>

* Fix bug with remote restore and add more tests

Signed-off-by: Marc Handalian <[email protected]>

---------

Signed-off-by: Marc Handalian <[email protected]>
(cherry picked from commit 8191de8)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch backport-failed v2.19.0 Issues and PRs related to version 2.19.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RW Separation] Change search replica recovery flow
4 participants