Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Remote Store] Changes to perform repository registration during bootstrap via node attributes. #9105

Merged
merged 34 commits into from
Sep 6, 2023
Merged
Show file tree
Hide file tree
Changes from 33 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
0041386
[Remote Store] Changes to perform repository registration during boot…
psychbot Aug 26, 2023
aaa8fd1
Code changes to do the repository verification at the time of node bo…
psychbot Aug 26, 2023
ee7d06a
Fixing repository verification logic
psychbot Aug 26, 2023
58466a7
Updating javadoc and RepoRegistration IT
psychbot Aug 26, 2023
d72aa58
Using CoundDownLatch
psychbot Aug 27, 2023
bf5bfda
Fixing tests
psychbot Aug 27, 2023
f225122
Updating CompatibilityMode enum
psychbot Aug 27, 2023
64d0e31
Updating REMOTE_STORE_COMPATIBILITY_MODE_SETTING value enum javadoc
psychbot Aug 27, 2023
6284bb3
Addressed few comments, updated javadocs and added verify logic to be…
psychbot Aug 28, 2023
edd2148
Introducing verifylocally to perform verification without cluster bei…
psychbot Aug 28, 2023
c6a3135
Updating/Fixing tests
psychbot Aug 29, 2023
efeef9e
Fixing ITs
psychbot Aug 29, 2023
1482544
Removing verifyLocally and move to existing verify methods. If the re…
psychbot Aug 30, 2023
a295af6
Fixing Additional Tests, there might still be some flaky ones
psychbot Aug 30, 2023
d67801b
Removing remote store specific cluster settings instead rely on node …
psychbot Aug 31, 2023
8d9df13
Removing unnecessary diffs
psychbot Aug 31, 2023
7e95361
Fixing all UTs and ITs, Addressing Comments
psychbot Sep 2, 2023
f26b1a2
Removing unintended changes
psychbot Sep 2, 2023
f1785f1
Fixing setup issues and assertions for a few ITs
psychbot Sep 2, 2023
4e05540
Reverting deleted settings and fixing tests post rebase
psychbot Sep 2, 2023
7de82b4
Addressing Comments
psychbot Sep 3, 2023
b0a4523
Addressing comments
psychbot Sep 3, 2023
401b6ec
Adding logic to clean up repository if create or verification fails
psychbot Sep 3, 2023
8bed219
Fixing nit issues to reduce overall code changes
psychbot Sep 4, 2023
eb0b049
fix
psychbot Sep 4, 2023
c3717b0
Reverting the exception handling during repository creation and verif…
psychbot Sep 4, 2023
c4cb780
Adding cluster manager node creation to avoid flaky test as this had …
psychbot Sep 4, 2023
b06dab5
Addressing comments
psychbot Sep 4, 2023
e243e18
Using local node to compute repository metadata till we only support …
psychbot Sep 5, 2023
46ffa47
Using existing nodes instead of local node
psychbot Sep 5, 2023
41e78a7
Using stream find first
psychbot Sep 5, 2023
c43a112
Changes to update RepositoriesService repositories with system reposi…
psychbot Sep 5, 2023
899ae5d
Fixing precommit
psychbot Sep 5, 2023
8c54a86
Fixing flaky test #9776
psychbot Sep 6, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
- Add average concurrency metric for concurrent segment search ([#9670](https://github.com/opensearch-project/OpenSearch/issues/9670))
- [Remote state] Integrate remote cluster state in publish/commit flow ([#9665](https://github.com/opensearch-project/OpenSearch/pull/9665))
- [Segment Replication] Adding segment replication statistics rolled up at index, node and cluster level ([#9709](https://github.com/opensearch-project/OpenSearch/pull/9709))
- [Remote Store] Changes to introduce repository registration during bootstrap via node attributes. ([#9105](https://github.com/opensearch-project/OpenSearch/pull/9105))

### Dependencies
- Bump `org.apache.logging.log4j:log4j-core` from 2.17.1 to 2.20.0 ([#8307](https://github.com/opensearch-project/OpenSearch/pull/8307))
Expand Down
4 changes: 2 additions & 2 deletions distribution/src/config/opensearch.yml
Original file line number Diff line number Diff line change
Expand Up @@ -92,10 +92,10 @@ ${path.logs}
# cluster.remote_store.enabled: true
#
# Repository to use for segment upload while enforcing remote store for an index
# cluster.remote_store.segment.repository: my-repo-1
# node.attr.remote_store.segment.repository: my-repo-1
#
# Repository to use for translog upload while enforcing remote store for an index
# cluster.remote_store.translog.repository: my-repo-1
# node.attr.remote_store.translog.repository: my-repo-1
#
# ---------------------------------- Experimental Features -----------------------------------
#
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,15 @@
import java.nio.file.Path;
import java.util.Arrays;
import java.util.Collections;
import java.util.Locale;
import java.util.Set;
import java.util.stream.Collectors;

import static org.opensearch.remotestore.RemoteStoreBaseIntegTestCase.remoteStoreClusterSettings;
import static org.opensearch.indices.IndicesService.CLUSTER_REPLICATION_TYPE_SETTING;
import static org.opensearch.node.remotestore.RemoteStoreNodeAttribute.REMOTE_STORE_REPOSITORY_SETTINGS_ATTRIBUTE_KEY_PREFIX;
import static org.opensearch.node.remotestore.RemoteStoreNodeAttribute.REMOTE_STORE_REPOSITORY_TYPE_ATTRIBUTE_KEY_FORMAT;
import static org.opensearch.node.remotestore.RemoteStoreNodeAttribute.REMOTE_STORE_SEGMENT_REPOSITORY_NAME_ATTRIBUTE_KEY;
import static org.opensearch.node.remotestore.RemoteStoreNodeAttribute.REMOTE_STORE_TRANSLOG_REPOSITORY_NAME_ATTRIBUTE_KEY;
import static org.opensearch.test.hamcrest.OpenSearchAssertions.assertAcked;

public abstract class AbstractRemoteStoreMockRepositoryIntegTestCase extends AbstractSnapshotIntegTestCase {
Expand All @@ -47,7 +52,6 @@ protected Settings featureFlagSettings() {
public void setup() {
FeatureFlagSetter.set(FeatureFlags.REMOTE_STORE);
FeatureFlagSetter.set(FeatureFlags.SEGMENT_REPLICATION_EXPERIMENTAL);
internalCluster().startClusterManagerOnlyNode(remoteStoreClusterSettings(REPOSITORY_NAME, TRANSLOG_REPOSITORY_NAME));
}

@Override
Expand All @@ -66,6 +70,43 @@ protected Settings remoteStoreIndexSettings(int numberOfReplicas) {
.build();
}

public Settings buildRemoteStoreNodeAttributes(Path repoLocation, double ioFailureRate, String skipExceptionBlobList, long maxFailure) {
String segmentRepoTypeAttributeKey = String.format(
Locale.getDefault(),
"node.attr." + REMOTE_STORE_REPOSITORY_TYPE_ATTRIBUTE_KEY_FORMAT,
REPOSITORY_NAME
);
String translogRepoTypeAttributeKey = String.format(
Locale.getDefault(),
"node.attr." + REMOTE_STORE_REPOSITORY_TYPE_ATTRIBUTE_KEY_FORMAT,
TRANSLOG_REPOSITORY_NAME
);
String segmentRepoSettingsAttributeKeyPrefix = String.format(
Locale.getDefault(),
"node.attr." + REMOTE_STORE_REPOSITORY_SETTINGS_ATTRIBUTE_KEY_PREFIX,
REPOSITORY_NAME
);
String translogRepoSettingsAttributeKeyPrefix = String.format(
Locale.getDefault(),
"node.attr." + REMOTE_STORE_REPOSITORY_SETTINGS_ATTRIBUTE_KEY_PREFIX,
TRANSLOG_REPOSITORY_NAME
);

return Settings.builder()
.put("node.attr." + REMOTE_STORE_SEGMENT_REPOSITORY_NAME_ATTRIBUTE_KEY, REPOSITORY_NAME)
.put(segmentRepoTypeAttributeKey, "mock")
.put(segmentRepoSettingsAttributeKeyPrefix + "location", repoLocation)
.put(segmentRepoSettingsAttributeKeyPrefix + "random_control_io_exception_rate", ioFailureRate)
.put(segmentRepoSettingsAttributeKeyPrefix + "skip_exception_on_verification_file", true)
.put(segmentRepoSettingsAttributeKeyPrefix + "skip_exception_on_list_blobs", true)
.put(segmentRepoSettingsAttributeKeyPrefix + "skip_exception_on_blobs", skipExceptionBlobList)
.put(segmentRepoSettingsAttributeKeyPrefix + "max_failure_number", maxFailure)
.put("node.attr." + REMOTE_STORE_TRANSLOG_REPOSITORY_NAME_ATTRIBUTE_KEY, TRANSLOG_REPOSITORY_NAME)
.put(translogRepoTypeAttributeKey, "mock")
.put(translogRepoSettingsAttributeKeyPrefix + "location", repoLocation)
.build();
}

protected void deleteRepo() {
logger.info("--> Deleting the repository={}", REPOSITORY_NAME);
assertAcked(clusterAdmin().prepareDeleteRepository(REPOSITORY_NAME));
Expand All @@ -74,26 +115,18 @@ protected void deleteRepo() {
}

protected String setup(Path repoLocation, double ioFailureRate, String skipExceptionBlobList, long maxFailure) {
logger.info("--> Creating repository={} at the path={}", REPOSITORY_NAME, repoLocation);
// The random_control_io_exception_rate setting ensures that 10-25% of all operations to remote store results in
/// IOException. skip_exception_on_verification_file & skip_exception_on_list_blobs settings ensures that the
// repository creation can happen without failure.
createRepository(
REPOSITORY_NAME,
"mock",
Settings.builder()
.put("location", repoLocation)
.put("random_control_io_exception_rate", ioFailureRate)
.put("skip_exception_on_verification_file", true)
.put("skip_exception_on_list_blobs", true)
// Skipping is required for metadata as it is part of recovery
.put("skip_exception_on_blobs", skipExceptionBlobList)
.put("max_failure_number", maxFailure)
);
logger.info("--> Creating repository={} at the path={}", TRANSLOG_REPOSITORY_NAME, repoLocation);
createRepository(TRANSLOG_REPOSITORY_NAME, "mock", Settings.builder().put("location", repoLocation));
Settings.Builder settings = Settings.builder()
.put(buildRemoteStoreNodeAttributes(repoLocation, ioFailureRate, skipExceptionBlobList, maxFailure));

if (randomBoolean()) {
settings.put(CLUSTER_REPLICATION_TYPE_SETTING.getKey(), ReplicationType.SEGMENT);
}

String dataNodeName = internalCluster().startDataOnlyNodes(1).get(0);
internalCluster().startClusterManagerOnlyNode(settings.build());
String dataNodeName = internalCluster().startDataOnlyNode(settings.build());
createIndex(INDEX_NAME);
logger.info("--> Created index={}", INDEX_NAME);
ensureYellowAndNoInitializingShards(INDEX_NAME);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -75,8 +75,8 @@ public void testDefaultRemoteStoreNoUserOverrideExceptReplicationTypeSegment() t
verifyRemoteStoreIndexSettings(
indexSettings,
"true",
"my-segment-repo-1",
"my-translog-repo-1",
REPOSITORY_NAME,
REPOSITORY_2_NAME,
ReplicationType.SEGMENT.toString(),
IndexSettings.DEFAULT_REMOTE_TRANSLOG_BUFFER_INTERVAL
);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,9 @@
import org.opensearch.cluster.metadata.IndexMetadata;
import org.opensearch.common.settings.Settings;
import org.opensearch.common.unit.TimeValue;
import org.opensearch.common.util.FeatureFlags;
import org.opensearch.index.IndexSettings;
import org.opensearch.indices.replication.common.ReplicationType;
import org.opensearch.test.FeatureFlagSetter;
import org.opensearch.test.OpenSearchIntegTestCase;
import org.junit.After;
import org.junit.Before;

import java.util.Locale;
Expand All @@ -28,53 +25,15 @@
import static org.opensearch.cluster.metadata.IndexMetadata.SETTING_REMOTE_TRANSLOG_STORE_REPOSITORY;
import static org.opensearch.cluster.metadata.IndexMetadata.SETTING_REPLICATION_TYPE;
import static org.opensearch.index.IndexSettings.INDEX_REMOTE_TRANSLOG_BUFFER_INTERVAL_SETTING;
import static org.opensearch.remotestore.RemoteStoreBaseIntegTestCase.remoteStoreClusterSettings;
import static org.opensearch.test.hamcrest.OpenSearchAssertions.assertAcked;
import static org.hamcrest.Matchers.containsString;

@OpenSearchIntegTestCase.ClusterScope(scope = OpenSearchIntegTestCase.Scope.TEST)
public class CreateRemoteIndexIT extends OpenSearchIntegTestCase {

@After
public void teardown() {
assertAcked(clusterAdmin().prepareDeleteRepository("my-segment-repo-1"));
assertAcked(clusterAdmin().prepareDeleteRepository("my-translog-repo-1"));
assertAcked(clusterAdmin().prepareDeleteRepository("my-custom-repo"));
}

@Override
protected Settings nodeSettings(int nodeOriginal) {
Settings settings = super.nodeSettings(nodeOriginal);
Settings.Builder builder = Settings.builder()
.put(remoteStoreClusterSettings("my-segment-repo-1", "my-translog-repo-1"))
.put(settings);
return builder.build();
}

@Override
protected Settings featureFlagSettings() {
return Settings.builder().put(super.featureFlagSettings()).put(FeatureFlags.REMOTE_STORE, "true").build();
}
public class CreateRemoteIndexIT extends RemoteStoreBaseIntegTestCase {

@Before
public void setup() {
FeatureFlagSetter.set(FeatureFlags.REMOTE_STORE);
internalCluster().startClusterManagerOnlyNode();
assertAcked(
clusterAdmin().preparePutRepository("my-segment-repo-1")
.setType("fs")
.setSettings(Settings.builder().put("location", randomRepoPath().toAbsolutePath()))
);
assertAcked(
clusterAdmin().preparePutRepository("my-translog-repo-1")
.setType("fs")
.setSettings(Settings.builder().put("location", randomRepoPath().toAbsolutePath()))
);
assertAcked(
clusterAdmin().preparePutRepository("my-custom-repo")
.setType("fs")
.setSettings(Settings.builder().put("location", randomRepoPath().toAbsolutePath()))
);
public void setup() throws Exception {
internalCluster().startNodes(2);
}

public void testDefaultRemoteStoreNoUserOverride() throws Exception {
Expand All @@ -91,8 +50,8 @@ public void testDefaultRemoteStoreNoUserOverride() throws Exception {
verifyRemoteStoreIndexSettings(
indexSettings,
"true",
"my-segment-repo-1",
"my-translog-repo-1",
REPOSITORY_NAME,
REPOSITORY_2_NAME,
ReplicationType.SEGMENT.toString(),
IndexSettings.DEFAULT_REMOTE_TRANSLOG_BUFFER_INTERVAL
);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,30 +27,38 @@
import org.opensearch.test.OpenSearchIntegTestCase;
import org.opensearch.test.disruption.NetworkDisruption;
import org.opensearch.test.transport.MockTransportService;
import org.junit.Before;

import java.nio.file.Path;
import java.util.Arrays;
import java.util.Collection;
import java.util.HashSet;
import java.util.Set;
import java.util.stream.Collectors;
import java.util.stream.Stream;

import static org.opensearch.test.hamcrest.OpenSearchAssertions.assertAcked;
import static org.opensearch.test.hamcrest.OpenSearchAssertions.assertHitCount;
import static org.opensearch.test.hamcrest.OpenSearchAssertions.assertNoFailures;
import static org.hamcrest.Matchers.equalTo;

@OpenSearchIntegTestCase.ClusterScope(scope = OpenSearchIntegTestCase.Scope.TEST, numDataNodes = 0)

public class PrimaryTermValidationIT extends RemoteStoreBaseIntegTestCase {

private static final String INDEX_NAME = "remote-store-test-idx-1";
protected Path absolutePath;
protected Path absolutePath2;

@Override
protected Collection<Class<? extends Plugin>> nodePlugins() {
return Arrays.asList(MockTransportService.TestPlugin.class);
}

@Before
public void setup() {
absolutePath = randomRepoPath().toAbsolutePath();
absolutePath2 = randomRepoPath().toAbsolutePath();
}

public void testPrimaryTermValidation() throws Exception {
// Follower checker interval is lower compared to leader checker so that the cluster manager can remove the node
// with network partition faster. The follower check retry count is also kept 1.
Expand All @@ -61,20 +69,12 @@ public void testPrimaryTermValidation() throws Exception {
.put(FollowersChecker.FOLLOWER_CHECK_TIMEOUT_SETTING.getKey(), "1s")
.put(FollowersChecker.FOLLOWER_CHECK_INTERVAL_SETTING.getKey(), "1s")
.put(FollowersChecker.FOLLOWER_CHECK_RETRY_COUNT_SETTING.getKey(), 1)
.put(remoteStoreClusterSettings(REPOSITORY_NAME, REPOSITORY_2_NAME, true))
.put(remoteStoreClusterSettings(REPOSITORY_NAME, absolutePath, REPOSITORY_2_NAME, absolutePath2))
.build();
internalCluster().startClusterManagerOnlyNode(clusterSettings);

// Create repository
absolutePath = randomRepoPath().toAbsolutePath();
assertAcked(
clusterAdmin().preparePutRepository(REPOSITORY_NAME).setType("fs").setSettings(Settings.builder().put("location", absolutePath))
);
absolutePath2 = randomRepoPath().toAbsolutePath();
putRepository(absolutePath2, REPOSITORY_2_NAME);

// Start data nodes and create index
internalCluster().startDataOnlyNodes(2, clusterSettings);

// Create index
createIndex(INDEX_NAME, remoteStoreIndexSettings(1));
ensureYellowAndNoInitializingShards(INDEX_NAME);
ensureGreen(INDEX_NAME);
Expand Down Expand Up @@ -156,6 +156,7 @@ public void testPrimaryTermValidation() throws Exception {
// received the following exception.
ShardNotFoundException exception = assertThrows(ShardNotFoundException.class, () -> indexSameDoc(primaryNode, INDEX_NAME));
assertTrue(exception.getMessage().contains("no such shard"));
internalCluster().clearDisruptionScheme();
ensureStableCluster(3);
ensureGreen(INDEX_NAME);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@
import java.nio.file.Path;

import static org.opensearch.remotestore.RemoteStoreBaseIntegTestCase.remoteStoreClusterSettings;
import static org.opensearch.test.hamcrest.OpenSearchAssertions.assertAcked;

@OpenSearchIntegTestCase.ClusterScope(scope = OpenSearchIntegTestCase.Scope.TEST)
public class RemoteIndexPrimaryRelocationIT extends IndexPrimaryRelocationIT {
Expand All @@ -29,15 +28,12 @@ public class RemoteIndexPrimaryRelocationIT extends IndexPrimaryRelocationIT {

public void setup() {
absolutePath = randomRepoPath().toAbsolutePath();
assertAcked(
clusterAdmin().preparePutRepository(REPOSITORY_NAME).setType("fs").setSettings(Settings.builder().put("location", absolutePath))
);
}

protected Settings nodeSettings(int nodeOrdinal) {
return Settings.builder()
.put(super.nodeSettings(nodeOrdinal))
.put(remoteStoreClusterSettings(REPOSITORY_NAME, REPOSITORY_NAME, false))
.put(remoteStoreClusterSettings(REPOSITORY_NAME, absolutePath))
.build();
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,11 +31,19 @@ public class RemoteIndexRecoveryIT extends IndexRecoveryIT {

protected static final String REPOSITORY_NAME = "test-remote-store-repo";

protected Path absolutePath;
protected Path repositoryPath;

@Before
public void setup() {
repositoryPath = randomRepoPath().toAbsolutePath();
}

@Override
protected Settings nodeSettings(int nodeOrdinal) {
return Settings.builder().put(super.nodeSettings(nodeOrdinal)).put(remoteStoreClusterSettings(REPOSITORY_NAME)).build();
return Settings.builder()
.put(super.nodeSettings(nodeOrdinal))
.put(remoteStoreClusterSettings(REPOSITORY_NAME, repositoryPath))
.build();
}

@Override
Expand All @@ -47,17 +55,6 @@ protected Settings featureFlagSettings() {
.build();
}

@Before
@Override
public void setUp() throws Exception {
super.setUp();
internalCluster().startClusterManagerOnlyNode();
absolutePath = randomRepoPath().toAbsolutePath();
assertAcked(
clusterAdmin().preparePutRepository(REPOSITORY_NAME).setType("fs").setSettings(Settings.builder().put("location", absolutePath))
);
}

@Override
public Settings indexSettings() {
return Settings.builder()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@
import org.opensearch.snapshots.AbstractSnapshotIntegTestCase;
import org.opensearch.snapshots.SnapshotState;
import org.opensearch.test.InternalTestCluster;
import org.opensearch.test.OpenSearchIntegTestCase;
import org.junit.After;
import org.junit.Before;

Expand All @@ -43,14 +44,14 @@
import static org.hamcrest.Matchers.equalTo;
import static org.hamcrest.Matchers.greaterThan;

@OpenSearchIntegTestCase.ClusterScope(scope = OpenSearchIntegTestCase.Scope.TEST, numDataNodes = 0)
public class RemoteRestoreSnapshotIT extends AbstractSnapshotIntegTestCase {
private static final String BASE_REMOTE_REPO = "test-rs-repo" + TEST_REMOTE_STORE_REPO_SUFFIX;
private Path remoteRepoPath;

@Before
public void setup() {
remoteRepoPath = randomRepoPath().toAbsolutePath();
createRepository(BASE_REMOTE_REPO, "fs", remoteRepoPath);
}

@After
Expand All @@ -63,7 +64,7 @@ protected Settings nodeSettings(int nodeOrdinal) {
return Settings.builder()
.put(super.nodeSettings(nodeOrdinal))
.put(FeatureFlags.REMOTE_STORE, "true")
.put(remoteStoreClusterSettings(BASE_REMOTE_REPO))
.put(remoteStoreClusterSettings(BASE_REMOTE_REPO, remoteRepoPath))
.build();
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,6 @@ public class RemoteSegmentStatsFromNodesStatsIT extends RemoteStoreBaseIntegTest
@Before
public void setup() {
setupCustomCluster();
setupRepo(false);
}

private void setupCustomCluster() {
Expand Down
Loading