Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add instructions for deploying migration assistant. #8798

Merged
merged 6 commits into from
Dec 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
154 changes: 45 additions & 109 deletions _migrations/deploying-migration-assistant/configuration-options.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,66 +7,35 @@

# Configuration options

This page outlines the configuration options for three key migrations:
1. **Metadata Migration**
2. **Backfill Migration with Reindex-from-Snapshot (RFS)**
3. **Live Capture Migration with Capture and Replay (C&R)**
This page outlines the configuration options for three key migrations scenarios:

Each of these migrations may depend on either a snapshot or a capture proxy. The CDK context blocks below are shown as separate context blocks for each migration type for simplicity. If performing multiple migration types, combine these options, as the actual execution of each migration is controlled from the Migration Console.
1. **Metadata migration**
2. **Backfill migration with `Reindex-from-Snapshot` (RFS)**
3. **Live capture migration with Capture and Replay (C&R)**
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the intro file, we do not use C&R as an abbreviation for "Capture and Replay". Please make consistent across files.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched to capitalized.


It also has a section describing how to specify the auth details for the source and target cluster (no auth, basic auth with a username and password, or sigv4 auth).
Each of these migrations depends on either a snapshot or a capture proxy. The following example `cdk.context.json` configurations are used by AWS Cloud Development Kit (AWS CDK) to deploy and configure Migration Assistant for OpenSearch, shown as separate blocks for each migration type. If you are performing a migration applicable to multiple scenarios, these options can be combined.

> [!TIP]
For a complete list of configuration options, please refer to the [opensearch-migrations options.md](https://github.com/opensearch-project/opensearch-migrations/blob/main/deployment/cdk/opensearch-service-migration/options.md) but please open an issue for consultation if changing an option that is not listed on this page.

Options for the source cluster endpoint, target cluster endpoint, and existing VPC should be configured for the Migration tools to function effectively.
For a complete list of configuration options, see [opensearch-migrations-options.md](https://github.com/opensearch-project/opensearch-migrations/blob/main/deployment/cdk/opensearch-service-migration/options.md). If you need a configuration option that is not found on this page, create an issue in the [OpenSearch Migrations repository](https://github.com/opensearch-project/opensearch-migrations/issues).

Check failure on line 19 in _migrations/deploying-migration-assistant/configuration-options.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _migrations/deploying-migration-assistant/configuration-options.md#L19

[Vale.Terms] Use 'OpenSearch' instead of 'opensearch'.
Raw output
{"message": "[Vale.Terms] Use 'OpenSearch' instead of 'opensearch'.", "location": {"path": "_migrations/deploying-migration-assistant/configuration-options.md", "range": {"start": {"line": 19, "column": 52}}}, "severity": "ERROR"}
{: .tip }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 19: We should probably provide a link to the repo.


Options for the source cluster endpoint, target cluster endpoint, and existing virtual private cloud (VPC) should be configured in order for the migration tools to function effectively.

## Metadata Migration Options
## Shared configuration options

## Sample Metadata Migration CDK Options
Each migration configuration shares the following options.

```json
{
"metadata-migration": {
"stage": "dev",
"vpcId": <VPC_ID>,
"sourceCluster": {
"endpoint": <SOURCE_CLUSTER_ENDPOINT>,
"version": "ES 7.10",
"auth": {"type": "none"}
},
"targetCluster": {
"endpoint": <TARGET_CLUSTER_ENDPOINT>,
"auth": {
"type": "basic",
"username": <TARGET_CLUSTER_USERNAME>,
"passwordFromSecretArn": <TARGET_CLUSTER_PASSWORD_SECRET>
}
},
"reindexFromSnapshotServiceEnabled": true,
"artifactBucketRemovalPolicy": "DESTROY"
}
}
```

There are currently no CDK options specific to Metadata migrations, which are performed from the Migration Console. This migration requires an existing snapshot, which can be created from the Migration Console.

<details>
<summary><b>Shared configuration options table</b>
</summary>
| Name | Example | Description |
| :--- | :--- | :--- |
| `sourceClusterEndpoint` | `"https://source-cluster.elb.us-east-1.endpoint.com"` | The endpoint for the source cluster. |
| `targetClusterEndpoint` | `"https://vpc-demo-opensearch-cluster-cv6hggdb66ybpk4kxssqt6zdhu.us-west-2.es.amazonaws.com:443"` | The endpoint for the target cluster. Required if using an existing target cluster for the migration instead of creating a new one. |
| `vpcId` | `"vpc-123456789abcdefgh"` | The ID of the existing VPC in which the migration resources will be stored. The VPC must have at least two private subnets that span two Availability Zones. |

Check failure on line 33 in _migrations/deploying-migration-assistant/configuration-options.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _migrations/deploying-migration-assistant/configuration-options.md#L33

[OpenSearch.Spelling] Error: subnets. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: subnets. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_migrations/deploying-migration-assistant/configuration-options.md", "range": {"start": {"line": 33, "column": 157}}}, "severity": "ERROR"}

| Name | Example | Description |
|-----------------------|-----------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `sourceClusterEndpoint` | `"https://source-cluster.elb.us-east-1.endpoint.com"` | The endpoint for the source cluster. |
| `targetClusterEndpoint` | `"https://vpc-demo-opensearch-cluster-cv6hggdb66ybpk4kxssqt6zdhu.us-west-2.es.amazonaws.com:443"` | The endpoint for the target cluster. Required if using an existing target cluster for the migration instead of creating a new one. |
| `vpcId` | `"vpc-123456789abcdefgh"` | The ID of the existing VPC where the migration resources will be placed. The VPC must have at least two private subnets that span two availability zones. |

</details>
## Backfill migration using RFS

## Backfill Migration with Reindex-from-Snapshot (RFS) Options

### Sample Backfill Migration CDK Options
The following CDK performs a backfill migrations using RFS:

```json
{
Expand All @@ -93,22 +62,21 @@
}
```

Performing a Reindex-from-Snapshot backfill migration requires an existing snapshot. The CDK options specific to backfill migrations are listed below. To view all available arguments for `reindexFromSnapshotExtraArgs`, see [here](https://github.com/opensearch-project/opensearch-migrations/blob/main/DocumentsFromSnapshotMigration/README.md#arguments). At a minimum, no extra arguments may be needed.
Performing an RFS backfill migration requires an existing snapshot.


<details>
<summary><b>Backfill specific configuration options table</b>
</summary>
The RFS configuration uses the following options. All options are optional.

| Name | Example | Description |
|---------------------------------|-----------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `reindexFromSnapshotServiceEnabled` | `true` | Enables deploying and configuring the RFS ECS service. |
| `reindexFromSnapshotExtraArgs` | `"--target-aws-region us-east-1 --target-aws-service-signing-name es"` | Extra arguments for the Document Migration command, with space separation. See the [RFS Extra Arguments](https://github.com/opensearch-project/opensearch-migrations/blob/main/DocumentsFromSnapshotMigration/README.md#arguments) for more details. You can pass `--no-insecure` to remove the `--insecure` flag. |
| Name | Example | Description |
| :--- | :--- | :--- |
| `reindexFromSnapshotServiceEnabled` | `true` | Enables deployment and configuration of the RFS ECS service. |
| `reindexFromSnapshotExtraArgs` | `"--target-aws-region us-east-1 --target-aws-service-signing-name es"` | Extra arguments for the Document Migration command, with space separation. See [RFS Extra Arguments](https://github.com/opensearch-project/opensearch-migrations/blob/main/DocumentsFromSnapshotMigration/README.md#arguments) for more information. You can pass `--no-insecure` to remove the `--insecure` flag. |

Check warning on line 73 in _migrations/deploying-migration-assistant/configuration-options.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _migrations/deploying-migration-assistant/configuration-options.md#L73

[OpenSearch.AcronymParentheses] 'README': Spell out acronyms the first time that you use them on a page and follow them with the acronym in parentheses. Subsequently, use the acronym alone.
Raw output
{"message": "[OpenSearch.AcronymParentheses] 'README': Spell out acronyms the first time that you use them on a page and follow them with the acronym in parentheses. Subsequently, use the acronym alone.", "location": {"path": "_migrations/deploying-migration-assistant/configuration-options.md", "range": {"start": {"line": 73, "column": 311}}}, "severity": "WARNING"}

</details>
To view all available arguments for `reindexFromSnapshotExtraArgs`, see [Snapshot migrations README](https://github.com/opensearch-project/opensearch-migrations/blob/main/DocumentsFromSnapshotMigration/README.md#arguments). At a minimum, no extra arguments may be needed.

## Live Capture Migration with Capture and Replay (C&R) Options
## Live capture migration with C&R

### Sample Live Capture Migration CDK Options
The following sample CDK performs a live capture migration with C&R:

```json
{
Expand Down Expand Up @@ -137,28 +105,26 @@
}
```

Performing a live capture migration requires that a Capture Proxy be configured to capture incoming traffic and send it to the target cluster via the Traffic Replayer service. For arguments available in `captureProxyExtraArgs`, refer to the `@Parameter` fields [here](https://github.com/opensearch-project/opensearch-migrations/blob/main/TrafficCapture/trafficCaptureProxyServer/src/main/java/org/opensearch/migrations/trafficcapture/proxyserver/CaptureProxy.java). For `trafficReplayerExtraArgs`, refer to the `@Parameter` fields [here](https://github.com/opensearch-project/opensearch-migrations/blob/main/TrafficCapture/trafficReplayer/src/main/java/org/opensearch/migrations/replay/TrafficReplayer.java). At a minimum, no extra arguments may be needed.
Performing a live capture migration requires that a Capture Proxy be configured to capture incoming traffic and send it to the target cluster using the Traffic Replayer service. For arguments available in `captureProxyExtraArgs`, refer to the `@Parameter` fields [here](https://github.com/opensearch-project/opensearch-migrations/blob/main/TrafficCapture/trafficCaptureProxyServer/src/main/java/org/opensearch/migrations/trafficcapture/proxyserver/CaptureProxy.java). For `trafficReplayerExtraArgs`, refer to the `@Parameter` fields [here](https://github.com/opensearch-project/opensearch-migrations/blob/main/TrafficCapture/trafficReplayer/src/main/java/org/opensearch/migrations/replay/TrafficReplayer.java). At a minimum, no extra arguments may be needed.

Check failure on line 108 in _migrations/deploying-migration-assistant/configuration-options.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _migrations/deploying-migration-assistant/configuration-options.md#L108

[OpenSearch.Spelling] Error: Replayer. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: Replayer. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_migrations/deploying-migration-assistant/configuration-options.md", "range": {"start": {"line": 108, "column": 161}}}, "severity": "ERROR"}

<details>
<summary><b>Capture and Replay specific configuration options table</b>
</summary>

| Name | Example | Description |
|--------------------------------|----------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `captureProxyServiceEnabled` | `true` | Enables the Capture Proxy service deployment via a new CloudFormation stack. |
| `captureProxyExtraArgs` | `"--suppressCaptureForHeaderMatch user-agent .*elastic-java/7.17.0.*"` | Extra arguments for the Capture Proxy command, including options specified by the [Capture Proxy](https://github.com/opensearch-project/opensearch-migrations/blob/main/TrafficCapture/trafficCaptureProxyServer/src/main/java/org/opensearch/migrations/trafficcapture/proxyserver/CaptureProxy.java). |
| `trafficReplayerServiceEnabled` | `true` | Enables the Traffic Replayer service deployment via a new CloudFormation stack. |
| Name | Example | Description |
| :--- | :--- | :--- |
| `captureProxyServiceEnabled` | `true` | Enables the Capture Proxy service deployment using an AWS CloudFormation stack. |
| `captureProxyExtraArgs` | `"--suppressCaptureForHeaderMatch user-agent .*elastic-java/7.17.0.*"` | Extra arguments for the Capture Proxy command, including options specified by the [Capture Proxy](https://github.com/opensearch-project/opensearch-migrations/blob/main/TrafficCapture/trafficCaptureProxyServer/src/main/java/org/opensearch/migrations/trafficcapture/proxyserver/CaptureProxy.java). |
| `trafficReplayerServiceEnabled` | `true` | Enables the Traffic Replayer service deployment using a CloudFormation stack. |

Check failure on line 115 in _migrations/deploying-migration-assistant/configuration-options.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _migrations/deploying-migration-assistant/configuration-options.md#L115

[OpenSearch.Spelling] Error: Replayer. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: Replayer. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_migrations/deploying-migration-assistant/configuration-options.md", "range": {"start": {"line": 115, "column": 67}}}, "severity": "ERROR"}
| `trafficReplayerExtraArgs` | `"--sigv4-auth-header-service-region es,us-east-1 --speedup-factor 5"` | Extra arguments for the Traffic Replayer command, including options for auth headers and other parameters specified by the [Traffic Replayer](https://github.com/opensearch-project/opensearch-migrations/blob/main/TrafficCapture/trafficReplayer/src/main/java/org/opensearch/migrations/replay/TrafficReplayer.java). |

</details>

## Cluster Authentication Options
For arguments available in `captureProxyExtraArgs`, see the `@Parameter` fields in [`CaptureProxy.java`](https://github.com/opensearch-project/opensearch-migrations/blob/main/TrafficCapture/trafficCaptureProxyServer/src/main/java/org/opensearch/migrations/trafficcapture/proxyserver/CaptureProxy.java). For `trafficReplayerExtraArgs`, see the `@Parameter` fields in [TrafficReplayer.java](https://github.com/opensearch-project/opensearch-migrations/blob/main/TrafficCapture/trafficReplayer/src/main/java/org/opensearch/migrations/replay/TrafficReplayer.java).


## Cluster authentication options

Both the source and target cluster can use no authentication (e.g. limited to the VPC), basic authentication with a username and password, or SigV4 scoped to a user or role.
Both the source and target cluster can use no authentication, authentication limited to VPC, basic authentication with a username and password, or AWS Signature Version 4 scoped to a user or role.

Examples of each of these are below.
### No authentication

No auth:
```
"sourceCluster": {
"endpoint": <SOURCE_CLUSTER_ENDPOINT>,
Expand All @@ -167,7 +133,8 @@
}
```

Basic auth:
### Basic authentication

```
"sourceCluster": {
"endpoint": <SOURCE_CLUSTER_ENDPOINT>,
Expand All @@ -180,7 +147,8 @@
}
```

SigV4 auth:
### Signature Version 4 authentication

Check failure on line 150 in _migrations/deploying-migration-assistant/configuration-options.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _migrations/deploying-migration-assistant/configuration-options.md#L150

[OpenSearch.HeadingCapitalization] 'Signature Version 4 authentication' is a heading and should be in sentence case.
Raw output
{"message": "[OpenSearch.HeadingCapitalization] 'Signature Version 4 authentication' is a heading and should be in sentence case.", "location": {"path": "_migrations/deploying-migration-assistant/configuration-options.md", "range": {"start": {"line": 150, "column": 5}}}, "severity": "ERROR"}

```
"sourceCluster": {
"endpoint": <SOURCE_CLUSTER_ENDPOINT>,
Expand All @@ -195,40 +163,8 @@

The `serviceSigningName` can be `es` for an Elasticsearch or OpenSearch domain, or `aoss` for an OpenSearch Serverless collection.

All of these auth mechanisms apply to both source and target clusters.

## Troubleshooting

### Restricted Permissions
When deploying if part of an [AWS Organization](https://docs.aws.amazon.com/organizations/latest/userguide/orgs_introduction.html) ↗ some permissions / resources might not be allowed. The full list can be generated from the synthesized cdk output with the awsFeatureUsage script.

```
/opensearch-migrations/deployment/cdk/opensearch-service-migration/awsFeatureUsage.sh [contextId]
```

<details>
<summary><b>Capture and Replay specific configuration options table</b>
</summary>

```shell
$ /opensearch-migrations/deployment/cdk/opensearch-service-migration/awsFeatureUsage.sh default
Synthesizing all stacks...
Synthesizing stack: networkStack-default
Synthesizing stack: migrationInfraStack
Synthesizing stack: reindexFromSnapshotStack
Synthesizing stack: migration-console
Finding resource usage from synthesized stacks...
-----------------------------------
IAM Policy Actions:
cloudwatch:GetMetricData
...
-----------------------------------
Resources Types:
AWS::CDK::Metadata
...
```
</details>
All of these authentication options apply to both source and target clusters.

## Network configuration

### Network Configuration
The migration tooling expects the source cluster, target cluster, and migration resources to exist in the same VPC. If this is not the case, manual networking setup outside of this documentation is likely required.
Loading
Loading