Skip to content

Commit

Permalink
Add migration phases pages (#8828)
Browse files Browse the repository at this point in the history
* Add first three migration phases pages

Signed-off-by: Archer <[email protected]>

* Apply suggestions from code review

Signed-off-by: Naarcha-AWS <[email protected]>

* Add backfill page.

Signed-off-by: Archer <[email protected]>

* Update backfill.md

Signed-off-by: Naarcha-AWS <[email protected]>

* Add replayer page.

Signed-off-by: Archer <[email protected]>

* Fix grammar.

Signed-off-by: Archer <[email protected]>

* Add final migration phases page.

Signed-off-by: Archer <[email protected]>

* Apply suggestions from code review

Signed-off-by: Naarcha-AWS <[email protected]>

* Apply suggestions from code review

Signed-off-by: Naarcha-AWS <[email protected]>

* Update _migrations/migration-phases/verifying-tools-for-migration.md

Signed-off-by: Naarcha-AWS <[email protected]>

* Update _migrations/migration-phases/backfill.md

Signed-off-by: Naarcha-AWS <[email protected]>

* Add migration phase links

Signed-off-by: Archer <[email protected]>

* Edit migration console section

Signed-off-by: Archer <[email protected]>

* Apply suggestions from code review

Signed-off-by: Naarcha-AWS <[email protected]>

* Update migrating-metadata.md

Signed-off-by: Naarcha-AWS <[email protected]>

* Apply suggestions from code review

Signed-off-by: Naarcha-AWS <[email protected]>

* Rename traffic replacer.

Signed-off-by: Archer <[email protected]>

* Update _migrations/migration-phases/verifying-tools-for-migration.md

Signed-off-by: Naarcha-AWS <[email protected]>

* Add sentence about live traffic capture

Signed-off-by: Archer <[email protected]>

* Apply suggestions from code review

Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>

* Update _migrations/migration-phases/backfill.md

Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>

* Update _migrations/migration-phases/verifying-tools-for-migration.md

Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>

* Update _migrations/migration-phases/verifying-tools-for-migration.md

Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>

* Add editorial review.

Signed-off-by: Archer <[email protected]>

* Additional editorial comments.

Signed-off-by: Archer <[email protected]>

* Editorial for infra and traffic

Signed-off-by: Archer <[email protected]>

* Editorial comments for using traffic replayer.

Signed-off-by: Archer <[email protected]>

* Add final editorial comments.

Signed-off-by: Archer <[email protected]>

* Apply suggestions from code review

Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>

---------

Signed-off-by: Archer <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>
Co-authored-by: Nathan Bower <[email protected]>
(cherry picked from commit 1fcf278)
  • Loading branch information
Naarcha-AWS committed Dec 3, 2024
1 parent 73e7132 commit 2419258
Show file tree
Hide file tree
Showing 32 changed files with 1,250 additions and 953 deletions.
330 changes: 330 additions & 0 deletions _migrations/getting-started-data-migration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,330 @@
---
layout: default
title: Quickstart: Data migration
nav_order: 10
---

# Getting started: Data migration

This quickstart outlines how to deploy Migration Assistant for OpenSearch and execute an existing data migration using `Reindex-from-Snapshot` (RFS). It uses AWS for illustrative purposes. However, the steps can be modified for use with other cloud providers.


## Prerequisites and assumptions

Before using this quickstart, make sure you fulfill the following prerequisites:

* Verify that your migration path [is supported](https://opensearch.org/docs/latest/migrations/is-migration-assistant-right-for-you/#supported-migration-paths). Note that we test with the exact versions specified, but you should be able to migrate data on alternative minor versions as long as the major version is supported.
* The source cluster must be deployed Amazon Simple Storage Service (Amazon S3) plugin.
* The target cluster must be deployed.

The steps in this guide assume the following:

* In this guide, a snapshot will be taken and stored in Amazon S3; the following assumptions are made about this snapshot:
* The `_source` flag is enabled on all indexes to be migrated.
* The snapshot includes the global cluster state (`include_global_state` is `true`).
* Shard sizes of up to approximately 80 GB are supported. Larger shards cannot be migrated. If this presents challenges for your migration, contact the [migration team](https://opensearch.slack.com/archives/C054JQ6UJFK).
* Migration Assistant will be installed in the same AWS Region and have access to both the source snapshot and target cluster.

---

## Step 1: Installing Bootstrap on an Amazon EC2 instance (~10 minutes)

To begin your migration, use the following steps to install a `bootstrap` box on an Amazon Elastic Compute Cloud (Amazon EC2) instance. The instance uses AWS CloudFormation to create and manage the stack.

1. Log in to the target AWS account in which you want to deploy Migration Assistant.
2. From the browser where you are logged in to your target AWS account, right-click [here](https://console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/new?templateURL=https://solutions-reference.s3.amazonaws.com/migration-assistant-for-amazon-opensearch-service/latest/migration-assistant-for-amazon-opensearch-service.template&redirectId=SolutionWeb) to load the CloudFormation template from a new browser tab.
3. Follow the CloudFormation stack wizard:
* **Stack Name:** `MigrationBootstrap`
* **Stage Name:** `dev`
* Choose **Next** after each step > **Acknowledge** > **Submit**.
4. Verify that the Bootstrap stack exists and is set to `CREATE_COMPLETE`. This process takes around 10 minutes to complete.

---

## Step 2: Setting up Bootstrap instance access (~5 minutes)

Use the following steps to set up Bootstrap instance access:

1. After deployment, find the EC2 instance ID for the `bootstrap-dev-instance`.
2. Create an AWS Identity and Access Management (IAM) policy using the following snippet, replacing `<aws-region>`, `<aws-account>`, `<stage>`, and `<ec2-instance-id>` with your information:

```json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "ssm:StartSession",
"Resource": [
"arn:aws:ec2:<aws-region>:<aws-account>:instance/<ec2-instance-id>",
"arn:aws:ssm:<aws-region>:<aws-account>:document/BootstrapShellDoc-<stage>-<aws-region>"
]
}
]
}
```

3. Name the policy, for example, `SSM-OSMigrationBootstrapAccess`, and then create the policy by selecting **Create policy**.

---

## Step 3: Logging in to Bootstrap and building Migration Assistant (~15 minutes)

Check failure on line 71 in _migrations/getting-started-data-migration.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _migrations/getting-started-data-migration.md#L71

[OpenSearch.HeadingCapitalization] 'Step 3: Logging in to Bootstrap and building Migration Assistant (~15 minutes)' is a heading and should be in sentence case.
Raw output
{"message": "[OpenSearch.HeadingCapitalization] 'Step 3: Logging in to Bootstrap and building Migration Assistant (~15 minutes)' is a heading and should be in sentence case.", "location": {"path": "_migrations/getting-started-data-migration.md", "range": {"start": {"line": 71, "column": 4}}}, "severity": "ERROR"}

Next, log in to Bootstrap and build Migration Assistant using the following steps.

### Prerequisites

To use these steps, make sure you fulfill the following prerequisites:

* The AWS Command Line Interface (AWS CLI) and AWS Session Manager plugin are installed on your instance.
* The AWS credentials are configured (`aws configure`) for your instance.

### Steps

1. Load AWS credentials into your terminal.
2. Log in to the instance using the following command, replacing `<instance-id>` and `<aws-region>` with your instance ID and Region:

```bash
aws ssm start-session --document-name BootstrapShellDoc-<stage>-<aws-region> --target <instance-id> --region <aws-region> [--profile <profile-name>]
```

3. Once logged in, run the following command from the shell of the Bootstrap instance in the `/opensearch-migrations` directory:

```bash
./initBootstrap.sh && cd deployment/cdk/opensearch-service-migration
```

4. After a successful build, note the path for infrastructure deployment, which will be used in the next step.

---

## Step 4: Configuring and deploying RFS (~20 minutes)

Use the following steps to configure and deploy RFS:

1. Add the target cluster password to AWS Secrets Manager as an unstructured string. Be sure to copy the secret Amazon Resource Name (ARN) for use during deployment.
2. From the same shell as the Bootstrap instance, modify the `cdk.context.json` file located in the `/opensearch-migrations/deployment/cdk/opensearch-service-migration` directory:

```json
{
"migration-assistant": {
"vpcId": "<TARGET CLUSTER VPC ID>",
"targetCluster": {
"endpoint": "<TARGET CLUSTER ENDPOINT>",
"auth": {
"type": "basic",
"username": "<TARGET CLUSTER USERNAME>",
"passwordFromSecretArn": "<TARGET CLUSTER PASSWORD SECRET>"
}
},
"sourceCluster": {
"endpoint": "<SOURCE CLUSTER ENDPOINT>",
"auth": {
"type": "basic",
"username": "<TARGET CLUSTER USERNAME>",
"passwordFromSecretArn": "<TARGET CLUSTER PASSWORD SECRET>"
}
},
"reindexFromSnapshotExtraArgs": "<RFS PARAMETERS (see below)>",
"stage": "dev",
"otelCollectorEnabled": true,
"migrationConsoleServiceEnabled": true,
"reindexFromSnapshotServiceEnabled": true,
"migrationAssistanceEnabled": true
}
}
```

The source and target cluster authorization can be configured to have no authorization, `basic` with a username and password, or `sigv4`.

3. Bootstrap the account with the following command:

```bash
cdk bootstrap --c contextId=migration-assistant --require-approval never
```

4. Deploy the stacks:

```bash
cdk deploy "*" --c contextId=migration-assistant --require-approval never --concurrency 5
```

5. Verify that all CloudFormation stacks were installed successfully.

### RFS parameters

If you're creating a snapshot using migration tooling, these parameters are automatically configured. If you're using an existing snapshot, modify the `reindexFromSnapshotExtraArgs` setting with the following values:

```bash
--s3-repo-uri s3://<bucket-name>/<repo> --s3-region <region> --snapshot-name <name>
```

You will also need to give the `migrationconsole` and `reindexFromSnapshot` TaskRoles permissions to the S3 bucket.

---

## Step 5: Deploying Migration Assistant

Check failure on line 166 in _migrations/getting-started-data-migration.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _migrations/getting-started-data-migration.md#L166

[OpenSearch.HeadingCapitalization] 'Step 5: Deploying Migration Assistant' is a heading and should be in sentence case.
Raw output
{"message": "[OpenSearch.HeadingCapitalization] 'Step 5: Deploying Migration Assistant' is a heading and should be in sentence case.", "location": {"path": "_migrations/getting-started-data-migration.md", "range": {"start": {"line": 166, "column": 4}}}, "severity": "ERROR"}

To deploy Migration Assistant, use the following steps:

1. Bootstrap the account:

```bash
cdk bootstrap --c contextId=migration-assistant --require-approval never --concurrency 5
```
2. Deploy the stacks when `cdk.context.json` is fully configured:

```bash
cdk deploy "*" --c contextId=migration-assistant --require-approval never --concurrency 3
```

These commands deploy the following stacks:

* Migration Assistant network stack
* `Reindex-from-snapshot` stack
* Migration console stack

---

## Step 6: Accessing the migration console

Run the following command to access the migration console:

```bash
./accessContainer.sh migration-console dev <region>
```


`accessContainer.sh` is located in `/opensearch-migrations/deployment/cdk/opensearch-service-migration/` on the Bootstrap instance. To learn more, see [Accessing the migration console].
`{: .note}

---

## Step 7: Verifying the connection to the source and target clusters

To verify the connection to the clusters, run the following command:

```bash
console clusters connection-check
```

You should receive the following output:

```
* **Source Cluster:** Successfully connected!
* **Target Cluster:** Successfully connected!
```

To learn more about migration console commands, see [Migration commands].

---

## Step 8: Snapshot creation

Run the following command to initiate snapshot creation from the source cluster:

```bash
console snapshot create [...]
```

To check the snapshot creation status, run the following command:

```bash
console snapshot status [...]
```

To learn more information about the snapshot, run the following command:

```bash
console snapshot status --deep-check [...]
```

Wait for snapshot creation to complete before moving to step 9.

To learn more about snapshot creation, see [Snapshot Creation].

---

## Step 9: Metadata migration

Run the following command to migrate metadata:

```bash
console metadata migrate [...]
```

For more information, see [Migrating metadata]({{site.url}}{{site.baseurl}}/migrations/migration-phases/migrating-metadata/).

---

## Step 10: RFS document migration

You can now use RFS to migrate documents from your original cluster:

1. To start the migration from RFS, start a `backfill` using the following command:

```bash
console backfill start
```

2. _(Optional)_ To speed up the migration, increase the number of documents processed at a simultaneously by using the following command:

```bash
console backfill scale <NUM_WORKERS>
```

3. To check the status of the documentation backfill, use the following command:

```bash
console backfill status
```

4. If you need to stop the backfill process, use the following command:

```bash
console backfill stop
```

For more information, see [Backfill]({{site.url}}{{site.baseurl}}/migrations/migration-phases/backfill/).

---

## Step 11: Backfill monitoring

Use the following command for detailed monitoring of the backfill process:

```bash
console backfill status --deep-check
```

You should receive the following output:

```json
BackfillStatus.RUNNING
Running=9
Pending=1
Desired=10
Shards total: 62
Shards completed: 46
Shards incomplete: 16
Shards in progress: 11
Shards unclaimed: 5
```

Logs and metrics are available in Amazon CloudWatch in the `OpenSearchMigrations` log group.

---

## Step 12: Verify that all documents were migrated

Use the following query in CloudWatch Logs Insights to identify failed documents:

```bash
fields @message
| filter @message like "Bulk request succeeded, but some operations failed."
| sort @timestamp desc
| limit 10000
```

If any failed documents are identified, you can index the failed documents directly as opposed to using RFS.

29 changes: 11 additions & 18 deletions _migrations/migration-console/accessing-the-migration-console.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,15 @@
---
layout: default
title: Accessing the migration console
nav_order: 35
parent: Migration console
---

# Accessing the migration console

The Bootstrap box deployed through Migration Assistant contains a script that simplifies access to the migration console through that instance.

The Migrations Assistant deployment includes an ECS task that hosts tools to run different phases of the migration and check the progress or results of the migration.

## SSH into the Migration Console
Following the AWS Solutions deployment, the bootstrap box contains a script that simplifies access to the migration console through that instance.

To access the Migration Console, use the following commands:
To access the migration console, use the following commands:

```shell
export STAGE=dev
Expand All @@ -16,13 +19,7 @@ export AWS_REGION=us-west-2

When opening the console a message will appear above the command prompt, `Welcome to the Migration Assistant Console`.

<details>

<summary>
<b>SSH from any machine into Migration Console</b>
</summary>

On a machine with the [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) ↗ and the [AWS Session Manager Plugin](https://docs.aws.amazon.com/systems-manager/latest/userguide/session-manager-working-with-install-plugin.html) ↗, you can directly connect to the migration console. Ensure you've run `aws configure` with credentials that have access to the environment.
On a machine with the [AWS Command Line Interface (AWS CLI)](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) and the [AWS Session Manager plugin](https://docs.aws.amazon.com/systems-manager/latest/userguide/session-manager-working-with-install-plugin.html), you can directly connect to the migration console. Ensure that you've run `aws configure` with credentials that have access to the environment.

Use the following commands:

Expand All @@ -32,10 +29,6 @@ export SERVICE_NAME=migration-console
export TASK_ARN=$(aws ecs list-tasks --cluster migration-${STAGE}-ecs-cluster --family "migration-${STAGE}-${SERVICE_NAME}" | jq --raw-output '.taskArns[0]')
aws ecs execute-command --cluster "migration-${STAGE}-ecs-cluster" --task "${TASK_ARN}" --container "${SERVICE_NAME}" --interactive --command "/bin/bash"
```
</details>

## Troubleshooting

### Deployment Stage

Typically, `STAGE` is `dev`, but this may vary based on what the user specified during deployment.
Typically, `STAGE` is equivalent to a standard `dev` environment, but this may vary based on what the user specified during deployment.
7 changes: 6 additions & 1 deletion _migrations/migration-console/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,9 @@ layout: default
title: Migration console
nav_order: 30
has_children: true
---
---

The Migrations Assistant deployment includes an Amazon Elastic Container Service (Amazon ECS) task that hosts tools that run different phases of the migration and check the progress or results of the migration. This ECS task is called the **migration console**. The migration console is a command line interface used to interact with the deployed components of the solution.

This section provides information about how to access the migration console and what commands are supported.

Loading

0 comments on commit 2419258

Please sign in to comment.