diff --git a/_migrations/getting-started-data-migration.md b/_migrations/getting-started-data-migration.md new file mode 100644 index 00000000000..8ae1a7f4570 --- /dev/null +++ b/_migrations/getting-started-data-migration.md @@ -0,0 +1,331 @@ +--- +layout: default +title: Quickstart: Data migration +nav_order: 10 +--- + +# Getting started: Data migration + +This quickstart outlines how to deploy Migration Assistant for OpenSearch and execute an existing data migration using `Reindex-from-Snapshot` (RFS). It uses AWS for illustrative purposes. However, the steps can be modified for use with other cloud providers. + + +## Prerequisites and assumptions + +Before using this quickstart, make sure you fulfill the following prerequisites: + +* Verify that your migration path [is supported](https://opensearch.org/docs/latest/migrations/is-migration-assistant-right-for-you/#supported-migration-paths). Note that we test with the exact versions specified, but you should be able to migrate data on alternative minor versions as long as the major version is supported. +* The source cluster must be deployed Amazon Simple Storage Service (Amazon S3) plugin. +* The target cluster must be deployed. + +The steps in this guide assume the following: + +* In this guide, a snapshot will be taken and stored in Amazon S3; the following assumptions are made about this snapshot: + * The `_source` flag is enabled on all indexes to be migrated. + * The snapshot includes the global cluster state (`include_global_state` is `true`). + * Shard sizes of up to approximately 80 GB are supported. Larger shards cannot be migrated. If this presents challenges for your migration, contact the [migration team](https://opensearch.slack.com/archives/C054JQ6UJFK). +* Migration Assistant will be installed in the same AWS Region and have access to both the source snapshot and target cluster. + +--- + +## Step 1: Installing Bootstrap on an Amazon EC2 instance (~10 minutes) + +To begin your migration, use the following steps to install a `bootstrap` box on an Amazon Elastic Compute Cloud (Amazon EC2) instance. The instance uses AWS CloudFormation to create and manage the stack. + +1. Log in to the target AWS account in which you want to deploy Migration Assistant. +2. From the browser where you are logged in to your target AWS account, right-click [here](https://console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/new?templateURL=https://solutions-reference.s3.amazonaws.com/migration-assistant-for-amazon-opensearch-service/latest/migration-assistant-for-amazon-opensearch-service.template&redirectId=SolutionWeb) to load the CloudFormation template from a new browser tab. +3. Follow the CloudFormation stack wizard: + * **Stack Name:** `MigrationBootstrap` + * **Stage Name:** `dev` + * Choose **Next** after each step > **Acknowledge** > **Submit**. +4. Verify that the Bootstrap stack exists and is set to `CREATE_COMPLETE`. This process takes around 10 minutes to complete. + +--- + +## Step 2: Setting up Bootstrap instance access (~5 minutes) + +Use the following steps to set up Bootstrap instance access: + +1. After deployment, find the EC2 instance ID for the `bootstrap-dev-instance`. +2. Create an AWS Identity and Access Management (IAM) policy using the following snippet, replacing ``, ``, ``, and `` with your information: + + ```json + { + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": "ssm:StartSession", + "Resource": [ + "arn:aws:ec2:::instance/", + "arn:aws:ssm:::document/BootstrapShellDoc--" + ] + } + ] + } + ``` + +3. Name the policy, for example, `SSM-OSMigrationBootstrapAccess`, and then create the policy by selecting **Create policy**. + +--- + +## Step 3: Logging in to Bootstrap and building Migration Assistant (~15 minutes) + +Next, log in to Bootstrap and build Migration Assistant using the following steps. + +### Prerequisites + +To use these steps, make sure you fulfill the following prerequisites: + +* The AWS Command Line Interface (AWS CLI) and AWS Session Manager plugin are installed on your instance. +* The AWS credentials are configured (`aws configure`) for your instance. + +### Steps + +1. Load AWS credentials into your terminal. +2. Log in to the instance using the following command, replacing `` and `` with your instance ID and Region: + + ```bash + aws ssm start-session --document-name BootstrapShellDoc-- --target --region [--profile ] + ``` + +3. Once logged in, run the following command from the shell of the Bootstrap instance in the `/opensearch-migrations` directory: + + ```bash + ./initBootstrap.sh && cd deployment/cdk/opensearch-service-migration + ``` + +4. After a successful build, note the path for infrastructure deployment, which will be used in the next step. + +--- + +## Step 4: Configuring and deploying RFS (~20 minutes) + +Use the following steps to configure and deploy RFS: + +1. Add the target cluster password to AWS Secrets Manager as an unstructured string. Be sure to copy the secret Amazon Resource Name (ARN) for use during deployment. +2. From the same shell as the Bootstrap instance, modify the `cdk.context.json` file located in the `/opensearch-migrations/deployment/cdk/opensearch-service-migration` directory: + + ```json + { + "migration-assistant": { + "vpcId": "", + "targetCluster": { + "endpoint": "", + "auth": { + "type": "basic", + "username": "", + "passwordFromSecretArn": "" + } + }, + "sourceCluster": { + "endpoint": "", + "auth": { + "type": "basic", + "username": "", + "passwordFromSecretArn": "" + } + }, + "reindexFromSnapshotExtraArgs": "", + "stage": "dev", + "otelCollectorEnabled": true, + "migrationConsoleServiceEnabled": true, + "reindexFromSnapshotServiceEnabled": true, + "migrationAssistanceEnabled": true + } + } + ``` + + The source and target cluster authorization can be configured to have no authorization, `basic` with a username and password, or `sigv4`. + +3. Bootstrap the account with the following command: + + ```bash + cdk bootstrap --c contextId=migration-assistant --require-approval never + ``` + +4. Deploy the stacks: + + ```bash + cdk deploy "*" --c contextId=migration-assistant --require-approval never --concurrency 5 + ``` + +5. Verify that all CloudFormation stacks were installed successfully. + +### RFS parameters + +If you're creating a snapshot using migration tooling, these parameters are automatically configured. If you're using an existing snapshot, modify the `reindexFromSnapshotExtraArgs` setting with the following values: + + ```bash + --s3-repo-uri s3:/// --s3-region --snapshot-name + ``` + +You will also need to give the `migrationconsole` and `reindexFromSnapshot` TaskRoles permissions to the S3 bucket. + +--- + +## Step 5: Deploying Migration Assistant + +To deploy Migration Assistant, use the following steps: + +1. Bootstrap the account: + + ```bash + cdk bootstrap --c contextId=migration-assistant --require-approval never --concurrency 5 + ``` +2. Deploy the stacks when `cdk.context.json` is fully configured: + + ```bash + cdk deploy "*" --c contextId=migration-assistant --require-approval never --concurrency 3 + ``` + +These commands deploy the following stacks: + +* Migration Assistant network stack +* Reindex From Snapshot stack +* Migration console stack + +--- + +## Step 6: Accessing the migration console + +Run the following command to access the migration console: + +```bash +./accessContainer.sh migration-console dev +``` + + +`accessContainer.sh` is located in `/opensearch-migrations/deployment/cdk/opensearch-service-migration/` on the Bootstrap instance. To learn more, see [Accessing the migration console]. +`{: .note} + +--- + +## Step 7: Verifying the connection to the source and target clusters + +To verify the connection to the clusters, run the following command: + +```bash +console clusters connection-check +``` + +You should receive the following output: + +``` +* **Source Cluster:** Successfully connected! +* **Target Cluster:** Successfully connected! +``` + +To learn more about migration console commands, see [Migration commands]. + +--- + +## Step 8: Snapshot creation + +Run the following command to initiate snapshot creation from the source cluster: + +```bash +console snapshot create [...] +``` + +To check the snapshot creation status, run the following command: + +```bash +console snapshot status [...] +``` + +To learn more information about the snapshot, run the following command: + +```bash +console snapshot status --deep-check [...] +``` + +Wait for snapshot creation to complete before moving to step 9. + +To learn more about snapshot creation, see [Snapshot Creation]. + +--- + +## Step 9: Metadata migration + +Run the following command to migrate metadata: + +```bash +console metadata migrate [...] +``` + +For more information, see [Metadata migration]. + +--- + +## Step 10: RFS document migration + +You can now use RFS to migrate documents from your original cluster: + +1. To start the migration from RFS, start a `backfill` using the following command: + + ```bash + console backfill start + ``` + +2. _(Optional)_ To speed up the migration, increase the number of documents processed at a simultaneously by using the following command: + + ```bash + console backfill scale + ``` + +3. To check the status of the documentation backfill, use the following command: + + ```bash + console backfill status + ``` + +4. If you need to stop the backfill process, use the following command: + + ```bash + console backfill stop + ``` + +For more information, see [Backfill execution]. + +--- + +## Step 11: Backfill monitoring + +Use the following command for detailed monitoring of the backfill process: + +```bash +console backfill status --deep-check +``` + +You should receive the following output: + +```json +BackfillStatus.RUNNING +Running=9 +Pending=1 +Desired=10 +Shards total: 62 +Shards completed: 46 +Shards incomplete: 16 +Shards in progress: 11 +Shards unclaimed: 5 +``` + +Logs and metrics are available in Amazon CloudWatch in the `OpenSearchMigrations` log group. + +--- + +## Step 12: Verify that all documents were migrated + +Use the following query in CloudWatch Logs Insights to identify failed documents: + +```bash +fields @message +| filter @message like "Bulk request succeeded, but some operations failed." +| sort @timestamp desc +| limit 10000 +``` + +If any failed documents are identified, you can index the failed documents directly as opposed to using RFS. + +For more information, see [Backfill migration]. diff --git a/_migrations/quick-start-data-migration.md b/_migrations/quick-start-data-migration.md deleted file mode 100644 index 62b13292e73..00000000000 --- a/_migrations/quick-start-data-migration.md +++ /dev/null @@ -1,262 +0,0 @@ ---- -layout: default -title: Quickstart - Data migration -nav_order: 10 ---- - -# Quickstart - Data migration - -This document outlines how to deploy the Migration Assistant and execute an existing data migration using Reindex-from-Snapshot (RFS). Note that this does not include steps for deploying and capturing live traffic, which is necessary for a zero-downtime migration. Please refer to the "Phases of a Migration" section in the wiki navigation bar for a complete end-to-end migration process, including metadata migration, live capture, Reindex-from-Snapshot, and replay. - -## Prerequisites and Assumptions -* Verify your migration path [is supported](https://github.com/opensearch-project/opensearch-migrations/wiki/Is-Migration-Assistant-Right-for-You%3F#supported-migration-paths). Note that we test with the exact versions specified, but you should be able to migrate data on alternative minor versions as long as the major version is supported. -* Source cluster must be deployed with the S3 plugin. -* Target cluster must be deployed. -* A snapshot will be taken and stored in S3 in this guide, and the following assumptions are made about this snapshot: - * The `_source` flag is enabled on all indices to be migrated. - * The snapshot includes the global cluster state (`include_global_state` is `true`). - * Shard sizes up to approximately 80GB are supported. Larger shards will not be able to migrate. If this is a blocker, please consult the migrations team. -* Migration Assistant will be installed in the same region and have access to both the source snapshot and target cluster. - ---- - -## Step 1 - Installing Bootstrap EC2 Instance (~10 mins) -1. Log into the target AWS account where you want to deploy the Migration Assistant. -2. From the browser where you are logged into your target AWS account right-click [here](https://console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/new?templateURL=https://solutions-reference.s3.amazonaws.com/migration-assistant-for-amazon-opensearch-service/latest/migration-assistant-for-amazon-opensearch-service.template&redirectId=SolutionWeb) ↗ to load the CloudFormation (Cfn) template from a new browser tab. -3. Follow the CloudFormation stack wizard: - * **Stack Name:** `MigrationBootstrap` - * **Stage Name:** `dev` - * Hit **Next** on each step, acknowledge on the fourth screen, and hit **Submit**. -4. Verify that the bootstrap stack exists and is set to `CREATE_COMPLETE`. This process takes around 10 minutes. - ---- - -## Step 2 - Setup Bootstrap Instance Access (~5 mins) -1. After deployment, find the EC2 instance ID for the `bootstrap-dev-instance`. -2. Create an IAM policy using the snippet below, replacing ``, ``, ``, and ``: - -```json -{ - "Version": "2012-10-17", - "Statement": [ - { - "Effect": "Allow", - "Action": "ssm:StartSession", - "Resource": [ - "arn:aws:ec2:::instance/", - "arn:aws:ssm:::document/SSM--BootstrapShell" - ] - } - ] -} -``` - -3. Name the policy, e.g., `SSM-OSMigrationBootstrapAccess`, and create the policy. - ---- - -## Step 3 - Login to Bootstrap and Build (~15 mins) -### Prerequisites: -* AWS CLI and AWS Session Manager Plugin installed. -* AWS credentials configured (`aws configure`). - -1. Load AWS credentials into your terminal. -2. Login to the instance using the command below, replacing `` and ``: -```bash -aws ssm start-session --document-name SSM-dev-BootstrapShell --target --region [--profile ] -``` -3. Once logged in, run the following command from the shell of the bootstrap instance (within the /opensearch-migrations directory): -```bash -./initBootstrap.sh && cd deployment/cdk/opensearch-service-migration -``` -4. After a successful build, remember the path for infrastructure deployment in the next step. - ---- - -## Step 4 - Configuring and Deploying for RFS Use Case (~20 mins) -1. Add the target cluster password to AWS Secrets Manager as an unstructured string. Be sure to copy the secret ARN for use during deployment. -2. From the same shell on the bootstrap instance, modify the cdk.context.json file located in the `/opensearch-migrations/deployment/cdk/opensearch-service-migration` directory: - -```json -{ - "migration-assistant": { - "vpcId": "", - "targetCluster": { - "endpoint": "", - "auth": { - "type": "basic", - "username": "", - "passwordFromSecretArn": "" - } - }, - "sourceCluster": { - "endpoint": "", - "auth": { - "type": "basic", - "username": "", - "passwordFromSecretArn": "" - } - }, - "reindexFromSnapshotExtraArgs": "", - "stage": "dev", - "otelCollectorEnabled": true, - "migrationConsoleServiceEnabled": true, - "reindexFromSnapshotServiceEnabled": true, - "migrationAssistanceEnabled": true - } -} -``` - -The source and target cluster authorization can be configured to have none, `basic` with a username and password, or `sigv4`. There are examples of each available [here](https://github.com/opensearch-project/opensearch-migrations/wiki/Configuration-Options#cluster-authentication-options). - -3. Bootstrap the account with the following command: -```bash -cdk bootstrap --c contextId=migration-assistant --require-approval never -``` -4. Deploy the stacks: -```bash -cdk deploy "*" --c contextId=migration-assistant --require-approval never --concurrency 5 -``` -5. Verify that all CloudFormation stacks were installed successfully. - -#### ReindexFromSnapshot Parameters -* If you're creating a snapshot using migration tooling, these parameters are auto-configured. If you're using an existing snapshot, modify `reindexFromSnapshotExtraArgs` with the following values: -```bash ---s3-repo-uri s3:/// --s3-region --snapshot-name -``` -Note, you will also need to give access to the migrationconsole and reindexFromSnapshot taskRole permissions to the bucket - ---- - -## Step 5 - Deploying the Migration Assistant -1. Bootstrap the account: -```bash -cdk bootstrap --c contextId=migration-assistant --require-approval never --concurrency 5 -``` -2. Deploy the stacks when `cdk.context.json` is fully configured: -```bash -cdk deploy "*" --c contextId=migration-assistant --require-approval never --concurrency 3 -``` - -### Stacks Deployed: -* Migration Assistant Network stack -* Reindex From Snapshot stack -* Migration Console stack - ---- - -## Step 6 - Accessing the Migration Console -Run the following command to access the migration console: -```bash -./accessContainer.sh migration-console dev -``` ->[!NOTE] ->`accessContainer.sh` is located in `/opensearch-migrations/deployment/cdk/opensearch-service-migration/` on the bootstrap instance. - -_Learn more [[Accessing the Migration Console]]_ - ---- - -## Step 7 - Checking Connection to Source & Target Clusters -To verify the connection to the clusters, run: -```bash -console clusters connection-check -``` - -### Expected Output: -* **Source Cluster:** Successfully connected! -* **Target Cluster:** Successfully connected! - -_Learn more [[Console commands reference|Migration-Console-commands-references]]_ - ---- - -## Step 8 - Snapshot Creation -Run the following to initiate creating a snapshot from the source cluster -``` -console snapshot create [...] -``` - -To check on the progress, -``` -console snapshot status [...] -``` -or, for more detail, -``` -console snapshot status --deep-check [...] -``` - -Wait for the snapshot to complete before moving to the next step. - -_Learn more [[Snapshot Creation Verification]] [[Snapshot Creation]]_ - ---- - -## Step 9 - Metadata Migration -Run the following command to migrate metadata: -```bash -console metadata migrate [...] -``` - -_Learn more [[Metadata Migration]]_ - ---- - -## Step 10 - RFS Document Migration -Start the backfill process: -```bash -console backfill start -``` - -Scale up the number of workers: -```bash -console backfill scale -``` - -Check the status: -```bash -console backfill status -``` - -To stop the workers: -```bash -console backfill stop -``` - -_Learn more [[Backfill Execution]]_ - ---- - -## Step 11 - Monitoring -Use the following command for detailed monitoring: -```bash -console backfill status --deep-check -``` - -### Example Output: -```text -BackfillStatus.RUNNING -Running=9 -Pending=1 -Desired=10 -Shards total: 62 -Shards completed: 46 -Shards incomplete: 16 -Shards in progress: 11 -Shards unclaimed: 5 -``` - -Logs and metrics are available in CloudWatch in the OpenSearchMigrations log group. - ---- - -## Step 12 - Verify all documents were migrated -Use the following query in CloudWatch Logs Insights to identify failed documents: -```bash -fields @message -| filter @message like "Bulk request succeeded, but some operations failed." -| sort @timestamp desc -| limit 10000 -``` - -_Learn more [[Backfill Result Validation]]_ \ No newline at end of file