From e3db40a65b507aa78ac606549fba07a3953adb45 Mon Sep 17 00:00:00 2001 From: Ronald Suplina <111508615+rsuplina@users.noreply.github.com> Date: Tue, 8 Oct 2024 10:10:39 +0100 Subject: [PATCH] Rearrange readme with clearer instructions (#45) Signed-off-by: rsuplina Signed-off-by: Jim Enright Co-authored-by: Jim Enright --- README.md | 248 +++++++++++++++++++++++++++++------------------------- 1 file changed, 133 insertions(+), 115 deletions(-) diff --git a/README.md b/README.md index c96a93a..308661d 100644 --- a/README.md +++ b/README.md @@ -1,184 +1,202 @@ -# CDP quickstart using the Terraform Module for CDP Prerequisites -This repository contains Terraform resource files to quickly deploy Cloudera Data Platform (CDP) Public Cloud and associated pre-requisite Cloud Service Provider (CSP) resources. It uses the [CDP Terraform Modules](https://github.com/cloudera-labs/terraform-cdp-modules) to do this. +# Cloudera on cloud Quickstart Using Terraform + +This repository provides Terraform resources to quickly deploy **Cloudera on Cloud** and associated pre-requisite **Cloud Service Provider (CSP)** resources. It uses the [CDP Terraform Modules](https://github.com/cloudera-labs/terraform-cdp-modules) to do this. A summary requirements, configuration and execution steps to use this repository is given below. -## Prerequisites +## ⚠️ Prerequisites -To use the module provided here, you will need the following prerequisites: +To use the module provided here, you will need: -* An AWS, Azure or GCP Cloud account; -* A CDP Public Cloud account (you can sign up for a [60-day free pilot](https://www.cloudera.com/campaign/try-cdp-public-cloud.html) ); +* An AWS, Azure, or GCP Cloud account; +* A Cloudera on cloud account (you can sign up for a [60-day free pilot](https://www.cloudera.com/campaign/try-cdp-public-cloud.html) ); * A recent version of Terraform software (version 0.13 or higher). -## Deployment steps +## 🔧 Configure Local Prerequisites + +* Terraform can be installed by following the instructions at https://developer.hashicorp.com/terraform/downloads. -### Configure local prerequisites +* If you have not yet configured your `~/.cdp/credentials` file, follow the steps for [Generating an API access key](https://docs.cloudera.com/cdp-public-cloud/cloud/cli/topics/mc-cli-generating-an-api-access-key.html). * To create resources in the Cloud Provider, access credentials or service account are needed for authentication. - * For **AWS** access keys are required to be able to create the Cloud resources via the Terraform aws provider. See the [AWS documentation for Managing access keys for IAM users](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html). - * For **Azure**, authentication with the Azure subscription is required. There are a number of ways to do this outlined in the [Azure Terraform Provider Documentation](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs#authenticating-to-azure). - * For **GCP**, authentication with the GCP API is required. There are a number of ways to do this outlined in the [Google Terraform Provider Documentation](https://registry.terraform.io/providers/hashicorp/google/latest/docs/guides/provider_reference#authentication). + * For **AWS** access keys are required to be able to create the Cloud resources via the Terraform aws provider. See the [AWS documentation for Managing access keys](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html). + * For **Azure**, authentication with the Azure subscription is required. There are a number of ways to do this outlined in the [Azure Terraform Provider documentation](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs#authenticating-to-azure). + * For **GCP**, authentication with the GCP API is required. There are a number of ways to do this outlined in the [Google Terraform Provider documentation](https://registry.terraform.io/providers/hashicorp/google/latest/docs/guides/provider_reference#authentication). -* If you have not yet configured your `~/.cdp/credentials file`, follow the steps for [Generating an API access key](https://docs.cloudera.com/cdp-public-cloud/cloud/cli/topics/mc-cli-generating-an-api-access-key.html). +> [!NOTE] +> See the [Additional Authentication & Configuration Notes](#additional-authentication--configuration-notes) section for further details on authentication with the Cloud Providers. -* Terraform can be installed by following the instructions at https://developer.hashicorp.com/terraform/downloads +## 📖 Quickstart Guide -#### Notes on AWS authentication +> [!IMPORTANT] +> Make sure your Cloudera and you Cloud provider credentials are properly configured before proceeding -* Details of the different methods to authenticate with AWS are available in the [aws Terraform provider docs](https://registry.terraform.io/providers/hashicorp/aws/latest/docs#authentication-and-configuration). +### 1. Clone the Repository -* The most common ways to specify AWS access and secret keys are: - * via environment variables (i.e. setting the `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY`) or; - * via shared configuration/credential files (e.g. the `$HOME/.aws/credentials` file). The `AWS_PROFILE` environment variable can be set to specify a named AWS profile. +```bash +git clone https://github.com/cloudera-labs/cdp-tf-quickstarts.git +cd cdp-tf-quickstarts +``` -* Note that the AWS region to use should always be specifed as a Terraform input variable (with the `aws_region` variable). This region variable is also used an input to the CDP deploy module used to identify the Cloud Provider region. +### 2. Configure Variables -#### Notes on Azure authentication +Change to required cloud provider directory and create a `terraform.tfvars` file with variable configuration for your deployment. -* Where you have more than one Azure Subscription the id to use can be passed via the the `ARM_SUBSCRIPTION_ID` environment variable. +Reference the `terraform.tfvars.template` in each cloud provider directory and the sample contents with indicators of values to change shown below. -* When using a Service Principal (SP) to authenticate with Azure, it is not possible to authenticate with azuread Terraform Provider (the provider used to create the Azure Cross Account AD Application) with the command az login --service-principal. We found the the best way to authenticate using an SP is by setting environment variables. Details of required environment variables are in the [azuread docs](https://registry.terraform.io/providers/hashicorp/azuread/latest/docs/guides/service_principal_client_secret#environment-variables) and [azurerm docs](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/guides/service_principal_client_secret#configuring-the-service-principal-in-terraform) and summarized below. ```bash -export ARM_CLIENT_ID="" -export ARM_CLIENT_SECRET="" -export ARM_TENANT_ID="" -export ARM_SUBSCRIPTION_ID="" -``` +# Change into cloud provider directory, e.g. for aws +cd aws -* The Azure API permissions listed are required by the provisioning account to create the Azure pre-requisite resources. Note that all permissions are of type Application (rather than Delegated). +cp terraform.tfvars.template terraform.tfvars -| API Permission | Purpose | -| ------------------| ------- | -| Microsoft Graph - Application.Read.All | Read all applications | -| Microsoft Graph - Application.ReadWrite.All | Read and write all applications | -| Microsoft Graph - Application.ReadWrite.OwnedBy | Manage apps that this app creates or owns | -| Microsoft Graph - Directory.ReadWrite.All | Read and write directory data | -| Microsoft Graph - User.Read.All | Read all users' full profiles | +vi terraform.tfvars +``` -#### Notes on GCP authentication +
+ Expand for AWS configuration file -* The [Getting Started Docs for Google Terraform Provider](https://registry.terraform.io/providers/hashicorp/google/latest/docs/guides/getting_started#adding-credentials) gives details on the two recommended ways to authenticate with the GCP API. - 1. The Google Cloud SDK (`gcloud`) can be installed and a User Application Default Credentials ("ADCs") can be created by running the command `gcloud auth application-default login` - 1. A Google Cloud Service Account key file can be generated and downloaded. The `GOOGLE_APPLICATION_CREDENTIALS` environment variable can then be set to the location of the file. + ```yaml + # ------- Global settings ------- + env_prefix = "" # Required name prefix for cloud and CDP resources, e.g. cldr1 - ```bash - export GOOGLE_APPLICATION_CREDENTIALS= + # ------- Cloud Settings ------- + aws_region = "" # Change this to specify Cloud Provider region, e.g. eu-west-1 + + # ------- CDP Environment Deployment ------- + deployment_template = "" # Specify the deployment pattern below. Options are public, semi-private or private ``` -* The Google Cloud IAM roles listed below are required by the provisioning account to create the GCP pre-requisite resources. +
+
+
+ Expand for Azure configuration file -| IAM Role | -| ------------------------- | -| Compute Network Admin | -| Compute Security Admin | -| Role Administrator | -| Security Admin | -| Service Account Admin | -| Service Account Key Admin | -| Storage Admin | -| Viewer | + ```yaml + # ------- Global settings ------- + env_prefix = "" # Required name prefix for cloud and CDP resources, e.g. cldr1 -* The Google project Id can be specified via the `gcp_project` input variable, the `GOOGLE_PROJECT` environment variable or the default project set via the Cloud SDK. This is described in the [Google Provider Default Values Configuration](https://registry.terraform.io/providers/hashicorp/google/latest/docs/guides/provider_reference#provider-default-values-configuration) documentation. + # ------- Cloud Settings ------- + azure_region = "" # Change this to specify Cloud Provider region, e.g. eastus -### Input file configuration + # ------- CDP Environment Deployment ------- + deployment_template = "" # Specify the deployment pattern below. Options are public, semi-private or private + ``` + +
+
+
+ Expand for GCP configuration file -The `terraform.tfvars.template` file in the required cloud provider directory contains the user-facing configuration. Edit this file to match your particular deployment. + ```yaml + # ------- Global settings ------- + env_prefix = "" # Required name prefix for cloud and CDP resources, e.g. cldr1 -Sample contents with indicators of values to change are shown below. + # ------- Cloud Settings ------- + gcp_project = "" # Change this to specify the GCP Project ID -#### Sample Configuration file for AWS + gcp_region = "" # Change this to specify Cloud Provider region, e.g. europe-west2 -```yaml -# ------- Global settings ------- -env_prefix = "" # Required name prefix for cloud and CDP resources, e.g. cldr1 + # ------- CDP Environment Deployment ------- + deployment_template = "" # Specify the deployment pattern below. Options are public, semi-private or private + ``` -# ------- Cloud Settings ------- -aws_region = "" # Change this to specify Cloud Provider region, e.g. eu-west-1 +
-# ------- CDP Environment Deployment ------- -deployment_template = "" # Specify the deployment pattern below. Options are public, semi-private or private +### 3. Deploy Infrastructure +```bash +terraform init +terraform apply ``` -#### Sample Configuration file for Azure +> ⏱️ **Note:** The deployment can take up to **60 minutes**. -```yaml -# ------- Global settings ------- -env_prefix = "" # Required name prefix for cloud and CDP resources, e.g. cldr1 +### 4. Monitor Progress -# ------- Cloud Settings ------- -azure_region = "" # Change this to specify Cloud Provider region, e.g. eastus +You can follow the deployment process on the Cloudera on cloud Management Console from your browser at [cdp.cloudera.com](https://cdp.cloudera.com). -# ------- CDP Environment Deployment ------- -deployment_template = "" # Specify the deployment pattern below. Options are public, semi-private or private -``` +After it completes, you can add [Data Hubs and Data Services](https://docs.cloudera.com/cdp-public-cloud/cloud/overview/topics/cdp-services.html) to your newly deployed environment from the Management Console UI or using the CLI. -#### Sample Configuration file for GCP +### Clean Up Resources -```yaml -# ------- Global settings ------- -env_prefix = "" # Required name prefix for cloud and CDP resources, e.g. cldr1 +If you no longer need the infrastructure and Cloudera on cloud environment that's provisioned by Terraform, run the following command to remove the deployment infrastructure and terminate all resources. -# ------- Cloud Settings ------- -gcp_project = "" # Change this to specify the GCP Project ID +```bash +terraform destroy +``` -gcp_region = "" # Change this to specify Cloud Provider region, e.g. europe-west2 +> ⏱️ **Note:** Cleanup of the deployment will take about 20 minutes. -# ------- CDP Environment Deployment ------- -deployment_template = "" # Specify the deployment pattern below. Options are public, semi-private or private -``` +## Additional Authentication & Configuration Notes -#### SSH keys +### SSH keys -By default the Terraform quickstarts will create a new SSH keypair that will be associated with all nodes provisioned by CDP. The private key will be stored in the `-ssh-key.pem` file of the Terraform cloud provider project directory. +By default the Terraform quickstarts will create a new SSH keypair that will be associated with all nodes provisioned by Cloudera on cloud. The private key will be stored in the `-ssh-key.pem` file of the Terraform cloud provider project directory. To use an existing SSH key, set the keypair name (for AWS) or public key text (for Azure and GCP) variable in the `terraform.tvars` file. -#### Access to UI and API endpoints +### Access to UI and API endpoints -By default inbound access to the UI and API endpoints of your deployment will be allowed from the public IP of executing host. +By default inbound access to the UI and API endpoints of your deployment will be allowed from the public IP of executing host. To add additional CIDRs or IP ranges, set the optional `ingress_extra_cidrs_and_ports` variable in the `terraform.tvars` file. -### Create infrastructure +### Notes on AWS authentication -1. Clone this repository using the following commands: +* Details of the different methods to authenticate with AWS are available in the [aws Terraform provider docs](https://registry.terraform.io/providers/hashicorp/aws/latest/docs#authentication-and-configuration). -```bash -# Git clone -git clone https://github.com/cloudera-labs/cdp-tf-quickstarts.git -# Change to directory with the cloned repo -cd cdp-tf-quickstarts -``` +* The most common ways to specify AWS access and secret keys are: + * via environment variables (i.e. setting the `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY`) or; + * via shared configuration/credential files (e.g. the `$HOME/.aws/credentials` file). The `AWS_PROFILE` environment variable can be set to specify a named AWS profile. -2. In the cloned repo, change to required cloud provider directory and create a `terraform.tfvars` file with variable definitions to run the module. Reference the `terraform.tfvars.template` in each cloud provider directory and the example contents discussed in the section above. +* Note that the AWS region to use should always be specifed as a Terraform input variable (with the `aws_region` variable). This region variable is also used an input to the CDP deploy module used to identify the Cloud Provider region. -```bash -# Change into cloud provider directory, e.g. for aws -cd aws +### Notes on Azure authentication -#Copy terraform.tfvars.template into terraform.tfvars -cp terraform.tfvars.template terraform.tfvars +* Where you have more than one Azure Subscription the id to use can be passed via the the `ARM_SUBSCRIPTION_ID` environment variable. -# Edit the terraform.tfvars file as needed, e.g. using vi -vi terraform.tfvars -``` +* When using a Service Principal (SP) to authenticate with Azure, it is not possible to authenticate with azuread Terraform Provider (the provider used to create the Azure Cross Account AD Application) with the command az login --service-principal. We found the the best way to authenticate using an SP is by setting environment variables. Details of required environment variables are in the [azuread docs](https://registry.terraform.io/providers/hashicorp/azuread/latest/docs/guides/service_principal_client_secret#environment-variables) and [azurerm docs](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/guides/service_principal_client_secret#configuring-the-service-principal-in-terraform) and summarized below. -3. To create Cloud resources and CDP environment, in the cloud provider directory, run the Terraform commands to initialize and apply the changes: + ```bash + export ARM_CLIENT_ID="" + export ARM_CLIENT_SECRET="" + export ARM_TENANT_ID="" + export ARM_SUBSCRIPTION_ID="" + ``` -```bash -terraform init -terraform apply -``` +* The Azure API permissions listed are required by the provisioning account to create the Azure pre-requisite resources. Note that all permissions are of type Application (rather than Delegated). + +| API Permission | Purpose | +| ------------------| ------- | +| Microsoft Graph - Application.Read.All | Read all applications | +| Microsoft Graph - Application.ReadWrite.All | Read and write all applications | +| Microsoft Graph - Application.ReadWrite.OwnedBy | Manage apps that this app creates or owns | +| Microsoft Graph - Directory.ReadWrite.All | Read and write directory data | +| Microsoft Graph - User.Read.All | Read all users' full profiles | -Once the creation of the CDP environment and data lake starts, you can follow the deployment process on the CDP Management Console from your browser in ( [https://cdp.cloudera.com/](https://cdp.cloudera.com/) ). After it completes, you can add CDP [Data Hubs and Data Services](https://docs.cloudera.com/cdp-public-cloud/cloud/overview/topics/cdp-services.html) to your newly deployed environment from the Management Console UI or using the CLI. +### Notes on GCP authentication -### Clean up the CDP environment and infrastructure +* The [Getting Started Docs for Google Terraform Provider](https://registry.terraform.io/providers/hashicorp/google/latest/docs/guides/getting_started#adding-credentials) gives details on the two recommended ways to authenticate with the GCP API. + 1. The Google Cloud SDK (`gcloud`) can be installed and a User Application Default Credentials ("ADCs") can be created by running the command `gcloud auth application-default login` + 1. A Google Cloud Service Account key file can be generated and downloaded. The `GOOGLE_APPLICATION_CREDENTIALS` environment variable can then be set to the location of the file. -If you no longer need the infrastructure and CDP environment that’s provisioned by Terraform, run the following command to remove the deployment infrastructure and terminate all resources. + ```bash + export GOOGLE_APPLICATION_CREDENTIALS= + ``` -```bash -terraform destroy -``` +* The Google Cloud IAM roles listed below are required by the provisioning account to create the GCP pre-requisite resources. + + | IAM Role | + | ------------------------- | + | Compute Network Admin | + | Compute Security Admin | + | Role Administrator | + | Security Admin | + | Service Account Admin | + | Service Account Key Admin | + | Storage Admin | + | Viewer | + +* The Google project Id can be specified via the `gcp_project` input variable, the `GOOGLE_PROJECT` environment variable or the default project set via the Cloud SDK. This is described in the [Google Provider Default Values Configuration](https://registry.terraform.io/providers/hashicorp/google/latest/docs/guides/provider_reference#provider-default-values-configuration) documentation.