Skip to content

sunita-tech/CDPmultinode_on_existingInstance

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 

Repository files navigation

CDP Multinode Cluster on Docker with existing cluster ( CDF(Nifi/Kafka) and CDSW(ML) )

CDP Multinode script using Docker on Mac/Windows 10, This will create CDP DC on Existing 6 Instaces of AWS (2 m5.4xlarge and 2 m52xlarge with stoarage of 100gb each ) Important to note : All instances should be clean with only OS installed on them, previous traces of installation may result in script failure.

Updated on March 12 , 2020

Assumptions

	1> This document assumes that you have access to an AWS account.	
	2> Request cloudera license from partner portal . 
	4> Access to valid cloudera.com credentials to download binaries
	6> Access to the following versions of docker are used for Mac OS and Windows 10 Pro
	https://hub.docker.com/editions/community/docker-ce-desktop-mac/
	https://hub.docker.com/editions/community/docker-ce-desktop-windows/

AWS Dependencies:

	1> Note AWS keypair name (.pem file) to use with the scripts
	2> Note AWS region and AZ. They must be same across all instnacees (us-east-1 used in this example)
	3> 6 instances required for setting up CDP DC with same OS image in same AZ, VPC, subnet and Security 			   Group(eg of OS image:ami-02eac2c0129f6376b #CentOS-7 x86_64)
	   - 4 instances of  m5.2xlarge with 100gb storage will be used as 3 worker nodes and 1 CDF instance.
	   - 2 instances of m5.4xlarge with 100 gb storage will be used for main master(CM) and CDSW  
          Make a note for all the existing private and public IP’s of instances with AZ, VPC,subnet, SG.
       4> Give names, add tags to each instances so you know the purpose of each
          instances eg:CDF_master,main_master,CDSW,Workernode1,2,3
	5> ssh into every instnace to make sure pem file works and you have access to all 6 nodes.	  

Download and licence info:

	1>Download the scripts. Save the files to your home directory (e.g.  Users/ssharma)
	NOTE: For Windows, avoid using space in folder-names. 
	2>Copy the license file to this directory. You should have requested a trial license from the partner portal. 
	3>Copy the AWS  .pem file into the home directory (Users/ssharma)
	4>Create a directory say mn-script and downaload or unzip script in this folder eg:/Users/ssharma/mn-script.  
	5>Create another directory under mn-script  $#mkdir bins  eg:/Users/ssharma/mn-script/bins
	6>Download the following CSD’s into the “bins” directory ( or latest ones packets)
	
	https://archive.cloudera.com/CFM/csd/1.0.0.0/NIFI-1.9.0.1.0.0.0-90.jar
	https://archive.cloudera.com/CFM/csd/1.0.0.0/NIFICA-1.9.0.1.0.0.0-90.jar
	https://archive.cloudera.com/CFM/csd/1.0.0.0/NIFIREGISTRY-0.3.0.1.0.0.0-90.jar
	
	Download the following files into the “bins” directory
	CSP: https://www.cloudera.com/downloads/cdf/csp-trial.html(Version 0.8 - sha, parcel files and CSD or latest) 
	CSM: https://www.cloudera.com/downloads/cdf/csm-trial.html(Version 2.0 - sha, parcel files and CSD or latest)

NOTE: Make sure the SCHEMAREGISTRY, STREAMS_MESSAGING_MANAGER and STREAMS_REPLICATION_MANAGER files are 
in the “bins” directory before executing the ansible playbook. 

Docker Setup:

On both Windows and Mac OS, the following commands are used to setup the environment.We will execute the scripts to setup the 6-node CDP DC cluster with all the relevant services. Kerberos and TLS will be setup by default. For documentation on Docker, refer to this link:https://docs.docker.com/v17.12/docker-for-mac/

1> Ensure docker desktop has been installed and is running without any issues on your laptop. 
2> Open a terminal on mac and command prompt on a windows machine. The set of instructions work on both Mac OS 		and Windows. 
3> $docker run -it fedora /bin/bash
4>Make a note the ID from the subsequent prompt as shown below. Use that ID to run the next command. 
5>$docker commit 77d2b4577cfb  myfedora (Use the ID from command above)
6>Mounting your local Mac drive  /Users/<dir> to Docker /home/<dir>
	
Mac Example: $docker run -it --volume /Users/ssharma:/home/ssharma myfedora /bin/bash
Windows Example: $docker run -it --volume C:\Users\ssharma:/home/ssharma myfedora /bin/bash
	
7> At this time, you have a docker engine with all the relevant files mapped to your home directory 
eg: /home/ssharma.  Next,we will prep the docker container and customize these files . 
    8> Install pyhton3 and boto3 in your Docker image 

	[root@2e3f9e83cf7a  ~]# dnf update -y
	[root@2e3f9e83cf7a  ~]# dnf install -y ansible python3-pip git  
	[root@2e3f9e83cf7a  ~]# pip3 install boto boto3

9> Add SSH key on docker ( it is 2 step process )
NOTE: On windows, you will need to copy the .pem file to a native docker folder and run these commands. 

Step 1 :This step produces agent pid as below
	$[root@2e3f9e83cf7a  ~]#eval ‘ssh-agent -s’
SSH_AUTH_SOCK=/var/folders/3m/xs2m6r7x7_qg8wp11ggy8l000000gp/T//ssh-ASHkKOqJ6PpS/agent.51910; export SSH_AUTH_SOCK;
	SSH_AGENT_PID=51911; export SSH_AGENT_PID;
   		echo Agent pid 51911;
	
Step2: Use ssh-add command and provide pem file location 
       $[root@2e3f9e83cf7a  ~] # ssh-add /home/ssharma/sunita_field.pem
	Identity added: /home/ssharma/sunita_field.pem

  10> Adding key-vault : Create the ansible vault file in the root directory to store the private key. 
      Note:It will ask for password to create vault,We will store this in a password file as the next step
      replace with key name with your own key in eg below for <username>_keys.vault
 
 [root@2e3f9e83cf7a  ~]#ansible-vault create ssharma_keys.vault 
   
  11> This will open up an editor similar to vi. Copy and paste your .pem file contents,Pay close attention at the 		  indentation.Give the key name and space for | , add 2 spaces for each line below key name 
  
  For Example: ssharma_keys.vault, give a <username>_key.vault ex: sunita_key: | as shown below

sunita_key: |
  -----BEGIN RSA PRIVATE KEY-----
  Madsfdasagafgfdgfdsgadhdjasvfgaertqrecsf
 [...]
 dfasdgretwreaqghaduogihafdkghareoighfdk=
 -----END RSA PRIVATE KEY-----

NOTE: Record the private key name (eg: sunita_key) which will be used later in the config files

You will be asked to enter a password. Save the password. You can use this password in case you want to 
view or edit the file at a later stage. Use ansible-vault view or ansible-vault edit to make changes
   
	[root@2e3f9e83cf7a  ~]#ls -ltr /home/ssharma/ssharma_keys.vault (verify)
	
12> On docker, let's now create a simple file to store the Vault password, so you won't be prompted at runtime,
Create the file under your home directory
	[root@2e3f9e83cf7a  ~]#echo "YourPassword" > vault-password-file
	[root@2e3f9e83cf7a  ~]#chmod 400 vault-password-file
	
NOTE: Record the file path and file name. We will use it in the config files

13> On docker export variables for the AWS keys as below:
        
    export AWS_ACCESS_KEY_ID=AKIAQxxxxxx
    export AWS_SECRET_ACCESS_KEY=uOI3N5KQZ8zbxxxxxxxxxx

Modify the configuration file:

At this point, you should have the script under a folder called mn-script.This folder should have the bin directory. We will also need access to the vault, pem and password files that are stored in the home directory. The home directory should be accessible via docker mapping of the folders.

1>Open ../config/stock.infra.aws.yml file
2>Make changes to parameters in stock.infra.aws.krb.yml where it says <replace me>. 
eg Owner,project,enddate,vpc,region,subnet and security group.

 region: us-east-1 <replace me>
 subnet: subnet-76505a3cxx<replace me>
 security_group: sg-010c70ad828ad9axx<replace me>
     image: ami-02eac2c0129f6376b <replace me> # CentOS-7 x86_6
   
   tags:
    owner: user.test<replace me>
    enddate: "01312020"<replace me>
    project: ansible-test<replace me>

3>Open and modify filepath for license in stock.cluster.krb.yml. where it says <replace me>
4>Open /etc/ansible/ansible.cfg  make the following changes and save.

	a> uncomment “host_key_checking=false”
	b> uncomment value_password_file and specify the location of your vault password file.
	c> uncomment inventory and provide ansible_hosts.yml path 
		eg : inventory = /home/ssharma/Downloads/partnersRepo/ansible_hosts.yml



5>Open /etc/ansible/hosts, add following 2 lines as below and save:

[local]
localhost

6>Change the following information in config/stock.cluster.krb.yml
  a> Add the private_key value eg:  {{ sunita_key }}
      b> Provide the location for the parcel downloaded above from /bins folder (Step 6 Download and licence info)
  wheree it says <replace me>
      Example: 
     local_csds: 
           - /home/ssharma/bins/SCHEMAREGISTRY-0.8.0.jar <replace me>
           - /home/ssharma/bins/STREAMS_MESSAGING_MANAGER-2.1.0.jar<replace me>
    local_parcels: 
    	- /home/ssharma/bins/SCHEMAREGISTRY-0.8.0.2.0.0.0-135-el7.parcel<replace me>>
  	     	- /home/ssharma/bins/SCHEMAREGISTRY-0.8.0.2.0.0.0-135-el7.parcel.sha<replace me>
  	     	- /home/ssharma/bins/STREAMS_MESSAGING_MANAGER-2.1.0.2.0.0.0-135-el7.parcel<replace me>
         	- /home/ssharma/bins/STREAMS_MESSAGING_MANAGER-2.1.0.2.0.0.0-135-el7.parcel.sha<replace me>

     c>For Auto_TLS, you will need a CDP DC license file from Cloudera in stock.cluster.krb.yml
   where it says <replace me> 
 
   licence:
  	       type: enterprise
       filepath: test_2019_2020_Licenseinfo/test_2019_2020_cloudera_license.txt<replace me>
   
8> Open ansible_host.yml
    a> Replace the .pem file
      eg : ansible_private_key_file: "/Users/ssharma/sunita_field.pem"
    b> then provide public and private IP’s as needed and save.
   First 3 are for worker nodes and 1 for CDF (2xlarge), last 2 are for main master and CDSW (4xlarge)
   replace all the IP's and re-check to make sure correct IP's are assigned 
       example:
	    ## 2xlarge/100gb vol for worker node#1
	    3.94.167.42:
	      ansible_host: 3.94.167.42

	      private_hostname: ip-172-31-16-186.ec2.internal
	      private_ip: 172.31.16.186
	      public_hostname: ec2-3-94-167-42.compute-1.amazonaws.com
	      public_ip: 3.94.167.42
	      
and more ..as below cm_server/db_server/main_master/krb5_server shares same IP 
main_master 4xlarge,100gb),CSDW(4xlarge,100gb), CDF :1 node 2xlarge with 100gb and 3 worker node 2xlarge 100gb
	      
	      cm_server:
		      hosts:
			54.91.49.29:
		    db_server:
		      hosts:
			54.91.49.29:
		    main_master:
		      hosts:
			54.91.49.29:

		    krb5_server:
		      hosts:
			54.91.49.29:

		    workers:
		      hosts:
			3.94.167.42:
			52.90.154.199:
			54.208.14.90:
		    cdf:
		      hosts:
			54.85.168.49:
		    cdsw_master:
		         hosts:
			  100.24.8.58:

Now are ready to execute the ansible playbook from mn-script folder.

$ansible-playbook site.yml -e "infra=config/stock.infra.aws.yml" -e "cluster=config/stock.cluster.krb.yml" -e "vault= <path-to-keys.vault-file>" -e "cdpdc_teardown=" -e "public_key=<name_of_public_key_AWS>"

Example:

ansible-playbook site.yml -e "infra=config/stock.infra.aws.yml" -e "cluster=config/stock.cluster.krb.yml" -e "vault=/root/ashish_keys.vault" -e "cdpdc_teardown=sunita-03122020" -e "public_key=sunita-pse-sandbox"

After End of Successful Execution, You will see something like below as a Recap:

TASK [cdpdc_cm_server : reset var _api_command] **********************************************************************************************************

ok: [54.91.49.29]
PLAY RECAP *****************************************************************************************************
100.24.8.58                : ok=31   changed=17   unreachable=0    failed=0    skipped=0    rescued=0    ignored=1   
3.94.167.42                : ok=31   changed=17   unreachable=0    failed=0    skipped=0    rescued=0    ignored=1   
52.90.154.199              : ok=31   changed=17   unreachable=0    failed=0    skipped=0    rescued=0    ignored=1   
54.208.14.90               : ok=31   changed=17   unreachable=0    failed=0    skipped=0    rescued=0    ignored=1   
54.85.168.49               : ok=57   changed=37   unreachable=0    failed=0    skipped=0    rescued=0    ignored=1   
54.91.49.29                : ok=118  changed=56   unreachable=0    failed=0    skipped=2    rescued=0    ignored=1   

Use cm node ( 4xlarge ) to get into CM to verify the cluster status above example shows [54.91.49.29] as cm server

https://54.91.49.29:7183/cmf/login Pwd: admin/admin

Login into AWS, check AWS EC2 instance , you will be able to see following instances created has 3 Worker nodes(2xlarge+100gb) , 1 CDF(2xlarge+100gb) ,1 CDSW(4xlarge+100gb) and 1(4xlarge+100gb) master nodes

About

CDP Multinode set up on Existing instances with CDF and CDSW

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published