Skip to content

Ansible playbooks for deploying Hortonworks Data Platform and DataFlow

License

Notifications You must be signed in to change notification settings

inst-tech/ansible-hortonworks

 
 

Repository files navigation

ansible-hortonworks

These Ansible playbooks will build a Hortonworks cluster (either Hortonworks Data Platform or Hortonworks DataFlow) using Ambari Blueprints.

This includes building the cloud infrastructure and taking care of the prerequisites.

The aim is to first build the nodes in a Cloud environment, prepare them (OS settings, etc) and then install Ambari and create the cluster using Ambari Blueprints.

If the infrastructure already exists, it can also use a static inventory.

It can use a static blueprint or generate a blueprint based on the components selected in the Ansible variables file.

For a full list of supported features check bellow.

Installation

  • AWS: See INSTALL.md for AWS specific build instructions.
  • Azure: See INSTALL.md for Azure specific build instructions.
  • Google Compute Engine: See INSTALL.md for GCE specific build instructions.
  • OpenStack: See INSTALL.md for OpenStack specific build instructions.
  • Static inventory: See INSTALL.md for specific build instructions.

Requirements

  • Ansible >= 2.2.1, < 2.4.0

  • Expects CentOS/RHEL, Ubuntu, Amazon Linux or SLES hosts

Description

Currently, these playbooks are divided into the following parts:

  1. Build the Cloud nodes

    Run the build_cloud.sh script to build the nodes. Refer to the Cloud specific INSTALL guides for more information.

  2. Install the cluster

    Run the install_cluster.sh script that will install the HDP or HDF cluster using Blueprints while taking care of the necessary prerequisites.

...or, alternatively, run each step separately:

  1. Build the Cloud nodes

    Run the build_cloud.sh script to build the nodes. Refer to the Cloud specific INSTALL guides for more information.

  2. Prepare the Cloud nodes

    Run the prepare_nodes.sh script to prepare the nodes.

    This installs the required OS packages, applies the recommended OS settings and adds the Ambari repositories.

  3. Install Ambari

    Run the install_ambari.sh script to install Ambari on the nodes.

    This installs the Ambari Agent on all nodes and the Ambari Server on the designated node. Ambari Agents are configured to register to the Ambari Server.

  4. Configure Ambari

    Run the configure_ambari.sh script to configure Ambari.

    This playbook is used to set the repository URLs in Ambari but will be used for other settings such as the Alert options or the admin user password.

  5. Apply Blueprint

    Run the apply_blueprint.sh script to install HDP or HDF based on an Ambari Blueprint.

    This uploads the Ambari Blueprint and Cluster Creation Template and launches a cluster create request to Ambari. It can also wait for the cluster to be built

  6. Post Install

    Run the post_install.sh script to execute any actions after the cluster is built.

Features

Infrastructure support

  • Pre-built infrastructure (using a static inventory file)
  • OpenStack nodes
  • OpenStack Block Storage (Cinder)
  • AWS nodes (with root EBS only)
  • AWS Block Storage (additional EBS)
  • Azure nodes
  • Azure Block Storage (VHDs)
  • Google Compute Engine nodes (with root Persistent Disks only)
  • Google Compute Engine Block Storage (additional Persistent Disks)

OS support

  • CentOS/RHEL 6 support
  • CentOS/RHEL 7 support
  • Ubuntu 14 support
  • Ubuntu 16 support
  • Amazon AMI (2016.09 and 2017.03) support
  • SUSE Linux Enterprise Server 11 support
  • SUSE Linux Enterprise Server 12 support

Prerequisites

  • Install and start NTP
  • Create /etc/hosts mappings
  • Set nofile and nproc limits
  • Set swappiness
  • Disable SELinux
  • Disable THP
  • Set Ambari repositories
  • Install OpenJDK or Oracle JDK
  • Install and prepare MySQL
  • Install and prepare PostgreSQL
  • Install and configure local MIT KDC
  • Partition and mount additional storage

Cluster build

  • Install Ambari Agent and Server with embedded JDK and databases
  • Configure Ambari Server with OpenJDK or Oracle JDK
  • Configure Ambari Server with advanced database options
  • Configure Ambari Server with SSL
  • Configure custom Repositories
  • Build HDP clusters
  • Build HDF clusters
  • Build HDP clusters with HDF nodes
  • Build clusters with a specific JSON blueprint (static)
  • Build clusters with a generated JSON blueprint (dynamic based on Jinja2 template and variables)
  • Wait for the cluster to be built

Dynamic blueprint

  • HA NameNode
  • HA ResourceManager
  • HA Hive
  • HA HBase Master
  • HA Oozie
  • Secure clusters with MIT KDC (Ambari managed)
  • Secure clusters with Microsoft AD (Ambari managed)
  • Install Ranger and enable plugins
  • Ranger AD integration
  • Hadoop SSL
  • Hadoop AD integration
  • NiFi SSL
  • NiFi AD integration
  • Basic memory settings tuning
  • Make use of additional storage for HDP workers
  • Make use of additional storage for master services
  • Configure additional storage for NiFi

About

Ansible playbooks for deploying Hortonworks Data Platform and DataFlow

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 97.9%
  • Shell 2.1%