-
Notifications
You must be signed in to change notification settings - Fork 20
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
17 changed files
with
224 additions
and
8 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
# Node decommissioning | ||
|
||
The procedure follows is the same as described as follows on [Cloudera](https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.4/administration/content/decommissioning-slave-nodes.html). | ||
|
||
Check Application, Nodemanager and Datanode statuses before starting a decomissioning process by executing the playbook `hadoop-component-decommissioning-check.yml`. This same playbook can be run several times after the decommissioning process has begun to see its status. | ||
|
||
To see which application is running on which node execute the command inside a node with yarn client `yarn app -status <application-id>`. | ||
|
||
## Yarn Nodemanager decommissioning | ||
|
||
Set the hostnames of the Nodemanagers to start to decommission in `yarn_nodemanagers_decommission` of the `excuded_nodes.yml` file seperated by comma in the Yarn tdp_variables, then set the timeout for the graceful decommissioning. The node is decommissioned once all applications running on it have terminated or after timeout and in this case it is restarted on another node. The value `-1` handles infinite timeout. Then execute the playbook `hadoop-components-decommissioning/yarn_resourcemanager_decomm_nodemanager.yml`. | ||
|
||
## HDFS Datanode decommissioning | ||
|
||
Set the hostnames of the Datanodes to start to decommission in `hdfs_datanodes_decommission` of the `excuded_nodes.yml` file seperated by comma in the HDFS tdp_variables, then execute the playbook `hadoop-components-decommissioning/hdfs_namenode_decomm_datanode.yml`. | ||
|
||
*NB*: the decommissioning of the HDFS datanode can take several hours depending on the size of the file system. | ||
|
||
## Hadoop decommissioning | ||
|
||
The playbook `hadoop-decommissioning.yml` executes both playbooks above and starts decommissioning the Yarn Nodemanager and the HDFS Datanode. It also before executes the `yarn_capacity_scheduler.yml` playbook to reconfigure the Yarn capacity scheduler. | ||
|
||
## Recommissioning a node | ||
|
||
For HDFS, just delete the node from `hdfs_datanodes_decommission` and execute the playbook `hadoop-components-decommissioning/hdfs_namenode_decomm_datanode.yml`. | ||
|
||
Concerning Yarn, delete the node from `yarn_nodemanagers_decommission`, execute the playbook `hadoop-components-decommissioning/yarn_resourcemanager_decomm_nodemanager.yml`, then restart the decommissioned Nodemanger with the playbook `yarn_nodemanager_restart.yml` and finally execute the same playbook `hadoop-components-decommissioning/yarn_resourcemanager_decomm_nodemanager.yml` again. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
# Copyright 2022 TOSIT.IO | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
--- | ||
hdfs_datanodes_decommission: [] | ||
yarn_nodemanagers_decommission: [] | ||
graceful_decommission_timeout_seconds: -1 |
14 changes: 14 additions & 0 deletions
14
playbooks/utils/decommissioning/hadoop-component-decommissioning-check.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
# Copyright 2022 TOSIT.IO | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
--- | ||
- name: Hadoop Yarn Nodemanager and HDFS Datanode check | ||
hosts: hdfs_nn, yarn_rm | ||
tasks: | ||
- tosit.tdp.resolve: # noqa unnamed-task | ||
node_name: hdfs_namenode, yarn_resourcemanager | ||
- name: Print application, node and datastorage information | ||
ansible.builtin.import_role: | ||
name: tosit.tdp.utils.hadoop_decommissioning_check | ||
tasks_from: main | ||
- ansible.builtin.meta: clear_facts # noqa unnamed-task |
15 changes: 15 additions & 0 deletions
15
...utils/decommissioning/hadoop-components-decommissioning/hdfs_namenode_decomm_datanode.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
# Copyright 2022 TOSIT.IO | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
--- | ||
- name: Hadoop HDFS datanode Decommissioning | ||
hosts: hdfs_nn | ||
vars_files: ../excluded_nodes.yml | ||
tasks: | ||
- tosit.tdp.resolve: # noqa unnamed-task | ||
node_name: hdfs_namenode | ||
- name: Decommission HDFS datanode | ||
ansible.builtin.import_role: | ||
name: tosit.tdp.utils.hdfs_namenode_decommissioning | ||
tasks_from: main | ||
- ansible.builtin.meta: clear_facts # noqa unnamed-task |
15 changes: 15 additions & 0 deletions
15
...mmissioning/hadoop-components-decommissioning/yarn_resourcemanager_decomm_nodemanager.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
# Copyright 2022 TOSIT.IO | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
--- | ||
- name: Hadoop Yarn resourcemanager decommissioning | ||
hosts: yarn_rm | ||
vars_files: ../excluded_nodes.yml | ||
tasks: | ||
- tosit.tdp.resolve: # noqa unnamed-task | ||
node_name: yarn_resourcemanager | ||
- name: Decommision YARN NM | ||
ansible.builtin.import_role: | ||
name: tosit.tdp.utils.yarn_resourcemanager_decommissioning | ||
tasks_from: main | ||
- ansible.builtin.meta: clear_facts # noqa unnamed-task |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
# Copyright 2022 TOSIT.IO | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
--- | ||
- ansible.builtin.import_playbook: ../yarn_capacity_scheduler.yml | ||
- ansible.builtin.import_playbook: hadoop-components-decommissioning/yarn_resourcemanager_decomm_nodemanager.yml | ||
# Decommission Yarn nodemanager | ||
- ansible.builtin.import_playbook: hadoop-components-decommissioning/hdfs_namenode_decomm_datanode.yml | ||
# Decommission HDFS namenode |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
# Copyright 2022 TOSIT.IO | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
--- | ||
- name: Check yarn node status | ||
ansible.builtin.command: yarn node -list -all | ||
register: node_output | ||
become: true | ||
become_user: yarn | ||
changed_when: false | ||
|
||
- name: Print output of node status | ||
ansible.builtin.debug: | ||
msg: "{{ node_output.stdout }}" | ||
|
||
- name: Check running applications on Yarn | ||
ansible.builtin.command: yarn app -list | ||
register: app_output | ||
become: true | ||
become_user: yarn | ||
changed_when: false | ||
|
||
- name: Print output of node status | ||
ansible.builtin.debug: | ||
msg: "{{ app_output.stdout }}" | ||
|
||
- name: Check HDFS datanodes usage | ||
ansible.builtin.command: hdfs dfsadmin -report | ||
register: storage_output | ||
become: true | ||
become_user: hdfs | ||
changed_when: false | ||
|
||
- name: Print output of node status | ||
ansible.builtin.debug: | ||
msg: "{{ storage_output.stdout }}" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
# Copyright 2022 TOSIT.IO | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
--- | ||
excluded_nodes: "{{ hdfs_datanodes_decommission }}" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
# Copyright 2022 TOSIT.IO | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
--- | ||
- name: Render dfs.exclude file | ||
ansible.builtin.template: | ||
src: dfs.exclude.j2 | ||
dest: "{{ hdfs_site['dfs.hosts.exclude'] }}" | ||
owner: root | ||
group: root | ||
mode: "644" | ||
|
||
- name: Update exlude nodes file | ||
ansible.builtin.lineinfile: | ||
path: "{{ hdfs_site['dfs.hosts.exclude'] }}" | ||
line: "{{ item | tosit.tdp.access_fqdn(hostvars) }}" | ||
state: present | ||
loop: "{{ excluded_nodes }}" | ||
|
||
- name: kinit hdfs NN | ||
ansible.builtin.command: kinit -kt /etc/security/keytabs/nn.service.keytab nn/{{ ansible_hostname | tosit.tdp.access_fqdn(hostvars) }}@{{ realm }} | ||
become: true | ||
become_user: hdfs | ||
changed_when: false | ||
|
||
- name: RefreshNodes | ||
ansible.builtin.command: hdfs dfsadmin -refreshNodes | ||
become: true | ||
become_user: hdfs | ||
changed_when: false | ||
|
||
- name: Check node status | ||
ansible.builtin.command: hdfs dfsadmin -report -decommissioning | ||
register: hdfs_output | ||
become: true | ||
become_user: hdfs | ||
changed_when: false | ||
|
||
- name: Print output of node status | ||
ansible.builtin.debug: | ||
msg: "{{ hdfs_output.stdout }}" |
3 changes: 3 additions & 0 deletions
3
roles/utils/hdfs_namenode_decommissioning/templates/dfs.exclude.j2
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
{% for dn in hdfs_datanodes_decommission %} | ||
{{ dn }} | ||
{% endfor %} |
6 changes: 6 additions & 0 deletions
6
roles/utils/yarn_resourcemanager_decommissioning/defaults/main.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
# Copyright 2022 TOSIT.IO | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
--- | ||
excluded_nodes: "{{ yarn_nodemanagers_decommission }}" | ||
timeout_seconds: "{{ graceful_decommission_timeout_seconds }}" |
41 changes: 41 additions & 0 deletions
41
roles/utils/yarn_resourcemanager_decommissioning/tasks/main.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
# Copyright 2022 TOSIT.IO | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
--- | ||
- name: Render yarn.exclude file | ||
ansible.builtin.template: | ||
src: yarn.exclude.j2 | ||
dest: "{{ yarn_site['yarn.resourcemanager.nodes.exclude-path'] }}" | ||
owner: root | ||
group: root | ||
mode: "644" | ||
|
||
- name: Update exlude nodes file | ||
ansible.builtin.lineinfile: | ||
path: "{{ yarn_site['yarn.resourcemanager.nodes.exclude-path'] }}" | ||
line: "{{ item | tosit.tdp.access_fqdn(hostvars) }}" | ||
state: present | ||
loop: "{{ excluded_nodes }}" | ||
|
||
- name: kinit yarn RM | ||
ansible.builtin.command: kinit -kt /etc/security/keytabs/rm.service.keytab rm/{{ ansible_hostname | tosit.tdp.access_fqdn(hostvars) }}@{{ realm }} | ||
become: true | ||
become_user: yarn | ||
changed_when: false | ||
|
||
- name: RefreshNodes | ||
ansible.builtin.command: yarn rmadmin -refreshNodes -g "{{ timeout_seconds }}" -server | ||
become: true | ||
become_user: yarn | ||
changed_when: false | ||
|
||
- name: Check node status | ||
ansible.builtin.command: yarn node -list -all | ||
register: yarn_output | ||
become: true | ||
become_user: yarn | ||
changed_when: false | ||
|
||
- name: Print output of node status | ||
ansible.builtin.debug: | ||
msg: "{{ yarn_output.stdout }}" |
3 changes: 3 additions & 0 deletions
3
roles/utils/yarn_resourcemanager_decommissioning/templates/yarn.exclude.j2
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
{% for nm in yarn_nodemanagers_decommission %} | ||
{{ nm }} | ||
{% endfor %} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters