Automate cloud status prior to upgrade #366

BjoernT · 2019-05-30T15:59:34Z

Based in discussions with @galstrom21 I think it is beneficial to check the environment automatically on checks I usually do prior to scheduling upgrades:

Are nova, neutron, cinder, heat services checking in (service-list and agent health commands)
Is VXLAN used with neutron still in containers
Hosts configured but not accessible inside the OSA inventory
Custom cinder drivers used which are not included at upstream
Hardware not on Ubuntu HCL
Open MAAS/Monitoring alerts
Customizations like forked OSA playbooks/code
Customizations of Openstack Policies
Last time OS patches had been installed and how many are missing
Galera and RabbitMQ cluster checks are mandatory
Mismatch of endpoint catalog and External/Internal_lb_vip_address
Are Holland backups working (or is the control plane backed up by other means)
Are multiple containers from the same type existing but not used
Zombie processes
iscsi errors on nova-compute consoles
Did API service report recent errors
Intrusive SSH options like limit sessions within sshd_config or authorized_keys
git repos origin doesn't point to upstream
git repo branch tracking misconfigured

antonym · 2019-05-30T17:13:17Z

#363 splits the preflight stuff so that leap and incremental can use them both and then we could build on top of that framework.
I'd imagine a lot of those are just automating the manual steps in the runbook docs.

There are also some healthcheck plays we're might be able to leverage for service checks so that we're not having to write stuff from scratch:

https://github.com/openstack/openstack-ansible/blob/stable/stein/playbooks/healthcheck-hosts.yml
https://github.com/openstack/openstack-ansible/blob/stable/stein/playbooks/healthcheck-infrastructure.yml
https://github.com/openstack/openstack-ansible/blob/stable/stein/playbooks/healthcheck-openstack.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automate cloud status prior to upgrade #366

Automate cloud status prior to upgrade #366

BjoernT commented May 30, 2019 •

edited

Loading

antonym commented May 30, 2019 •

edited

Loading

Automate cloud status prior to upgrade #366

Automate cloud status prior to upgrade #366

Comments

BjoernT commented May 30, 2019 • edited Loading

antonym commented May 30, 2019 • edited Loading

BjoernT commented May 30, 2019 •

edited

Loading

antonym commented May 30, 2019 •

edited

Loading