Automatic replacement of MC NotReady nodes #1541

whites11 · 2022-10-20T02:23:47Z

Similarly to what we do on WCs, we should have an automatic replacement feature of MC nodes when they get not ready

T-Kukawka · 2022-10-20T09:15:23Z

Ideal approach would be to have an operator that recycles nodes regardless of the provider. This could be used for CAPI then as well. Operator would have to be HA as the node it is running on can be affected.

whites11 · 2022-10-20T09:25:59Z

@giantswarm/team-hydra is there any feature like this implemented in the CAPI world?

alex-dabija · 2022-10-20T09:32:53Z

CAPI does have the concept of health checks for machines. Unfortunately, it only works with machine deployments and not with machine pools because a machine resource CR (or machine set CR) needs to be present.

T-Kukawka · 2022-10-24T09:05:45Z

Waiting for CAPA to stabilize implementation of machinepools/machinedeployments

whites11 · 2023-04-06T14:56:53Z

did some progress in giantnetes-terraform 14.12.0.
We run a script in the masters that take the node down if services are down

nprokopic · 2023-05-08T11:38:43Z

In CAPI all MCs will also be WCs at the same time (with all CAPI CRs and managed my CAPI), so it will be possible to use MachineHealthChecks (however well/bad they work).

Currently it should be possible to use MachineHealthChecks for control plane nodes and workers created with machine deployments (CAPG and CAPZ, and I think all onprem providers).

When it comes to MachinePool support (currently used in CAPA), there is an open PR to implement MachinePoolMachine, which should then make it possible to implement MachineHealthChecks for MachinePool as well.

fiunchinho · 2023-05-08T11:52:17Z

CAPA is currently using it for the control planes nodes.

fiunchinho · 2024-10-07T21:00:36Z

Nowadays I think this is a duplicate of https://github.com/giantswarm/giantswarm/issues/28006. I think I'd close this one @T-Kukawka

T-Kukawka · 2024-10-09T07:17:27Z

agreed, closing

whites11 changed the title ~~Automatic replacement of NotReady nodes~~ Automatic replacement of MC NotReady nodes Oct 20, 2022

T-Kukawka closed this as completed Oct 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automatic replacement of MC NotReady nodes #1541

Automatic replacement of MC NotReady nodes #1541

whites11 commented Oct 20, 2022 •

edited

Loading

T-Kukawka commented Oct 20, 2022

whites11 commented Oct 20, 2022

alex-dabija commented Oct 20, 2022

T-Kukawka commented Oct 24, 2022

whites11 commented Apr 6, 2023

nprokopic commented May 8, 2023

fiunchinho commented May 8, 2023

fiunchinho commented Oct 7, 2024

T-Kukawka commented Oct 9, 2024

Automatic replacement of MC NotReady nodes #1541

Automatic replacement of MC NotReady nodes #1541

Comments

whites11 commented Oct 20, 2022 • edited Loading

T-Kukawka commented Oct 20, 2022

whites11 commented Oct 20, 2022

alex-dabija commented Oct 20, 2022

T-Kukawka commented Oct 24, 2022

whites11 commented Apr 6, 2023

nprokopic commented May 8, 2023

fiunchinho commented May 8, 2023

fiunchinho commented Oct 7, 2024

T-Kukawka commented Oct 9, 2024

whites11 commented Oct 20, 2022 •

edited

Loading