Machine pool nodes are not drained during upgrade #2170

bdehri · 2023-03-20T09:21:34Z

Issue

Machine pool nodes are not drain during a cluster upgrade.

It's probably the root cause for #1993 .

fiunchinho · 2023-03-22T11:12:19Z

I did a bit of research on the topic. Draining of worker nodes when using AWSMachinePools is not currently implemented in CAPA. This is the issue from upstream AWSMachinePool graceful scale down.

This feature is blocked by this Graduation of EventBridge Feature. And this is the ADR explaining the proposal.

There was a PR to implement the feature but it was closed because of the mentioned "Graduation of EventBridge Feature".

I believe the final implementation would involve deploying this component in our clusters https://github.com/aws/aws-node-termination-handler

fiunchinho · 2023-03-22T15:24:41Z

There is also kubernetes-sigs/cluster-api-provider-aws#2023

fiunchinho · 2023-03-23T09:08:43Z

There is a k8s built-in mechanism that we could try https://kubernetes.io/blog/2021/04/21/graceful-node-shutdown-beta/

The configuration flags needed to configure it (like ShutdownGracePeriod and ShutdownGracePeriodCriticalPods) are not available as kubelet flags though, so they can't be configured using kubeletExtraFlags. That means it wouldn't work with our current approach to configure the kubelet using kubeadm and the kubeletExtraFlags field. I opened this upstream kubernetes-sigs/cluster-api#8348

mnitchev · 2023-03-29T13:53:12Z

Regarding the above ShutdownGracePeriod. Kubeadm allows patching the Kubelet config (see here) but this was added in 1.25. I'll see if this can be done with a pre/post kubeadm command.

mnitchev · 2023-04-06T14:36:48Z

Released in [email protected]
A bit more info on how it works in the PR: giantswarm/cluster-aws#276

calvix changed the title ~~during and machine pool update, the nodes are tear down without draining~~ during a machine pool update, the nodes are tear down without draining Mar 21, 2023

alex-dabija added area/kaas Mission: Cloud Native Platform - Self-driving Kubernetes as a Service team/hydra topic/capi provider/cluster-api-aws Cluster API based running on AWS kind/bug labels Mar 27, 2023

alex-dabija changed the title ~~during a machine pool update, the nodes are tear down without draining~~ Machine pool nodes are not drain during upgrade Mar 27, 2023

alex-dabija mentioned this issue Mar 27, 2023

CAPA cluster upgrades stability #1777

Closed

14 tasks

mnitchev self-assigned this Mar 29, 2023

mnitchev changed the title ~~Machine pool nodes are not drain during upgrade~~ Machine pool nodes are not drained during upgrade Apr 3, 2023

mnitchev closed this as completed Apr 6, 2023

T-Kukawka mentioned this issue Nov 23, 2023

CAPA node draining - Phase 2 #2975

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Machine pool nodes are not drained during upgrade #2170

Machine pool nodes are not drained during upgrade #2170

bdehri commented Mar 20, 2023 •

edited by alex-dabija

Loading

fiunchinho commented Mar 22, 2023 •

edited

Loading

fiunchinho commented Mar 22, 2023

fiunchinho commented Mar 23, 2023 •

edited

Loading

mnitchev commented Mar 29, 2023

mnitchev commented Apr 6, 2023

Machine pool nodes are not drained during upgrade #2170

Machine pool nodes are not drained during upgrade #2170

Comments

bdehri commented Mar 20, 2023 • edited by alex-dabija Loading

Issue

fiunchinho commented Mar 22, 2023 • edited Loading

fiunchinho commented Mar 22, 2023

fiunchinho commented Mar 23, 2023 • edited Loading

mnitchev commented Mar 29, 2023

mnitchev commented Apr 6, 2023

bdehri commented Mar 20, 2023 •

edited by alex-dabija

Loading

fiunchinho commented Mar 22, 2023 •

edited

Loading

fiunchinho commented Mar 23, 2023 •

edited

Loading