Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kube-proxy service start failure on boot #626

Closed
vmpjdc opened this issue Aug 26, 2024 · 3 comments
Closed

kube-proxy service start failure on boot #626

vmpjdc opened this issue Aug 26, 2024 · 3 comments

Comments

@vmpjdc
Copy link

vmpjdc commented Aug 26, 2024

Summary

I've observed a couple of times that when a k8s unit reboots, kube-proxy fails to start. This breaks many functions of the node with somewhat perplexing symptoms.

The core of the problem appears to be:

Aug 24 00:33:53 juju-bd78f7-stg-netbox-30 k8s.kube-proxy[611]: + exec /snap/k8s/313/bin/kube-proxy --cluster-cidr=10.1.0.0/16 --healthz-bind-address=127.0.0.1 --hostname-override=juju-bd78f7-stg-netbox-30 --kubeconfig=/etc/kubernetes/proxy.conf --profiling=false
Aug 24 00:33:53 juju-bd78f7-stg-netbox-30 k8s.kube-proxy[611]: I0824 00:33:53.621814     611 server_linux.go:69] "Using iptables proxy"
Aug 24 00:33:53 juju-bd78f7-stg-netbox-30 k8s.kube-proxy[611]: I0824 00:33:53.651253     611 server.go:1062] "Successfully retrieved node IP(s)" IPs=["10.142.102.91"]
Aug 24 00:33:53 juju-bd78f7-stg-netbox-30 k8s.kube-proxy[611]: I0824 00:33:53.652936     611 conntrack.go:119] "Set sysctl" entry="net/netfilter/nf_conntrack_max" value=131072
Aug 24 00:33:53 juju-bd78f7-stg-netbox-30 k8s.kube-proxy[611]: E0824 00:33:53.653060     611 server.go:558] "Error running ProxyServer" err="open /proc/sys/net/netfilter/nf_conntrack_max: no such file or directory"
Aug 24 00:33:53 juju-bd78f7-stg-netbox-30 k8s.kube-proxy[611]: E0824 00:33:53.653169     611 run.go:74] "command failed" err="open /proc/sys/net/netfilter/nf_conntrack_max: no such file or directory"
Aug 24 00:33:53 juju-bd78f7-stg-netbox-30 systemd[1]: snap.k8s.kube-proxy.service: Main process exited, code=exited, status=1/FAILURE

That is, kube-proxy tries to configure conntrack before the kernel module has loaded.

Here's a gist with the full output from boot (line 319 is where I started it manually): https://gist.github.com/vmpjdc/06913c8125814eb98f8ebda3fd356ab2

What Should Happen Instead?

The kube-proxy service should start reliably on boot.

Reproduction Steps

Deploy k8s using Juju: juju deploy -n3 --channel 1.30/beta --constraints 'mem=8G root-disk=50G cores=2' k8s

Optionally, deploy some services into the cluster with Juju.

Reboot a k8s unit.

Observe that kube-proxy did not start (Current = inactive):

$ juju exec -u k8s/0 -- snap services k8s.kube-proxy
Service         Startup   Current   Notes
k8s.kube-proxy  enabled   inactive  -
$ _

(I'm not sure whether this happens every single time.)

Run kubectl get pods -A that some pods (probably a cilium pod, maybe others) are in non-Running states, e.g. Unknown, Terminating, etc.

Start kube-proxy: juju exec -u k8s/0 -- snap start k8s.kube-proxy

Observe that the cluster recovers. (If it does not, delete the affected pods and they should respawn quickly.)

System information

Script does not exist. Here's what the charm installed:

installed:                v1.30.0            (313) 109MB classic,held

Can you suggest a fix?

If it's possible to customize the sysemctl units that snapd creates, allowing more retries would probably work around the problem. In the meantime I can work around it locally by installing a suitable override myself; and, for that matter, so could the charm.

Are you interested in contributing with a fix?

No response

@bschimke95
Copy link
Contributor

Hi @vmpjdc

Thanks for reporting this. We'll look into this.
In the meantime you might get around this issue by forcefully loading the module:

echo nf_conntrack | sudo tee /etc/modules-load.d/nf_conntrack.conf

See also: canonical/microk8s#4462 (comment)

@vmpjdc
Copy link
Author

vmpjdc commented Oct 2, 2024

I've been testing the following systemd drop-in config, and it seems to work well so far:

[Unit]
StartLimitIntervalSec=0

[Service]
Restart=always
RestartSec=1s

@aznashwan
Copy link
Contributor

Hey @vmpjdc, and thanks a lot for your report!

Glad to announce we have merged a patch which should alleviate this issue yesterday: #743

NOTE: you will still need to ensure the nf_conntrack module is installed on your host (Deb package name is linux-modules-$(uname -r) if you're running on Ubuntu/Debian), as the patch will only try to load it on kube-proxy startup.

You should already be seeing it in the edge channel if you'd like to test it, and will definitely be included on the stable tracks of the snap when those get published.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants
@aznashwan @vmpjdc @bschimke95 and others