Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: fs.inotify.max_user_watches is too low #325

Open
pierreozoux opened this issue Mar 15, 2024 · 1 comment
Open

[Bug]: fs.inotify.max_user_watches is too low #325

pierreozoux opened this issue Mar 15, 2024 · 1 comment
Labels
bug Something isn't working
Milestone

Comments

@pierreozoux
Copy link
Contributor

What happened

In some containers, we see logs like:

too many open files

to fix it, we run on the nodes:

sudo sysctl fs.inotify.max_user_instances=1280
sudo sysctl fs.inotify.max_user_watches=655360

Step to reproduce

Install loki promtail,and several pods, and should see the issue.

Expected to happen

The issue is well documented here:
https://gitlab.mim-libre.fr/rizomo/cluster-prod/-/issues/68

Add anything

My question is what do you think is the best way to fix that?

Is it at the OMI level?

Is it in the cluster manifest, I could add a line of bash that would be executed when the node joins or when the node restarts.

Or should it be some kind of daemon set?

What is you opinion on this?

Thanks for your help!

cluster-api output

.

Environment

.
@pierreozoux pierreozoux added the bug Something isn't working label Mar 15, 2024
@outscale-hmi outscale-hmi added this to the next Release milestone Apr 22, 2024
@pierreozoux
Copy link
Contributor Author

In the meantime, I found how to fix locally on my cluster:

in the kubeadmconfigtemplates.bootstrap.cluster.x-k8s.io:

spec:
  template:
    metadata: {}
    spec:
      files:
      - content: |
          #!/bin/sh

          curl https://github.com/opencontainers/runc/releases/download/v1.1.1/runc.amd64 -Lo /tmp/runc.amd64
          chmod +x /tmp/runc.amd64
          cp -f /tmp/runc.amd64 /usr/local/sbin/runc
        owner: root:root
        path: /tmp/set_runc.sh
        permissions: "0744"
      format: cloud-config
      joinConfiguration:
        discovery: {}
        nodeRegistration:
          imagePullPolicy: IfNotPresent
          kubeletExtraArgs:
            cloud-provider: external
            node-labels: ingress=true
            provider-id: aws:///'{{ ds.meta_data.placement.availability_zone }}'/'{{
              ds.meta_data.instance_id }}'
          name: '{{ ds.meta_data.local_hostname }}'
      preKubeadmCommands:
      - sh /tmp/set_runc.sh
      - sudo sysctl fs.inotify.max_user_instances=1280
      - sudo sysctl fs.inotify.max_user_watches=655360

I could also label the node the way I needed.

I let you close if you want to keep track of this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

No branches or pull requests

2 participants