Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dqlite writing huge amounts of data to local drives #3064

Open
b0n0r opened this issue Apr 14, 2022 · 24 comments
Open

dqlite writing huge amounts of data to local drives #3064

b0n0r opened this issue Apr 14, 2022 · 24 comments

Comments

@b0n0r
Copy link

b0n0r commented Apr 14, 2022

Running a 3 node microk8s instance on baremetal commodity HW, Ubuntu 20.04.

Started out with 2 nodes, adding a 3rd one later in the process, enabling HA.

What I am seeing is very strange behaviour where dqlite process is tiring out local storage rather quickly.

A particular node has been running for roughly 2 weeks and the dqlite process has written 30TB+ to the drive(s).

My pinpointing process went like this:

  • uptime for reference
    03:47:27 up 13 days, 28 min, 1 user, load average: 2.56, 3.33, 2.74

  • check all processes in /proc and grep for bytes written, sort numerically:
    grep ^write_bytes /proc/*/io | awk '{print $2" "$1}'|sort -n
    the process in my case with most bytes written:
    37019017576448 /proc/1092/io:write_bytes:
    quick conversion amounts this to roughly 33 Terabytes which is exceedingly big number of writes

  • putting a face to a name (what is this PID)
    ps auwx|grep 1092
    output in my case:
    root 1092 14.9 4.0 11209648 5295232 ? Ssl Apr01 2810:47 /snap/microk8s/3052/bin/k8s-dqlite --storage-dir=/var/snap/microk8s/3052/var/kubernetes/backend/ --listen=unix:///var/snap/microk8s/3052/var/kubernetes/backend/kine.sock:12379

  • the storage-dir of the dqlite process du -hs /var/snap/microk8s/3052/var/kubernetes/backend/
    3.5G /var/snap/microk8s/3052/var/kubernetes/backend/

  • kubectl get nodes seems fine

$ microk8s.kubectl get nodes -o wide
NAME              STATUS   ROLES    AGE    VERSION                    INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
testnode-k8s-n2   Ready    <none>   142d   v1.22.8-3+f833e44163b5b1   xxx.x.x.xxx   <none>        Ubuntu 20.04.3 LTS   5.4.0-91-generic    containerd://1.5.2
testnode-k8s-n1   Ready    <none>   142d   v1.22.8-3+f833e44163b5b1   xxx.x.x.xxx   <none>        Ubuntu 20.04.3 LTS   5.4.0-91-generic    containerd://1.5.2
testnode-k8s-n3   Ready    <none>   16d    v1.23.5-2+c812603a312d2b   xxx.x.x.xxx   <none>        Ubuntu 20.04.4 LTS   5.4.0-107-generic   containerd://1.5.9
  • microk8s status
$ microk8s status
microk8s is running
high-availability: yes
  datastore master nodes: xxx.x.x.xxx:19001 xxx.x.x.xxx:19001 xxx.x.x.xxx:19001
  datastore standby nodes: none

just ran this for reference a got a weird permission denied error for dqlite storage-dir though, no idea why (race condition maybe?) all files are owned by root:microk8s with 660 perms:

$ microk8s status
microk8s is running
Error: open servers store: open /var/snap/microk8s/3052/var/kubernetes/backend/cluster.yaml: permission denied
Usage:
  dqlite -s <servers> <database> [command] [flags]

Flags:
  -c, --cert string       public TLS cert
  -f, --format string     output format (tabular, json) (default "tabular")
  -h, --help              help for dqlite
  -k, --key string        private TLS key
  -s, --servers strings   comma-separated list of db servers, or file://<store>

high-availability: yes
  datastore master nodes: xxx.x.x.xxx:19001 xxx.x.x.xxx:19001 xxx.x.x.xxx:19001
  datastore standby nodes: none

What I also see is the dqlite process is only running on 1 of the 3 nodes - is that expected behaviour?

Please run microk8s inspect and attach the generated tarball to this issue.

Would like to, but I went through the generated tarball and seems like it includes a lot of potentially sensitive (internal) data, is there a way to redact this or somehow share it so it does not become public?

Thank you for microk8s :)

@MathieuBordere
Copy link

MathieuBordere commented Apr 14, 2022

Can you show the result of ls -alh /var/snap/microk8s/3052/var/kubernetes/backend/?

@petermihalik
Copy link

Hi Mathieu,

I'm working with @b0n0r on this issue, so here is the requested output:

total 3.5G
drwxrwxr-x 2 root microk8s 4.0K Apr 14 09:30 .
drwxr-xr-x 4 root root     4.0K Mar 28 10:17 ..
-rw-rw---- 1 root microk8s 8.0M Apr 14 09:17 0000000112380813-0000000112381268
-rw-rw---- 1 root microk8s 8.0M Apr 14 09:19 0000000112381269-0000000112381618
-rw-rw---- 1 root microk8s 8.0M Apr 14 09:19 0000000112381619-0000000112382182
-rw-rw---- 1 root microk8s 8.0M Apr 14 09:19 0000000112382183-0000000112382755
-rw-rw---- 1 root microk8s 8.0M Apr 14 09:20 0000000112382756-0000000112383313
-rw-rw---- 1 root microk8s 8.0M Apr 14 09:21 0000000112383314-0000000112383729
-rw-rw---- 1 root microk8s 8.0M Apr 14 09:21 0000000112383730-0000000112384325
-rw-rw---- 1 root microk8s 8.0M Apr 14 09:23 0000000112384326-0000000112384680
-rw-rw---- 1 root microk8s 8.0M Apr 14 09:24 0000000112384681-0000000112385134
-rw-rw---- 1 root microk8s 8.0M Apr 14 09:24 0000000112385135-0000000112385705
-rw-rw---- 1 root microk8s 8.0M Apr 14 09:24 0000000112385706-0000000112386273
-rw-rw---- 1 root microk8s 8.0M Apr 14 09:25 0000000112386274-0000000112386738
-rw-rw---- 1 root microk8s 8.0M Apr 14 09:26 0000000112386739-0000000112387234
-rw-rw---- 1 root microk8s 8.0M Apr 14 09:27 0000000112387235-0000000112387738
-rw-rw---- 1 root microk8s 8.0M Apr 14 09:28 0000000112387739-0000000112388069
-rw-rw---- 1 root microk8s 8.0M Apr 14 09:29 0000000112388070-0000000112388613
-rw-rw---- 1 root microk8s 8.0M Apr 14 09:29 0000000112388614-0000000112389177
-rw-rw---- 1 root microk8s 8.0M Apr 14 09:30 0000000112389178-0000000112389759
-rw-rw---- 1 root microk8s 1.9K Mar 28 10:17 cluster.crt
-rw-rw---- 1 root microk8s 3.2K Mar 28 10:17 cluster.key
-rw-rw---- 1 root microk8s  196 Apr 14 09:30 cluster.yaml
-rw-rw-r-- 1 root microk8s    2 Apr 14 05:55 failure-domain
-rw-rw---- 1 root microk8s   60 Mar 28 10:17 info.yaml
srw-rw---- 1 root microk8s    0 Apr  1 07:19 kine.sock:12379
-rw-rw---- 1 root microk8s   66 Apr  1 07:19 localnode.yaml
-rw-rw---- 1 root microk8s   32 Apr 10 15:09 metadata1
-rw-rw---- 1 root microk8s   32 Apr 10 15:09 metadata2
-rw-rw---- 1 root microk8s 8.0M Apr 14 09:30 open-27604
-rw-rw---- 1 root microk8s 8.0M Apr 14 09:29 open-27605
-rw-rw---- 1 root microk8s 8.0M Apr 14 09:30 open-27606
-rw-rw---- 1 root microk8s 1.7G Apr 14 09:29 snapshot-15-112388421-1131045106
-rw-rw---- 1 root microk8s  128 Apr 14 09:29 snapshot-15-112388421-1131045106.meta
-rw-rw---- 1 root microk8s 1.7G Apr 14 09:30 snapshot-15-112389445-1131085779
-rw-rw---- 1 root microk8s  128 Apr 14 09:30 snapshot-15-112389445-1131085779.meta

@MathieuBordere
Copy link

Based on this information, your system writes roughly 112389759 - 112380813 = 8946 (112380813 is from file 0000000112380813-0000000112381268 and 112389759 is from file 0000000112389178-0000000112389759) raft log entries every 12 minutes, which is roughly 45000 entries every hour, about 1 million entries per day.
Snapshot are written roughly every 1000 entries, so that means roughly 1000 snapshots per day, with snapshot size 1.7Gb, that is 1.7 TB writes of snapshots per day, times 14 for 2 weeks is roughly 24TB, which is somewhat in line with the 30TB you are seeing. The deviation is probably related to the fact we only see a 12 minute window of the data, it might have been a bit calmer during this period of time.

dqlite has a way to decrease the frequency of the snapshotting, and the microk8s team has been made aware on how to do this. Decreasing the frequency of snapshotting would decrease the total amount of data written to disk.

@ktsakalozos
Copy link
Member

Hi @petermihalik,

What could you share about the k8s workload that produces this amount of write operations? I wonder if we could use it as a benchmark for future improvements?

Decreasing the frequency of snapshots would increase the memory footprint used by dqlite. Could you share how much memory dqlite uses right now and how much free memory your system has?

Currently we do not expose the option of manually configuring the raft log threshold size that triggers the snapshot. Would you consider using an external etcd cluster [1] an option?

[1] https://microk8s.io/docs/external-etcd

@b0n0r
Copy link
Author

b0n0r commented Apr 19, 2022

hi @ktsakalozos and the team,

this is a mixed workload running some batch processing deployments as well as some simple microservices for text language detection etc. nothing too exotic i think.

what are the steps for me to pinpoint the source of so many write operations ie. which part of the workload is generating this?

dqlite process (still running on only 1 out of 3 nodes) is currently consuming 5.8% of memory according to ps, while these systems have 128G memory available.

more detailed output for cat /proc/<PID>/status:

$ cat /proc/<PID>/status
Name:	k8s-dqlite
...
VmPeak:	15564948 kB
VmSize:	11507924 kB
VmLck:	       0 kB
VmPin:	       0 kB
VmHWM:	11653992 kB
VmRSS:	 7764292 kB
RssAnon:	 7744276 kB
RssFile:	   20016 kB
RssShmem:	       0 kB
VmData:	 8178700 kB
VmStk:	     132 kB
VmExe:	   18108 kB
VmLib:	    5736 kB
VmPTE:	   15756 kB
VmSwap:	       0 kB
HugetlbPages:	       0 kB
CoreDumping:	0
THP_enabled:	1
Threads:	43
SigQ:	0/514555
...

if external etcd is my only option to save the SSDs, i will have to give it a try i guess? are there any other options apart from this? my concern is that this behaviour is killing the drives and since this is a PoC cluster consumer drives are wearing out very quickly.

@nilswieber
Copy link

I'm facing the same problem. dqlite is generating a lot of disk io, which is producing wear on the ssds on which microk8s is running. I reduced the cluster to a single node in order to reduce the impact.

@b0n0r
Copy link
Author

b0n0r commented Nov 14, 2022

I'm facing the same problem. dqlite is generating a lot of disk io, which is producing wear on the ssds on which microk8s is running. I reduced the cluster to a single node in order to reduce the impact.

try disabling the "prometheus" microk8s addon: microk8s disable prometheus and see if that helps?

@benben
Copy link

benben commented Dec 1, 2022

Can confirm this behavior on a a fresh microk8s cluster, not running any workloads. Virtualization is Proxmox. I've attached some debug info including the inspect tarball.

# microk8s version
MicroK8s v1.25.4 revision 4221
# uname -a
Linux hostname 5.15.0-53-generic #59-Ubuntu SMP Mon Oct 17 18:53:30 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
# microk8s status
microk8s is running
high-availability: yes
  datastore master nodes: 10.0.0.1:19001 10.0.0.2:19001 10.0.0.3:19001
  datastore standby nodes: none
addons:
  enabled:
    ha-cluster           # (core) Configure high availability on the current node
    helm                 # (core) Helm - the package manager for Kubernetes
    helm3                # (core) Helm 3 - the package manager for Kubernetes
  disabled:
    cert-manager         # (core) Cloud native certificate management
    community            # (core) The community addons repository
    dashboard            # (core) The Kubernetes dashboard
    dns                  # (core) CoreDNS
    gpu                  # (core) Automatic enablement of Nvidia CUDA
    host-access          # (core) Allow Pods connecting to Host services smoothly
    hostpath-storage     # (core) Storage class; allocates storage from host directory
    ingress              # (core) Ingress controller for external access
    kube-ovn             # (core) An advanced network fabric for Kubernetes
    mayastor             # (core) OpenEBS MayaStor
    metallb              # (core) Loadbalancer for your Kubernetes cluster
    metrics-server       # (core) K8s Metrics Server for API access to service metrics
    observability        # (core) A lightweight observability stack for logs, traces and metrics
    prometheus           # (core) Prometheus operator for monitoring and logging
    rbac                 # (core) Role-Based Access Control for authorisation
    registry             # (core) Private image registry exposed on localhost:32000
    storage              # (core) Alias to hostpath-storage add-on, deprecated

inspection-report-20221201_100122.tar.gz

@vinayan3
Copy link

vinayan3 commented Jan 5, 2023

@ktsakalozos is there any updates? I'm also facing this issue too and would like a way to reduce the number of writes to the SSD to reduce wear just like the other folks on this thread.

microk8s status
microk8s is running
high-availability: no
  datastore master nodes: 127.0.0.1:19001
  datastore standby nodes: none
addons:
  enabled:
    cert-manager         # (core) Cloud native certificate management
    dashboard            # (core) The Kubernetes dashboard
    dns                  # (core) CoreDNS
    gpu                  # (core) Automatic enablement of Nvidia CUDA
    ha-cluster           # (core) Configure high availability on the current node
    helm                 # (core) Helm - the package manager for Kubernetes
    helm3                # (core) Helm 3 - the package manager for Kubernetes
    hostpath-storage     # (core) Storage class; allocates storage from host directory
    metrics-server       # (core) K8s Metrics Server for API access to service metrics
    rbac                 # (core) Role-Based Access Control for authorisation
    registry             # (core) Private image registry exposed on localhost:32000
    storage              # (core) Alias to hostpath-storage add-on, deprecated
  disabled:
    community            # (core) The community addons repository
    host-access          # (core) Allow Pods connecting to Host services smoothly
    ingress              # (core) Ingress controller for external access
    kube-ovn             # (core) An advanced network fabric for Kubernetes
    mayastor             # (core) OpenEBS MayaStor
    metallb              # (core) Loadbalancer for your Kubernetes cluster
    observability        # (core) A lightweight observability stack for logs, traces and metrics
    prometheus           # (core) Prometheus operator for monitoring and logging

@duplabe
Copy link

duplabe commented Feb 3, 2023

I have the same problem.

My stack: Proxmox with 3 VMs, each running microk8s on Ubuntu 22.04 (1 control-plane, 2 workers). iotop shows constant k8s-dqlite writing on the control-plane (not on the workers).

microk8s version
MicroK8s v1.26.0 revision 4390

microk8s status
microk8s is running
high-availability: no
  datastore master nodes: 192.168.123.50:19001
  datastore standby nodes: none
addons:
  enabled:
    community            # (core) The community addons repository
    dns                  # (core) CoreDNS
    ha-cluster           # (core) Configure high availability on the current node
    helm                 # (core) Helm - the package manager for Kubernetes
    helm3                # (core) Helm 3 - the package manager for Kubernetes
    ingress              # (core) Ingress controller for external access
    metrics-server       # (core) K8s Metrics Server for API access to service metrics
  disabled:
    argocd               # (community) Argo CD is a declarative continuous deployment for Kubernetes.
...

@ktsakalozos
Copy link
Member

With the 1.26 release we exposed [1] a few configuration variables you could use to tune dqlite's behavior.

You can create a tuning.yaml file under /var/snap/microk8s/current/var/kubernetes/backend/ that includes:

snapshot:
  trailing: 8192
  threshold: 1024

... and restart microk8s.

This will instruct dqlite to keep more data in memory but take snapshots less frequently (x8 times less).

Note that this is experimental for now and might need some trial and error on your side to make sure in matches your needs/workloads/hardware specs.

[1] canonical/k8s-dqlite#32

@selvex
Copy link

selvex commented Mar 12, 2023

With the 1.26 release we exposed [1] a few configuration variables you could use to tune dqlite's behavior.

You can create a tuning.yaml file under /var/snap/microk8s/current/var/kubernetes/backend/ that includes:

snapshot:
  trailing: 8192
  threshold: 1024

... and restart microk8s.

This will instruct dqlite to keep more data in memory but take snapshots less frequently (x8 times less).

Note that this is experimental for now and might need some trial and error on your side to make sure in matches your needs/workloads/hardware specs.

[1] canonical/k8s-dqlite#32

Tried this, but instead of

snapshot:
  trailing: 8192
  threshold: 1024

which are the defaults according to (canonical/k8s-dqlite#32), I chose

snapshot:
  trailing: 8192
  threshold: 8192

which should have the expected effect of reducing the writes by a factor of 8 according to the PR.

However, after setting the values I did not experience any difference in the huge amount of I/O operations. I am running a single node cluster with microk8s and the virtual machine permanently has a write of 5 - 6 mb/s.

@davidmonro
Copy link

I've also tried messing with the tuning.yaml, and from the syslog entry it appears to have been accepted:

microk8s.daemon-k8s-dqlite[3041]: I0402 01:43:19.917841 3041 log.go:198] Raft snapshot parameters set to (threshold=8192, trailing=8192)

but I'm not seeing any significant reduction in write traffic.

@MathieuBordere
Copy link

MathieuBordere commented Apr 3, 2023

@selvex @davidmonro Can you post the contents of your data folder?
ls -alh /var/snap/microk8s/3052/var/kubernetes/backend/ 3052 should probably be replaced by some other value.

@davidmonro
Copy link

$ ls -alh /var/snap/microk8s/4959/var/kubernetes/backend/
total 244M
drwxrwx--- 2 root microk8s 4.0K Apr  3 19:24 .
drwxr-xr-x 3 root root     4.0K Apr 22  2022 ..
-rw-rw---- 1 root microk8s 8.0M Apr  3 18:47 0000000153727440-0000000153727797
-rw-rw---- 1 root microk8s 8.0M Apr  3 18:48 0000000153727798-0000000153728298
-rw-rw---- 1 root microk8s 8.0M Apr  3 18:48 0000000153728299-0000000153728899
-rw-rw---- 1 root microk8s 8.0M Apr  3 18:50 0000000153728900-0000000153729324
-rw-rw---- 1 root microk8s 8.0M Apr  3 18:53 0000000153729325-0000000153729735
-rw-rw---- 1 root microk8s 8.0M Apr  3 18:53 0000000153729736-0000000153730344
-rw-rw---- 1 root microk8s 8.0M Apr  3 18:55 0000000153730345-0000000153730845
-rw-rw---- 1 root microk8s 8.0M Apr  3 18:58 0000000153730846-0000000153731200
-rw-rw---- 1 root microk8s 8.0M Apr  3 18:58 0000000153731201-0000000153731799
-rw-rw---- 1 root microk8s 8.0M Apr  3 18:59 0000000153731800-0000000153732357
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:02 0000000153732358-0000000153732710
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:03 0000000153732711-0000000153733238
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:03 0000000153733239-0000000153733852
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:06 0000000153733853-0000000153734236
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:08 0000000153734237-0000000153734692
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:08 0000000153734693-0000000153735295
-rw-rw---- 1 root microk8s 5.3M Apr  3 19:09 0000000153735296-0000000153735631
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:12 0000000153735632-0000000153735988
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:13 0000000153735989-0000000153736548
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:13 0000000153736549-0000000153737158
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:16 0000000153737159-0000000153737511
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:18 0000000153737512-0000000153737993
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:18 0000000153737994-0000000153738606
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:20 0000000153738607-0000000153739029
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:23 0000000153739030-0000000153739431
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:23 0000000153739432-0000000153740043
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:24 0000000153740044-0000000153740548
-rw-rw---- 1 root microk8s 1.9K Apr 22  2022 cluster.crt
-rw-rw---- 1 root microk8s 3.2K Apr 22  2022 cluster.key
-rw-rw---- 1 root microk8s   63 Apr  3 19:24 cluster.yaml
-rw-rw-r-- 1 root microk8s    2 Apr  2 01:43 failure-domain
-rw-rw---- 1 root microk8s   57 Apr 22  2022 info.yaml
srw-rw---- 1 root microk8s    0 Apr  2 01:43 kine.sock:12379
-rw-rw---- 1 root microk8s   63 Apr  2 01:43 localnode.yaml
-rw-rw---- 1 root microk8s   32 Apr 22  2022 metadata1
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:25 open-1753
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:23 open-1754
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:24 open-1755
-rw-rw---- 1 root microk8s 3.3M Apr  3 18:43 snapshot-1-153727439-151252375
-rw-rw---- 1 root microk8s   72 Apr  3 18:43 snapshot-1-153727439-151252375.meta
-rw-rw---- 1 root microk8s 3.2M Apr  3 19:09 snapshot-1-153735631-152794636
-rw-rw---- 1 root microk8s   72 Apr  3 19:09 snapshot-1-153735631-152794636.meta
-rw-rw-r-- 1 root microk8s   45 Apr  2 01:08 tuning.yaml

@MathieuBordere
Copy link

MathieuBordere commented Apr 3, 2023

$ ls -alh /var/snap/microk8s/4959/var/kubernetes/backend/
total 244M
drwxrwx--- 2 root microk8s 4.0K Apr  3 19:24 .
drwxr-xr-x 3 root root     4.0K Apr 22  2022 ..
-rw-rw---- 1 root microk8s 8.0M Apr  3 18:47 0000000153727440-0000000153727797
-rw-rw---- 1 root microk8s 8.0M Apr  3 18:48 0000000153727798-0000000153728298
-rw-rw---- 1 root microk8s 8.0M Apr  3 18:48 0000000153728299-0000000153728899
-rw-rw---- 1 root microk8s 8.0M Apr  3 18:50 0000000153728900-0000000153729324
-rw-rw---- 1 root microk8s 8.0M Apr  3 18:53 0000000153729325-0000000153729735
-rw-rw---- 1 root microk8s 8.0M Apr  3 18:53 0000000153729736-0000000153730344
-rw-rw---- 1 root microk8s 8.0M Apr  3 18:55 0000000153730345-0000000153730845
-rw-rw---- 1 root microk8s 8.0M Apr  3 18:58 0000000153730846-0000000153731200
-rw-rw---- 1 root microk8s 8.0M Apr  3 18:58 0000000153731201-0000000153731799
-rw-rw---- 1 root microk8s 8.0M Apr  3 18:59 0000000153731800-0000000153732357
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:02 0000000153732358-0000000153732710
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:03 0000000153732711-0000000153733238
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:03 0000000153733239-0000000153733852
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:06 0000000153733853-0000000153734236
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:08 0000000153734237-0000000153734692
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:08 0000000153734693-0000000153735295
-rw-rw---- 1 root microk8s 5.3M Apr  3 19:09 0000000153735296-0000000153735631
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:12 0000000153735632-0000000153735988
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:13 0000000153735989-0000000153736548
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:13 0000000153736549-0000000153737158
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:16 0000000153737159-0000000153737511
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:18 0000000153737512-0000000153737993
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:18 0000000153737994-0000000153738606
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:20 0000000153738607-0000000153739029
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:23 0000000153739030-0000000153739431
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:23 0000000153739432-0000000153740043
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:24 0000000153740044-0000000153740548
-rw-rw---- 1 root microk8s 1.9K Apr 22  2022 cluster.crt
-rw-rw---- 1 root microk8s 3.2K Apr 22  2022 cluster.key
-rw-rw---- 1 root microk8s   63 Apr  3 19:24 cluster.yaml
-rw-rw-r-- 1 root microk8s    2 Apr  2 01:43 failure-domain
-rw-rw---- 1 root microk8s   57 Apr 22  2022 info.yaml
srw-rw---- 1 root microk8s    0 Apr  2 01:43 kine.sock:12379
-rw-rw---- 1 root microk8s   63 Apr  2 01:43 localnode.yaml
-rw-rw---- 1 root microk8s   32 Apr 22  2022 metadata1
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:25 open-1753
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:23 open-1754
-rw-rw---- 1 root microk8s 8.0M Apr  3 19:24 open-1755
-rw-rw---- 1 root microk8s 3.3M Apr  3 18:43 snapshot-1-153727439-151252375
-rw-rw---- 1 root microk8s   72 Apr  3 18:43 snapshot-1-153727439-151252375.meta
-rw-rw---- 1 root microk8s 3.2M Apr  3 19:09 snapshot-1-153735631-152794636
-rw-rw---- 1 root microk8s   72 Apr  3 19:09 snapshot-1-153735631-152794636.meta
-rw-rw-r-- 1 root microk8s   45 Apr  2 01:08 tuning.yaml

The impact of the decrease of the snapshot frequency on disk IO will be noticeable when the snapshot files are large, like in the case above where a 1.7GB snapshot is taken with 1 minute between snapshots. In your case, because the snapshot is so small, it doesn't make a lot of difference.

If you see a lot of writes, that's because k8s/microk8s/someone else is performing a lot of writes on the DB for some reason.

@davidmonro
Copy link

Interesting - it is a pretty light workload, so not sure what would be doing that. Can dqlite to do query logging?

@ktsakalozos
Copy link
Member

@davidmonro what exactly do you see? The I/O load is constant or does it come in spikes? Is this a single node cluster?

You could add a --debug flag in /var/snap/microk8s/current/args/k8s-dqlite and then restart k8s-dqlite with: sudo systemctl restart snap.microk8s.daemon-k8s-dqlite. Then observer the k8s-dqlite logs with journalctl -fu snap.microk8s.daemon-k8s-dqlite.

@davidmonro
Copy link

davidmonro commented Jul 2, 2023

Sorry I left this so long, I hoped upgrading to microk8s 1.27 and completely rebuilding the node would fix it. It did not.

There's a lot of output, and a lot of it seems to relate to leases? I ran it with debugging for about 150 seconds, which generated about 8000 lines of output, 4700 of which containe the word 'lease'.

Is there anything sensitive in that log, or would it be sensible to attach it here?

@camille-rodriguez
Copy link

Experienced this with 1.28, local disk filled up with 500GB of data with no much running on it

@aalvarado
Copy link

I also have this issue, constantly writing to disk

@aalvarado
Copy link

Uninstalling microk8s has fixed to disk every second. It was also filling up the journal with repeated entries

@raackley
Copy link

I am also experiencing this issue. Dqlite has already killed 2 nvmes and has put considerable wear on its replacements that are only a few months old. I'm surprised this issue is still ongoing and not getting more attention.

The other issue I have is dqlite seems to be consuming way more memory than it should be. So the experimental workaround to trade off more memory consumption for reduced snapshotting doesn't seem very good to me.

If external-etcd fixes these issues, I'd be interested, except for the documentation strongly discouraging you from doing this for running clusters.

I've been mostly happily running microk8s for years, but seems like these issues have backed me into a corner, and I guess I need to look at alternatives.

@RoadRunnr
Copy link

I'm having the same problem with the v1.31.2. snap on Ubuntu 24.10.

@ktsakalozos dqlite log as requested in #3064 (comment) attached. Single node cluster, no significant load.

dqlite.log.gz

Content of /var/snap/microk8s/current/var/kubernetes/backend/

# ls -alh /var/snap/microk8s/current/var/kubernetes/backend/
total 205M
drwxrwx--- 1 root microk8s 2.1K Nov 21 10:42 .
drwxr-xr-x 1 root root       14 Oct 28 16:56 ..
-rw-rw---- 1 root microk8s 8.0M Nov 21 09:44 0000000004550657-0000000004551032
-rw-rw---- 1 root microk8s 8.0M Nov 21 09:46 0000000004551033-0000000004551403
-rw-rw---- 1 root microk8s 6.0M Nov 21 09:48 0000000004551404-0000000004551680
-rw-rw---- 1 root microk8s 8.0M Nov 21 09:51 0000000004551681-0000000004552048
-rw-rw---- 1 root microk8s 8.0M Nov 21 09:54 0000000004552049-0000000004552419
-rw-rw---- 1 root microk8s 6.3M Nov 21 09:56 0000000004552420-0000000004552704
-rw-rw---- 1 root microk8s 8.0M Nov 21 09:58 0000000004552705-0000000004553069
-rw-rw---- 1 root microk8s 8.0M Nov 21 10:01 0000000004553070-0000000004553441
-rw-rw---- 1 root microk8s 6.3M Nov 21 10:03 0000000004553442-0000000004553728
-rw-rw---- 1 root microk8s 8.0M Nov 21 10:06 0000000004553729-0000000004554106
-rw-rw---- 1 root microk8s 8.0M Nov 21 10:09 0000000004554107-0000000004554480
-rw-rw---- 1 root microk8s 5.9M Nov 21 10:11 0000000004554481-0000000004554752
-rw-rw---- 1 root microk8s 8.0M Nov 21 10:13 0000000004554753-0000000004555125
-rw-rw---- 1 root microk8s 8.0M Nov 21 10:16 0000000004555126-0000000004555505
-rw-rw---- 1 root microk8s 5.8M Nov 21 10:18 0000000004555506-0000000004555776
-rw-rw---- 1 root microk8s 8.0M Nov 21 10:21 0000000004555777-0000000004556146
-rw-rw---- 1 root microk8s 1.8M Nov 21 10:21 0000000004556147-0000000004556228
-rw-rw---- 1 root microk8s 8.0M Nov 21 10:24 0000000004556229-0000000004556595
-rw-rw---- 1 root microk8s 4.6M Nov 21 10:25 0000000004556596-0000000004556800
-rw-rw---- 1 root microk8s 8.0M Nov 21 10:28 0000000004556801-0000000004557171
-rw-rw---- 1 root microk8s 8.0M Nov 21 10:31 0000000004557172-0000000004557539
-rw-rw---- 1 root microk8s 6.2M Nov 21 10:33 0000000004557540-0000000004557824
-rw-rw---- 1 root microk8s 8.0M Nov 21 10:36 0000000004557825-0000000004558193
-rw-rw---- 1 root microk8s 8.0M Nov 21 10:38 0000000004558194-0000000004558568
-rw-rw---- 1 root microk8s 6.0M Nov 21 10:40 0000000004558569-0000000004558848
-rw-rw---- 1 root microk8s 1.9K Oct 28 16:56 cluster.crt
-rw-rw---- 1 root microk8s 3.2K Oct 28 16:56 cluster.key
-rw-rw---- 1 root microk8s   63 Nov 21 10:42 cluster.yaml
-rw-rw---- 1 root microk8s    0 Oct 28 16:56 dqlite-lock
-rw-rw-r-- 1 root microk8s    2 Nov 20 04:57 failure-domain
-rw-rw---- 1 root microk8s   57 Oct 28 16:56 info.yaml
srw-rw---- 1 root microk8s    0 Nov 21 10:21 kine.sock:12379
-rw-rw---- 1 root microk8s   32 Oct 28 16:56 metadata1
-rw-rw---- 1 root microk8s 8.0M Nov 21 10:38 open-10
-rw-rw---- 1 root microk8s 8.0M Nov 21 10:40 open-11
-rw-rw---- 1 root microk8s 8.0M Nov 21 10:42 open-9
-rw-rw---- 1 root microk8s 2.1M Nov 21 10:33 snapshot-1-4557824-1533533180
-rw-rw---- 1 root microk8s   72 Nov 21 10:33 snapshot-1-4557824-1533533180.meta
-rw-rw---- 1 root microk8s 1.8M Nov 21 10:40 snapshot-1-4558848-1533977361
-rw-rw---- 1 root microk8s   72 Nov 21 10:40 snapshot-1-4558848-1533977361.meta

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests