-
Notifications
You must be signed in to change notification settings - Fork 773
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dqlite writing huge amounts of data to local drives #3064
Comments
Can you show the result of |
Hi Mathieu, I'm working with @b0n0r on this issue, so here is the requested output:
|
Based on this information, your system writes roughly 112389759 - 112380813 = 8946 (112380813 is from file dqlite has a way to decrease the frequency of the snapshotting, and the microk8s team has been made aware on how to do this. Decreasing the frequency of snapshotting would decrease the total amount of data written to disk. |
Hi @petermihalik, What could you share about the k8s workload that produces this amount of write operations? I wonder if we could use it as a benchmark for future improvements? Decreasing the frequency of snapshots would increase the memory footprint used by dqlite. Could you share how much memory dqlite uses right now and how much free memory your system has? Currently we do not expose the option of manually configuring the raft log threshold size that triggers the snapshot. Would you consider using an external etcd cluster [1] an option? |
hi @ktsakalozos and the team, this is a mixed workload running some batch processing deployments as well as some simple microservices for text language detection etc. nothing too exotic i think. what are the steps for me to pinpoint the source of so many write operations ie. which part of the workload is generating this? dqlite process (still running on only 1 out of 3 nodes) is currently consuming more detailed output for
if external etcd is my only option to save the SSDs, i will have to give it a try i guess? are there any other options apart from this? my concern is that this behaviour is killing the drives and since this is a PoC cluster consumer drives are wearing out very quickly. |
I'm facing the same problem. dqlite is generating a lot of disk io, which is producing wear on the ssds on which microk8s is running. I reduced the cluster to a single node in order to reduce the impact. |
try disabling the "prometheus" microk8s addon: |
Can confirm this behavior on a a fresh microk8s cluster, not running any workloads. Virtualization is Proxmox. I've attached some debug info including the inspect tarball.
|
@ktsakalozos is there any updates? I'm also facing this issue too and would like a way to reduce the number of writes to the SSD to reduce wear just like the other folks on this thread.
|
I have the same problem. My stack: Proxmox with 3 VMs, each running microk8s on Ubuntu 22.04 (1 control-plane, 2 workers). iotop shows constant
|
With the 1.26 release we exposed [1] a few configuration variables you could use to tune dqlite's behavior. You can create a
... and restart microk8s. This will instruct dqlite to keep more data in memory but take snapshots less frequently (x8 times less). Note that this is experimental for now and might need some trial and error on your side to make sure in matches your needs/workloads/hardware specs. |
Tried this, but instead of
which are the defaults according to (canonical/k8s-dqlite#32), I chose
which should have the expected effect of reducing the writes by a factor of 8 according to the PR. However, after setting the values I did not experience any difference in the huge amount of I/O operations. I am running a single node cluster with microk8s and the virtual machine permanently has a write of 5 - 6 mb/s. |
I've also tried messing with the tuning.yaml, and from the syslog entry it appears to have been accepted:
but I'm not seeing any significant reduction in write traffic. |
@selvex @davidmonro Can you post the contents of your data folder? |
|
The impact of the decrease of the snapshot frequency on disk IO will be noticeable when the snapshot files are large, like in the case above where a 1.7GB snapshot is taken with 1 minute between snapshots. In your case, because the snapshot is so small, it doesn't make a lot of difference. If you see a lot of writes, that's because k8s/microk8s/someone else is performing a lot of writes on the DB for some reason. |
Interesting - it is a pretty light workload, so not sure what would be doing that. Can dqlite to do query logging? |
@davidmonro what exactly do you see? The I/O load is constant or does it come in spikes? Is this a single node cluster? You could add a |
Sorry I left this so long, I hoped upgrading to microk8s 1.27 and completely rebuilding the node would fix it. It did not. There's a lot of output, and a lot of it seems to relate to leases? I ran it with debugging for about 150 seconds, which generated about 8000 lines of output, 4700 of which containe the word 'lease'. Is there anything sensitive in that log, or would it be sensible to attach it here? |
Experienced this with 1.28, local disk filled up with 500GB of data with no much running on it |
I also have this issue, constantly writing to disk |
Uninstalling microk8s has fixed to disk every second. It was also filling up the journal with repeated entries |
I am also experiencing this issue. Dqlite has already killed 2 nvmes and has put considerable wear on its replacements that are only a few months old. I'm surprised this issue is still ongoing and not getting more attention. The other issue I have is dqlite seems to be consuming way more memory than it should be. So the experimental workaround to trade off more memory consumption for reduced snapshotting doesn't seem very good to me. If external-etcd fixes these issues, I'd be interested, except for the documentation strongly discouraging you from doing this for running clusters. I've been mostly happily running microk8s for years, but seems like these issues have backed me into a corner, and I guess I need to look at alternatives. |
I'm having the same problem with the v1.31.2. snap on Ubuntu 24.10. @ktsakalozos dqlite log as requested in #3064 (comment) attached. Single node cluster, no significant load. Content of
|
Running a 3 node microk8s instance on baremetal commodity HW, Ubuntu 20.04.
Started out with 2 nodes, adding a 3rd one later in the process, enabling HA.
What I am seeing is very strange behaviour where
dqlite
process is tiring out local storage rather quickly.A particular node has been running for roughly 2 weeks and the
dqlite
process has written 30TB+ to the drive(s).My pinpointing process went like this:
uptime for reference
03:47:27 up 13 days, 28 min, 1 user, load average: 2.56, 3.33, 2.74
check all processes in /proc and grep for bytes written, sort numerically:
grep ^write_bytes /proc/*/io | awk '{print $2" "$1}'|sort -n
the process in my case with most bytes written:
37019017576448 /proc/1092/io:write_bytes:
quick conversion amounts this to roughly 33 Terabytes which is exceedingly big number of writes
putting a face to a name (what is this PID)
ps auwx|grep 1092
output in my case:
root 1092 14.9 4.0 11209648 5295232 ? Ssl Apr01 2810:47 /snap/microk8s/3052/bin/k8s-dqlite --storage-dir=/var/snap/microk8s/3052/var/kubernetes/backend/ --listen=unix:///var/snap/microk8s/3052/var/kubernetes/backend/kine.sock:12379
the storage-dir of the dqlite process
du -hs /var/snap/microk8s/3052/var/kubernetes/backend/
3.5G /var/snap/microk8s/3052/var/kubernetes/backend/
kubectl get nodes
seems finemicrok8s status
just ran this for reference a got a weird permission denied error for dqlite storage-dir though, no idea why (race condition maybe?) all files are owned by root:microk8s with 660 perms:
What I also see is the
dqlite
process is only running on 1 of the 3 nodes - is that expected behaviour?Please run
microk8s inspect
and attach the generated tarball to this issue.Would like to, but I went through the generated tarball and seems like it includes a lot of potentially sensitive (internal) data, is there a way to redact this or somehow share it so it does not become public?
Thank you for microk8s :)
The text was updated successfully, but these errors were encountered: