-
Notifications
You must be signed in to change notification settings - Fork 826
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WSL2 gets laggy over time - suspend to disk (hibernate) restores normal operation #11885
Comments
Logs are required for review from WSL teamIf this a feature request, please reply with '/feature'. If this is a question, reply with '/question'. How to collect WSL logsDownload and execute collect-wsl-logs.ps1 in an administrative powershell prompt:
The script will output the path of the log file once done. If this is a networking issue, please use collect-networking-logs.ps1, following the instructions here Once completed please upload the output files to this Github issue. Click here for more info on logging View similar issuesPlease view the issues below to see if they solve your problem, and if the issue describes your problem please consider closing this one and thumbs upping the other issue to help us prioritize it! Open similar issues:
Closed similar issues:
|
The log file doesn't contain any WSL traces. Please make sure that you reproduced the issue while the log collection was running. Diagnostic information
|
The issue cannot be "reproduced" and then captured in a log. WSL2 just slows down over time... |
@cvjjm: When WSL gets laggy, can you run:
And inside that shell, run |
When I try to run
When I try to run
Because of the relation to hibernation I tried to disable the virtual memory / page file in WIn10, but this did not change anything. |
@cvjjm: The debug shell only works when WSL is already running, was wsl running when you got that error ? |
@cvjjm: Hmm that's weird. Can you share the content of |
Ok thank you @cvjjm. Let's try something else: Can you share the output of |
Thank you @cvjjm. I wonder if that drop cache script has an impact. Can you try to remove it and see if that changes the behavior ? Also FYI WSL will essentially do that automatically now since 2.1.3 ( See |
I have removed all contrab entries and instead put One additional piece of information that I only notice now is that it seems that after a fresh restart WSL runs smoothly indefinitely (or at least for a full day). Only after the first sleep/hibernate/suspend to disk/ram cycle does the laggyness start to increase. |
Behavior stays the same also with AutoMemoryReclaim disabled and no cron job running. |
I stumbled accross this issue here carlfriedrich/wsl-kernel-build#1 which is clearly different from mine but also hibernate related, so have decided to try with the custom Kernel 5.15.153.1-microsoft-standard-WSL2 provided there and will report back. |
Unfortunately I cannot use Docker Desktop with the custom WSL Kernel because of the problem described here docker/for-win#14282. It is thus hard for me to test whether the issue here goes away with patched custom kernel because I would need to stop working for a day or so... |
Supplementary question to @cvjjm. So in your case when WSL become slower and slower it never hang as a result? First symptoms I observed in my case are for example ctrl+r search using |
Yes, it becomes slower and slower and after a few hours unbearably slow, but it never hangs up completely. We seem to have very similar hardware! I have a T14 with Ryzen 7 Pro 7840U. I don't think I fully understand the thing with the interrupts, but when WSL in the responsive state and essentially idling I get the following output:
I.e., 411 HVS interrupts in 10s on CPU0. I will report the number for the laggy state later... |
This is the output when WSL is the laggy state:
About 28 HVS interrups in 10s on CPU0. @Baryczka, is this in line with your expectations? Fewer interrupts in the laggy state and more during the responsive state? |
I think should use older version 20 or 22 maybe fix this problem. |
I would add that I also experience similar symptoms on my Mac Pro 7,1 running Win 11, and with very heavy use of docker. Originally the system was configured with a 24 core CPU and 48gig ram, and I had 10 cores set for WSL and 32gig ram. The docker containers, or the operating system builds that I do in WSL, pretty much continuously have the CPU and RAM at maximum, and are constantly accessing the disk and network. Other than a significant lack of network speed with Docker (which seems capped at 100Mbps, despite having a 2.5G local ethernet and a 1G link to the outside world), WSL is pretty much constantly consuming all resources available to it. Typically after a day or two of this, the VM would grind to a halt. A wsl --shutdown would restore normal operation. I have not tried a hibernate, but now it's on my list of possible tools. Upgrading the system memory to 112GB and giving WSL access to 20 cores and 80GB has significantly improved the problem. I still notice it slowing down some after a couple of days, but in the week that I've had that upgrade, it has not come to a halt or slowed down to unacceptable speeds, though when I do notice it slowing down, i restart wsl. I'll drop in in a few more days and say if hibernate got it back, and maybe do some ping testing like above. |
This issue still persists. |
Windows Version
Microsoft Windows [Version 10.0.19045.4651]
WSL Version
2.2.4.0
Are you using WSL 1 or WSL 2?
Kernel Version
5.15.153.1-microsoft-standard-WSL2
Distro Version
Ubuntu 24.04
Other Software
Docker Desktop 4.33.1 (161083)
Repro Steps
Start computed, start using wsl, on a time scale of 30min to 1h WSL2 gets increasingly unresponsive and laggy. This causes noticeable input delay when typing in any console connected to WSL2 (irrespective of whether it is the Windows Console or the System Powershell directly into WSL2, or whether it is, e.g., the console built into Docker Desktop) and equally affects, e.g., graphical applications running inside a container on Docker Desktop, displayed through VcXsrv.
More objectively the problem manifests itself in erratic ping times between the host and WSL. A typical ping after an hour of usage looks like this:
Interestingly the problem can be temporarily resolved by suspending the system to disk (hibernation), but not by a suspend to ram. After a hibernate/wake cycle the ping looks like this, and then again slowly degrades along with the input lag:
Restarting Docker / Docker Desktop or the vEthernet adapters (
Restart-NetAdapter -Name "vEthernet (WSL)"
) does not have the same positive effect. Completely shutting down and restarting WSL2 does temporarily solve the problem, just like a hibernation cycle.The speed and magnitude of the performance problem do not seem to be related to how intensely I use WSL2 (just idling vs. running multi core work loads inside a container) and also occurs irrespective of whether the laptop is connected to a wired or wires network and/or VPN.
No suspiciously high resource usage of the Vmemm process (or any other process) is visible in Task Manager.
I have tried disabling NetBIOS and the "Large Send Offload" feature as suggested in other issues/blog posts, but nothing has made any difference.
I have experimented with the following settings in wsl.conf, but to no avail:
The lagg also manifests itself in some docker commands taking very long time, e.g.
Measure-Command {docker ps}
taking between 1 and 5 seconds vs. less than 200ms when everything is working, but the problem does not exclusively effect Docker (Desktop) but all of WSL2.This is on a T14s with AMD Ryzen Pro 7 Processor.
Please let me know which other information/logs could be useful to diagnose the problem.
Expected Behavior
The performance and responsiveness of WSL should not degrade over time.
Actual Behavior
The performance and responsiveness of WSL do degrade over time.
Diagnostic Logs
No response
The text was updated successfully, but these errors were encountered: