Reduce CPU usage when idle #775

scottwey · 2024-09-15T01:29:56Z

Currently, the tight loop in Engine causes very high single core CPU usage when idle. This is also not great because this is long-running blocking code running inside of an async task, blocking an async worker entirely. On systems with lower core count, this will probably impact performance fairly negatively.

From my testing, this change dramatically drops CPU usage with minimal impact to performance, although I have not had a chance to benchmark properly.

github-actions · 2024-09-15T01:31:05Z

Code Metrics Report

  ===============================================================================
 Language            Files        Lines         Code     Comments       Blanks
===============================================================================
 C Header                2           35           28            0            7
 Dockerfile              1           34           25            0            9
 Happy                   1          442          369            0           73
 JSON                   12          105          104            0            1
 Python                 52         2268         1930           69          269
 TOML                   20          625          559            2           64
 YAML                    2           21           19            2            0
-------------------------------------------------------------------------------
 Jupyter Notebooks       4            0            0            0            0
 |- Markdown             2           77           32           31           14
 |- Python               2          196          169            1           26
 (Total)                            273          201           32           40
-------------------------------------------------------------------------------
 Markdown               38         2765            0         2098          667
 |- BASH                 6          103          100            0            3
 |- JSON                 1           12           12            0            0
 |- Python               5           92           82            0           10
 |- Rust                 9          322          274            0           48
 |- TOML                 2           75           63            0           12
 (Total)                           3369          531         2098          740
-------------------------------------------------------------------------------
 Rust                  260        75712        68239         1548         5925
 |- Markdown           123         1217           25         1117           75
 (Total)                          76929        68264         2665         6000
===============================================================================
 Total                 393        82007        71273         3719         7015
===============================================================================

mistralrs-core/src/engine/mod.rs

scottwey · 2024-09-15T08:46:53Z

@EricLBuehler I switched to using yield_now only when nothing is scheduled instead of sleeping indiscriminately. I also did a bit of refactoring to hopefully make things a bit cleaner. I tested a bit and things are looking good. :)

EricLBuehler · 2024-09-16T01:47:11Z

mistralrs-core/src/engine/mod.rs


-            self.scheduler.free_finished_sequence_groups();
+                    if self.scheduler.waiting_len() == 0 {
+                        tokio::task::yield_now().await;


I like the idea of the tokio::select! addition! I'm just a bit confused about how this function would implement what we want (that is, to not sit in a loop). If I understand correctly, it yields to tokio's runtime, which means we wait for the other arm of the tokio::select! (the request recieve arm) to match?

Perhaps you could add a comment here explaining what the logic/flow is?

You understand my proposed flow correctly. My understanding of the reason we no longer sit in the loop anymore is mostly because of the yield_now; it basically marks the task as Pending for one iteration of Tokio's runtime. On the next iteration, yield_now will marked Ready and the loop can continue. Yielding doesn't take much time, but is enough to lower CPU usage dramatically.

@EricLBuehler Thoughts on this? Happy to close this and rethink if this change doesn't make sense to you.

EricLBuehler

@scottwey I think after this change and if you could do some testing of it, this should be good to merge!

Finally, if you could drop some rough metrics on CPU usage before vs after it would be great too.

mistralrs-core/src/engine/mod.rs

Co-authored-by: Eric Buehler <[email protected]>

scottwey · 2024-10-12T10:30:56Z

@EricLBuehler
Thanks for the patience here. I have had 0 time for anything besides work.
I finally had the time to leave this running and use it as a backend for a lot of personal inference and I don't see any obvious issues. Let me know if there are any specific things you want me to be on the lookout for.
In terms of CPU usage, it drops from a single core pegged at 100% down to ~1-2% on that core. Happy to gather more rigorous metrics if you would like. Just let me know what you would want to see.

Otherwise, I will get the conflicts sorted and await further review.

EricLBuehler · 2024-10-13T01:41:21Z

@scottwey thanks you for the testing! Great to see the 100% -> ~1% utilization :).

If you could resolve the conflicts, we can absolutely merge.

see if sleeping for a little decreases cpu usage at rest

c447a0f

EricLBuehler reviewed Sep 15, 2024

View reviewed changes

mistralrs-core/src/engine/mod.rs Outdated Show resolved Hide resolved

scottwey added 4 commits September 15, 2024 08:29

attempt to refactor

38f84f9

Merge branch 'EricLBuehler:master' into master

8928c83

only yield when there's no waiting_len

c6f1845

Merge branch 'master' of github.com:scottwey/mistral.rs

c8c727a

EricLBuehler reviewed Sep 16, 2024

View reviewed changes

add a comment on what yield_now does

f9914fe

EricLBuehler requested changes Sep 18, 2024

View reviewed changes

mistralrs-core/src/engine/mod.rs Outdated Show resolved Hide resolved

Update mistralrs-core/src/engine/mod.rs

e67011e

Co-authored-by: Eric Buehler <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce CPU usage when idle #775

Reduce CPU usage when idle #775

scottwey commented Sep 15, 2024

github-actions bot commented Sep 15, 2024 •

edited

Loading

scottwey commented Sep 15, 2024 •

edited

Loading

EricLBuehler Sep 16, 2024

scottwey Sep 17, 2024 •

edited

Loading

scottwey Sep 18, 2024

EricLBuehler left a comment •

edited

Loading

scottwey commented Oct 12, 2024 •

edited

Loading

EricLBuehler commented Oct 13, 2024

Reduce CPU usage when idle #775

Are you sure you want to change the base?

Reduce CPU usage when idle #775

Conversation

scottwey commented Sep 15, 2024

github-actions bot commented Sep 15, 2024 • edited Loading

scottwey commented Sep 15, 2024 • edited Loading

EricLBuehler Sep 16, 2024

Choose a reason for hiding this comment

scottwey Sep 17, 2024 • edited Loading

Choose a reason for hiding this comment

scottwey Sep 18, 2024

Choose a reason for hiding this comment

EricLBuehler left a comment • edited Loading

Choose a reason for hiding this comment

scottwey commented Oct 12, 2024 • edited Loading

EricLBuehler commented Oct 13, 2024

github-actions bot commented Sep 15, 2024 •

edited

Loading

scottwey commented Sep 15, 2024 •

edited

Loading

scottwey Sep 17, 2024 •

edited

Loading

EricLBuehler left a comment •

edited

Loading

scottwey commented Oct 12, 2024 •

edited

Loading