-
Notifications
You must be signed in to change notification settings - Fork 285
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
CP-49078: Preprocess fields into a Hashtbl within get_record (#6114)
Flame graphs indicate that, under load created by parallel "xe vm-list" commands, the DB action get_record is hit often. This function constructs an API-level record by marshalling an association list that maps field names to unmarshalled string values. To do this, it serially queries all the field names using `List.assoc`. This has rather large cost in doing lexicographical string comparisons (`caml_compare` on string keys). To avoid this, regardless of record size, we preprocess the association lists `__regular_fields` and `__set_refs` into a (string, string) Hashtbl.t and query that to construct each record field. --- This benefit of this change is most notable for large records (such as `VM` and `pool`). The cost of the previously generated code, which does a bunch of serial `List.assoc` calls, incurs the quadratic cost of list traversal (compounded by the costly lexicographical comparison of strings during the search). To produce measurements, I sampled xapi under a load of 500 consecutive `xe vm-list` invocations (using the same sampling rate with `perf`) on a host with a single VM. Without the change, the `get_record` done internally by `xe vm-list` makes up for ~33.59% of the samples (33.59% = 1,592,782,255 samples). With the change, `get_record` accounts for ~7.56 of the samples (of which there are substantially fewer collected: 7.56% = 264,948,239). So the number of samples for `get_record` has dropped from 1,592,782,255 to 264,948,239 (assuming `perf`'s sampling is reliable). You can see the visual difference in the flame graphs: Before: ![{B1FEFF3A-AD91-478B-A828-89DCD19C2BEA}](https://github.com/user-attachments/assets/b7a4d504-3894-4f34-8dbe-b63de0e1d88c) After: ![{CB28FFEC-944D-4F26-918A-FFB57E4875A3}](https://github.com/user-attachments/assets/fa517307-deb6-4a49-9a18-8217deb38734) Of course, this is benefit as measured in aggregate (`perf` sampling), so quite a fast and loose comparison. In practice, the `xe vm-list` stress test goes from 7.4s to 6.2s (as `get_record` makes up only a small part of the work done for a single `xe vm-list`).
- Loading branch information
Showing
7 changed files
with
201 additions
and
64 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters