Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate using Profile-Guided Optimization (PGO) and Post-Link Optimization (PLO) on VTS #283

Closed
zamazan4ik opened this issue Nov 20, 2023 · 3 comments · Fixed by #288
Closed

Comments

@zamazan4ik
Copy link

Recently I checked Profile-Guided Optimization (PGO) improvements on multiple projects. The results are here. E.g. PGO helps with optimizing Envoyproxy. PGO results for other proxies like HAProxy you can be found in the repo above. According to the multiple tests, PGO can help with improving performance in many other cases. Since there are already some performance-oriented requests like #251 - I think trying to apply PGO to the VTS module can be a good thing.

I can suggest the following action points:

  • Perform PGO benchmarks on VTS. If it shows improvements - add a note to the documentation about possible improvements in VTS performance with PGO.
  • Providing an easier way (e.g. a build option) to build scripts with PGO can be helpful for the end-users and maintainers since they will be able to optimize VTS according to their workloads

Maybe testing Post-Link Optimization techniques (like LLVM BOLT) would be interesting too (Clang and Rustc already use BOLT as an addition to PGO) but I recommend starting from the usual PGO.

Here are some examples of how PGO optimization is integrated in other projects:

@u5surf
Copy link
Collaborator

u5surf commented Dec 31, 2023

@zamazan4ik Thanks interesting suggestion.
I consider that such optimize through this module might be limited. At first this module is a kind of nginx module, in short the build process is just nginx one, that might only the optimization of nginx.
We should suggest such build process to nginx developers instead this module as formally.

@u5surf
Copy link
Collaborator

u5surf commented Jan 1, 2024

@zamazan4ik
I could completely make sense what you suggested, it can be optimized the following approach. But I'm not sure how can be improved it such that process.
https://stackoverflow.com/questions/13881292/what-information-does-gcc-profile-guided-optimization-pgo-collect-and-which-op
We also can find the detail of this mechanisms on this paper.
https://people.freebsd.org/~lstewart/articles/cpumemory.pdf section7.4.

In generally if it could be improve the performance which you expected, we could be written the following a building process as a tips in README instead of providing the optimized binary. Only we prefer to build such the way at users own risks.

compile with fprofile-generate

% pwd
/home/u5surf/nginx
% CC=gcc ./auto/configure --with-cc-opt='-fprofile-generate -fprofile-dir=./objs' --with-ld-opt='-lgcov' --add-module=../nginx-module-vts
% make

test a several cases

% pwd
/home/u5surf/nginx-module-vts
% sudo PATH=/home/u5surf/nginx/objs:$PATH prove -r t/000.display_html.t
...(during runtime it records coverage data into .gcda files)

recompile with fprofile-use

% pwd
/home/u5surf/nginx
% CC=gcc ./auto/configure --with-cc-opt='-fprofile-use -fprofile-dir=../nginx-module-vts/objs' --with-ld-opt='-lgcov' --add-module=../nginx-module-vts
% make

@zamazan4ik
Copy link
Author

Excuse me for the so late response - holidays, you know :)

In generally if it could be improve the performance which you expected, we could be written the following a building process as a tips in README instead of providing the optimized binary. Only we prefer to build such the way at users own risks.

I agree with your suggestion. Having such documentation somewhere (like in the README file) is a good thing to the users to have.

I have the following suggestions for your documentation about PGO:

  • Clang also supports PGO so there is no need to bind the PGO documentation only to the GCC compiler. PGO with Clang can be enabled with the same -fprofile-generate/-fprofile-use flags.
  • It would be great if in the documentation you put information about the actual performance wins with PGO in the VTS module. But for that, we need to perform some benchmarks.

Here I gathered some PGO-related documentation examples:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants