Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Work on memory allocations #103

Open
fabinsch opened this issue Nov 10, 2022 · 3 comments
Open

Work on memory allocations #103

fabinsch opened this issue Nov 10, 2022 · 3 comments
Labels
enhancement New feature or request

Comments

@fabinsch
Copy link
Collaborator

fabinsch commented Nov 10, 2022

With #92 and #100 we start to fix some of the remaining memory allocations in the dense backend. Proxsuite v0.2.7, containing these fixes, seems to be around 8% faster on our benchmarks. With #93 we add an option to easily check for allocations and we use the pipeline [conda:macos-latest:Debug:c++17] with the new option CHECK_RUNTIME_MALLOC. We can still see some unittests failing when we are not allowing for memory allocations.
In this commit, we identify all the places in the dense backend that need to allow for memory allocations to make all unittests pass. We should therefore understand if and how these can be fixed.

The remaining allocations seem to be coming from multiplications of Eigen::Maps.

@fabinsch fabinsch added the enhancement New feature or request label Nov 10, 2022
@fabinsch
Copy link
Collaborator Author

Note that currently, the pipeline [conda:macos-latest:Debug:c++17] is running with the option CHECK_RUNTIME_MALLOC which fails for several tests, meaning we still have eigen memory allocation in the dense solver. To not block PRs, this pipeline is allowed to fail in #117 .

@fabinsch
Copy link
Collaborator Author

The allocations here and here can be avoided by using again layzProduct instead of the operator*.

In our unittests, memory alloctions happen just above a dimension of the problem. A check with the proxqp_benchmark shows that a small speedup (2-5%) could be achieved by using layzProduct instead of the operator* for small size problems, while a great loss in performance is observed for bigger problems. We should maybe allow some dynamic memory allocation and let Eigen handle it under the hood how to treat different sizes.

@josechenf
Copy link

Could this be an option for the user?

In massive multi-threaded environments, memory allocations can cause a huge performance drop as they often block other threads from allocating memory at the same time.
Hard-real time/embedded applications may also find some difficulties as the computation time may no longer be deterministic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: No status
Development

No branches or pull requests

2 participants