Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SharedCache] Optimize ReadExportNode #6322

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from

Conversation

bdash
Copy link
Contributor

@bdash bdash commented Jan 15, 2025

ReadExportNode is called frequently during the initial load of the shared cache and impacts how long it takes for parts of the shared cache UI to be functional. This is a collection of optimizations that cut the time spent within ReadExportNode by 50%:

  1. Pass iterators to ReadExportNode rather than a DataBuffer + offset. The lack of inlining in DataBuffer's operator[] kills performance. Ideally this would have used std::span, but that would require bumping the minimum C++ version to C++20.

  2. Add MMappedFileAccessor::ReadSpan so that ReadExportNode can operate directly on the mapped data without first copying it.

  3. Removes a call to GetAnalysisFunctionsForAddress whose result was unused.

  4. Use std::find to find the nul at the end of strings rather than assembling the string a character at a time. This avoids repeatedly growing the string.

  5. Avoid the usual Symbol constructor in favor of BNCreateSymbol. The Symbol constructor has over head in two forms:

    1. It has a NameSpace as an argument. Creating / destroying this allocates and deallocates memory We could create a single instance and reuse it for all calls, but...
    2. The constructor copies all fields of the NameSpace to the heap before calling BNCreateSymbol and then deallocates them afterwards. This seems unnecessary, and adds a non-trivial amount of overhead.

    This can go back to directly constructing the Symbol once the constructors are improved.

`ReadExportNode` is called a lot during the initial load of the shared
cache and thus impacts how long it takes for the UI to become
responsive. This is a collection of optimizations that cut the time
spent within `ReadExportNode` by 50%:

1. Pass iterators to `ReadExportNode` rather than a `DataBuffer` + offset.
   The lack of inlining in `DataBuffer`'s `operator[]` kills performance.
   Ideally this would have used `std::span`, but that would require
   bumping the minimum C++ version to C++20.
2. Add `MMappedFileAccessor::ReadSpan` so that `ReadExportNode` can
   operate directly on the mapped data without first copying it.
3. Removes a call to `GetAnalysisFunctionsForAddress` whose result was
   unused.
4. Use `std::find` to find the nul at the end of strings rather than
   assembling the string a character at a time. This avoids repeatedly
   growing the string.
5. Avoid the usual `Symbol` constructor in favor of `BNCreateSymbol`.
   The `Symbol` constructor has over head in two forms:
   1. It has a `NameSpace` as an argument. Creating / destroying this
      allocates and deallocates memory We could create a single
      instance and reuse it for all calls, but...
   2. The constructor copies all fields of the `NameSpace` to the heap
      before calling `BNCreateSymbol` and then deallocates them
      afterwards. This seems unnecessary, and adds a non-trivial amount
      of overhead.

   This can go back to directly constructing the `Symbol` once the
   constructors are improved.
@plafosse plafosse requested a review from 0cyn January 22, 2025 13:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants