Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i#7113: Add analyzer library to cache instr decode info #7114

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

abhinav92003
Copy link
Contributor

@abhinav92003 abhinav92003 commented Dec 9, 2024

Adds a new library to cache information about decoded instructions. This can be used by analysis tools that need to decode the instr encodings in the trace, to avoid overhead of redundant decodings which can get expensive.

The library allows the tools to specify what information they need to cache. Also, it uses instr_noalloc_t when possible to reduce heap usage and allocation/deallocation overhead.

Refactors the invariant checker tool to use this library.

Issue: #7113

Adds a new library to cache information about decoded instructions. This can be
used by analysis tools that need to decode the instr encodings in the trace.

The library allows the tools to specify what information they need to cache.

Refactors the invariant checker tool to use this library.

Issue: #7113
@abhinav92003 abhinav92003 changed the title i#7113: Add library to cache information about decoded instructions i#7113: Add analyzer library to cache instr decode info Dec 10, 2024
clients/drcachesim/tools/instr_decode_cache.cpp Outdated Show resolved Hide resolved
clients/drcachesim/tests/instr_decode_cache_test.cpp Outdated Show resolved Hide resolved
clients/drcachesim/tests/instr_decode_cache_test.cpp Outdated Show resolved Hide resolved
clients/drcachesim/tools/instr_decode_cache.h Outdated Show resolved Hide resolved
clients/drcachesim/tools/instr_decode_cache.h Outdated Show resolved Hide resolved
clients/drcachesim/tools/invariant_checker.cpp Outdated Show resolved Hide resolved
clients/drcachesim/tools/invariant_checker.cpp Outdated Show resolved Hide resolved
clients/drcachesim/tools/invariant_checker.cpp Outdated Show resolved Hide resolved
clients/drcachesim/tools/invariant_checker.h Outdated Show resolved Hide resolved
* they need.
*
* Derived classes must provide a default constructor to make objects
* corresponding to an invalid decoding.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you looked at opcode_mix and ensured this interface works well there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opcode_mix just needs the opcode and category, so yes it should work there too. The only complexity with opcode mix is that it still tries to handle traces without encodings, so opcode_mix will still need to have decoding logic for that case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a TODO for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If no-encoding traces are still a possibility (very much so w/ our public v1 out there): wouldn't you expect a decoding helper library to include the messy logic to handle those? If I were a user I would not be happy with a library that only works on some traces when the code to handle all traces is sitting there in opcode_mix: I would think "why isn't that in this library".

If we aggressively deprecate (as in delete) public v1, and can make the claim no one should encounter an encoding-free trace, then not tackling that in this library seems reasonable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we aggressively deprecate (as in delete) public v1, and can make the claim no one should encounter an encoding-free trace, then not tackling that in this library seems reasonable.

But if we do that, we should remove the module mapping code from opcode_mix, right? I guess for me it feels weird to have opcode_mix have that code yet this library not handle that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Supporting backward-compatibility is definitely useful. But that is not as clear when it comes at a non-trivial cost to the new interfaces and the user.

Including the module mapping logic in instr_decode_cache_t comes with more additions to the API that are not trivial, particularly: providing the file type (which isn't available at construction time, so the user would need to remember to provide that for instr_decode_cache_t to figure out what decode logic to use), also the module map init logic is not trivial either.

Since trace encodings have been around 2+ years, and we have the public v2 also coming up very soon, it doesn't seem unreasonable to me for new convenience libraries (decoding is still possible without instr_decode_cache_t) to make assumptions about availability of trace encodings.

However: I agree that it is important to let the user know in case they're using instr_decode_cache_t on an old trace. We could perhaps add some heuristic to detect that: many consecutive decodes all with zeroes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, if we're unable to resolve concerns about potential misuses of instr_decode_cache_t, it may be prudent to instead add it only to private libraries where we can make stronger assumptions about trace versions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could perhaps add some heuristic to detect that: many consecutive decodes all with zeroes?

I think we don't need a heuristic. Instead: since the reader sets encoding_is_new when it sees any encoding

cur_ref_.instr.encoding_is_new = true;

maybe we can just use that: add to the decode cache only if encoding_is_new. That way, the cache would return nullptr for all instrs for traces without encodings, which is easy enough to spot.

@abhinav92003
Copy link
Contributor Author

Decided to try out an alternate way to support module-mapper-decoding in instr_decode_cache_t that came out of offline discussion. Okay to hold off on the re-review until then (Cannot undo re-request review)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants