i#7113: Add analyzer library to cache instr decode info #7114

abhinav92003 · 2024-12-09T21:54:14Z

Adds a new library to cache information about decoded instructions. This can be used by analysis tools that need to decode the instr encodings in the trace, to avoid overhead of redundant decodings which can get expensive.

The library allows the tools to specify what information they need to cache. Also, it uses instr_noalloc_t when possible to reduce heap usage and allocation/deallocation overhead.

Refactors the invariant checker tool to use this library.

Issue: #7113

Adds a new library to cache information about decoded instructions. This can be used by analysis tools that need to decode the instr encodings in the trace. The library allows the tools to specify what information they need to cache. Refactors the invariant checker tool to use this library. Issue: #7113

clients/drcachesim/tools/instr_decode_cache.cpp

clients/drcachesim/tests/instr_decode_cache_test.cpp

clients/drcachesim/tools/instr_decode_cache.h

clients/drcachesim/tools/invariant_checker.cpp

clients/drcachesim/tools/invariant_checker.h

derekbruening · 2024-12-10T16:50:34Z

clients/drcachesim/tools/instr_decode_cache.h

+ * they need.
+ *
+ * Derived classes must provide a default constructor to make objects
+ * corresponding to an invalid decoding.


Have you looked at opcode_mix and ensured this interface works well there?

opcode_mix just needs the opcode and category, so yes it should work there too. The only complexity with opcode mix is that it still tries to handle traces without encodings, so opcode_mix will still need to have decoding logic for that case.

Added a TODO for now.

If no-encoding traces are still a possibility (very much so w/ our public v1 out there): wouldn't you expect a decoding helper library to include the messy logic to handle those? If I were a user I would not be happy with a library that only works on some traces when the code to handle all traces is sitting there in opcode_mix: I would think "why isn't that in this library".

If we aggressively deprecate (as in delete) public v1, and can make the claim no one should encounter an encoding-free trace, then not tackling that in this library seems reasonable.

If we aggressively deprecate (as in delete) public v1, and can make the claim no one should encounter an encoding-free trace, then not tackling that in this library seems reasonable.

But if we do that, we should remove the module mapping code from opcode_mix, right? I guess for me it feels weird to have opcode_mix have that code yet this library not handle that.

Supporting backward-compatibility is definitely useful. But that is not as clear when it comes at a non-trivial cost to the new interfaces and the user.

Including the module mapping logic in instr_decode_cache_t comes with more additions to the API that are not trivial, particularly: providing the file type (which isn't available at construction time, so the user would need to remember to provide that for instr_decode_cache_t to figure out what decode logic to use), also the module map init logic is not trivial either.

Since trace encodings have been around 2+ years, and we have the public v2 also coming up very soon, it doesn't seem unreasonable to me for new convenience libraries (decoding is still possible without instr_decode_cache_t) to make assumptions about availability of trace encodings.

However: I agree that it is important to let the user know in case they're using instr_decode_cache_t on an old trace. We could perhaps add some heuristic to detect that: many consecutive decodes all with zeroes?

That said, if we're unable to resolve concerns about potential misuses of instr_decode_cache_t, it may be prudent to instead add it only to private libraries where we can make stronger assumptions about trace versions.

We could perhaps add some heuristic to detect that: many consecutive decodes all with zeroes?

I think we don't need a heuristic. Instead: since the reader sets encoding_is_new when it sees any encoding

dynamorio/clients/drcachesim/reader/reader.cpp

Line 231 in eec4721

cur_ref_.instr.encoding_is_new = true;

maybe we can just use that: add to the decode cache only if encoding_is_new. That way, the cache would return nullptr for all instrs for traces without encodings, which is easy enough to spot.

abhinav92003 · 2024-12-11T02:37:40Z

Decided to try out an alternate way to support module-mapper-decoding in instr_decode_cache_t that came out of offline discussion. Okay to hold off on the re-review until then (Cannot undo re-request review)

abhinav92003 added 7 commits December 9, 2024 16:53

Docx improvement, and handle regdeps branch_target case.

abebffc

Use instr_noalloc_t where possible.

18f7028

Remove redundant test.

4487168

move impl to cpp

41595eb

Move impl to cpp

d2e94c7

Cleanup and aarch64 mov fix.

f0f8a74

abhinav92003 changed the title ~~i#7113: Add library to cache information about decoded instructions~~ i#7113: Add analyzer library to cache instr decode info Dec 10, 2024

Fix windows bug

a1b1d63

abhinav92003 requested a review from derekbruening December 10, 2024 03:05

derekbruening reviewed Dec 10, 2024

View reviewed changes

abhinav92003 added 2 commits December 10, 2024 16:43

Reviewer suggested changes

db8a3ad

Cleanup

1e810b5

abhinav92003 requested a review from derekbruening December 11, 2024 02:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

i#7113: Add analyzer library to cache instr decode info #7114

i#7113: Add analyzer library to cache instr decode info #7114

abhinav92003 commented Dec 9, 2024 •

edited

Loading

derekbruening Dec 10, 2024

abhinav92003 Dec 10, 2024

abhinav92003 Dec 10, 2024

derekbruening Dec 10, 2024

derekbruening Dec 10, 2024

abhinav92003 Dec 10, 2024

abhinav92003 Dec 10, 2024

abhinav92003 Dec 11, 2024

abhinav92003 commented Dec 11, 2024

i#7113: Add analyzer library to cache instr decode info #7114

Are you sure you want to change the base?

i#7113: Add analyzer library to cache instr decode info #7114

Conversation

abhinav92003 commented Dec 9, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abhinav92003 commented Dec 11, 2024

abhinav92003 commented Dec 9, 2024 •

edited

Loading