-
Notifications
You must be signed in to change notification settings - Fork 490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend HLO metadata to include class hierarchy information #5715
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi,
I would like to learn more about what is the intended usecase of this feature:
- What / who would be the consumers of this new information?
- How are they consumed? is it consumed by humans or by another downstream system that expects a very specific format?
I've responded to specific code feedback above. In response to the top level question: This (debug) metadata is consumed first by the compiler which maintains "backwards pointers" to meta-data through the lowering process. Ultimately this gets embedded in an executable archive allowing a profiler to perform reverse lookups, and then layout approximately where in the model we are at point in the execution (note that due to compilation there will be points in time where multiple code structures are running at one time). This can be particularly useful for optimization where a user can relate from a bottleneck at the machine instruction level to class, variable name and ultimately full call-stacks (discussed in OpenXLA, but we still need to upstream an implementation since it requires change to the protobuf layout). Presumably other tools (static analysis and the like) could be used to parse the HLO graph and look back at source code to perform other tasks. |
Hi @mrnikwaws, My proposal: We can reduce the scope of the PR to just the first: i.e. say we add a Python function The Suppose you want that API to be called after every ATen op for a user's Pytorch program, without the user calling it directly. Then, you can use a tensor subclass (see examples at https://github.com/albanD/subclass_zoo) wrapping XLATensor and call attach_custom_op_name for the users. It can look like the following:
And the end user can do
This way both the custom torch dispatch mode, as well as the code to compute stack trace, can remain out of tree. === My rationale: So, either 1) something is the standard output of torchxla's produced HLO, if so, different backend compilers has to agree on its format and BC/FC guarantees. OR, 2) something is a custom info that only be consumed by a particular compiler. In this case, it seems to me that it falls in category of 2), and therefore, it should not be emitted by default. So instead, I propose a way to enable to you to do what you want, at the same time enable other compilers to attach similar but different informations for their backends. |
This looks like it could work. We already use a python context for profiling, though it will make some use cases (where folks just use env variables to lower debug HLO) a little more complex. However I think I can manage that. I wasn't aware that we could plug into the dispatcher like this (thanks!). I'm also unclear how you can connect information from a return tensor (or tensors in some container) to the related XlaNode in the in memory graph for XLA. Assuming you can describe such a mechanism, I am using "UserMetaData" on each torch::lazy::Node https://github.com/pytorch/pytorch/blob/main/torch/csrc/lazy/core/ir_metadata.h#L24-L28 which is freeform metadata on each Node, which is then lowering that once the node is realized and lowered. This is needed since torch lazy Nodes rather than XlaNodes are passed to the lowering context. If we keep the same mechanism my existing ExtendedFrameInfo struct this would minimally become a string. Is this what you were imagining? If you could help me an approximate version of how and where to achieve the "custom_op_name" piece (i.e. attaching meta-data for an XlaNode from a return tensor) I'd be OK building a locally and testing. Some of my other comments still stand (e.g. efficient string encoding in the protobuf) but if we can get to parity with op_name string for now that solves the core requirement. Likely these other improvements belong in an RFC. |
You can actually get the stacktrace from Python (without querying XLANode):
Give
So basically we have the location of a particular Aten op is called. You probably want to drop the last frame as that is the one that points inside of Does this help? |
As discussed offline I'm working on the refactor now |
Refactor completed - will correct lint errors shortly |
Fix merge issues |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few nits, overall pretty solid. Thanks!
@@ -0,0 +1,90 @@ | |||
import sys |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a line in run_test.sh otherwise this test wont be run by default
torch_xla/csrc/tensor.cpp
Outdated
@@ -897,4 +897,17 @@ void XLATensor::MarkDynamicDimension(uint32_t dim) { | |||
xla_node->MarkDynamicDimension(dim); | |||
} | |||
|
|||
void XLATensor::SetCustomOpName(const std::string& op_name) { | |||
auto* xla_node = dynamic_cast<XlaNode*>(CurrentIrValue().node.get()); | |||
if (xla_node != nullptr) xla_node->SetCustomOpName(op_name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (xla_node != nullptr) {
xla_node->SetCustomOpName(op_name);
}
torch_xla/csrc/tensor.cpp
Outdated
|
||
const std::string& XLATensor::GetCustomOpName() const { | ||
auto* xla_node = dynamic_cast<XlaNode*>(CurrentIrValue().node.get()); | ||
if (xla_node != nullptr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's wrap all the branches with {}
* Add python binding to allow custom op_name metadata for lowered HLO
* Add python binding to allow custom op_name metadata for lowered HLO
* Add python binding to allow custom op_name metadata for lowered HLO
Summary
This change adds extended metadata to lowered HLO, along with JSON export of HLO.
Notes
Sample HLO as JSON
Limitations
Testing