-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#8962: modify output_mem_config to be optional in binary ops #9342
Conversation
36cae55
to
4de237e
Compare
4de237e
to
aa56c6f
Compare
if (this->in_place && this->output_mem_config.has_value()) { | ||
TT_FATAL(input_tensor_a.memory_config().memory_layout == this->output_mem_config.value().memory_layout); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably should have gotten rid of in_place before this mem_config change since this simplifies further changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay @tt-aho
I hope everyone is okay with removing in_place
flag from binary ops and replacing tt_lib.operations.primary.add / ttnn.experimental.operations.primary.add
?
I'll remove the inline Tensor add and use binary add()
with output_tensor
i.e tt_lib.tensor.add / ttnn.experimental.tensor.add
. This will involve changes in below files, shall I go ahead with it (should someone be informed) ?
- models/demos/t3000/falcon40b/tt/falcon_decoder.py
- models/demos/resnet/tt/metalResnetBlock50.py
- models/demos/t3000/llama2_70b/tt/llama_decoder_galaxy.py
- models/demos/t3000/llama2_70b/tt/llama_decoder_optimized.py
- models/experimental/llama2_70b/tt/llama_decoder_galaxy.py
- models/experimental/llama2_70b/tt/llama_decoder_optimized.py
- models/demos/falcon7b/tests/unit_tests/test_falcon_matmuls_and_bmms_with_mixed_precision.py
@@ -306,7 +306,7 @@ inline Tensor add( | |||
EltwiseBinary{ | |||
BinaryOpType::ADD, | |||
fused_activations, | |||
output_mem_config, | |||
output_mem_config.value(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't think we should be doing .value() here.
Is the plan to change the default value for output_mem_config to nullopt instead of operation::DEFAULT_OUTPUT_MEMORY_CONFIG?
I think here we should either:
- pass output_mem_config as is or
- pass output_mem_config.value_or(whatever we decide the default should be)
@eyonland do you have thoughts on this? Should we determine the default memconfig to use when none is provided here, or within the op struct fns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the default value of output_mem_config
will be nullopt. I was thinking of assigning it operation::DEFAULT_OUTPUT_MEMORY_CONFIG if both output_tensor and output_mem_config is nullopt.
I have an earlier comment from @eyonland regarding this, are you ok with this too ?
"If the optional output tensor is provided, then it should be used instead of the dtype or output memconfig which should not be required but made optional going forward.
The flow should be:
Does an optional output tensor actually exist that is NOT a promise? If so, ignore whatever the user provided for dtype and memconfig.
If dtype and memconfig are provided use this, if not use the dtype and memconfig of the first input tensor as the default(s)"
so I'm thinking of adding an if block inside the run_eltwise_binary
and change output_mem_config
as per above conditions and then pass it to the EltwiseBinary
struct
aa56c6f
to
0574933
Compare
Issue closed #8962 |
modify
output_mem_config
to bestd::optional
in binary opsWhy this change ?
output_tensor
support so whenoutput_tensor
is passedoutput_mem_config
is to be ignoredCI - https://github.com/tenstorrent/tt-metal/actions/runs/9450252404