-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix embedding tests that were previously failing #1128
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add an embedding test under test/Silicon/TTNN/perf_unit? Other than that, looks great!
323f10c
to
d2c8cf5
Compare
d2c8cf5
to
e798a17
Compare
std::optional<int> padToken = std::nullopt; | ||
::tt::tt_metal::Layout layout = ::ttnn::ROW_MAJOR_LAYOUT; | ||
::tt::tt_metal::Layout layout = utils::isTilized(op->out()) | ||
? ::ttnn::TILE_LAYOUT | ||
: ::ttnn::ROW_MAJOR_LAYOUT; | ||
auto embeddingsType = ::ttnn::operations::embedding::EmbeddingsType::GENERIC; | ||
::ttnn::DataType outputDataType = utils::getDataType(op->output()); | ||
::ttnn::DataType outputDataType = utils::getDataType(op->out()); | ||
::ttnn::MemoryConfig outputMemoryConfig = | ||
utils::createMemoryConfig(op->output()); | ||
utils::createMemoryConfig(op->out()); | ||
::ttnn::Tensor out = | ||
::ttnn::embedding(input, weight, padToken, layout, embeddingsType, | ||
outputDataType, outputMemoryConfig); | ||
tensorPool.insert_or_assign(op->output()->global_id(), out); | ||
tensorPool.insert_or_assign(op->out()->global_id(), out); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh this is an interesting problem we didn't foresee... So in order to supply the destination arg, we need to supply all the other optional parameters?
In the TTNN Defaults Design Doc you mentioned that
In TTNN they include all explicit layout info in the parameters, but if the optional dps output tensor is provided, all the explicit parameters get ignored and they extract all information from the output tensor.
Now I understand what you were talking about... We should raise this with them, seems like bad API design. Dest arg and other optional args should live in 2 separate overloads.
@sdjordjevicTT @nsmithtt you guys have regular syncs with ttnn folks? Is this something that we could ask them to take care of?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@svuckovicTT yeah exactly. Seems like this is inconsistent as well. Looking at binary.cpp, we ignore memory_config if we pass in the optional output tensor (this is the point I was making in the document):
inline Tensor binary_impl(
uint8_t queue_id,
BinaryOpType binary_op_type,
const ttnn::Tensor &input_tensor,
const float scalar,
const std::optional<ttnn::MemoryConfig> &memory_config = std::nullopt,
const std::optional<Tensor> &optional_output_tensor = std::nullopt) {
// output_memory_config is extracted from optional_output_tensor when possible
auto output_memory_config = optional_output_tensor.has_value()
? optional_output_tensor.value().memory_config()
: memory_config.value_or(input_tensor.memory_config());
However in embedding, we ignore the optional_output_tensor completely:
// optional_output_tensor is unused in this function
struct EmbeddingOperation {
static inline Tensor invoke(
uint8_t queue_id,
const Tensor& input_tensor_arg,
const Tensor& weight_arg,
const std::optional<int>& pad_token = std::nullopt,
const Layout& layout = ttnn::ROW_MAJOR_LAYOUT,
EmbeddingsType embeddings_type = EmbeddingsType::GENERIC,
const std::optional<const DataType> dtype = std::nullopt,
const std::optional<MemoryConfig>& memory_config = std::nullopt,
std::optional<Tensor> optional_output_tensor = std::nullopt) {
if (pad_token.has_value()) {
embeddings_type = EmbeddingsType::PADDED;
}
auto hidden_embedding_dim = weight_arg.get_shape()[-1];
auto padded_hidden_embedding_dim = weight_arg.get_shape().with_tile_padding()[-1];
auto weight = ttnn::unsqueeze_to_4D(weight_arg);
auto batch_size = input_tensor_arg.get_shape()[0];
auto sentence_size = input_tensor_arg.get_shape()[-1];
auto input_tensor =
ttnn::reshape(input_tensor_arg, ttnn::SimpleShape{std::array<uint32_t, 4>{batch_size, 1, 1, sentence_size}});
bool tilized = layout == ttnn::TILE_LAYOUT;
auto embeddings = operation::run(
Embeddings{
.output_mem_config = memory_config.value_or(input_tensor.memory_config()),
.tilized = tilized,
.embeddings_type = embeddings_type,
.pad_token = pad_token,
.output_dtype = dtype.value_or(weight.get_dtype())},
{input_tensor, weight})
.at(0);
embeddings = ttnn::reshape(
embeddings, ttnn::SimpleShape{std::array<uint32_t, 3>{batch_size, sentence_size, hidden_embedding_dim}});
return embeddings;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I understand what you were talking about... We should raise this with them, seems like bad API design. Dest arg and other optional args should live in 2 separate overloads.
So my understanding is that you either supply 1 or 2:
- Explicit parameters, like output_dtype / output_memory_config / etc.
- Output tensor (DPS style)
If you supply 2, then 1 are ignored, this makes sense to me, what are they supposed to do if you supply both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nsmithtt so in eltwise binary it ignores 1. and uses everything in 2. if you provide both.
However looking at embedding, it completely ignores 2. whether or not you provide it and completely uses 1.
So it's inconsistent across ops currently. And my other question in the document was that would we want to model our ops like this as well? Since the current goal is to match the modelling of ttnn. Currently alot of our ops (for example eltwise binary) only provides 2. and doesn't have the option to provide 1 in the tablegen/flatbuffer schemas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we probably want to model it how TTNN does, although that does feel a bit cumbersome. Would like to get thoughts from @sdjordjevicTT and @svuckovicTT, I'm not sure how else we could model it though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nsmithtt I agree, it makes the parameters explicit, while there's also the need for the tensor to have this information because the consumer op would likely need it. I guess a note here would be that if we ever force anything, we need to change both the op params and the output tensor. We hit an issue previously in the compiler that when forcing row_major/tile we updated the op params but not the tensor info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So my understanding is that you either supply 1 or 2
AFAIK, in C++, you can't supply just 2 - if you want to supply 2, you need to supply 1 as well, given they're both (all) optional. All the non-dest optional parameters come before the optional dest param. This makes it cumbersome to use DPS, because now you need to create random parameters that are not going to be used. (Sometimes.)
Having 2 separate overloads here is the solution, though I don't expect we'll get that anytime soon, if ever. Can we talk to them and see if they're willing to canonicalize to respecting dest param's properties? That would make it easier for us to not have to specialize each op.
Otherwise, I don't see us being able to do anything besides modelling what TTNN does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK, in C++, you can't supply just 2 - if you want to supply 2, you need to supply 1 as well, given they're both (all) optional. All the non-dest optional parameters come before the optional dest param. This makes it cumbersome to use DPS, because now you need to create random parameters that are not going to be used. (Sometimes.)
Not sure I'm following, if you had API:
Tensor add(Tensor a, Tensor b, optional<DType> output_dtype = nullopt, optional<Tensor> out = nullopt);
Then for each case we'd do respectively:
add(a, b, DType::Float16, nullopt);
add(a, b, nullopt, out);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I'm following
Of course, it's because I'm not making any sense... While I was saying optional parameters, I was thinking default parameters, disregarding that optional params can be supplied with nullopts. Sincere apologies for wasting time!
I think we probably want to model it how TTNN does, although that does feel a bit cumbersome.
I'm partial to us modelling TTNN accurately as well. It'll simplify runtimes, no need to work around what was supplied and what wasn't - just "copy-paste" parameters from the IR to the API call (have some default constructors for nullopts, etc.). @sdjordjevicTT anything to add?
I was already thinking of adapting an op here and there for emitc path, just to make things smoother, wondering if it's okay if we roll this out on a per-op, need-to-have basis.
TTNN embedding requires bf16 and ROW_MAJOR input tensors, so added embeddingOp as a case in shouldForceRowMajor and updated all embedding tests to use bf16. Re-enabled embedding silicon tests.
Also made embeddingOp dps to match ttnn implementation.
FYI @ddilbazTT