Fix embedding tests that were previously failing #1128

jnie-TT · 2024-10-31T19:58:38Z

TTNN embedding requires bf16 and ROW_MAJOR input tensors, so added embeddingOp as a case in shouldForceRowMajor and updated all embedding tests to use bf16. Re-enabled embedding silicon tests.

Also made embeddingOp dps to match ttnn implementation.

FYI @ddilbazTT

tapspatel

can you add an embedding test under test/Silicon/TTNN/perf_unit? Other than that, looks great!

svuckovicTT · 2024-11-04T11:06:13Z

runtime/lib/ttnn/operations/embedding/embedding.cpp

  std::optional<int> padToken = std::nullopt;
-  ::tt::tt_metal::Layout layout = ::ttnn::ROW_MAJOR_LAYOUT;
+  ::tt::tt_metal::Layout layout = utils::isTilized(op->out())
+                                      ? ::ttnn::TILE_LAYOUT
+                                      : ::ttnn::ROW_MAJOR_LAYOUT;
  auto embeddingsType = ::ttnn::operations::embedding::EmbeddingsType::GENERIC;
-  ::ttnn::DataType outputDataType = utils::getDataType(op->output());
+  ::ttnn::DataType outputDataType = utils::getDataType(op->out());
  ::ttnn::MemoryConfig outputMemoryConfig =
-      utils::createMemoryConfig(op->output());
+      utils::createMemoryConfig(op->out());
  ::ttnn::Tensor out =
      ::ttnn::embedding(input, weight, padToken, layout, embeddingsType,
                        outputDataType, outputMemoryConfig);
-  tensorPool.insert_or_assign(op->output()->global_id(), out);
+  tensorPool.insert_or_assign(op->out()->global_id(), out);


Oh this is an interesting problem we didn't foresee... So in order to supply the destination arg, we need to supply all the other optional parameters?

In the TTNN Defaults Design Doc you mentioned that

In TTNN they include all explicit layout info in the parameters, but if the optional dps output tensor is provided, all the explicit parameters get ignored and they extract all information from the output tensor.

Now I understand what you were talking about... We should raise this with them, seems like bad API design. Dest arg and other optional args should live in 2 separate overloads.

@sdjordjevicTT @nsmithtt you guys have regular syncs with ttnn folks? Is this something that we could ask them to take care of?

@svuckovicTT yeah exactly. Seems like this is inconsistent as well. Looking at binary.cpp, we ignore memory_config if we pass in the optional output tensor (this is the point I was making in the document):

inline Tensor binary_impl( uint8_t queue_id, BinaryOpType binary_op_type, const ttnn::Tensor &input_tensor, const float scalar, const std::optional<ttnn::MemoryConfig> &memory_config = std::nullopt, const std::optional<Tensor> &optional_output_tensor = std::nullopt) { // output_memory_config is extracted from optional_output_tensor when possible auto output_memory_config = optional_output_tensor.has_value() ? optional_output_tensor.value().memory_config() : memory_config.value_or(input_tensor.memory_config());

However in embedding, we ignore the optional_output_tensor completely:

// optional_output_tensor is unused in this function struct EmbeddingOperation { static inline Tensor invoke( uint8_t queue_id, const Tensor& input_tensor_arg, const Tensor& weight_arg, const std::optional<int>& pad_token = std::nullopt, const Layout& layout = ttnn::ROW_MAJOR_LAYOUT, EmbeddingsType embeddings_type = EmbeddingsType::GENERIC, const std::optional<const DataType> dtype = std::nullopt, const std::optional<MemoryConfig>& memory_config = std::nullopt, std::optional<Tensor> optional_output_tensor = std::nullopt) { if (pad_token.has_value()) { embeddings_type = EmbeddingsType::PADDED; } auto hidden_embedding_dim = weight_arg.get_shape()[-1]; auto padded_hidden_embedding_dim = weight_arg.get_shape().with_tile_padding()[-1]; auto weight = ttnn::unsqueeze_to_4D(weight_arg); auto batch_size = input_tensor_arg.get_shape()[0]; auto sentence_size = input_tensor_arg.get_shape()[-1]; auto input_tensor = ttnn::reshape(input_tensor_arg, ttnn::SimpleShape{std::array<uint32_t, 4>{batch_size, 1, 1, sentence_size}}); bool tilized = layout == ttnn::TILE_LAYOUT; auto embeddings = operation::run( Embeddings{ .output_mem_config = memory_config.value_or(input_tensor.memory_config()), .tilized = tilized, .embeddings_type = embeddings_type, .pad_token = pad_token, .output_dtype = dtype.value_or(weight.get_dtype())}, {input_tensor, weight}) .at(0); embeddings = ttnn::reshape( embeddings, ttnn::SimpleShape{std::array<uint32_t, 3>{batch_size, sentence_size, hidden_embedding_dim}}); return embeddings; }

Now I understand what you were talking about... We should raise this with them, seems like bad API design. Dest arg and other optional args should live in 2 separate overloads.

So my understanding is that you either supply 1 or 2:

Explicit parameters, like output_dtype / output_memory_config / etc.

Output tensor (DPS style)

If you supply 2, then 1 are ignored, this makes sense to me, what are they supposed to do if you supply both?

@nsmithtt so in eltwise binary it ignores 1. and uses everything in 2. if you provide both.
However looking at embedding, it completely ignores 2. whether or not you provide it and completely uses 1.
So it's inconsistent across ops currently. And my other question in the document was that would we want to model our ops like this as well? Since the current goal is to match the modelling of ttnn. Currently alot of our ops (for example eltwise binary) only provides 2. and doesn't have the option to provide 1 in the tablegen/flatbuffer schemas.

I think we probably want to model it how TTNN does, although that does feel a bit cumbersome. Would like to get thoughts from @sdjordjevicTT and @svuckovicTT, I'm not sure how else we could model it though.

@nsmithtt I agree, it makes the parameters explicit, while there's also the need for the tensor to have this information because the consumer op would likely need it. I guess a note here would be that if we ever force anything, we need to change both the op params and the output tensor. We hit an issue previously in the compiler that when forcing row_major/tile we updated the op params but not the tensor info.

So my understanding is that you either supply 1 or 2

AFAIK, in C++, you can't supply just 2 - if you want to supply 2, you need to supply 1 as well, given they're both (all) optional. All the non-dest optional parameters come before the optional dest param. This makes it cumbersome to use DPS, because now you need to create random parameters that are not going to be used. (Sometimes.)

Having 2 separate overloads here is the solution, though I don't expect we'll get that anytime soon, if ever. Can we talk to them and see if they're willing to canonicalize to respecting dest param's properties? That would make it easier for us to not have to specialize each op.

Otherwise, I don't see us being able to do anything besides modelling what TTNN does.

AFAIK, in C++, you can't supply just 2 - if you want to supply 2, you need to supply 1 as well, given they're both (all) optional. All the non-dest optional parameters come before the optional dest param. This makes it cumbersome to use DPS, because now you need to create random parameters that are not going to be used. (Sometimes.)

Not sure I'm following, if you had API:

Tensor add(Tensor a, Tensor b, optional<DType> output_dtype = nullopt, optional<Tensor> out = nullopt);

Then for each case we'd do respectively:

add(a, b, DType::Float16, nullopt);

add(a, b, nullopt, out);

Not sure I'm following

Of course, it's because I'm not making any sense... While I was saying optional parameters, I was thinking default parameters, disregarding that optional params can be supplied with nullopts. Sincere apologies for wasting time!

I think we probably want to model it how TTNN does, although that does feel a bit cumbersome.

I'm partial to us modelling TTNN accurately as well. It'll simplify runtimes, no need to work around what was supplied and what wasn't - just "copy-paste" parameters from the IR to the API call (have some default constructors for nullopts, etc.). @sdjordjevicTT anything to add?

I was already thinking of adapting an op here and there for emitc path, just to make things smoother, wondering if it's okay if we roll this out on a per-op, need-to-have basis.

jnie-TT requested review from kmabeeTT, AleksKnezevic, pilkicTT, sdjordjevicTT, svuckovicTT, mtopalovicTT, rpavlovicTT, jserbedzijaTT, nobradovictt and tapspatel as code owners October 31, 2024 19:58

jnie-TT requested a review from nsmithtt October 31, 2024 20:00

tapspatel approved these changes Oct 31, 2024

View reviewed changes

jnie-TT force-pushed the jnie/embedding_fix branch from 323f10c to d2c8cf5 Compare November 1, 2024 01:59

Fix embedding tests that were previously failing

e798a17

jnie-TT force-pushed the jnie/embedding_fix branch from d2c8cf5 to e798a17 Compare November 1, 2024 20:42

svuckovicTT reviewed Nov 4, 2024

View reviewed changes

svuckovicTT approved these changes Nov 4, 2024

View reviewed changes

ddilbazTT mentioned this pull request Nov 4, 2024

Gather op implementation [#1015] #1151

Merged

kmabeeTT approved these changes Nov 4, 2024

View reviewed changes

jnie-TT merged commit 6988418 into main Nov 4, 2024
18 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix embedding tests that were previously failing #1128

Fix embedding tests that were previously failing #1128

jnie-TT commented Oct 31, 2024

tapspatel left a comment

svuckovicTT Nov 4, 2024

jnie-TT Nov 4, 2024 •

edited

Loading

nsmithtt Nov 4, 2024

jnie-TT Nov 4, 2024 •

edited

Loading

nsmithtt Nov 4, 2024

jnie-TT Nov 4, 2024

svuckovicTT Nov 4, 2024

nsmithtt Nov 4, 2024

svuckovicTT Nov 5, 2024

Fix embedding tests that were previously failing #1128

Fix embedding tests that were previously failing #1128

Conversation

jnie-TT commented Oct 31, 2024

tapspatel left a comment

Choose a reason for hiding this comment

svuckovicTT Nov 4, 2024

Choose a reason for hiding this comment

jnie-TT Nov 4, 2024 • edited Loading

Choose a reason for hiding this comment

nsmithtt Nov 4, 2024

Choose a reason for hiding this comment

jnie-TT Nov 4, 2024 • edited Loading

Choose a reason for hiding this comment

nsmithtt Nov 4, 2024

Choose a reason for hiding this comment

jnie-TT Nov 4, 2024

Choose a reason for hiding this comment

svuckovicTT Nov 4, 2024

Choose a reason for hiding this comment

nsmithtt Nov 4, 2024

Choose a reason for hiding this comment

svuckovicTT Nov 5, 2024

Choose a reason for hiding this comment

jnie-TT Nov 4, 2024 •

edited

Loading

jnie-TT Nov 4, 2024 •

edited

Loading