Add V0 tensor layout generation #518

odjuricicTT · 2024-08-28T10:23:00Z

A first implementation of generating possible op configurations. Starting with generating a few sharded tensor layouts for now. Much of the details here are incomplete and this is meant to unblock further optimizer and runtime development.

Waiting on #541 in order to add appropriate TensorMemoryLayout before merge.

Closes #572

nobradovictt · 2024-08-28T10:30:20Z

lib/Dialect/TTIR/Analysis/LegalGridAnalysis.cpp

+      layout.withMemorySpace(op->getContext(), MemorySpace::DeviceL1);
+
+  // Block Sharded
+  for (auto width = 2; width <= analysisInput.maxGrid.getShape()[0]; ++width) {


Maybe start with max_width/2. Same for height. Are you sure all these have chance to be valid, to you need to check/factor tensor shape first?

Will implement some basic tensor shape constrain checking on our end.

nobradovictt · 2024-08-28T10:31:01Z

lib/Dialect/TTIR/Analysis/LegalGridAnalysis.cpp

+  // L1 Interleaved (same as above)
+  LayoutAttr l1Interleaved =
+      layout.withMemorySpace(op->getContext(), MemorySpace::DeviceL1);
+  analysisResult.push_back(l1Interleaved);


Is there any attribute that makes this one interleaved compared to sharded L1s below?

No, the only difference is that interleaved does not have a gird set. Should we make this explicit in LayoutAttr?

@nsmithtt Any plans here, what was your intention for representing interleaved modes?

@jnie-TT is adding flatbuffer support. Once that's in I think we can just add a new layout attribute to capture this. Or maybe we should specialize a TTNN layout class? So far it's just 1 enum, so specializing might be overkill, metal backend can just assert that this enum is always set to None or something.

Great, thanks! Yeah lets go with new layout attribute.

nobradovictt · 2024-08-28T10:32:01Z

lib/Dialect/TTIR/Analysis/LegalGridAnalysis.cpp

+  for (auto width = 2; width <= analysisInput.maxGrid.getShape()[0]; ++width) {
+    for (auto height = 2; height <= analysisInput.maxGrid.getShape()[1];
+         ++height) {
+      analysisResult.push_back(shardedBase.withGrid(


There is a shardSpec inside LayoutAttr is it also automatically updated?

I cannot seem to find it. Can you point me to what you are referring to?

Take a look at llvm::SmallVector<int64_t> LayoutAttr::getShardShape()

AttrParameter<"MemRefType", "A memref that describes the physical footprint allocation of the shard. It must also have a shape with rank equal to grid.">:$memref);

Yes, it's automatically updated.

nobradovictt · 2024-08-28T10:33:32Z

lib/Dialect/TTIR/Analysis/LegalGridAnalysis.cpp

+                     }),
+      analysisResult.end());
+
+  // TODO: Potetialy filter out tensors that dont fit into L1 at all.


It would be nice to have this for sharded ones.

Ok, will treat this as a nice to have.

Actually, don't do this. :) We can still decide later to split the tensor for pipeline op, so this will be up to sharding policy to decide.

nobradovictt · 2024-08-28T10:34:26Z

lib/Dialect/TTIR/Analysis/LegalGridAnalysis.cpp

+  // Height Sharded
+  // TODO: Missing affine mapping to actual grid.
+  // TODO: Can we have every shape of 1d grid? Probably not, need to check what
+  // is divisible by grid sides.


I recommended above going down to 50% of grid usage.

Actually its 25% if you go up to half of both dimensions on two-dimensional grid. :)

odjuricicTT · 2024-08-29T10:33:42Z

Waiting on #541 to land in order to set appropriate TensorMemoryLayout.

nobradovictt · 2024-08-30T07:50:23Z

lib/Dialect/TTIR/Analysis/LegalGridAnalysis.cpp

 bool LegalGridAnalysis::applyOverrides() {
  // Lookup grid size overrides based on location information for current
  // operation.
  //

+  // TODO(odjuricic): We may need to infer shard type.


Yeah override needs to be extended.

Do we want all LayoutAttr params to be overridable?

Because of linking with TT-explorer most likely yes, but let's find some good enough starting point.

Ok, will add this as separate issue and tackle after this.

nobradovictt · 2024-08-30T13:55:34Z

lib/Dialect/TTIR/Analysis/LegalGridAnalysis.cpp

+  if (not llvm::isa<TTIROp>(op)) {
+    return;
+  }
+  // Skip operations that don't have output tensors.


Can this be used instead:
if (op->getNumResults() == 0)

Yes, except for ToLayoutOp which has an output.

nobradovictt · 2024-08-30T13:58:57Z

lib/Dialect/TTIR/Analysis/LegalGridAnalysis.cpp

+  // This implementation is a placeholder and is meant to just enable testing of
+  // other components.
+
+  // Skip mlir ops.


Nit: Update comment to: Process only TTIR ops.

nobradovictt · 2024-09-05T08:43:23Z

test/ttmlir/Dialect/TTIR/test_grid_set.mlir

@@ -3,7 +3,7 @@
 module attributes {} {
  func.func @forward(%arg0: tensor<64x128xf32>, %arg1: tensor<64x128xf32>) -> tensor<64x128xf32> {
    %0 = tensor.empty() : tensor<64x128xf32>
-    // CHECK: #[[LAYOUT_1:.*]] = #tt.layout<(d0, d1) -> (d0, d1), undef, <8x8>, memref<8x16xf32, #dram>, interleaved>
+    // CHECK: #[[LAYOUT_1:.*]] = #tt.layout<(d0, d1) -> (d0, d1), undef, <1x1>, memref<64x128xf32, #dram>, interleaved>


Why is this changed to 1x1?

It defaults to 1x1 if GridAttr is not set. My original thinking was that tensor layout grid does not make sense when the tensor is in dram. When you set 8x8 with dram you get an incorrect and confusing memref as well.

ATM i think that this value is not used at all (when dram). But might be in the coming changes to runtime.

@nsmithtt @jnie-TT Do we have any plans on how to infer computation grid for OPs which operate with DRAM data?

This seems to be different on an op per op basis and is discussed in #450, #559 and is being addressed in #605 .

nobradovictt · 2024-09-05T08:44:09Z

include/ttmlir/Dialect/TTIR/Analysis/LegalGridAnalysis.h

@@ -15,6 +15,7 @@ struct LegalGridAnalysisInput {
  ChipDescAttr chipDesc;
  GridAttr maxGrid;
  RankedTensorType tensorType;
+  int64_t maxShardedGrids = 64;


I was intending for this to be a passable param. Will set to constexpr for now and change when needed.

Indeed it should be passed in as param, make it as part of override issue in followup change.

nobradovictt · 2024-09-05T08:47:20Z

lib/Dialect/TTIR/Analysis/LegalGridAnalysis.cpp

 bool LegalGridAnalysis::applyOverrides() {
  // Lookup grid size overrides based on location information for current
  // operation.
  //

+  // TODO(odjuricic): Need to override all params, not just grid size.


Is this for followup change? File an issue under Optmizier if so.

Yes, there is an issue filed.

nobradovictt · 2024-09-05T08:48:48Z

lib/Dialect/TTIR/Analysis/LegalGridAnalysis.cpp

-      GridAttr::get(op->getContext(), analysisInput.maxGrid.getShape())));
+  // DRAM
+  // No grid is set since the tensor is not sharded.
+  // TODO(odjuricic): We need to set grid here since it will be used as the


File an issue, mention special case where grid matches number of DRAM banks.

Related to the 1x1 comment above, this needs to be updated once we know how runtime and TTNN treat this param.

Filed an issue.

nobradovictt suggested changes Aug 28, 2024

View reviewed changes

jnie-TT mentioned this pull request Aug 29, 2024

Add sharding support in ttnn backend #541

Merged

nobradovictt reviewed Aug 30, 2024

View reviewed changes

odjuricicTT force-pushed the odjuricic/grid-analysis branch from eb0c300 to afe37ff Compare August 30, 2024 13:41

odjuricicTT marked this pull request as ready for review August 30, 2024 13:43

odjuricicTT requested review from sdjordjevicTT, rpavlovicTT and mrakitaTT as code owners August 30, 2024 13:43

odjuricicTT changed the title ~~[WIP] Add V0 tensor layout generation~~ Add V0 tensor layout generation Aug 30, 2024

nobradovictt reviewed Aug 30, 2024

View reviewed changes

nobradovictt reviewed Sep 5, 2024

View reviewed changes

nobradovictt approved these changes Sep 5, 2024

View reviewed changes

odjuricicTT added 7 commits September 5, 2024 12:47

[WIP] Add V0 tensor layout generation

9d3c8a4

Add tmp tensor shape constraints

51684cf

Limit number of sharded layouts

b8b619c

Fix grid analysis and tests

55cbc77

Fix comments

1f1372b

Set TensorMemoryLayout in legal layout generation

d20c3df

Set dram grid to max

9b91a42

odjuricicTT force-pushed the odjuricic/grid-analysis branch from f2efe2d to 9b91a42 Compare September 5, 2024 10:53

odjuricicTT added 3 commits September 5, 2024 15:32

Fix dps operand

b76714f

Fix lint errors

2ea7a16

Update sharding test to fit into l1

679fd17

odjuricicTT merged commit 6eb09be into main Sep 6, 2024
13 checks passed

odjuricicTT mentioned this pull request Sep 6, 2024

Skip grid analysis for ops that don't require it #517

Closed

Add V0 tensor layout generation #518

Add V0 tensor layout generation #518

Conversation

odjuricicTT commented Aug 28, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

odjuricicTT commented Aug 29, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

odjuricicTT commented Aug 28, 2024 •

edited

Loading