Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example with experimental syntax #1960

Merged
merged 190 commits into from
Dec 12, 2024
Merged
Show file tree
Hide file tree
Changes from 184 commits
Commits
Show all changes
190 commits
Select commit Hold shift + click to select a range
49bd37f
brainstorming proof of concept
hunhoffe Aug 16, 2024
048d156
Update test.cpp
hunhoffe Oct 14, 2024
095ce35
Clean up code in preparation for additional development
hunhoffe Oct 14, 2024
847be7c
Saving progress
hunhoffe Oct 14, 2024
b52c6ec
Fix formatting
hunhoffe Oct 15, 2024
181368a
Save progress
hunhoffe Oct 15, 2024
696633a
Update python/pyrightconfig.json
hunhoffe Oct 15, 2024
94f1fa4
working on building up runtime sequence
hunhoffe Oct 15, 2024
1a0567e
Merge branch 'main' into erika-iron-brainstorming
hunhoffe Oct 15, 2024
d450878
Merge branch 'main' into erika-iron-brainstorming
hunhoffe Oct 16, 2024
e1fd409
Fix paths from merge
hunhoffe Oct 16, 2024
2e3b923
rewrite passthrough kernel to use dma task operations
hunhoffe Oct 16, 2024
22a8418
First example minimally working again
hunhoffe Oct 16, 2024
a54659a
Rename api as iron2 (for now)
hunhoffe Oct 16, 2024
f2ce079
Merge branch 'main' into erika-iron-brainstorming
hunhoffe Oct 16, 2024
5e6a328
Working with tiler some more
hunhoffe Oct 17, 2024
92ff3d6
Started on DMA transpose example
hunhoffe Oct 17, 2024
ac0ec99
DMA transpose example minimally working
hunhoffe Oct 17, 2024
cb52c07
Add matrix_scalar_add experimental
hunhoffe Oct 17, 2024
57a0049
Merge branch 'main' into erika-iron-brainstorming
hunhoffe Oct 17, 2024
989d755
another example
hunhoffe Oct 17, 2024
745002c
Another example
hunhoffe Oct 17, 2024
4526758
object fifo forward
hunhoffe Oct 17, 2024
0fe4b8b
Working through a few more examples
hunhoffe Oct 17, 2024
ebb2716
Add experimental vector_exp example
hunhoffe Oct 17, 2024
106b702
Merge branch 'main' into erika-iron-brainstorming
hunhoffe Oct 17, 2024
dea7d28
stub of tensor tiler
hunhoffe Oct 18, 2024
00713ea
py fmt and imports
hunhoffe Oct 18, 2024
2650c41
Explore tile helper class
hunhoffe Oct 19, 2024
747ca3a
First version of tensor tiler
hunhoffe Oct 21, 2024
e5518fa
Merge branch 'main' into tiler-helper
hunhoffe Oct 21, 2024
1f86f05
Add some tests for the tiler
hunhoffe Oct 21, 2024
39e0a5d
Some improvements
hunhoffe Oct 21, 2024
03a0741
Merge branch 'main' into tiler-helper
hunhoffe Oct 22, 2024
8d67307
Some small improvements to tensortiler
hunhoffe Oct 22, 2024
637e314
Stub out example
hunhoffe Oct 22, 2024
d18a2ef
Added simple tiling examples
hunhoffe Oct 22, 2024
3c8ffb3
Merge branch 'main' into tiler-helper
hunhoffe Oct 22, 2024
d293c96
Update programming_examples/basic/tiling_exploration/per_tile/aie2.py
hunhoffe Oct 22, 2024
9c2ce5f
Fix makefile typos
hunhoffe Oct 22, 2024
2a3a484
Add tensor tiler tests
hunhoffe Oct 22, 2024
a47df3a
a couple more tests
hunhoffe Oct 22, 2024
babf9e7
Add a few more tests, remove template
hunhoffe Oct 22, 2024
46a487c
Add one more test
hunhoffe Oct 22, 2024
1071ee0
make tensortile test formatting a bit more sane
hunhoffe Oct 22, 2024
192194d
More python formatting
hunhoffe Oct 22, 2024
4f9656a
A few more tests
hunhoffe Oct 22, 2024
34ea2d8
Merge branch 'main' into tiler-helper
hunhoffe Oct 22, 2024
e437776
add visualization example
hunhoffe Oct 22, 2024
c744299
caption more correctly
hunhoffe Oct 22, 2024
d51e5c8
A bit of progress towards matrix_vector
hunhoffe Oct 23, 2024
fe56391
Merge branch 'main' into erika-iron-brainstorming
hunhoffe Oct 23, 2024
87df9a7
Merge branch 'main' into tiler-helper
hunhoffe Oct 23, 2024
880ee2f
update tiler code
hunhoffe Oct 21, 2024
0d63d47
Merge branch 'main' into erika-iron-brainstorming
hunhoffe Oct 23, 2024
5d998cc
Merge branch 'tiler-helper' into erika-iron-brainstorming
hunhoffe Oct 23, 2024
1140074
fix mistake from merge
hunhoffe Oct 23, 2024
3d6c8ea
DMA Transpose working with new TensorTiler
hunhoffe Oct 23, 2024
50d9ce3
Fix small typos in dma transpose designs
hunhoffe Oct 23, 2024
0e17621
Matrix scalar add working with new tiler
hunhoffe Oct 23, 2024
173cea1
Passthrough DMA working with TensorTile, but not TensorTiler2D
hunhoffe Oct 23, 2024
df855ff
passthrough kernel experimental working because of hack in dmatask
hunhoffe Oct 23, 2024
b71677b
add missing transfer length
hunhoffe Oct 23, 2024
b582e85
experimental working with row_wise_bias_add
hunhoffe Oct 23, 2024
6cf794c
experimental vector exp now working
hunhoffe Oct 23, 2024
aa5de3e
Merge branch 'main' into erika-iron-brainstorming
hunhoffe Oct 23, 2024
379b712
Remove development notes and restore unneeded changes
hunhoffe Oct 23, 2024
f4a0335
Use peano and do not pollute source dir in experimental tests
hunhoffe Oct 23, 2024
e9e57ab
Fix typo
hunhoffe Oct 23, 2024
3429d78
Stub out plans for placement
hunhoffe Oct 23, 2024
32e18c5
Merge branch 'main' into erika-iron-brainstorming
hunhoffe Oct 23, 2024
ee8a483
dma transpose with placer working
hunhoffe Oct 23, 2024
44f5c95
Port rest of examples to use SequentialPlacer
hunhoffe Oct 23, 2024
6cb2b6a
Stub out (untestsed) matrix vector
hunhoffe Oct 24, 2024
01731dc
Some notes for demo
hunhoffe Oct 24, 2024
8efd95b
more notes
hunhoffe Oct 24, 2024
d42987d
Merge branch 'main' into erika-iron-brainstorming
hunhoffe Oct 25, 2024
d842023
Small stylistic updates
hunhoffe Oct 25, 2024
1785464
more demo prep
hunhoffe Oct 25, 2024
5b2dd41
Add some composition notes
hunhoffe Oct 25, 2024
a46c8b4
Add some (untested) access count visualizations in addition to the ac…
hunhoffe Oct 25, 2024
6d83cab
Some cleanups for next demo
hunhoffe Oct 31, 2024
49f3962
Merge branch 'main' into erika-iron-brainstorming
hunhoffe Nov 13, 2024
1fd5476
Merge branch 'main' into erika-iron-brainstorming
hunhoffe Nov 14, 2024
96dcd93
Merge branch 'main' into erika-iron-brainstorming
hunhoffe Nov 18, 2024
5018eeb
First fixups after mergin real tensor tiler in
hunhoffe Nov 18, 2024
8807c76
Merge branch 'main' into erika-iron-brainstorming
hunhoffe Nov 18, 2024
34cac3c
Add more demo code notes
hunhoffe Nov 19, 2024
bf3af3b
Merge branch 'main' into erika-iron-brainstorming
hunhoffe Nov 21, 2024
308586f
Merge branch 'main' into erika-iron-brainstorming
hunhoffe Nov 22, 2024
e764711
Start reducing diff from main, cleanup makefiles, first example worki…
hunhoffe Nov 22, 2024
b67625e
Matrix vector with iron api working
hunhoffe Nov 23, 2024
a8b78d2
Fix up a few more examples
hunhoffe Nov 23, 2024
5217a20
Fix up a few more examples
hunhoffe Nov 23, 2024
5b1029d
Finished pass of cleaning up aie2_iron examples
hunhoffe Nov 23, 2024
a211369
A few minor fixes
hunhoffe Nov 23, 2024
7d69c7c
Cleanup syntax
hunhoffe Nov 23, 2024
379f7b8
Merge branch 'main' into erika-iron-brainstorming
hunhoffe Nov 23, 2024
bbdd9e4
fix after merge
hunhoffe Nov 23, 2024
3667162
Merge branch 'main' into erika-iron-brainstorming
hunhoffe Nov 25, 2024
77bc0fe
not working yet
hunhoffe Nov 23, 2024
b6e8d33
Few small updates
hunhoffe Nov 25, 2024
8c57cd1
Getting further...
hunhoffe Nov 25, 2024
7d7ec7d
Working as well as possible with minimal logic
hunhoffe Nov 25, 2024
4354571
Fixed up naming and paths
hunhoffe Nov 25, 2024
216f7ea
Rename directory
hunhoffe Nov 25, 2024
c99ceda
Update for API change; add additional example
hunhoffe Nov 25, 2024
a153db2
Another example
hunhoffe Nov 25, 2024
3d9dc2f
Not sure if type conversion is advisable, but I will save progress in…
hunhoffe Nov 25, 2024
896c5ff
another example
hunhoffe Nov 25, 2024
1cd0c2f
Fix small bug
hunhoffe Nov 25, 2024
4dd7386
Another example
hunhoffe Nov 25, 2024
12016f4
Other reduce examples
hunhoffe Nov 25, 2024
3825b20
default tap in runtime fill/drain ops
hunhoffe Dec 2, 2024
ae9dd55
Fix typos
hunhoffe Dec 2, 2024
03fdacf
Ported rest of basic examples
hunhoffe Dec 2, 2024
ad1eb46
Prepare for bottlneck example
hunhoffe Dec 2, 2024
b41cd7a
Merge branch 'main' into erika-iron-brainstorming
hunhoffe Dec 2, 2024
a4a100e
Cleanup experimental import
hunhoffe Dec 2, 2024
6e975a4
Some objectfifo python refactoring
hunhoffe Dec 2, 2024
d07a0ec
Add missing NotImplementedError
hunhoffe Dec 2, 2024
f4f564f
Add code snippets
hunhoffe Dec 3, 2024
66dca0a
python fmt
hunhoffe Dec 3, 2024
b844c16
Start fixing typos from objectfifo refactor
hunhoffe Dec 3, 2024
6f2146a
Fix a few more bugs from object fifo refactor
hunhoffe Dec 3, 2024
749d39e
A few more fixes after object fifo refactor
hunhoffe Dec 3, 2024
5665090
Fix one more typo
hunhoffe Dec 4, 2024
e6765b8
first ml example ported
hunhoffe Dec 4, 2024
11b6945
Ported a few more ml examples
hunhoffe Dec 4, 2024
beea650
ported color detect
hunhoffe Dec 4, 2024
cb0d31e
Fix depth in color detect new format
hunhoffe Dec 4, 2024
1496428
Handling RTPs in runtime
hunhoffe Dec 4, 2024
92422ab
Finished porting vision examples
hunhoffe Dec 5, 2024
e77c902
Merge branch 'main' into erika-iron-brainstorming
hunhoffe Dec 5, 2024
e582007
Start cleaning up branch for merge
hunhoffe Dec 5, 2024
6aa3779
Some minor cleanups
hunhoffe Dec 5, 2024
fa9d2c5
Fix a few bugs
hunhoffe Dec 5, 2024
3f66199
Port bottleneck example
hunhoffe Dec 5, 2024
d8a7b71
Fix some bugs from merge
hunhoffe Dec 5, 2024
2eba817
Finixhed porting ml examples
hunhoffe Dec 5, 2024
b629b1e
Fix lit file error
hunhoffe Dec 5, 2024
5967284
Preparing for tiliner in single core matmul
hunhoffe Dec 5, 2024
f36bf33
Add taps to alt single core matmul
hunhoffe Dec 5, 2024
7909edb
Add tiler for b
hunhoffe Dec 5, 2024
3dfd668
Tiler for C
hunhoffe Dec 5, 2024
c5406c8
Add a tielr
hunhoffe Dec 5, 2024
cb11d4a
starting on single core mat mult iron
hunhoffe Dec 5, 2024
808f8a1
task groups in runtime minimally working
hunhoffe Dec 5, 2024
33df41f
Update runtime task group handling
hunhoffe Dec 5, 2024
ca5e6be
Fix lit file error
hunhoffe Dec 5, 2024
8e7c4ed
Fix bug with placement for resnet
hunhoffe Dec 6, 2024
46e6ebd
Merge branch 'main' into erika-iron-brainstorming
hunhoffe Dec 6, 2024
acc8121
Progress towards another example
hunhoffe Dec 6, 2024
472700a
Stub out whole array iron makefiles and lit files
hunhoffe Dec 6, 2024
895f193
ObjectFifo code cleanup. This will break some examples
hunhoffe Dec 7, 2024
fd8dd0e
Some fixes from obj fifo refactor
hunhoffe Dec 7, 2024
f415903
Another quick fix
hunhoffe Dec 7, 2024
918c9db
Fix another small bug
hunhoffe Dec 7, 2024
a2b7200
Fix a few more things
hunhoffe Dec 7, 2024
9522ab1
stub out mat mul whole array
hunhoffe Dec 7, 2024
1afe9cd
ported whole array mat mul
hunhoffe Dec 7, 2024
c962f3c
fix length calc
hunhoffe Dec 7, 2024
040a747
try something new with offsets
hunhoffe Dec 7, 2024
2467847
prune runtime cons handle duplicates
hunhoffe Dec 7, 2024
937c2e2
Start cleaning up code to prepare for PR
hunhoffe Dec 7, 2024
ae8f9ac
start cleaning up imports
hunhoffe Dec 7, 2024
3f727b6
Fixup more imports
hunhoffe Dec 7, 2024
e50e8b6
Fix up a few more imports
hunhoffe Dec 7, 2024
be13659
fix a few import errors
hunhoffe Dec 7, 2024
f6ef95f
Fix errors from import refactor
hunhoffe Dec 8, 2024
6542998
Clean up passthrough plio code
hunhoffe Dec 8, 2024
84936d1
Add tile visualization hook for mat mul cascade
hunhoffe Dec 8, 2024
e38b070
Remove experimental example
hunhoffe Dec 8, 2024
c697017
Remove example file that does not work anyways
hunhoffe Dec 8, 2024
f75c2a5
Remove unnecessary changes
hunhoffe Dec 8, 2024
d0abbe9
Update programming examples readme
hunhoffe Dec 8, 2024
7f17072
Add experimental code and example
hunhoffe Dec 8, 2024
3f20623
Add readmes
hunhoffe Dec 8, 2024
b541d4f
Remove CI test code
hunhoffe Dec 8, 2024
e7b8717
Update programming_examples/experimental/example.py
hunhoffe Dec 12, 2024
b955b47
Update programming_examples/experimental/example.py
hunhoffe Dec 12, 2024
d018ca0
Update programming_examples/experimental/example.py
hunhoffe Dec 12, 2024
db58543
Update programming_examples/experimental/example.py
hunhoffe Dec 12, 2024
942d04e
Merge branch 'main' into experimental_syntax
hunhoffe Dec 12, 2024
735d13e
Merge branch 'main' into experimental_syntax
hunhoffe Dec 12, 2024
72bd7b5
Remove files that did not get deleted in merge
hunhoffe Dec 12, 2024
f715885
Remove more files not deleted in merge
hunhoffe Dec 12, 2024
4bfdd80
Remove more merge artifacts
hunhoffe Dec 12, 2024
535cea1
Remove another merge artifact
hunhoffe Dec 12, 2024
e469207
Merge branch 'main' into experimental_syntax
hunhoffe Dec 12, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion programming_examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,14 @@

# <ins>Programming Examples</ins>

These programming examples are provided so that application programmers can learn how to leverage the IRON design flow with mlir-aie python bindings, and the mlir-aie intermediate representation directly to build applications targeting AI Engines. They are organized into the following directories:
These programming examples are provided so that application programmers can learn how to leverage the IRON design flow with mlir-aie python bindings, and the mlir-aie intermediate representation directly to build applications targeting AI Engines.

Each IRON example has one or more implementations:
* `aie2.py` - These are written using the original IRON syntax
* `aie2_alt.py` - These are written using an alternate form of `runtime_sequence`, but the design is likely otherwise unchanged.
* `aie2_iron.py` - These are written using an alternative IRON syntax

They are organized into the following directories:

## [basic](./basic)

Expand Down
10 changes: 9 additions & 1 deletion programming_examples/basic/dma_transpose/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,17 @@ K ?= 32

aie_py_src=aie2.py
use_alt?=0
use_iron?=0

ifeq (${use_alt}, 1)
aie_py_src=aie2_alt.py
ifeq (${use_iron}, 1)
$(error Cannot specify both alternative design and IRON)
endif
endif

ifeq (${use_iron}, 1)
aie_py_src=aie2_iron.py
endif

build/aie.mlir: ${srcdir}/${aie_py_src}
Expand All @@ -51,7 +59,7 @@ endif
run: ${targetname}.exe build/final.xclbin
${powershell} ./$< -x build/final.xclbin -i build/insts.txt -k MLIR_AIE --M ${M} --K ${K}

generate_access_map: ${srcdir}/aie2.py
generate_access_map: ${srcdir}/${aie_py_src}
mkdir -p ${@D}
python3 $< --generate-access-map ${M} ${K}

Expand Down
2 changes: 1 addition & 1 deletion programming_examples/basic/dma_transpose/aie2_alt.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# dma_transpose/aie2.py -*- Python -*-
# dma_transpose/aie2_alt.py -*- Python -*-
#
# This file is licensed under the Apache License v2.0 with LLVM Exceptions.
# See https://llvm.org/LICENSE.txt for license information.
Expand Down
59 changes: 59 additions & 0 deletions programming_examples/basic/dma_transpose/aie2_iron.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# dma_transpose/aie2_iron.py -*- Python -*-
#
# This file is licensed under the Apache License v2.0 with LLVM Exceptions.
# See https://llvm.org/LICENSE.txt for license information.
# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
#
# (c) Copyright 2024 Advanced Micro Devices, Inc. or its affiliates
import argparse
import numpy as np
import sys

from aie.iron import ObjectFifo, Program, Runtime
from aie.iron.device import NPU1Col1, AnyComputeTile
from aie.iron.placers import SequentialPlacer
from aie.helpers.taplib import TensorTiler2D


def my_passthrough(M, K, generate_acccess_map=False):
tensor_ty = np.ndarray[(M, K), np.dtype[np.int32]]

tap_in = TensorTiler2D.simple_tiler((M, K), tile_col_major=True)[0]

if generate_acccess_map:
tap_in.visualize(file_path="iron_transpose_data.png", show_tile=False)
return

of_in = ObjectFifo(tensor_ty)
of_out = of_in.cons().forward(AnyComputeTile)

rt = Runtime()
with rt.sequence(tensor_ty, tensor_ty, tensor_ty) as (a_in, _, c_out):
rt.fill(of_in.prod(), a_in, tap_in)
rt.drain(of_out.cons(), c_out, wait=True)

my_program = Program(NPU1Col1(), rt)
module = my_program.resolve_program(SequentialPlacer())
print(module)


if __name__ == "__main__":
p = argparse.ArgumentParser()
p.add_argument("dims", help="M K", type=int, nargs="*", default=[64, 64])
p.add_argument(
"--generate-access-map",
action="store_true",
help="Produce a file showing data access order",
)
args = p.parse_args()

if len(args.dims) != 2:
print(
"ERROR: Must provide either no dimensions or both M and K", file=sys.stderr
)
exit(-1)
my_passthrough(
M=args.dims[0],
K=args.dims[1],
generate_acccess_map=args.generate_access_map,
)
11 changes: 11 additions & 0 deletions programming_examples/basic/dma_transpose/run_makefile_iron.lit
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
// (c) Copyright 2024 Advanced Micro Devices, Inc.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
// REQUIRES: ryzen_ai, peano
//
// RUN: mkdir -p iron_test
// RUN: cd iron_test
// RUN: make -f %S/Makefile clean
// RUN: env use_iron=1 make -f %S/Makefile
// RUN: %run_on_npu make -f %S/Makefile run | FileCheck %s
// CHECK: PASS!
101 changes: 72 additions & 29 deletions programming_examples/basic/matrix_multiplication/cascade/aie2_alt.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,12 @@
import argparse
from ml_dtypes import bfloat16
import numpy as np
import sys

from aie.extras.context import mlir_mod_ctx
from aie.dialects.aie import *
from aie.dialects.aiex import *
from aie.helpers.dialects.ext.scf import _for as range_
from aie.helpers.taplib import TensorAccessPattern, TensorAccessSequence

dtype_map = {
"bf16": bfloat16,
Expand Down Expand Up @@ -42,9 +42,15 @@ def main():
"--dtype_out", type=str, choices=["bf16", "i16", "f32", "i32"], default="i32"
)
argparser.add_argument("--trace_size", type=int, default=0)
argparser.add_argument(
"--generate-taps",
action="store_true",
help="Generate TensorAccessPatterns, a Python object to represent each data transfer"
"of the input/output matrices. These objects can be used for visualization.",
)
args = argparser.parse_args()
with mlir_mod_ctx() as ctx:
my_matmul(
maybe_taps = my_matmul(
args.M,
args.K,
args.N,
Expand All @@ -55,16 +61,32 @@ def main():
args.dtype_in,
args.dtype_out,
args.trace_size,
args.generate_taps,
)
# print(ctx.module.operation.verify())
print(ctx.module)

if args.generate_taps:
return maybe_taps


def ceildiv(a, b):
return (a + b - 1) // b


def my_matmul(M, K, N, m, k, n, n_aie_cols, dtype_in_str, dtype_out_str, trace_size):
def my_matmul(
M,
K,
N,
m,
k,
n,
n_aie_cols,
dtype_in_str,
dtype_out_str,
trace_size,
generate_taps=False,
):

n_aie_rows = 4
n_aie_cores = n_aie_rows * n_aie_cols
Expand Down Expand Up @@ -128,6 +150,12 @@ def my_matmul(M, K, N, m, k, n, n_aie_cols, dtype_in_str, dtype_out_str, trace_s
elif n_aie_cols == 4:
dev = AIEDevice.npu1_4col

# These will hold TensorAccessPattern objects that represent the runtime
# npu_dma_memcpy_nd operations of this design. They are only used if generate_taps is true
A_taps = []
B_taps = []
C_taps = []

@device(dev)
def device_body():
A_l2_ty = np.ndarray[(m * k * n_A_tiles_per_shim,), np.dtype[dtype_in]]
Expand Down Expand Up @@ -349,60 +377,75 @@ def sequence(A, B, C):
for col in range(n_aie_cols):
C_col_offset = col * n
C_offset = C_col_offset + C_row_offset

C_sizes = [tb_n_rows, N // n // n_aie_cols, m, n]
C_strides = [m * N, n * n_aie_cols, N, 1]
C_tap = TensorAccessPattern(
(M, N), C_offset, sizes=C_sizes, strides=C_strides
)
c_task = shim_dma_single_bd_task(
C_l2l3_fifos[col],
C,
offset=C_offset,
sizes=[tb_n_rows, N // n // n_aie_cols, m, n],
strides=[m * N, n * n_aie_cols, N, 1],
tap=C_tap,
issue_token=True,
)
dma_start_task(c_task)
out_tasks.append(c_task)
C_taps.append(C_tap)

for tile_row in range(tb_n_rows):
A_block_offset = ((tb * tb_max_n_rows) + tile_row) * m * K
A_row_offset = col * n_A_tiles_per_shim * k
A_offset = A_block_offset + A_row_offset
A_sizes = [
N // n // n_aie_cols,
K // k // n_aie_rows,
m * n_A_tiles_per_shim,
k,
]
A_strides = [0, k * n_aie_rows, K, 1]
A_tap = TensorAccessPattern(
(M, K), A_offset, sizes=A_sizes, strides=A_strides
)
B_col_offset = col * n
a_task = shim_dma_single_bd_task(
A_l3l2_fifos[col],
A,
offset=A_offset,
sizes=[
N // n // n_aie_cols,
K // k // n_aie_rows,
m * n_A_tiles_per_shim,
k,
],
strides=[0, k * n_aie_rows, K, 1],
tap=A_tap,
)
dma_start_task(a_task)
in_tasks.append(a_task)

A_taps.append(A_tap)

B_sizes = [
N // n // n_aie_cols,
K // k // n_aie_rows,
k * n_aie_rows,
n,
]
B_strides = [n * n_aie_cols, k * n_aie_rows * N, N, 1]
B_tap = TensorAccessPattern(
(K, N), B_col_offset, sizes=B_sizes, strides=B_strides
)
b_task = shim_dma_single_bd_task(
B_l3l2_fifos[col],
B,
offset=B_col_offset,
sizes=[
N // n // n_aie_cols,
K // k // n_aie_rows,
k * n_aie_rows,
n,
],
strides=[n * n_aie_cols, k * n_aie_rows * N, N, 1],
B_l3l2_fifos[col], B, tap=B_tap
)
dma_start_task(b_task)
in_tasks.append(b_task)
B_taps.append(B_tap)
dma_await_task(*out_tasks)
out_tasks = []
dma_free_task(*in_tasks)
in_tasks = []

if generate_taps:
# If generate taps is true, return a representation of tensor access patterns
# representing all the npu_dma_memcpy_nd runtime sequence operations per input/ouput tensor.
return (
TensorAccessSequence.from_taps(A_taps),
TensorAccessSequence.from_taps(B_taps),
TensorAccessSequence.from_taps(C_taps),
)


if __name__ == "__main__":
main()
else:
print("Not meant to be imported")
sys.exit(1)
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,19 @@ k=32

kernels=mv_${m}x${k}
use_alt?=0
use_iron?=0

ifeq (${use_alt}, 1)
aie_py_src=aie2_alt.py
ifeq (${use_iron}, 1)
$(error Cannot specify both alternative design and IRON)
endif
endif

ifeq (${use_iron}, 1)
aie_py_src=aie2_iron.py
endif


SELF_DIR := $(dir $(lastword $(MAKEFILE_LIST)))
include ${SELF_DIR}../makefile-common
Expand Down
Loading
Loading