Add transpose WH sharded, generalize row major permute when N > 4, and do a minor refactor of ttnn::permute #34535
Job | Run time |
---|---|
1s | |
53s | |
18s | |
5s | |
5s | |
8s | |
6m 40s | |
1m 4s | |
23s | |
9m 37s |
Job | Run time |
---|---|
1s | |
53s | |
18s | |
5s | |
5s | |
8s | |
6m 40s | |
1m 4s | |
23s | |
9m 37s |