Skip to content

Commit

Permalink
[wip] Writing coordinate system docs
Browse files Browse the repository at this point in the history
  • Loading branch information
pjanevskiTT committed Oct 29, 2024
1 parent 971d39a commit 50f18df
Showing 1 changed file with 75 additions and 50 deletions.
125 changes: 75 additions & 50 deletions docs/coordinate_systems.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,23 +26,84 @@ Wiki on coordinates understood by open-UMD
* [Important Notes about Harvesting](#important-notes-about-harvesting)
+ [Additional Notes about the Translation Scheme in UMD](#additional-notes-about-the-translation-scheme-in-umd)

# Summary

# API and Coordinates Usage
## Important Notes
* UMD accepts Virtual Coordinates (these match Physical Coordinates on Grayskull)
* Any application programming overlay hardware/streams must use Translated Coordinates on a Wormhole Device with translation tables enabled
* Regular Physical Coordinates can be used on a Wormhole Device with Translation Tables disabled
* Translation Tables are a prerequisite for Wormhole Harvesting
* As part of product definition, translation tables are enabled on Wormhole devices regardless of their grid size
This documentation is intended to be used as a guide to understanding harvesting of Tenstorrent chips, as well as understanding different coordinate systems for cores on the chip.

This document describes coordinate systems of the chip cores and harvesting in the following sequence:

1. Harvesting basics
2. Different coordinate systems used
3. How does harvesting affect coordinate systems
4. Programming guide using different coordinate systems

Prior to reading this document, it is recommended the reader is familiar with following concepts
- General architecture of the current generation of Tenstorrent chips (Grayskull, Wormhole, Blackhole)
- Difference between different core types (Tensix, DRAM, PCIe, ARC, Ethernet)

## Important notes for further reading

- Annotation X x Y (for example, 8x10) represents that we have X cores on the x axis, and Y cores on the Y axis. In terms of row/column view, that would mean that we have Y rows and X columns. Example for 8x10 is the image below

# Harvesting basics

In basic terms, harvesting represents turning off certain cores on the chip. This is done for various reasons, for example faulty cores on the chip can be harvested. Theoretically, we could harvest any cores, but in practice, only certain types of cores are harvested on Tenstorrent chips.

### Grayskull harvesting

On Grayskull there is no harvesting. That means that on each Grayskull chip full grid of tensix cores (12x10) is available.

Harvesting of non-tensix cores (DRAM, PCIe, ARC, Ethernet) is also not supported.

### Wormhole harvesting

On wormhole, harvesting of tensix rows is supported. That means that on the tensix grid (8x10) we always have 8 columns of chips, but number of rows can decrease. In practice, our Wormhole chips have one or two rows harvested. Example for two harvested rows is in the image below.

(TODO: attach image of harvested rows)

Note that there is no limitation on which specific rows we can harvest.

Harvesting of non-tensix cores (DRAM, PCIe, ARC, Ethernet) is not supported on Wormhole.

### Blackhole harvesting

On Blackhole, harvesting of tensix columns is supported. That means that on the tensix grid (14x10), we are always going to have 10 rows of tensix cores, but number of columns may decrease. In practice, our Blackhole chips have (TODO: how many columns) columns harvested. Example for two harvested columns is in the image below.

(TODO: attach image of harvested columns)

Note that there is no limitation on which specific columns we can harvest.

(TODO: dram harvesting)

Harvesting of other cores (PCIe, ARC, Ethernet) is not supported on Blackhole.

# Coordinates System Definitions
## Physical Coordinates - Any Arch
These are the NOC coordinates that the hardware understands, there are two distinct variations for NOC0 and NOC1. In hardware, each node is given an ID (which is different for each NOC) which can be used to identify this node. In the SOC descriptor, physical coordinates are specified for NOC0.

### Important Notes
* The coordinates are **NOT** contiguous for tensix cores -- In the example shown, ethernet corresponds to physical coords `[*-6]`: this row is not on in the grid of worker cores
* The coordinates are statically assigned to each "node" regardless of harvesting
* `grayskull` are listed in `[y-x]` while `wormhole_b0` are listed in `[x-y]`
## Logical coordinates

This coordinate system is not used within UMD. It hides the details of physical coordinates and allows upper layers of the stack to access tensix endpoints through a set of traditional Cartesian Coordinates (starting at `0-0`).

### Example
Given the physical worker grid for `wormhole_b0` in the diagram above, the following set of logical coordinates (in `x-y`) can be used to access each physical core:

```yaml
functional_workers:
[ # Each node specifies logical coords specifically.
0-0, 1-0, 2-0, 3-0, 4-0, 5-0, 6-0, 7-0,
0-1, 1-1, 2-1, 3-1, 4-1, 5-1, 6-1, 7-1,
0-2, 1-2, 2-2, 3-2, 4-2, 5-2, 6-2, 7-2,
0-3, 1-3, 2-3, 3-3, 4-3, 5-3, 6-3, 7-3,
0-4, 1-4, 2-4, 3-4, 4-4, 5-4, 6-4, 7-4,
0-5, 1-5, 2-5, 3-5, 4-5, 5-5, 6-5, 7-5,
0-6, 1-6, 2-6, 3-6, 4-6, 5-6, 6-6, 7-6,
0-7, 1-7, 2-7, 3-7, 4-7, 5-7, 6-7, 7-7,
0-8, 1-8, 2-8, 3-8, 4-8, 5-8, 6-8, 7-8,
0-9, 1-9, 2-9, 3-9, 4-9, 5-9, 6-9, 7-9,
]
```
## Physical Coordinates
These are the NOC coordinates that the hardware understands, there are two distinct variations for NOC0 and NOC1. In hardware, each node is given an ID (which is different for each NOC) which can be used to identify this node. In the SOC descriptor, physical coordinates are specified for NOC0.
### Example
Example for `wormhole_b0`, where coordinates are specified in `[x-y]` pairs for `NOC0`
Expand All @@ -68,43 +129,7 @@ functional_workers:
```
A similar example displaying physical coordinates on Grayskull can be found in `tests/soc_descs/grayskull_10x12.yaml` inside the UMD repo.

### Mapping relationship between NOC0/NOC1
Below is the way we map the physical to `NOC0`/`NOC1` coordinates -- Notice that physical maps to `NOC0`directly and `noc_size_*` changes depending on `ARCH`
```cpp
#define NOC_X(x) (noc == NOC0 ? (x) : (noc_size_x-1-(x)))
#define NOC_Y(y) (noc == NOC0 ? (y) : (noc_size_y-1-(y)))
```

## Logical Coordinates - Any Arch
This coordinate system is not used within UMD. It hides the details of physical coordinates and allows upper layers of the stack to access tensix endpoints through a set of traditional Cartesian Coordinates (starting at `0-0`).

### Important Notes
* These coordinates cannot be used in APIs provided by UMD. UMD expects virtual coordinates to be passed into its APIs (see below for definition).
* This coordinate system is easier to use than physical coordinates and can be used by Buda or Metal. However, certain operations may require the use of physical coordinates. For this, the SOC descriptor class in UMD presents the following translation layers:

```
std::unordered_map<int, int> worker_log_to_routing_x; // worker logical to routing (x)
std::unordered_map<int, int> worker_log_to_routing_y; // worker logical to routing (y)
```
### Example
Given the physical worker grid for `wormhole_b0` in the diagram above, the following set of logical coordinates (in `x-y`) can be used to access each physical core:
```yaml
functional_workers:
[ # Each node specifies logical coords specifically.
0-0, 1-0, 2-0, 3-0, 4-0, 5-0, 6-0, 7-0,
0-1, 1-1, 2-1, 3-1, 4-1, 5-1, 6-1, 7-1,
0-2, 1-2, 2-2, 3-2, 4-2, 5-2, 6-2, 7-2,
0-3, 1-3, 2-3, 3-3, 4-3, 5-3, 6-3, 7-3,
0-4, 1-4, 2-4, 3-4, 4-4, 5-4, 6-4, 7-4,
0-5, 1-5, 2-5, 3-5, 4-5, 5-5, 6-5, 7-5,
0-6, 1-6, 2-6, 3-6, 4-6, 5-6, 6-6, 7-6,
0-7, 1-7, 2-7, 3-7, 4-7, 5-7, 6-7, 7-7,
0-8, 1-8, 2-8, 3-8, 4-8, 5-8, 6-8, 7-8,
0-9, 1-9, 2-9, 3-9, 4-9, 5-9, 6-9, 7-9,
]
```
## Virtual coordinates

## Hardware Translated Coordinates - Only for Wormhole and beyond
**Note: This is an implementation detail in UMD that upper layers of the stack are not exposed to. However, device binaries using streams should use the translated coordinate scheme presented here.**
Expand Down

0 comments on commit 50f18df

Please sign in to comment.