Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UMD create_mock_cluster and fix PhysicalCoordinate #15411

Merged
merged 9 commits into from
Nov 26, 2024
Merged

Conversation

broskoTT
Copy link
Contributor

@broskoTT broskoTT commented Nov 24, 2024

This is PR similar to #15301. It previously broke CI pipelines, but this PR has a fix for that.

Ticket

Related to #13948

Problem description

Tied to UMD change tenstorrent/tt-umd#310

What's changed

  • Rename and clear up create_for_grayskull_cluster to create_mock_cluster
  • For grayskull, the CEM already works, and the create_from_yaml path is successfully used already
  • eth_coord_t was updated in Support multiple unconnected clusters tt-umd#306 , there is a change on how PhysicalCoordinate is created to reflect that.

Checklist

broskoTT added a commit to tenstorrent/tt-umd that referenced this pull request Nov 26, 2024
### Issue
No issue

### Description
Related to tenstorrent/tt-metal#15411
While working on it, I realised that cluster_id can take whichever
chip_id.
This change makes it deterministic, and choose always the smallest one

### List of the changes
- Always chose smallest number in set for cluster_id
- Wrote a test which failed before this change
- Minor other changes in test to make it not segfault on machine without
cards.

### Testing
Wrote a new CI test

### API Changes
There are no API changes in this PR.
broskoTT and others added 5 commits November 26, 2024 12:23
### Ticket
Related to #13948

### Problem description
Tied to UMD change tenstorrent/tt-umd#310

### What's changed
- Rename and clear up create_for_grayskull_cluster to
create_mock_cluster
- For grayskull, the CEM already works, and the create_from_yaml path is
successfully used already

### Checklist
- [x] Post commit CI passes :
https://github.com/tenstorrent/tt-metal/actions/runs/11972671759
- [x] Blackhole Post commit (if applicable): Not applicable
- [x] Model regression CI testing passes (if applicable): Not applicable
- [x] Device performance regression CI testing passes (if applicable):
Not applicable
- [x] New/Existing tests provide coverage for changes: Not applicable
@broskoTT broskoTT merged commit bd6187c into main Nov 26, 2024
194 of 200 checks passed
@broskoTT broskoTT deleted the brosko/umd_fix branch November 26, 2024 19:45
gfengTT pushed a commit that referenced this pull request Nov 27, 2024
This is PR similar to #15301. It previously broke CI pipelines, but this
PR has a fix for that.

### Ticket
Related to #13948

### Problem description
Tied to UMD change tenstorrent/tt-umd#310

### What's changed
- Rename and clear up create_for_grayskull_cluster to
create_mock_cluster
- For grayskull, the CEM already works, and the create_from_yaml path is
successfully used already
- eth_coord_t was updated in
tenstorrent/tt-umd#306 , there is a change on
how PhysicalCoordinate is created to reflect that.

### Checklist
- [x] All post-commit tests :
https://github.com/tenstorrent/tt-metal/actions/runs/12033347997
- [x] Blackhole post-commit tests :
https://github.com/tenstorrent/tt-metal/actions/runs/12033347995
- [ ] (Single-card) Model perf tests :
https://github.com/tenstorrent/tt-metal/actions/runs/12033439959
- [ ] (Single-card) Device perf regressions :
https://github.com/tenstorrent/tt-metal/actions/runs/12033442095
- [x] (T3K) T3000 unit tests :
https://github.com/tenstorrent/tt-metal/actions/runs/12033456773
- [x] (T3K) T3000 demo tests :
https://github.com/tenstorrent/tt-metal/actions/runs/12033517770
- [x] (TG) TG unit tests :
https://github.com/tenstorrent/tt-metal/actions/runs/12033520415
- [x] (TG) TG demo tests :
https://github.com/tenstorrent/tt-metal/actions/runs/12033522422
- [x] (TGG) TGG unit tests :
https://github.com/tenstorrent/tt-metal/actions/runs/12033524632
- [x] (TGG) TGG demo tests :
https://github.com/tenstorrent/tt-metal/actions/runs/12033526704
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants