Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IOMMU support episode II #393

Merged
merged 16 commits into from
Dec 11, 2024
Merged

Conversation

joelsmithTT
Copy link
Contributor

Issue

#370

Description

Adds IOMMU support for Blackhole in a way that should be transparent to the application.

List of the changes

  • Allow Blackhole to have multiple hugepages / host memory channels
  • Add an API on TTDevice for iATU programming
  • Rehome Blackhole iATU programming code to blackhole_tt_device.cpp
  • Remove unnecessary logic to determine hugepage quantity (just use what the application passes to Cluster constructor)
  • Add sysmem tests for Blackhole.

Testing

Manual testing was performed for both IOMMU on and IOMMU off cases using the newly-added sysmem tests for Blackhole.

With IOMMU on:

[==========] Running 2 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 2 tests from SiliconDriverBH
[ RUN      ] SiliconDriverBH.SysmemTestWithPcie
  Detecting chips (found 1)
2024-12-10 20:40:07.019 | WARNING  | SiliconDriver   - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN
2024-12-10 20:40:07.020 | WARNING  | SiliconDriver   - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN
2024-12-10 20:40:07.083 | INFO     | SiliconDriver   - Detected PCI devices: [0]
2024-12-10 20:40:07.083 | INFO     | SiliconDriver   - Using local chip ids: {0} and remote chip ids {}
2024-12-10 20:40:07.083 | INFO     | SiliconDriver   - Opened PCI device 0; KMD version: 1.30.0, IOMMU: enabled
2024-12-10 20:40:07.170 | INFO     | SiliconDriver   - Allocating sysmem without hugepages (size: 0x40000000).
2024-12-10 20:40:07.417 | INFO     | SiliconDriver   - Mapped sysmem without hugepages to IOVA 0x3ffffff80000000.
2024-12-10 20:40:07.418 | INFO     | SiliconDriver   - Device: 0 Mapping iATU region 0 from 0x0 to 0x3fffffff to 0x3ffffff80000000
[       OK ] SiliconDriverBH.SysmemTestWithPcie (658 ms)
[ RUN      ] SiliconDriverBH.RandomSysmemTestWithPcie
2024-12-10 20:40:07.672 | WARNING  | SiliconDriver   - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN
2024-12-10 20:40:07.672 | WARNING  | SiliconDriver   - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN
2024-12-10 20:40:07.731 | INFO     | SiliconDriver   - Detected PCI devices: [0]
2024-12-10 20:40:07.731 | INFO     | SiliconDriver   - Using local chip ids: {0} and remote chip ids {}
2024-12-10 20:40:07.731 | INFO     | SiliconDriver   - Opened PCI device 0; KMD version: 1.30.0, IOMMU: enabled
2024-12-10 20:40:07.818 | INFO     | SiliconDriver   - Allocating sysmem without hugepages (size: 0x40000000).
2024-12-10 20:40:08.081 | INFO     | SiliconDriver   - Mapped sysmem without hugepages to IOVA 0x3ffffff80000000.
2024-12-10 20:40:08.327 | WARNING  | SiliconDriver   - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN
2024-12-10 20:40:08.327 | WARNING  | SiliconDriver   - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN
2024-12-10 20:40:08.387 | INFO     | SiliconDriver   - Detected PCI devices: [0]
2024-12-10 20:40:08.387 | INFO     | SiliconDriver   - Using local chip ids: {0} and remote chip ids {}
2024-12-10 20:40:08.387 | INFO     | SiliconDriver   - Opened PCI device 0; KMD version: 1.30.0, IOMMU: enabled
2024-12-10 20:40:08.474 | INFO     | SiliconDriver   - Allocating sysmem without hugepages (size: 0x100000000).
2024-12-10 20:40:09.453 | INFO     | SiliconDriver   - Mapped sysmem without hugepages to IOVA 0x3fffffe00000000.
2024-12-10 20:40:09.453 | INFO     | SiliconDriver   - Device: 0 Mapping iATU region 0 from 0x0 to 0x3fffffff to 0x3fffffe00000000
2024-12-10 20:40:09.454 | INFO     | SiliconDriver   - Device: 0 Mapping iATU region 1 from 0x40000000 to 0x7fffffff to 0x3fffffe40000000
2024-12-10 20:40:09.454 | INFO     | SiliconDriver   - Device: 0 Mapping iATU region 2 from 0x80000000 to 0xbfffffff to 0x3fffffe80000000
2024-12-10 20:40:09.454 | INFO     | SiliconDriver   - Device: 0 Mapping iATU region 3 from 0xc0000000 to 0xffffffff to 0x3fffffec0000000
[       OK ] SiliconDriverBH.RandomSysmemTestWithPcie (7754 ms)
[----------] 2 tests from SiliconDriverBH (8413 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 1 test suite ran. (8413 ms total)
[  PASSED  ] 2 tests.

With IOMMU in passthrough:

[==========] Running 2 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 2 tests from SiliconDriverBH
[ RUN      ] SiliconDriverBH.SysmemTestWithPcie
  Detecting chips (found 1)
2024-12-10 20:59:03.744 | WARNING  | SiliconDriver   - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN
2024-12-10 20:59:03.745 | WARNING  | SiliconDriver   - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN
2024-12-10 20:59:03.812 | INFO     | SiliconDriver   - Detected PCI devices: [0]
2024-12-10 20:59:03.812 | INFO     | SiliconDriver   - Using local chip ids: {0} and remote chip ids {}
2024-12-10 20:59:03.813 | INFO     | SiliconDriver   - Opened PCI device 0; KMD version: 1.30.0, IOMMU: disabled
2024-12-10 20:59:03.928 | INFO     | SiliconDriver   - Device: 0 Mapping iATU region 0 from 0x0 to 0x3fffffff to 0xe00000000
[       OK ] SiliconDriverBH.SysmemTestWithPcie (383 ms)
[ RUN      ] SiliconDriverBH.RandomSysmemTestWithPcie
2024-12-10 20:59:04.121 | WARNING  | SiliconDriver   - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN
2024-12-10 20:59:04.121 | WARNING  | SiliconDriver   - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN
2024-12-10 20:59:04.177 | INFO     | SiliconDriver   - Detected PCI devices: [0]
2024-12-10 20:59:04.177 | INFO     | SiliconDriver   - Using local chip ids: {0} and remote chip ids {}
2024-12-10 20:59:04.177 | INFO     | SiliconDriver   - Opened PCI device 0; KMD version: 1.30.0, IOMMU: disabled
2024-12-10 20:59:04.380 | WARNING  | SiliconDriver   - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN
2024-12-10 20:59:04.380 | WARNING  | SiliconDriver   - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN
2024-12-10 20:59:04.435 | INFO     | SiliconDriver   - Detected PCI devices: [0]
2024-12-10 20:59:04.435 | INFO     | SiliconDriver   - Using local chip ids: {0} and remote chip ids {}
2024-12-10 20:59:04.436 | INFO     | SiliconDriver   - Opened PCI device 0; KMD version: 1.30.0, IOMMU: disabled
2024-12-10 20:59:04.513 | INFO     | SiliconDriver   - Device: 0 Mapping iATU region 0 from 0x0 to 0x3fffffff to 0xe00000000
2024-12-10 20:59:04.513 | INFO     | SiliconDriver   - Device: 0 Mapping iATU region 1 from 0x40000000 to 0x7fffffff to 0xe40000000
2024-12-10 20:59:04.513 | INFO     | SiliconDriver   - Device: 0 Mapping iATU region 2 from 0x80000000 to 0xbfffffff to 0xe80000000
2024-12-10 20:59:04.513 | INFO     | SiliconDriver   - Device: 0 Mapping iATU region 3 from 0xc0000000 to 0xffffffff to 0xec0000000
[       OK ] SiliconDriverBH.RandomSysmemTestWithPcie (11055 ms)
[----------] 2 tests from SiliconDriverBH (11438 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 1 test suite ran. (11438 ms total)
[  PASSED  ] 2 tests.

API Changes

There are no API changes in this PR.

@joelsmithTT joelsmithTT self-assigned this Dec 11, 2024
Copy link
Contributor

@broskoTT broskoTT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Are there going to be further PRs, or will sysmem just work now on tt_metal on a blackhole machine without hugepages?

tests/blackhole/test_silicon_driver_bh.cpp Show resolved Hide resolved
device/api/umd/device/tt_device/tt_device.h Show resolved Hide resolved
device/cluster.cpp Show resolved Hide resolved
device/cluster.cpp Show resolved Hide resolved
device/tt_device/blackhole_tt_device.cpp Show resolved Hide resolved
device/cluster.cpp Show resolved Hide resolved
@joelsmithTT
Copy link
Contributor Author

Are there going to be further PRs, or will sysmem just work now on tt_metal on a blackhole machine without hugepages?

This should just work provided the IOMMU is enabled. It will fail if IOMMU is disabled or in passthrough mode and there are no hugepages available. I suppose a caveat here is this hasn't been extensively tested.

@joelsmithTT joelsmithTT merged commit bf740bd into main Dec 11, 2024
24 checks passed
@joelsmithTT joelsmithTT deleted the joelsmith/iommu-support-episode-ii branch December 11, 2024 20:17
broskoTT pushed a commit that referenced this pull request Dec 13, 2024
Reverts commit 3210bd9 from
#393.

### Issue
Metal CI failure tracked in
tenstorrent/tt-metal#15675

### Description
The reverted commit removed logic that allowed applications to request
more hugepages than available to UMD. Previously, UMD would issue a
warning in such cases. However, this created a potential safety issue
since applications had no visibility into partial hugepage allocation
(e.g., requesting 4 pages but receiving only 2).

This situation could lead to:
- Host software segfaults when accessing unmapped pages
- More critically, device software could potentially corrupt host
physical address space by writing to nonexistent pages

While the original change (making excessive hugepage requests a fatal
error) improved safety, particularly in conjunction with IOMMU
enablement, it caused failures in Metal CI tests that (possibly
unintentionally?) request more hugepages than available. This revert is
a temporary measure until the Metal CI tests can be updated.


### List of the changes
* Revert 3210bd9
* Update comment documentation

### Testing
CI

### API Changes
There are no API changes in this PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants