-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Blackhole IOMMU support #370
Comments
Merged
joelsmithTT
added a commit
that referenced
this issue
Dec 11, 2024
### Issue #370 ### Description Adds IOMMU support for Blackhole in a way that should be transparent to the application. ### List of the changes * Allow Blackhole to have multiple hugepages / host memory channels * Add an API on TTDevice for iATU programming * Rehome Blackhole iATU programming code to blackhole_tt_device.cpp * Remove unnecessary logic to determine hugepage quantity (just use what the application passes to Cluster constructor) * Add sysmem tests for Blackhole. ### Testing Manual testing was performed for both IOMMU on and IOMMU off cases using the newly-added sysmem tests for Blackhole. With IOMMU on: ``` [==========] Running 2 tests from 1 test suite. [----------] Global test environment set-up. [----------] 2 tests from SiliconDriverBH [ RUN ] SiliconDriverBH.SysmemTestWithPcie Detecting chips (found 1) 2024-12-10 20:40:07.019 | WARNING | SiliconDriver - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN 2024-12-10 20:40:07.020 | WARNING | SiliconDriver - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN 2024-12-10 20:40:07.083 | INFO | SiliconDriver - Detected PCI devices: [0] 2024-12-10 20:40:07.083 | INFO | SiliconDriver - Using local chip ids: {0} and remote chip ids {} 2024-12-10 20:40:07.083 | INFO | SiliconDriver - Opened PCI device 0; KMD version: 1.30.0, IOMMU: enabled 2024-12-10 20:40:07.170 | INFO | SiliconDriver - Allocating sysmem without hugepages (size: 0x40000000). 2024-12-10 20:40:07.417 | INFO | SiliconDriver - Mapped sysmem without hugepages to IOVA 0x3ffffff80000000. 2024-12-10 20:40:07.418 | INFO | SiliconDriver - Device: 0 Mapping iATU region 0 from 0x0 to 0x3fffffff to 0x3ffffff80000000 [ OK ] SiliconDriverBH.SysmemTestWithPcie (658 ms) [ RUN ] SiliconDriverBH.RandomSysmemTestWithPcie 2024-12-10 20:40:07.672 | WARNING | SiliconDriver - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN 2024-12-10 20:40:07.672 | WARNING | SiliconDriver - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN 2024-12-10 20:40:07.731 | INFO | SiliconDriver - Detected PCI devices: [0] 2024-12-10 20:40:07.731 | INFO | SiliconDriver - Using local chip ids: {0} and remote chip ids {} 2024-12-10 20:40:07.731 | INFO | SiliconDriver - Opened PCI device 0; KMD version: 1.30.0, IOMMU: enabled 2024-12-10 20:40:07.818 | INFO | SiliconDriver - Allocating sysmem without hugepages (size: 0x40000000). 2024-12-10 20:40:08.081 | INFO | SiliconDriver - Mapped sysmem without hugepages to IOVA 0x3ffffff80000000. 2024-12-10 20:40:08.327 | WARNING | SiliconDriver - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN 2024-12-10 20:40:08.327 | WARNING | SiliconDriver - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN 2024-12-10 20:40:08.387 | INFO | SiliconDriver - Detected PCI devices: [0] 2024-12-10 20:40:08.387 | INFO | SiliconDriver - Using local chip ids: {0} and remote chip ids {} 2024-12-10 20:40:08.387 | INFO | SiliconDriver - Opened PCI device 0; KMD version: 1.30.0, IOMMU: enabled 2024-12-10 20:40:08.474 | INFO | SiliconDriver - Allocating sysmem without hugepages (size: 0x100000000). 2024-12-10 20:40:09.453 | INFO | SiliconDriver - Mapped sysmem without hugepages to IOVA 0x3fffffe00000000. 2024-12-10 20:40:09.453 | INFO | SiliconDriver - Device: 0 Mapping iATU region 0 from 0x0 to 0x3fffffff to 0x3fffffe00000000 2024-12-10 20:40:09.454 | INFO | SiliconDriver - Device: 0 Mapping iATU region 1 from 0x40000000 to 0x7fffffff to 0x3fffffe40000000 2024-12-10 20:40:09.454 | INFO | SiliconDriver - Device: 0 Mapping iATU region 2 from 0x80000000 to 0xbfffffff to 0x3fffffe80000000 2024-12-10 20:40:09.454 | INFO | SiliconDriver - Device: 0 Mapping iATU region 3 from 0xc0000000 to 0xffffffff to 0x3fffffec0000000 [ OK ] SiliconDriverBH.RandomSysmemTestWithPcie (7754 ms) [----------] 2 tests from SiliconDriverBH (8413 ms total) [----------] Global test environment tear-down [==========] 2 tests from 1 test suite ran. (8413 ms total) [ PASSED ] 2 tests. ``` With IOMMU in passthrough: ``` [==========] Running 2 tests from 1 test suite. [----------] Global test environment set-up. [----------] 2 tests from SiliconDriverBH [ RUN ] SiliconDriverBH.SysmemTestWithPcie Detecting chips (found 1) 2024-12-10 20:59:03.744 | WARNING | SiliconDriver - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN 2024-12-10 20:59:03.745 | WARNING | SiliconDriver - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN 2024-12-10 20:59:03.812 | INFO | SiliconDriver - Detected PCI devices: [0] 2024-12-10 20:59:03.812 | INFO | SiliconDriver - Using local chip ids: {0} and remote chip ids {} 2024-12-10 20:59:03.813 | INFO | SiliconDriver - Opened PCI device 0; KMD version: 1.30.0, IOMMU: disabled 2024-12-10 20:59:03.928 | INFO | SiliconDriver - Device: 0 Mapping iATU region 0 from 0x0 to 0x3fffffff to 0xe00000000 [ OK ] SiliconDriverBH.SysmemTestWithPcie (383 ms) [ RUN ] SiliconDriverBH.RandomSysmemTestWithPcie 2024-12-10 20:59:04.121 | WARNING | SiliconDriver - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN 2024-12-10 20:59:04.121 | WARNING | SiliconDriver - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN 2024-12-10 20:59:04.177 | INFO | SiliconDriver - Detected PCI devices: [0] 2024-12-10 20:59:04.177 | INFO | SiliconDriver - Using local chip ids: {0} and remote chip ids {} 2024-12-10 20:59:04.177 | INFO | SiliconDriver - Opened PCI device 0; KMD version: 1.30.0, IOMMU: disabled 2024-12-10 20:59:04.380 | WARNING | SiliconDriver - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN 2024-12-10 20:59:04.380 | WARNING | SiliconDriver - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN 2024-12-10 20:59:04.435 | INFO | SiliconDriver - Detected PCI devices: [0] 2024-12-10 20:59:04.435 | INFO | SiliconDriver - Using local chip ids: {0} and remote chip ids {} 2024-12-10 20:59:04.436 | INFO | SiliconDriver - Opened PCI device 0; KMD version: 1.30.0, IOMMU: disabled 2024-12-10 20:59:04.513 | INFO | SiliconDriver - Device: 0 Mapping iATU region 0 from 0x0 to 0x3fffffff to 0xe00000000 2024-12-10 20:59:04.513 | INFO | SiliconDriver - Device: 0 Mapping iATU region 1 from 0x40000000 to 0x7fffffff to 0xe40000000 2024-12-10 20:59:04.513 | INFO | SiliconDriver - Device: 0 Mapping iATU region 2 from 0x80000000 to 0xbfffffff to 0xe80000000 2024-12-10 20:59:04.513 | INFO | SiliconDriver - Device: 0 Mapping iATU region 3 from 0xc0000000 to 0xffffffff to 0xec0000000 [ OK ] SiliconDriverBH.RandomSysmemTestWithPcie (11055 ms) [----------] 2 tests from SiliconDriverBH (11438 ms total) [----------] Global test environment tear-down [==========] 2 tests from 1 test suite ran. (11438 ms total) [ PASSED ] 2 tests. ``` ### API Changes There are no API changes in this PR.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Initial IOMMU support PR: #338
TODO:
The text was updated successfully, but these errors were encountered: