Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[X86] Swap ports 10 and 11 in Alder Lake Scheduling Model #117466

Merged

Conversation

boomanaiden154
Copy link
Contributor

Based on intel/perfmon#149, the documentation is incorrect and the pfm counter names are actually correct. This patch adjusts the Alder Lake scheduling model to match the performance counter naming/ correct naming that will soon be reflected in the optimization manual.

This fixes part of #117360.

@llvmbot
Copy link
Member

llvmbot commented Nov 24, 2024

@llvm/pr-subscribers-backend-x86

Author: Aiden Grossman (boomanaiden154)

Changes

Based on intel/perfmon#149, the documentation is incorrect and the pfm counter names are actually correct. This patch adjusts the Alder Lake scheduling model to match the performance counter naming/ correct naming that will soon be reflected in the optimization manual.

This fixes part of #117360.


Patch is 749.90 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/117466.diff

40 Files Affected:

  • (modified) llvm/lib/Target/X86/X86PfmCounters.td (+1-4)
  • (modified) llvm/lib/Target/X86/X86SchedAlderlakeP.td (+240-241)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/independent-load-stores.s (+11-11)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-adx.s (+5-5)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-aes.s (+7-7)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-avx1.s (+323-323)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-avx2.s (+153-153)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-avxgfni.s (+7-7)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-avxvnni.s (+9-9)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-bmi1.s (+14-14)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-bmi2.s (+18-18)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-clflushopt.s (+2-2)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-clwb.s (+2-2)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-cmov.s (+49-49)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-cmpxchg.s (+5-5)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-f16c.s (+3-3)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-fma.s (+97-97)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-gfni.s (+4-4)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-lea.s (+46-46)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-lzcnt.s (+4-4)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-mmx.s (+47-47)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-movbe.s (+4-4)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-pclmul.s (+2-2)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-popcnt.s (+4-4)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-prefetchw.s (+3-3)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-rdrand.s (+4-4)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-rdseed.s (+4-4)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-sse1.s (+59-59)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-sse2.s (+119-119)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-sse3.s (+11-11)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-sse41.s (+45-45)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-sse42.s (+11-11)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-ssse3.s (+33-33)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-vaes.s (+5-5)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-vpclmulqdq.s (+2-2)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-x86_32.s (+2-2)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-x86_64.s (+694-694)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-x87.s (+2-2)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/resources-xsave.s (+4-4)
  • (modified) llvm/test/tools/llvm-mca/X86/AlderlakeP/zero-idioms.s (+3-3)
diff --git a/llvm/lib/Target/X86/X86PfmCounters.td b/llvm/lib/Target/X86/X86PfmCounters.td
index 0c80f1eaadadb8..26dc03c063636f 100644
--- a/llvm/lib/Target/X86/X86PfmCounters.td
+++ b/llvm/lib/Target/X86/X86PfmCounters.td
@@ -210,10 +210,7 @@ def AlderLakePfmCounters : ProcPfmCounters {
   let IssueCounters = [
     PfmIssueCounter<"ADLPPort00", "uops_dispatched:port_0">,
     PfmIssueCounter<"ADLPPort01", "uops_dispatched:port_1">,
-    // The perfmon documentation and thus libpfm seems to incorrectly label
-    // this performance counter, as ports 2,3, and 11 are actually grouped
-    // according to most documentation. See #113941 for additional details.
-    PfmIssueCounter<"ADLPPort02_03_11", "uops_dispatched:port_2_3_10">,
+    PfmIssueCounter<"ADLPPort02_03_10", "uops_dispatched:port_2_3_10">,
     PfmIssueCounter<"ADLPPort04_09", "uops_dispatched:port_4_9">,
     PfmIssueCounter<"ADLPPort05_11", "uops_dispatched:port_5_11">,
     PfmIssueCounter<"ADLPPort06", "uops_dispatched:port_6">,
diff --git a/llvm/lib/Target/X86/X86SchedAlderlakeP.td b/llvm/lib/Target/X86/X86SchedAlderlakeP.td
index f8c6b32a853be9..564369804711a9 100644
--- a/llvm/lib/Target/X86/X86SchedAlderlakeP.td
+++ b/llvm/lib/Target/X86/X86SchedAlderlakeP.td
@@ -56,16 +56,15 @@ def ADLPPort00_05          : ProcResGroup<[ADLPPort00, ADLPPort05]>;
 def ADLPPort00_05_06       : ProcResGroup<[ADLPPort00, ADLPPort05, ADLPPort06]>;
 def ADLPPort00_06          : ProcResGroup<[ADLPPort00, ADLPPort06]>;
 def ADLPPort01_05          : ProcResGroup<[ADLPPort01, ADLPPort05]>;
-def ADLPPort01_05_10       : ProcResGroup<[ADLPPort01, ADLPPort05, ADLPPort10]>;
+def ADLPPort01_05_11       : ProcResGroup<[ADLPPort01, ADLPPort05, ADLPPort11]>;
 def ADLPPort02_03          : ProcResGroup<[ADLPPort02, ADLPPort03]>;
 def ADLPPort02_03_07       : ProcResGroup<[ADLPPort02, ADLPPort03, ADLPPort07]>;
-def ADLPPort02_03_11       : ProcResGroup<[ADLPPort02, ADLPPort03, ADLPPort11]>;
-def ADLPPort05_11          : ProcResGroup<[ADLPPort05, ADLPPort11]>;
+def ADLPPort02_03_10       : ProcResGroup<[ADLPPort02, ADLPPort03, ADLPPort10]>;
 def ADLPPort07_08          : ProcResGroup<[ADLPPort07, ADLPPort08]>;
 
 // EU has 112 reservation stations.
-def ADLPPort00_01_05_06_10 : ProcResGroup<[ADLPPort00, ADLPPort01, ADLPPort05,
-                                           ADLPPort06, ADLPPort10]> {
+def ADLPPort00_01_05_06_11 : ProcResGroup<[ADLPPort00, ADLPPort01, ADLPPort05,
+                                           ADLPPort06, ADLPPort11]> {
   let BufferSize = 112;
 }
 
@@ -75,8 +74,8 @@ def ADLPPort04_09          : ProcResGroup<[ADLPPort04, ADLPPort09]> {
 }
 
 // MEM has 72 reservation stations.
-def ADLPPort02_03_07_08_11 : ProcResGroup<[ADLPPort02, ADLPPort03, ADLPPort07,
-                                           ADLPPort08, ADLPPort11]> {
+def ADLPPort02_03_07_08_10 : ProcResGroup<[ADLPPort02, ADLPPort03, ADLPPort07,
+                                           ADLPPort08, ADLPPort10]> {
   let BufferSize = 72;
 }
 
@@ -114,7 +113,7 @@ multiclass ADLPWriteResPair<X86FoldableSchedWrite SchedRW,
 
   // Memory variant also uses a cycle on port 2/3/11 and adds LoadLat cycles to
   // the latency (default = 5).
-  def : WriteRes<SchedRW.Folded, !listconcat([ADLPPort02_03_11], ExePorts)> {
+  def : WriteRes<SchedRW.Folded, !listconcat([ADLPPort02_03_10], ExePorts)> {
     let Latency = !add(Lat, LoadLat);
     let ReleaseAtCycles = !listconcat([1], Res);
     let NumMicroOps = !add(UOps, LoadUOps);
@@ -127,49 +126,49 @@ multiclass ADLPWriteResPair<X86FoldableSchedWrite SchedRW,
 
 // Infered SchedWrite definition.
 def : WriteRes<WriteADC, [ADLPPort00_06]>;
-defm : X86WriteRes<WriteADCLd, [ADLPPort00_01_05_06_10, ADLPPort00_06], 11, [1, 1], 2>;
+defm : X86WriteRes<WriteADCLd, [ADLPPort00_01_05_06_11, ADLPPort00_06], 11, [1, 1], 2>;
 defm : ADLPWriteResPair<WriteAESDecEnc, [ADLPPort00_01], 5, [1], 1, 7>;
 defm : ADLPWriteResPair<WriteAESIMC, [ADLPPort00_01], 8, [2], 2, 7>;
 defm : X86WriteRes<WriteAESKeyGen, [ADLPPort00, ADLPPort00_01, ADLPPort00_01_05, ADLPPort00_06, ADLPPort01_05, ADLPPort05], 7, [4, 1, 1, 2, 3, 3], 14>;
-defm : X86WriteRes<WriteAESKeyGenLd, [ADLPPort00, ADLPPort00_01, ADLPPort00_06, ADLPPort01_05, ADLPPort02_03_11, ADLPPort05], 12, [4, 1, 2, 3, 1, 3], 14>;
-def : WriteRes<WriteALU, [ADLPPort00_01_05_06_10]>;
-def : WriteRes<WriteALULd, [ADLPPort00_01_05_06_10]> {
+defm : X86WriteRes<WriteAESKeyGenLd, [ADLPPort00, ADLPPort00_01, ADLPPort00_06, ADLPPort01_05, ADLPPort02_03_10, ADLPPort05], 12, [4, 1, 2, 3, 1, 3], 14>;
+def : WriteRes<WriteALU, [ADLPPort00_01_05_06_11]>;
+def : WriteRes<WriteALULd, [ADLPPort00_01_05_06_11]> {
   let Latency = 11;
 }
 defm : ADLPWriteResPair<WriteBEXTR, [ADLPPort00_06, ADLPPort01], 6, [1, 1], 2>;
-defm : ADLPWriteResPair<WriteBLS, [ADLPPort01_05_10], 2, [1]>;
+defm : ADLPWriteResPair<WriteBLS, [ADLPPort01_05_11], 2, [1]>;
 defm : ADLPWriteResPair<WriteBSF, [ADLPPort01], 3, [1]>;
 defm : ADLPWriteResPair<WriteBSR, [ADLPPort01], 3, [1]>;
 def : WriteRes<WriteBSWAP32, [ADLPPort01]>;
 defm : X86WriteRes<WriteBSWAP64, [ADLPPort00_06, ADLPPort01], 2, [1, 1], 2>;
 defm : ADLPWriteResPair<WriteBZHI, [ADLPPort01], 3, [1]>;
 def : WriteRes<WriteBitTest, [ADLPPort01]>;
-defm : X86WriteRes<WriteBitTestImmLd, [ADLPPort01, ADLPPort02_03_11], 6, [1, 1], 2>;
-defm : X86WriteRes<WriteBitTestRegLd, [ADLPPort00_01_05_06_10, ADLPPort00_06, ADLPPort01, ADLPPort01_05_10, ADLPPort02_03_11], 11, [4, 2, 1, 2, 1], 10>;
+defm : X86WriteRes<WriteBitTestImmLd, [ADLPPort01, ADLPPort02_03_10], 6, [1, 1], 2>;
+defm : X86WriteRes<WriteBitTestRegLd, [ADLPPort00_01_05_06_11, ADLPPort00_06, ADLPPort01, ADLPPort01_05_11, ADLPPort02_03_10], 11, [4, 2, 1, 2, 1], 10>;
 def : WriteRes<WriteBitTestSet, [ADLPPort01]>;
 def : WriteRes<WriteBitTestSetImmLd, [ADLPPort01]> {
   let Latency = 11;
 }
-defm : X86WriteRes<WriteBitTestSetRegLd, [ADLPPort00_01_05_06_10, ADLPPort00_06, ADLPPort01, ADLPPort01_05_10], 17, [3, 2, 1, 2], 8>;
+defm : X86WriteRes<WriteBitTestSetRegLd, [ADLPPort00_01_05_06_11, ADLPPort00_06, ADLPPort01, ADLPPort01_05_11], 17, [3, 2, 1, 2], 8>;
 defm : ADLPWriteResPair<WriteBlend, [ADLPPort01_05], 1, [1], 1, 7>;
 defm : ADLPWriteResPair<WriteBlendY, [ADLPPort00_01_05], 1, [1], 1, 8>;
 defm : ADLPWriteResPair<WriteCLMul, [ADLPPort05], 3, [1], 1, 7>;
 defm : ADLPWriteResPair<WriteCMOV, [ADLPPort00_06], 1, [1], 1, 6>;
-defm : X86WriteRes<WriteCMPXCHG, [ADLPPort00_01_05_06_10, ADLPPort00_06], 3, [3, 2], 5>;
-defm : X86WriteRes<WriteCMPXCHGRMW, [ADLPPort00_01_05_06_10, ADLPPort00_06, ADLPPort02_03_11, ADLPPort04_09, ADLPPort07_08], 12, [1, 2, 1, 1, 1], 6>;
+defm : X86WriteRes<WriteCMPXCHG, [ADLPPort00_01_05_06_11, ADLPPort00_06], 3, [3, 2], 5>;
+defm : X86WriteRes<WriteCMPXCHGRMW, [ADLPPort00_01_05_06_11, ADLPPort00_06, ADLPPort02_03_10, ADLPPort04_09, ADLPPort07_08], 12, [1, 2, 1, 1, 1], 6>;
 defm : ADLPWriteResPair<WriteCRC32, [ADLPPort01], 3, [1]>;
 defm : X86WriteRes<WriteCvtI2PD, [ADLPPort00_01, ADLPPort05], 5, [1, 1], 2>;
-defm : X86WriteRes<WriteCvtI2PDLd, [ADLPPort00_01, ADLPPort02_03_11], 11, [1, 1], 2>;
+defm : X86WriteRes<WriteCvtI2PDLd, [ADLPPort00_01, ADLPPort02_03_10], 11, [1, 1], 2>;
 defm : X86WriteRes<WriteCvtI2PDY, [ADLPPort00_01, ADLPPort05], 7, [1, 1], 2>;
-defm : X86WriteRes<WriteCvtI2PDYLd, [ADLPPort00_01, ADLPPort02_03_11], 12, [1, 1], 2>;
+defm : X86WriteRes<WriteCvtI2PDYLd, [ADLPPort00_01, ADLPPort02_03_10], 12, [1, 1], 2>;
 defm : X86WriteResPairUnsupported<WriteCvtI2PDZ>;
 defm : ADLPWriteResPair<WriteCvtI2PS, [ADLPPort00_01], 4, [1], 1, 7>;
 defm : ADLPWriteResPair<WriteCvtI2PSY, [ADLPPort00_01], 4, [1], 1, 8>;
 defm : X86WriteResPairUnsupported<WriteCvtI2PSZ>;
 defm : X86WriteRes<WriteCvtI2SD, [ADLPPort00_01, ADLPPort05], 7, [1, 1], 2>;
-defm : X86WriteRes<WriteCvtI2SDLd, [ADLPPort00_01, ADLPPort02_03_11], 11, [1, 1], 2>;
+defm : X86WriteRes<WriteCvtI2SDLd, [ADLPPort00_01, ADLPPort02_03_10], 11, [1, 1], 2>;
 defm : X86WriteRes<WriteCvtI2SS, [ADLPPort00_01, ADLPPort05], 7, [1, 1], 2>;
-defm : X86WriteRes<WriteCvtI2SSLd, [ADLPPort00_01, ADLPPort02_03_11], 11, [1, 1], 2>;
+defm : X86WriteRes<WriteCvtI2SSLd, [ADLPPort00_01, ADLPPort02_03_10], 11, [1, 1], 2>;
 defm : ADLPWriteResPair<WriteCvtPD2I, [ADLPPort00_01, ADLPPort05], 5, [1, 1], 2, 7>;
 defm : ADLPWriteResPair<WriteCvtPD2IY, [ADLPPort00_01, ADLPPort05], 7, [1, 1], 2, 8>;
 defm : X86WriteResPairUnsupported<WriteCvtPD2IZ>;
@@ -177,17 +176,17 @@ defm : ADLPWriteResPair<WriteCvtPD2PS, [ADLPPort00_01, ADLPPort05], 5, [1, 1], 2
 defm : ADLPWriteResPair<WriteCvtPD2PSY, [ADLPPort00_01, ADLPPort05], 7, [1, 1], 2, 8>;
 defm : X86WriteResPairUnsupported<WriteCvtPD2PSZ>;
 defm : X86WriteRes<WriteCvtPH2PS, [ADLPPort00_01, ADLPPort05], 6, [1, 1], 2>;
-defm : X86WriteRes<WriteCvtPH2PSLd, [ADLPPort00_01, ADLPPort02_03_11], 12, [1, 1], 2>;
+defm : X86WriteRes<WriteCvtPH2PSLd, [ADLPPort00_01, ADLPPort02_03_10], 12, [1, 1], 2>;
 defm : X86WriteRes<WriteCvtPH2PSY, [ADLPPort00_01, ADLPPort05], 8, [1, 1], 2>;
-defm : X86WriteRes<WriteCvtPH2PSYLd, [ADLPPort00_01, ADLPPort02_03_11], 12, [1, 1], 2>;
+defm : X86WriteRes<WriteCvtPH2PSYLd, [ADLPPort00_01, ADLPPort02_03_10], 12, [1, 1], 2>;
 defm : X86WriteResPairUnsupported<WriteCvtPH2PSZ>;
 defm : ADLPWriteResPair<WriteCvtPS2I, [ADLPPort00_01], 4, [1], 1, 7>;
 defm : ADLPWriteResPair<WriteCvtPS2IY, [ADLPPort00_01], 4, [1], 1, 8>;
 defm : X86WriteResPairUnsupported<WriteCvtPS2IZ>;
 defm : X86WriteRes<WriteCvtPS2PD, [ADLPPort00_01, ADLPPort05], 5, [1, 1], 2>;
-defm : X86WriteRes<WriteCvtPS2PDLd, [ADLPPort00_01, ADLPPort02_03_11], 11, [1, 1], 2>;
+defm : X86WriteRes<WriteCvtPS2PDLd, [ADLPPort00_01, ADLPPort02_03_10], 11, [1, 1], 2>;
 defm : X86WriteRes<WriteCvtPS2PDY, [ADLPPort00_01, ADLPPort05], 7, [1, 1], 2>;
-defm : X86WriteRes<WriteCvtPS2PDYLd, [ADLPPort00_01, ADLPPort02_03_11], 12, [1, 1], 2>;
+defm : X86WriteRes<WriteCvtPS2PDYLd, [ADLPPort00_01, ADLPPort02_03_10], 12, [1, 1], 2>;
 defm : X86WriteResPairUnsupported<WriteCvtPS2PDZ>;
 defm : X86WriteRes<WriteCvtPS2PH, [ADLPPort00_01, ADLPPort05], 6, [1, 1], 2>;
 defm : X86WriteRes<WriteCvtPS2PHSt, [ADLPPort00_01, ADLPPort04_09, ADLPPort07_08], 12, [1, 1, 1], 3>;
@@ -199,12 +198,12 @@ defm : ADLPWriteResPair<WriteCvtSD2I, [ADLPPort00, ADLPPort00_01], 7, [1, 1], 2>
 defm : ADLPWriteResPair<WriteCvtSD2SS, [ADLPPort00_01, ADLPPort05], 5, [1, 1], 2, 7>;
 defm : ADLPWriteResPair<WriteCvtSS2I, [ADLPPort00, ADLPPort00_01], 7, [1, 1], 2>;
 defm : X86WriteRes<WriteCvtSS2SD, [ADLPPort00_01, ADLPPort05], 5, [1, 1], 2>;
-defm : X86WriteRes<WriteCvtSS2SDLd, [ADLPPort00_01, ADLPPort02_03_11], 11, [1, 1], 2>;
+defm : X86WriteRes<WriteCvtSS2SDLd, [ADLPPort00_01, ADLPPort02_03_10], 11, [1, 1], 2>;
 defm : ADLPWriteResPair<WriteDPPD, [ADLPPort00_01, ADLPPort01_05], 9, [2, 1], 3, 7>;
 defm : ADLPWriteResPair<WriteDPPS, [ADLPPort00_01, ADLPPort00_06, ADLPPort01_05, ADLPPort05], 14, [2, 1, 2, 1], 6, 7>;
 defm : ADLPWriteResPair<WriteDPPSY, [ADLPPort00_01, ADLPPort00_06, ADLPPort01_05, ADLPPort05], 14, [2, 1, 2, 1], 6, 8>;
-defm : ADLPWriteResPair<WriteDiv16, [ADLPPort00_01_05_06_10, ADLPPort01], 16, [1, 3], 4, 4>;
-defm : ADLPWriteResPair<WriteDiv32, [ADLPPort00_01_05_06_10, ADLPPort01], 15, [1, 3], 4, 4>;
+defm : ADLPWriteResPair<WriteDiv16, [ADLPPort00_01_05_06_11, ADLPPort01], 16, [1, 3], 4, 4>;
+defm : ADLPWriteResPair<WriteDiv32, [ADLPPort00_01_05_06_11, ADLPPort01], 15, [1, 3], 4, 4>;
 defm : ADLPWriteResPair<WriteDiv64, [ADLPPort01], 18, [3], 3>;
 defm : X86WriteRes<WriteDiv8, [ADLPPort01], 17, [3], 3>;
 defm : X86WriteRes<WriteDiv8Ld, [ADLPPort01], 22, [3], 3>;
@@ -212,7 +211,7 @@ defm : X86WriteRes<WriteEMMS, [ADLPPort00, ADLPPort00_05, ADLPPort00_06], 10, [1
 def : WriteRes<WriteFAdd, [ADLPPort05]> {
   let Latency = 3;
 }
-defm : X86WriteRes<WriteFAddLd, [ADLPPort01_05, ADLPPort02_03_11], 10, [1, 1], 2>;
+defm : X86WriteRes<WriteFAddLd, [ADLPPort01_05, ADLPPort02_03_10], 10, [1, 1], 2>;
 defm : ADLPWriteResPair<WriteFAdd64, [ADLPPort01_05], 3, [1], 1, 7>;
 defm : ADLPWriteResPair<WriteFAdd64X, [ADLPPort01_05], 3, [1], 1, 7>;
 defm : ADLPWriteResPair<WriteFAdd64Y, [ADLPPort01_05], 3, [1], 1, 8>;
@@ -249,13 +248,13 @@ defm : ADLPWriteResPair<WriteFHAddY, [ADLPPort01_05, ADLPPort05], 5, [1, 2], 3,
 def : WriteRes<WriteFLD0, [ADLPPort00_05]>;
 defm : X86WriteRes<WriteFLD1, [ADLPPort00_05], 1, [2], 2>;
 defm : X86WriteRes<WriteFLDC, [ADLPPort00_05], 1, [2], 2>;
-def : WriteRes<WriteFLoad, [ADLPPort02_03_11]> {
+def : WriteRes<WriteFLoad, [ADLPPort02_03_10]> {
   let Latency = 7;
 }
-def : WriteRes<WriteFLoadX, [ADLPPort02_03_11]> {
+def : WriteRes<WriteFLoadX, [ADLPPort02_03_10]> {
   let Latency = 7;
 }
-def : WriteRes<WriteFLoadY, [ADLPPort02_03_11]> {
+def : WriteRes<WriteFLoadY, [ADLPPort02_03_10]> {
   let Latency = 8;
 }
 defm : ADLPWriteResPair<WriteFLogic, [ADLPPort00_01_05], 1, [1], 1, 7>;
@@ -268,8 +267,8 @@ defm : X86WriteResPairUnsupported<WriteFMAZ>;
 def : WriteRes<WriteFMOVMSK, [ADLPPort00]> {
   let Latency = 3;
 }
-defm : X86WriteRes<WriteFMaskedLoad, [ADLPPort00_01_05, ADLPPort02_03_11], 8, [1, 1], 2>;
-defm : X86WriteRes<WriteFMaskedLoadY, [ADLPPort00_01_05, ADLPPort02_03_11], 9, [1, 1], 2>;
+defm : X86WriteRes<WriteFMaskedLoad, [ADLPPort00_01_05, ADLPPort02_03_10], 8, [1, 1], 2>;
+defm : X86WriteRes<WriteFMaskedLoadY, [ADLPPort00_01_05, ADLPPort02_03_10], 9, [1, 1], 2>;
 defm : X86WriteRes<WriteFMaskedStore32, [ADLPPort00, ADLPPort04_09, ADLPPort07_08], 14, [1, 1, 1], 3>;
 defm : X86WriteRes<WriteFMaskedStore32Y, [ADLPPort00, ADLPPort04_09, ADLPPort07_08], 14, [1, 1, 1], 3>;
 defm : X86WriteRes<WriteFMaskedStore64, [ADLPPort00, ADLPPort04_09, ADLPPort07_08], 14, [1, 1, 1], 3>;
@@ -331,15 +330,15 @@ defm : X86WriteResPairUnsupported<WriteFVarShuffleZ>;
 def : WriteRes<WriteFence, [ADLPPort00_06]> {
   let Latency = 2;
 }
-defm : ADLPWriteResPair<WriteIDiv16, [ADLPPort00_01_05_06_10, ADLPPort01], 16, [1, 3], 4, 4>;
-defm : ADLPWriteResPair<WriteIDiv32, [ADLPPort00_01_05_06_10, ADLPPort01], 15, [1, 3], 4, 4>;
+defm : ADLPWriteResPair<WriteIDiv16, [ADLPPort00_01_05_06_11, ADLPPort01], 16, [1, 3], 4, 4>;
+defm : ADLPWriteResPair<WriteIDiv32, [ADLPPort00_01_05_06_11, ADLPPort01], 15, [1, 3], 4, 4>;
 defm : ADLPWriteResPair<WriteIDiv64, [ADLPPort01], 18, [3], 3>;
 defm : X86WriteRes<WriteIDiv8, [ADLPPort01], 17, [3], 3>;
 defm : X86WriteRes<WriteIDiv8Ld, [ADLPPort01], 22, [3], 3>;
-defm : ADLPWriteResPair<WriteIMul16, [ADLPPort00_01_05_06_10, ADLPPort00_06, ADLPPort01], 5, [2, 1, 1], 4>;
-defm : ADLPWriteResPair<WriteIMul16Imm, [ADLPPort00_01_05_06_10, ADLPPort01], 4, [1, 1], 2>;
+defm : ADLPWriteResPair<WriteIMul16, [ADLPPort00_01_05_06_11, ADLPPort00_06, ADLPPort01], 5, [2, 1, 1], 4>;
+defm : ADLPWriteResPair<WriteIMul16Imm, [ADLPPort00_01_05_06_11, ADLPPort01], 4, [1, 1], 2>;
 defm : ADLPWriteResPair<WriteIMul16Reg, [ADLPPort01], 3, [1]>;
-defm : ADLPWriteResPair<WriteIMul32, [ADLPPort00_01_05_06_10, ADLPPort00_06, ADLPPort01], 4, [1, 1, 1], 3>;
+defm : ADLPWriteResPair<WriteIMul32, [ADLPPort00_01_05_06_11, ADLPPort00_06, ADLPPort01], 4, [1, 1, 1], 3>;
 defm : ADLPWriteResPair<WriteIMul32Imm, [ADLPPort01], 3, [1]>;
 defm : ADLPWriteResPair<WriteIMul32Reg, [ADLPPort01], 3, [1]>;
 defm : ADLPWriteResPair<WriteIMul64, [ADLPPort01, ADLPPort05], 4, [1, 1], 2>;
@@ -357,10 +356,10 @@ defm : X86WriteRes<WriteJumpLd, [ADLPPort00_06, ADLPPort02_03], 6, [1, 1], 2>;
 def : WriteRes<WriteLAHFSAHF, [ADLPPort00_06]> {
   let Latency = 3;
 }
-defm : X86WriteRes<WriteLDMXCSR, [ADLPPort00, ADLPPort00_01_05, ADLPPort00_06, ADLPPort02_03_11], 7, [1, 1, 1, 1], 4>;
+defm : X86WriteRes<WriteLDMXCSR, [ADLPPort00, ADLPPort00_01_05, ADLPPort00_06, ADLPPort02_03_10], 7, [1, 1, 1, 1], 4>;
 def : WriteRes<WriteLEA, [ADLPPort01]>;
 defm : ADLPWriteResPair<WriteLZCNT, [ADLPPort01], 3, [1]>;
-def : WriteRes<WriteLoad, [ADLPPort02_03_11]> {
+def : WriteRes<WriteLoad, [ADLPPort02_03_10]> {
   let Latency = 5;
 }
 def : WriteRes<WriteMMXMOVMSK, [ADLPPort00]> {
@@ -368,17 +367,17 @@ def : WriteRes<WriteMMXMOVMSK, [ADLPPort00]> {
 }
 defm : ADLPWriteResPair<WriteMPSAD, [ADLPPort01_05, ADLPPort05], 4, [1, 1], 2, 7>;
 defm : ADLPWriteResPair<WriteMPSADY, [ADLPPort01_05, ADLPPort05], 4, [1, 1], 2, 8>;
-defm : ADLPWriteResPair<WriteMULX32, [ADLPPort00_01_05_06_10, ADLPPort00_06, ADLPPort01], 4, [1, 1, 1], 2>;
+defm : ADLPWriteResPair<WriteMULX32, [ADLPPort00_01_05_06_11, ADLPPort00_06, ADLPPort01], 4, [1, 1, 1], 2>;
 defm : ADLPWriteResPair<WriteMULX64, [ADLPPort01, ADLPPort05], 4, [1, 1]>;
 def : WriteRes<WriteMicrocoded, [ADLPPort00_01_05_06]> {
   let Latency = AlderlakePModel.MaxLatency;
 }
-def : WriteRes<WriteMove, [ADLPPort00_01_05_06_10]>;
+def : WriteRes<WriteMove, [ADLPPort00_01_05_06_11]>;
 defm : X86WriteRes<WriteNop, [], 1, [], 0>;
 defm : X86WriteRes<WritePCmpEStrI, [ADLPPort00, ADLPPort00_01_05, ADLPPort00_06, ADLPPort01, ADLPPort05], 16, [3, 2, 1, 1, 1], 8>;
-defm : X86WriteRes<WritePCmpEStrILd, [ADLPPort00, ADLPPort00_01_05, ADLPPort00_06, ADLPPort01, ADLPPort02_03_11, ADLPPort05], 31, [3, 1, 1, 1, 1, 1], 8>;
+defm : X86WriteRes<WritePCmpEStrILd, [ADLPPort00, ADLPPort00_01_05, ADLPPort00_06, ADLPPort01, ADLPPort02_03_10, ADLPPort05], 31, [3, 1, 1, 1, 1, 1], 8>;
 defm : X86WriteRes<WritePCmpEStrM, [ADLPPort00, ADLPPort00_01_05, ADLPPort00_06, ADLPPort01, ADLPPort05], 16, [3, 3, 1, 1, 1], 9>;
-defm : X86WriteRes<WritePCmpEStrMLd, [ADLPPort00, ADLPPort00_01_05, ADLPPort00_06, ADLPPort01, ADLPPort02_03_11, ADLPPort05], 17, [3, 2, 1, 1, 1, 1], 9>;
+defm : X86WriteRes<WritePCmpEStrMLd, [ADLPPort00, ADLPPort00_01_05, ADLPPort00_06, ADLPPort01, ADLPPort02_03_10, ADLPPort05], 17, [3, 2, 1, 1, 1, 1], 9>;
 defm : ADLPWriteResPair<WritePCmpIStrI, [ADLPPort00], 11, [3], 3, 20>;
 defm : ADLPWriteResPair<WritePCmpIStrM, [ADLPPort00], 11, [3], 3>;
 defm : ADLPWriteResPair<WritePHAdd, [ADLPPort00_05, ADLPPort05], 3, [1, 2], 3, 8>;
@@ -393,16 +392,16 @@ defm : ADLPWriteResPair<WritePSADBW, [ADLPPort05], 3, [1], 1, 8>;
 defm : ADLPWriteResPair<WritePSADBWX, [ADLPPort05], 3, [1], 1, 7>;
 defm : ADLPWriteResPair<WritePSADBWY, [ADLPPort05], 3, [1], 1, 8>;
 defm : X86WriteResPairUnsupported<WritePSADBWZ>;
-defm : X86WriteRes<WriteRMW, [ADLPPort02_03_11, ADLPPort04_09, ADLPPort07_08], 1, [1, 1, 1], 3>;
-defm : X86WriteRes<WriteRotate, [ADLPPort00_01_05_06_10, ADLPPort00_06], 2, [1, 2], 3>;
-defm : X86WriteRes<WriteRotateLd, [ADLPPort00_01_05_06_10, ADLPPort00_06], 12, [1, 2], 3>;
+defm : X86WriteRes<WriteRMW, [ADLPPort02_03_10, ADLPPort04_09, ADLPPort07_08], 1, [1, 1, 1], 3>;
+defm : X86WriteRes<WriteRotate, [ADLPPort00_01_05_06_11, ADLPPort00_06], 2, [1, 2], 3>;
+defm : X86WriteRes<WriteRotateLd, [ADLPPort00_01_05_06_11, ADLPPort00_06], 12, [1, 2], 3>;
 defm : X86WriteRes<WriteRotateCL, [ADLPPort00_06], 2, [2], 2>;
-defm : X86WriteRes<WriteRotateCLLd, [ADLPPort00_01_05_06_10, ADLPPort00_06, ADLPPort01], 19, [2, 3, 2], 7>;
+defm : X86WriteRes<WriteRotateCLLd, [ADLPPort00_01_05_06_11, ADLPPort00_06, ADLPPort01], 19, [2, 3, 2], 7>;
 defm : X86WriteRes<WriteSETCC, [ADLPPort00_06], 2, [2], 2>;
 defm : X86WriteRes<WriteSETCCStore, [ADLPPort00_06, ADLPPort04_09, ADLPPort07_08], 13, [2, 1, 1], 4>;
-defm : X86WriteRes<WriteSHDmrcl, [ADLPPort00_01_05_06_10, ADLPPort00_06, ADLPPort01, ADLPPort02_03_11, ADLPPort04_09, ADLPPort07_08], 12, [1, 1, 1, 1, 1, 1], 6>;
-defm : X86WriteRes<WriteSHDmri, [ADLPPort00_01_05_06_10, ADLPPort01, ADLPPort02_03_11, ADLPPort04_09, ADLPPort07_08], 12, [1, 1, 1, 1, 1], 5>;
-defm : X86WriteRes<WriteSHDrrcl, [ADLPPort00_01_05_06_10, ADLPPort00_06, ADLPPort01], 5, [1, 1, 1], 3>;
+defm : X86WriteRes<WriteSHDmrcl, [ADLPPort00_01_05_06_11, ADLPPort00_06, ADLPPort01, ADLPPort02_03_10, ADLPPort04_09, ADLPPort07_08], 12, [1, 1, 1, 1, 1, 1], 6>;
+defm : X86WriteRes<WriteSHDmri, [ADLPPort00_01_05_06_11, ADLPPort01, ADLPPort02_03_10, ADLPPort04_09, ADLPPort07_08], 12, [1, 1, 1, 1, 1], 5>;
+defm : X86WriteRes<WriteSHDrrcl, [ADLPPort00_01_05_06_11, ADLPPort00_06, ADLPPort01], 5, [1, 1, 1], 3>;
 def : WriteRes<WriteSHDrri, [ADLPPort01]> {
   let Latency = 3;
 }
@@ -447,20 +446,20 @@ defm : ADLPWriteResPair<WriteVecIMulX, [ADLPPort00_01], 5, [1], 1, 8>;
 defm : ADLPWriteResPair<WriteVecIMulY, [ADLPPort00_01], 5, [1], 1, 8>;
 defm : X86WriteResPairUnsupported<WriteVecIMulZ>;
 defm : X86WriteRes<WriteVecInsert, [ADLPPort01_05, ADLPPort05], 4, [1, 1], 2>;
-defm : X86WriteRes<WriteVecInsertLd, [ADLPPort01_05, ADLPPort02_03_11], 8, [1, 1], 2>;
-def : WriteRes<WriteVecLoad, [ADLPPort02_03_11]> {
+defm : X86WriteRes<WriteVecIn...
[truncated]

Based on intel/perfmon#149, the documentation is
incorrect and the pfm counter names are actually correct. This patch adjusts
the Alder Lake scheduling model to match the performance counter naming/
correct naming that will soon be reflected in the optimization manual.

This fixes part of llvm#117360.
Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - cheers @HaohaiWen @phoebewang any comments?

Copy link
Contributor

@phoebewang phoebewang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@boomanaiden154 boomanaiden154 merged commit 512dc5c into llvm:main Nov 25, 2024
8 checks passed
@boomanaiden154 boomanaiden154 deleted the alder-lake-sched-swap-10-11 branch November 25, 2024 00:58
@HaohaiWen
Copy link
Contributor

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants