Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix vectorization error on copy algorithm #6567

Merged
merged 1 commit into from
Dec 9, 2024

Conversation

Pansysk75
Copy link
Member

Fixes bad vectorization that caused copy algorithms with unseq execution to malfunction.
This should fix recent test errors on CI builders with the -fopenmp flag.

@Pansysk75 Pansysk75 requested a review from hkaiser as a code owner November 9, 2024 01:08
@Pansysk75 Pansysk75 force-pushed the fix-copy-bad-vectorization branch 2 times, most recently from 0d43825 to f3746fb Compare November 9, 2024 01:11
@StellarBot
Copy link

Performance test report

HPX Performance

Comparison

BENCHMARKFORK_JOIN_EXECUTORPARALLEL_EXECUTORSCHEDULER_EXECUTOR
For Each------

Info

PropertyBeforeAfter
HPX Datetime2024-03-18T14:00:30+00:002024-11-09T01:11:54+00:00
HPX Commitd27ac2e9554da2
Datetime2024-03-18T09:18:04.949759-05:002024-11-08T19:20:14.941431-06:00
Envfile
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/18.1.5/bin/clang++ 18.1.5
Clusternamerostamrostam
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch--

Info

PropertyBeforeAfter
HPX Datetime2024-03-18T14:00:30+00:002024-11-09T01:11:54+00:00
HPX Commitd27ac2e9554da2
Datetime2024-03-18T09:19:53.062988-05:002024-11-08T19:22:05.043553-06:00
Envfile
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/18.1.5/bin/clang++ 18.1.5
Clusternamerostamrostam
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Comparison

BENCHMARKFORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATORPARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATORSCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add------
Stream Benchmark - Scale-------
Stream Benchmark - Triad------
Stream Benchmark - Copy------

Info

PropertyBeforeAfter
HPX Datetime2024-03-18T14:00:30+00:002024-11-09T01:11:54+00:00
HPX Commitd27ac2e9554da2
Datetime2024-03-18T09:20:13.002391-05:002024-11-08T19:22:25.708051-06:00
Envfile
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/18.1.5/bin/clang++ 18.1.5
Clusternamerostamrostam
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (≤5%)
++/--Large performance improvement/degradation (≤10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

hkaiser
hkaiser previously approved these changes Nov 9, 2024
Copy link
Member

@hkaiser hkaiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@hkaiser
Copy link
Member

hkaiser commented Nov 9, 2024

@Pansysk75 Wow, just wow!

Copy link

codacy-production bot commented Nov 9, 2024

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation Diff coverage
-0.45% 100.00%
Coverage variation details
Coverable lines Covered lines Coverage
Common ancestor commit (342d373) 234436 200546 85.54%
Head commit (57a7f65) 191422 (-43014) 162896 (-37650) 85.10% (-0.45%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details
Coverable lines Covered lines Diff coverage
Pull request (#6567) 2 2 100.00%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings    Change summary preferences

Codacy stopped sending the deprecated coverage status on June 5th, 2024. Learn more

@Pansysk75
Copy link
Member Author

@hkaiser I've narrowed down the remaining simd fail-case. I might find a way to convince the compiler to produce correct code, as I did above, however it's apparent that some simd directive causes too aggressive optimization and ignores blatant data dependencies. I'd prefer to try find a proper solution for that.

@Pansysk75
Copy link
Member Author

@hkaiser The solution we discussed (passing data as arguments to loop_n instead of lambda captures) doesn't seem to prevent the faulty vectorization.

Trying to recreate a simpler version of the failing case on Godbolt gives me

warning: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Wpass-failed=transform-warning]
    #pragma omp simd
     ^

Which, to me, hints that Clang tries to do dependency analysis and not allow vectorization that would lead to incorrect results when using #pragma omp simd. I couldn't locate any other strange configuration on our side, so it could be a compiler bug.

@Pansysk75 Pansysk75 force-pushed the fix-copy-bad-vectorization branch 2 times, most recently from 57a7f65 to def399f Compare November 18, 2024 16:48
@StellarBot
Copy link

Performance test report

HPX Performance

Comparison

BENCHMARKFORK_JOIN_EXECUTORPARALLEL_EXECUTORSCHEDULER_EXECUTOR
For Each------

Info

PropertyBeforeAfter
HPX Commitd27ac2ea0d2388
HPX Datetime2024-03-18T14:00:30+00:002024-11-18T16:34:11+00:00
Datetime2024-03-18T09:18:04.949759-05:002024-11-18T10:49:24.504553-06:00
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Envfile
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8
Clusternamerostamrostam

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch--

Info

PropertyBeforeAfter
HPX Commitd27ac2ea0d2388
HPX Datetime2024-03-18T14:00:30+00:002024-11-18T16:34:11+00:00
Datetime2024-03-18T09:19:53.062988-05:002024-11-18T10:51:16.270872-06:00
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Envfile
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8
Clusternamerostamrostam

Comparison

BENCHMARKFORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATORPARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATORSCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add------
Stream Benchmark - Scale------
Stream Benchmark - Triad------
Stream Benchmark - Copy------

Info

PropertyBeforeAfter
HPX Commitd27ac2ea0d2388
HPX Datetime2024-03-18T14:00:30+00:002024-11-18T16:34:11+00:00
Datetime2024-03-18T09:20:13.002391-05:002024-11-18T10:51:41.209873-06:00
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Envfile
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8
Clusternamerostamrostam

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (≤5%)
++/--Large performance improvement/degradation (≤10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

@StellarBot
Copy link

Performance test report

HPX Performance

Comparison

BENCHMARKFORK_JOIN_EXECUTORPARALLEL_EXECUTORSCHEDULER_EXECUTOR
For Each------

Info

PropertyBeforeAfter
HPX Datetime2024-03-18T14:00:30+00:002024-11-18T16:48:02+00:00
HPX Commitd27ac2e7e3b6d5
Envfile
Datetime2024-03-18T09:18:04.949759-05:002024-11-18T10:58:54.156541-06:00
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Clusternamerostamrostam

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch--

Info

PropertyBeforeAfter
HPX Datetime2024-03-18T14:00:30+00:002024-11-18T16:48:02+00:00
HPX Commitd27ac2e7e3b6d5
Envfile
Datetime2024-03-18T09:19:53.062988-05:002024-11-18T11:00:44.547046-06:00
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Clusternamerostamrostam

Comparison

BENCHMARKFORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATORPARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATORSCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add------
Stream Benchmark - Scale-----
Stream Benchmark - Triad------
Stream Benchmark - Copy-------

Info

PropertyBeforeAfter
HPX Datetime2024-03-18T14:00:30+00:002024-11-18T16:48:02+00:00
HPX Commitd27ac2e7e3b6d5
Envfile
Datetime2024-03-18T09:20:13.002391-05:002024-11-18T11:01:05.808638-06:00
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Clusternamerostamrostam

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (≤5%)
++/--Large performance improvement/degradation (≤10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

@hkaiser
Copy link
Member

hkaiser commented Nov 19, 2024

The checkout test is failing as the Kitware people have just removed the file we're trying to download from their CDash repo (see: Kitware/CDash#2570):

Here is the file they removed:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:output method="xml" indent="yes"/>
    <xsl:template match="/Site">
        <xsl:variable name="Name"><xsl:value-of select="@Name"/></xsl:variable>
        <xsl:variable name="Hostname"><xsl:value-of select="@Hostname"/></xsl:variable>
        <xsl:variable name="TestCount"><xsl:value-of select="count(//TestList/Test)"/></xsl:variable>
        <xsl:variable name="ErrorCount"><xsl:value-of select="count(//TestList/Test[@Status='error'])"/></xsl:variable>
        <xsl:variable name="FailureCount"><xsl:value-of select="count(//Testing/Test[@Status='failed'])"/></xsl:variable>
        <testsuite name="{$Name}" hostname="{$Hostname}" errors="0" failures="{$FailureCount}" tests="{$TestCount}">
            <xsl:variable name="BuildName"><xsl:value-of select="@BuildName"/></xsl:variable>
            <xsl:variable name="BuildStamp"><xsl:value-of select="@BuildStamp"/></xsl:variable>
            <xsl:variable name="Generator"><xsl:value-of select="@Generator"/></xsl:variable>
            <xsl:variable name="CompilerName"><xsl:value-of select="@CompilerName"/></xsl:variable>
            <xsl:variable name="OSName"><xsl:value-of select="@OSName"/></xsl:variable>
            <xsl:variable name="OSRelease"><xsl:value-of select="@OSRelease"/></xsl:variable>
            <xsl:variable name="OSVersion"><xsl:value-of select="@OSVersion"/></xsl:variable>
            <xsl:variable name="OSPlatform"><xsl:value-of select="@OSPlatform"/></xsl:variable>
            <xsl:variable name="Is64Bits"><xsl:value-of select="@Is64Bits"/></xsl:variable>
            <xsl:variable name="VendorString"><xsl:value-of select="@VendorString"/></xsl:variable>
            <xsl:variable name="VendorID"><xsl:value-of select="@VendorID"/></xsl:variable>
            <xsl:variable name="FamilyID"><xsl:value-of select="@FamilyID"/></xsl:variable>
            <xsl:variable name="ModelID"><xsl:value-of select="@ModelID"/></xsl:variable>
            <xsl:variable name="ProcessorCacheSize"><xsl:value-of select="@ProcessorCacheSize"/></xsl:variable>
            <xsl:variable name="NumberOfLogicalCPU"><xsl:value-of select="@NumberOfLogicalCPU"/></xsl:variable>
            <xsl:variable name="NumberOfPhysicalCPU"><xsl:value-of select="@NumberOfPhysicalCPU"/></xsl:variable>
            <xsl:variable name="TotalVirtualMemory"><xsl:value-of select="@TotalVirtualMemory"/></xsl:variable>
            <xsl:variable name="TotalPhysicalMemory"><xsl:value-of select="@TotalPhysicalMemory"/></xsl:variable>
            <xsl:variable name="LogicalProcessorsPerPhysical"><xsl:value-of select="@LogicalProcessorsPerPhysical"/></xsl:variable>
            <xsl:variable name="ProcessorClockFrequency"><xsl:value-of select="@ProcessorClockFrequency"/></xsl:variable>
            <properties>
                <property name="BuildName" value="{$BuildName}"/>
                <property name="BuildStamp" value="{$BuildStamp}"/>
                <property name="Name" value="{$Name}"/>
                <property name="Generator" value="{$Generator}"/>
                <property name="CompilerName" value="{$CompilerName}"/>
                <property name="OSName" value="{$OSName}"/>
                <property name="Hostname" value="{$Hostname}"/>
                <property name="OSRelease" value="{$OSRelease}"/>
                <property name="OSVersion" value="{$OSVersion}"/>
                <property name="OSPlatform" value="{$OSPlatform}"/>
                <property name="Is64Bits" value="{$Is64Bits}"/>
                <property name="VendorString" value="{$VendorString}"/>
                <property name="VendorID" value="{$VendorID}"/>
                <property name="FamilyID" value="{$FamilyID}"/>
                <property name="ModelID" value="{$ModelID}"/>
                <property name="ProcessorCacheSize" value="{$ProcessorCacheSize}"/>
                <property name="NumberOfLogicalCPU" value="{$NumberOfLogicalCPU}"/>
                <property name="NumberOfPhysicalCPU" value="{$NumberOfPhysicalCPU}"/>
                <property name="TotalVirtualMemory" value="{$TotalVirtualMemory}"/>
                <property name="TotalPhysicalMemory" value="{$TotalPhysicalMemory}"/>
                <property name="LogicalProcessorsPerPhysical" value="{$LogicalProcessorsPerPhysical}"/>
                <property name="ProcessorClockFrequency" value="{$ProcessorClockFrequency}"/>
            </properties>
            <xsl:apply-templates select="Testing/Test"/>
            <system-out>
                BuildName: <xsl:value-of select="$BuildName"/>
                BuildStamp: <xsl:value-of select="$BuildStamp"/>
                Name: <xsl:value-of select="$Name"/>
                Generator: <xsl:value-of select="$Generator"/>
                CompilerName: <xsl:value-of select="$CompilerName"/>
                OSName: <xsl:value-of select="$OSName"/>
                Hostname: <xsl:value-of select="$Hostname"/>
                OSRelease: <xsl:value-of select="$OSRelease"/>
                OSVersion: <xsl:value-of select="$OSVersion"/>
                OSPlatform: <xsl:value-of select="$OSPlatform"/>
                Is64Bits: <xsl:value-of select="$Is64Bits"/>
                VendorString: <xsl:value-of select="$VendorString"/>
                VendorID: <xsl:value-of select="$VendorID"/>
                FamilyID: <xsl:value-of select="$FamilyID"/>
                ModelID: <xsl:value-of select="$ModelID"/>
                ProcessorCacheSize: <xsl:value-of select="$ProcessorCacheSize"/>
                NumberOfLogicalCPU: <xsl:value-of select="$NumberOfLogicalCPU"/>
                NumberOfPhysicalCPU: <xsl:value-of select="$NumberOfPhysicalCPU"/>
                TotalVirtualMemory: <xsl:value-of select="$TotalVirtualMemory"/>
                TotalPhysicalMemory: <xsl:value-of select="$TotalPhysicalMemory"/>
                LogicalProcessorsPerPhysical: <xsl:value-of select="$LogicalProcessorsPerPhysical"/>
                ProcessorClockFrequency: <xsl:value-of select="$ProcessorClockFrequency"/>
            </system-out>
        </testsuite>
    </xsl:template>
    <xsl:template match="Testing/Test">
        <xsl:variable name="testcasename"><xsl:value-of select="Name"/></xsl:variable>
        <xsl:variable name="testclassname"><xsl:value-of select="substring(Path,2)"/></xsl:variable>
        <xsl:variable name="exectime">
            <xsl:for-each select="Results/NamedMeasurement">
                <xsl:if test="@name = 'Execution Time'">
                    <xsl:value-of select="."/>
                </xsl:if>
            </xsl:for-each>
        </xsl:variable>
        <testcase name="{$testcasename}" classname="{$testclassname}" time="{$exectime}">
            <xsl:if test="@Status = 'passed'"></xsl:if>
            <xsl:if test="@Status = 'failed'">
                <xsl:variable name="failtype">
                    <xsl:for-each select="Results/NamedMeasurement">
                        <xsl:if test="@name = 'Exit Code'">
                            <xsl:value-of select="."/>
                        </xsl:if>
                    </xsl:for-each>
                </xsl:variable>
                <xsl:variable name="failcode">
                    <xsl:for-each select="Results/NamedMeasurement">
                        <xsl:if test="@name = 'Exit Value'">
                            <xsl:value-of select="."/>
                        </xsl:if>
                    </xsl:for-each>
                </xsl:variable>
                <failure message="{$failtype} ({$failcode})"><xsl:value-of select="Results/Measurement/Value/text()"/></failure>
            </xsl:if>
            <xsl:if test="@Status = 'notrun'">
                <skipped><xsl:value-of select="Results/Measurement/Value/text()"/></skipped>
            </xsl:if>
        </testcase>
    </xsl:template>
</xsl:stylesheet>

We should add this to our CircleCI testing environment so we don't have to rely on their version of https://raw.githubusercontent.com/Kitware/CDash/master/app/cdash/tests/circle/conv.xsl anymore.

@hkaiser
Copy link
Member

hkaiser commented Nov 19, 2024

Fixed here: #6575

@Pansysk75 Pansysk75 force-pushed the fix-copy-bad-vectorization branch from def399f to f71cc57 Compare November 20, 2024 16:56
@StellarBot
Copy link

Performance test report

HPX Performance

Comparison

BENCHMARKFORK_JOIN_EXECUTORPARALLEL_EXECUTORSCHEDULER_EXECUTOR
For Each------

Info

PropertyBeforeAfter
HPX Commitd27ac2edb94ab5
HPX Datetime2024-03-18T14:00:30+00:002024-11-20T16:56:57+00:00
Envfile
Clusternamerostamrostam
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8
Datetime2024-03-18T09:18:04.949759-05:002024-11-20T12:32:04.289623-06:00
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch--

Info

PropertyBeforeAfter
HPX Commitd27ac2edb94ab5
HPX Datetime2024-03-18T14:00:30+00:002024-11-20T16:56:57+00:00
Envfile
Clusternamerostamrostam
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8
Datetime2024-03-18T09:19:53.062988-05:002024-11-20T12:33:54.151409-06:00
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Comparison

BENCHMARKFORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATORPARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATORSCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add------
Stream Benchmark - Scale-----
Stream Benchmark - Triad------
Stream Benchmark - Copy------

Info

PropertyBeforeAfter
HPX Commitd27ac2edb94ab5
HPX Datetime2024-03-18T14:00:30+00:002024-11-20T16:56:57+00:00
Envfile
Clusternamerostamrostam
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8
Datetime2024-03-18T09:20:13.002391-05:002024-11-20T12:34:14.500185-06:00
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (≤5%)
++/--Large performance improvement/degradation (≤10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

@@ -49,7 +49,7 @@

#define HPX_HAVE_VECTOR_REDUCTION

#elif (_OPENMP >= 201307) || (defined(__clang__) && HPX_CLANG_VERSION >= 30700)
#elif (_OPENMP >= 201307) && !defined(__clang__)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#elif (_OPENMP >= 201307) && !defined(__clang__)
#elif (_OPENMP >= 201307) && !defined(HPX_CLANG_VERSION)

@Pansysk75 Pansysk75 force-pushed the fix-copy-bad-vectorization branch from f71cc57 to 3464c07 Compare November 20, 2024 20:01
@StellarBot
Copy link

Performance test report

HPX Performance

Comparison

BENCHMARKFORK_JOIN_EXECUTORPARALLEL_EXECUTORSCHEDULER_EXECUTOR
For Each------

Info

PropertyBeforeAfter
HPX Commitd27ac2ecbd966e
HPX Datetime2024-03-18T14:00:30+00:002024-11-20T20:01:06+00:00
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8
Datetime2024-03-18T09:18:04.949759-05:002024-11-20T14:10:27.120252-06:00
Clusternamerostamrostam
Envfile
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch--

Info

PropertyBeforeAfter
HPX Commitd27ac2ecbd966e
HPX Datetime2024-03-18T14:00:30+00:002024-11-20T20:01:06+00:00
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8
Datetime2024-03-18T09:19:53.062988-05:002024-11-20T14:12:17.538749-06:00
Clusternamerostamrostam
Envfile
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Comparison

BENCHMARKFORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATORPARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATORSCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add------
Stream Benchmark - Scale-------
Stream Benchmark - Triad------
Stream Benchmark - Copy------

Info

PropertyBeforeAfter
HPX Commitd27ac2ecbd966e
HPX Datetime2024-03-18T14:00:30+00:002024-11-20T20:01:06+00:00
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8
Datetime2024-03-18T09:20:13.002391-05:002024-11-20T14:12:37.990789-06:00
Clusternamerostamrostam
Envfile
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (≤5%)
++/--Large performance improvement/degradation (≤10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

Copy link

codacy-production bot commented Nov 20, 2024

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation Diff coverage
-0.42%
Coverage variation details
Coverable lines Covered lines Coverage
Common ancestor commit (342d373) 234436 200546 85.54%
Head commit (3464c07) 191450 (-42986) 162968 (-37578) 85.12% (-0.42%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details
Coverable lines Covered lines Diff coverage
Pull request (#6567) 0 0 ∅ (not applicable)

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings    Change summary preferences

Codacy stopped sending the deprecated coverage status on June 5th, 2024. Learn more

@hkaiser
Copy link
Member

hkaiser commented Nov 25, 2024

retest lsu

@StellarBot
Copy link

Performance test report

HPX Performance

Comparison

BENCHMARKFORK_JOIN_EXECUTORPARALLEL_EXECUTORSCHEDULER_EXECUTOR
For Each------

Info

PropertyBeforeAfter
HPX Commitd27ac2e2be998f
HPX Datetime2024-03-18T14:00:30+00:002024-11-25T14:58:24+00:00
Envfile
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Datetime2024-03-18T09:18:04.949759-05:002024-11-25T09:08:56.653224-06:00
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8
Clusternamerostamrostam

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch--

Info

PropertyBeforeAfter
HPX Commitd27ac2e2be998f
HPX Datetime2024-03-18T14:00:30+00:002024-11-25T14:58:24+00:00
Envfile
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Datetime2024-03-18T09:19:53.062988-05:002024-11-25T09:10:46.628195-06:00
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8
Clusternamerostamrostam

Comparison

BENCHMARKFORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATORPARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATORSCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add------
Stream Benchmark - Scale-----
Stream Benchmark - Triad------
Stream Benchmark - Copy------

Info

PropertyBeforeAfter
HPX Commitd27ac2e2be998f
HPX Datetime2024-03-18T14:00:30+00:002024-11-25T14:58:24+00:00
Envfile
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Datetime2024-03-18T09:20:13.002391-05:002024-11-25T09:11:07.230320-06:00
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8
Clusternamerostamrostam

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (≤5%)
++/--Large performance improvement/degradation (≤10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

@hkaiser
Copy link
Member

hkaiser commented Dec 2, 2024

@Pansysk75 is this ready to go now?

@Pansysk75
Copy link
Member Author

@Pansysk75 is this ready to go now?

Yes, I guess so.
Vectorization on Clang will not use OpenMP pragmas, but the clang-specific pragmas.
The OpenMP simd pragmas did not show any issues on other compilers.

@Pansysk75
Copy link
Member Author

retest

@StellarBot
Copy link

Performance test report

HPX Performance

Comparison

BENCHMARKFORK_JOIN_EXECUTORPARALLEL_EXECUTORSCHEDULER_EXECUTOR
For Each------

Info

PropertyBeforeAfter
HPX Commitd27ac2e741f87e
HPX Datetime2024-03-18T14:00:30+00:002024-12-02T00:07:28+00:00
Datetime2024-03-18T09:18:04.949759-05:002024-12-09T10:24:58.554829-06:00
Envfile
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8
Clusternamerostamrostam

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch--

Info

PropertyBeforeAfter
HPX Commitd27ac2e741f87e
HPX Datetime2024-03-18T14:00:30+00:002024-12-02T00:07:28+00:00
Datetime2024-03-18T09:19:53.062988-05:002024-12-09T10:26:48.536546-06:00
Envfile
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8
Clusternamerostamrostam

Comparison

BENCHMARKFORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATORPARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATORSCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add------
Stream Benchmark - Scale------
Stream Benchmark - Triad------
Stream Benchmark - Copy------

Info

PropertyBeforeAfter
HPX Commitd27ac2e741f87e
HPX Datetime2024-03-18T14:00:30+00:002024-12-02T00:07:28+00:00
Datetime2024-03-18T09:20:13.002391-05:002024-12-09T10:27:08.851756-06:00
Envfile
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8
Clusternamerostamrostam

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (≤5%)
++/--Large performance improvement/degradation (≤10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

@Pansysk75
Copy link
Member Author

@hkaiser It should be good to merge now

@hkaiser hkaiser merged commit cd8d65b into master Dec 9, 2024
46 of 58 checks passed
@hkaiser hkaiser deleted the fix-copy-bad-vectorization branch December 9, 2024 21:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants