Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simd Find #6302

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open

Simd Find #6302

wants to merge 6 commits into from

Conversation

Johan511
Copy link
Contributor

using simd-helpers (from #6286 ) to add vectorization to stl algorithms

@@ -34,8 +34,7 @@ namespace hpx::parallel::detail {
sequential_find_t<ExPolicy>, Iterator first, Sentinel last,
T const& value, Proj proj = Proj())
{
return util::loop_pred<
std::decay_t<hpx::execution::sequenced_policy>>(
return util::loop_pred<ExPolicy>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add static_assert verifying that ExPolicy is actually a sequential policy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change had been made so sequential_find_t can accept sequential and unsequential execution policies. Do you want me to rather use static_assert(is_seq || is_unseq) ?

/*
Compiler and Hardware should also support vector operations for IterDiff,
else we see slower performance when compared to sequential version
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this comment mean? Should that requirement be somehow enforced?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed previously, changing data types can lead to vectorization being broken. I wanted to leave a note for the future regarding it. Use perf regression tests might not be the best way to do this as they often seem to have lot of variance in them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, a comment doesn't help with this, does it? I think setting up performance tests would be the better option. @Pansysk75 can help with that.

Copy link
Member

@Pansysk75 Pansysk75 Aug 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hkaiser Do you mean automated ones that report on GH? He asked me about this but I discouraged him from doing that yet. It can certainly be done though, if that's what you have in mind
@Johan511 There are ways of getting around that variance, so that (hopefully) wont be a probem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I thought we could add it to our performance tests.

Copy link
Contributor Author

@Johan511 Johan511 Aug 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added it in a separate PR

libs/core/algorithms/include/hpx/parallel/util/loop.hpp Outdated Show resolved Hide resolved
libs/core/algorithms/include/hpx/parallel/util/loop.hpp Outdated Show resolved Hide resolved
libs/core/algorithms/include/hpx/parallel/util/loop.hpp Outdated Show resolved Hide resolved
@Johan511
Copy link
Contributor Author

Johan511 commented Aug 8, 2023

first-simd
Speedups observed for seq vs unseq

@hkaiser
Copy link
Member

hkaiser commented Aug 8, 2023

Speedups observed for seq vs unseq

Nice!

@StellarBot
Copy link

Performance test report

HPX Performance

Comparison

BENCHMARKFORK_JOIN_EXECUTORPARALLEL_EXECUTORSCHEDULER_EXECUTOR
For Each(=)??-

Info

PropertyBeforeAfter
HPX Datetime2023-05-10T12:07:53+00:002023-08-08T21:21:49+00:00
HPX Commitdcb5415f524d6e
Clusternamerostamrostam
Datetime2023-05-10T14:50:18.616050-05:002023-08-08T16:30:27.710089-05:00
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Envfile

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch=

Info

PropertyBeforeAfter
HPX Datetime2023-05-10T12:07:53+00:002023-08-08T21:21:49+00:00
HPX Commitdcb5415f524d6e
Clusternamerostamrostam
Datetime2023-05-10T14:52:35.047119-05:002023-08-08T16:32:40.990850-05:00
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Envfile

Comparison

BENCHMARKFORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATORPARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATORSCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add(=)(=)(=)
Stream Benchmark - Scale(=)(=)(=)
Stream Benchmark - Triad(=)(=)(=)
Stream Benchmark - Copy(=)(=)(=)

Info

PropertyBeforeAfter
HPX Datetime2023-05-10T12:07:53+00:002023-08-08T21:21:49+00:00
HPX Commitdcb5415f524d6e
Clusternamerostamrostam
Datetime2023-05-10T14:52:52.237641-05:002023-08-08T16:32:57.921975-05:00
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Envfile

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (≤5%)
++/--Large performance improvement/degradation (≤10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

@StellarBot
Copy link

Performance test report

HPX Performance

Comparison

BENCHMARKFORK_JOIN_EXECUTORPARALLEL_EXECUTORSCHEDULER_EXECUTOR
For Each(=)??-

Info

PropertyBeforeAfter
HPX Datetime2023-05-10T12:07:53+00:002023-08-08T21:33:54+00:00
HPX Commitdcb5415f663f52
Datetime2023-05-10T14:50:18.616050-05:002023-08-08T16:40:03.129538-05:00
Clusternamerostamrostam
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Envfile
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch(=)

Info

PropertyBeforeAfter
HPX Datetime2023-05-10T12:07:53+00:002023-08-08T21:33:54+00:00
HPX Commitdcb5415f663f52
Datetime2023-05-10T14:52:35.047119-05:002023-08-08T16:42:16.273814-05:00
Clusternamerostamrostam
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Envfile
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1

Comparison

BENCHMARKFORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATORPARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATORSCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add(=)(=)(=)
Stream Benchmark - Scale(=)(=)(=)
Stream Benchmark - Triad(=)(=)(=)
Stream Benchmark - Copy(=)(=)(=)

Info

PropertyBeforeAfter
HPX Datetime2023-05-10T12:07:53+00:002023-08-08T21:33:54+00:00
HPX Commitdcb5415f663f52
Datetime2023-05-10T14:52:52.237641-05:002023-08-08T16:42:33.204088-05:00
Clusternamerostamrostam
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Envfile
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (≤5%)
++/--Large performance improvement/degradation (≤10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

@StellarBot
Copy link

Performance test report

HPX Performance

Comparison

BENCHMARKFORK_JOIN_EXECUTORPARALLEL_EXECUTORSCHEDULER_EXECUTOR
For Each(=)??-

Info

PropertyBeforeAfter
HPX Datetime2023-05-10T12:07:53+00:002023-08-09T06:33:19+00:00
HPX Commitdcb541517aba85
Clusternamerostamrostam
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Datetime2023-05-10T14:50:18.616050-05:002023-08-09T01:40:37.750853-05:00
Envfile

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch(=)

Info

PropertyBeforeAfter
HPX Datetime2023-05-10T12:07:53+00:002023-08-09T06:33:19+00:00
HPX Commitdcb541517aba85
Clusternamerostamrostam
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Datetime2023-05-10T14:52:35.047119-05:002023-08-09T01:42:50.859516-05:00
Envfile

Comparison

BENCHMARKFORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATORPARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATORSCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add(=)(=)(=)
Stream Benchmark - Scale(=)(=)(=)
Stream Benchmark - Triad=(=)(=)
Stream Benchmark - Copy(=)(=)(=)

Info

PropertyBeforeAfter
HPX Datetime2023-05-10T12:07:53+00:002023-08-09T06:33:19+00:00
HPX Commitdcb541517aba85
Clusternamerostamrostam
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Datetime2023-05-10T14:52:52.237641-05:002023-08-09T01:43:07.824650-05:00
Envfile

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (≤5%)
++/--Large performance improvement/degradation (≤10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

@hkaiser
Copy link
Member

hkaiser commented Aug 9, 2023

@Johan511 could you please fix the reported clang-format issues as well?

@Johan511
Copy link
Contributor Author

Johan511 commented Aug 9, 2023

@Johan511 could you please fix the reported clang-format issues as well?

There is one loop yet to be vectorized, once that's done vectorization should be enabled for parallel case too. We can merge after that.

@StellarBot
Copy link

Performance test report

HPX Performance

Comparison

BENCHMARKFORK_JOIN_EXECUTORPARALLEL_EXECUTORSCHEDULER_EXECUTOR
For Each(=)??-

Info

PropertyBeforeAfter
HPX Datetime2023-05-10T12:07:53+00:002023-08-11T12:14:49+00:00
HPX Commitdcb54155cccf76
Datetime2023-05-10T14:50:18.616050-05:002023-08-11T07:20:22.167797-05:00
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Envfile
Clusternamerostamrostam
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch(=)

Info

PropertyBeforeAfter
HPX Datetime2023-05-10T12:07:53+00:002023-08-11T12:14:49+00:00
HPX Commitdcb54155cccf76
Datetime2023-05-10T14:52:35.047119-05:002023-08-11T07:22:34.867171-05:00
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Envfile
Clusternamerostamrostam
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Comparison

BENCHMARKFORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATORPARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATORSCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add(=)(=)(=)
Stream Benchmark - Scale(=)(=)=
Stream Benchmark - Triad=(=)(=)
Stream Benchmark - Copy(=)-(=)

Info

PropertyBeforeAfter
HPX Datetime2023-05-10T12:07:53+00:002023-08-11T12:14:49+00:00
HPX Commitdcb54155cccf76
Datetime2023-05-10T14:52:52.237641-05:002023-08-11T07:22:51.879951-05:00
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Envfile
Clusternamerostamrostam
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (≤5%)
++/--Large performance improvement/degradation (≤10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

@Johan511
Copy link
Contributor Author

@hkaiser the unseq_first_n function has been changed a bit, the the function now takes iterator as input instead of value, this does not break vectorization.

Hari Hara Naveen S and others added 6 commits February 8, 2024 14:11
Signed-off-by: Hari Hara Naveen S <[email protected]>
Signed-off-by: Hari Hara Naveen S <[email protected]>

changing test according to change made to helper

Signed-off-by: Hari Hara Naveen S <[email protected]>
Signed-off-by: Hari Hara Naveen S <[email protected]>

unseq

Signed-off-by: Hari Hara Naveen S <[email protected]>
…for function call deduction)

Signed-off-by: Hari Hara Naveen S <[email protected]>
Signed-off-by: Hari Hara Naveen S <[email protected]>
Signed-off-by: Hari Hara Naveen S <[email protected]>

fixed expolicy not accepting par_unseq

Signed-off-by: Hari Hara Naveen S <[email protected]>

cleanup changes

Signed-off-by: Hari Hara Naveen S <[email protected]>

fixing missing concept header

Signed-off-by: Hari Hara Naveen S <[email protected]>
Copy link

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation Diff coverage
-0.10% 80.00%
Coverage variation details
Coverable lines Covered lines Coverage
Common ancestor commit (17bdd0d) 206733 176202 85.23%
Head commit (d8096f0) 190774 (-15959) 162418 (-13784) 85.14% (-0.10%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details
Coverable lines Covered lines Diff coverage
Pull request (#6302) 10 8 80.00%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings    Change summary preferences

You may notice some variations in coverage metrics with the latest Coverage engine update. For more details, visit the documentation

@hkaiser hkaiser modified the milestones: 1.10.0, 1.11.0 May 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants