Add support for NEON (128-bit wide SIMD for ARM) for 64-bit architectures #115
Labels
acceptance: go ahead
Reviewed, implementation can start
area: performance
Performance improvements
contribute: simd
Requires SIMD knowledge
help wanted
External contributions welcome
type: feature
New feature or request
Milestone
Is your feature request related to a problem? Please describe.
SIMD acceleration is implemented for x86 and tracked in #21 for 32-bit ARM. We also need support for 64-bit ARM.
Describe the solution you'd like
We expect most of the design around extracting architecture-specific bits to be done in #14. After that, a similar approach can be used here and in #21.
Additional context
Find NEON intrinsics documentation here.
I am not knowledgable in NEON and I don't even know how to emulate an ARM system locally, so help here is really needed.
The text was updated successfully, but these errors were encountered: