vector dot - inconsistent results #88958
-
|
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 2 replies
-
I think it is expected because the accuracy between scalar floating instructions and SIMD instructions is different. |
Beta Was this translation helpful? Give feedback.
-
The purely mathematical definition is indeed: If you were to define this programmatically, you'd end up with something like the following: public static T DotProduct<T>(ReadOnlySpan<T> a, ReadOnlySpan<T> b)
where T : INumber<T>
{
if (a.Length != b.Length)
{
throw new ArgumentOutOfRangeException();
}
T[] product = Multiply(a, b);
return Sum(product);
}
public static T[] Multiply<T>(ReadOnlySpan<T> a, ReadOnlySpan<T> b)
where T : INumber<T>
{
if (a.Length != b.Length)
{
throw new ArgumentOutOfRangeException();
}
T[] result = new T[a.Length];
for (int i = 0; i < result.Length; i++)
{
result[i] = a[i] * b[i];
}
return result;
} For the naive case, the implementation of public static T Sum<T>(ReadOnlySpan<T> a)
where T : INumber<T>
{
T result = T.Zero;
for (int i = 0; i < a.Length; i++)
{
result += a[i];
}
return result;
} Floating-point values, however, are not infinitely precise and end up rounding after each operation. This can introduce error into the equation since floating-point is then not associative, which is to say that That is to say, the "standard" algorithm is actually more akin to the following: public static T Sum<T>(ReadOnlySpan<T> a)
where T : INumber<T>
{
T result;
if (a.Length > 2)
{
int halfLength = a.Length / 2;
result = Sum(a[..halfLength]) + Sum(a[halfLength..]);
}
else if (a.Length == 2)
{
result = a[0] + a[1];
}
else if (a.Length == 1)
{
result = a[0];
}
else
{
result = T.Zero;
}
return result;
} You can see this in the formal definition of |
Beta Was this translation helpful? Give feedback.
-
Hi tannergoodingtannergooding, Thank you for your very elaborated answer. In fact, by simply replacing double It's clear that no associativity guarantees producing results better than the others, everything depends on the operands of the summation; Since pairwise associativity is more SIMD friendly, you should probably consider changing the left-wise associativity used by ordinary scalar summation into a pairwise one simply for the sake of producing consistent results. |
Beta Was this translation helpful? Give feedback.
-
Thank you everybody. |
Beta Was this translation helpful? Give feedback.
-- Noting that I may have made a mistake in the pairwise algorithm. It's early and I didn't think too deeply 😅
The general intent is that you add neighbors to eachother recursively. So
(a[0] + a[1])
and(a[2] + a[3])
and so on. This halves the length. You then do it again on those intermediates until you only have 1 result left.So for 4 elements:
Or for 8 elements:
etc