Uneven performance of `blst_p1s_mult_pippenger` #235

chfast · 2024-10-31T12:22:38Z

When benchmarking blst_p1s_mult_pippenger I noticed sudden increases in performance at number of points: 64, 128, 256 and further.

The text was updated successfully, but these errors were encountered:

dot-asm · 2024-11-01T11:12:57Z

And what's the issue? :-) But on a more serious note, the keyword is that the tangent becomes more and more moderate, and it depends on how you slice scalars depending on amount of inputs, which is a balancing act. The "scalar-slicing" procedure is prone to rounding errors, which is why the curve is bound to have breaks. Now with this in mind, what's the issue? That the breaks are too big?

chfast · 2024-11-04T13:51:05Z

I just wanted to notify that decision how to slice scalars depending on the number of inputs may be improved. E.g. currently, for our data it is faster to compute MSM for 65 points than for 55.

dot-asm · 2024-11-05T10:02:01Z

For the record, performance for such small amounts of inputs has never been subject to such close scrutiny, let alone single-thread performance[!]. The latter is because even single-board computers are multi-core this time and day. But anyway, try to modify pippenger_window_size() in src/multi_scalar.c by adding npoints += 8; after size_t wbits; declaration.

chfast · 2024-12-09T13:37:06Z

I tried the suggestion (npoints += 8) but it's effect is limited.

I got the best results by:

increasing the bits by 1 from what is originally computed,

  size_t r = wbits>12 ? wbits-3 : (wbits>4 ? wbits-2 : (wbits ? 2 : 1));
  return r + 1;

decreasing the threshold of the fallback to mult_wbits by 4, for p1 this is 64 → 16
```
if ((npoints * sizeof(ptype##_affine) * 8 * 3) <= SCRATCH_LIMIT / 4)
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uneven performance of `blst_p1s_mult_pippenger` #235

Uneven performance of `blst_p1s_mult_pippenger` #235

chfast commented Oct 31, 2024

dot-asm commented Nov 1, 2024

chfast commented Nov 4, 2024

dot-asm commented Nov 5, 2024

chfast commented Dec 9, 2024 •

edited

Loading

Uneven performance of blst_p1s_mult_pippenger #235

Uneven performance of blst_p1s_mult_pippenger #235

Comments

chfast commented Oct 31, 2024

dot-asm commented Nov 1, 2024

chfast commented Nov 4, 2024

dot-asm commented Nov 5, 2024

chfast commented Dec 9, 2024 • edited Loading

Uneven performance of `blst_p1s_mult_pippenger` #235

Uneven performance of `blst_p1s_mult_pippenger` #235

chfast commented Dec 9, 2024 •

edited

Loading