Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better batch commit and switch to Reed Solomon code. #155

Merged
merged 68 commits into from
Sep 9, 2024

Conversation

yczhangsjtu
Copy link
Collaborator

@yczhangsjtu yczhangsjtu commented Aug 20, 2024

Main tasks accomplished by this PR:

  • Replace the naive batch commit (committing to individual polys) to real batch commit, i.e., committing to multiple polynomials in a single Merkle tree.
  • Add the simple_batch_prove and simple_batch_verify methods. These methods support opening:
    • One commitment that commits to multiple polynomials of the same size.
    • One opening point.
  • Switch the encoding algorithm from the one in BaseFold paper to Reed Solomon code. The encoding algorithm of RS code is much faster, and RS code has better distance so allows a better parameter.
  • Estimate the appropriate parameter for RS code.

(The original batch_prove and batch_verify methods supports opening multiple commitments, multiple points and a flexible combination between polys and points, but only allow each input commitment to contain only one polynomial)

@yczhangsjtu yczhangsjtu changed the title Basefold optimizations Better batch commit and switch to Reed Solomon code. Sep 2, 2024
@yczhangsjtu
Copy link
Collaborator Author

Just finished switching to Reed Solomon code. All tests passed.

@yczhangsjtu
Copy link
Collaborator Author

yczhangsjtu commented Sep 2, 2024

Remained to be done:

  • Estimate the new parameter for RS code. Currently, the parameter (number of queries and expansion rate) are decided using the original code from BaseFold paper. RS code has a better distance so may enjoy smaller parameters.
  • Finish the benchmark on the current algorithms.

@yczhangsjtu
Copy link
Collaborator Author

yczhangsjtu commented Sep 3, 2024

Benchmarks with the RS code.

Polynomials over base field.

num_vars commit 2 open 2 verify 2 commit 4 open 4 verify 4 commit 8 open 8 verify 8 commit 16 open 16 verify 16 commit 32 open 32 verify 32 commit 64 open 64 verify 64
10 4.5243 ms 18.988 ms 17.162 ms 5.1485 ms 19.157 ms 17.568 ms 6.1629 ms 20.383 ms 18.371 ms 7.3498 ms 22.838 ms 19.928 ms 8.5209 ms 27.893 ms 23.061 ms 11.508 ms 37.876 ms 29.209 ms
11 7.9425 ms 26.458 ms 22.503 ms 8.1803 ms 27.220 ms 22.962 ms 9.9768 ms 28.833 ms 23.416 ms 11.417 ms 32.115 ms 25.164 ms 14.481 ms 38.973 ms 28.217 ms 19.840 ms 52.815 ms 34.623 ms
12 13.362 ms 35.963 ms 28.199 ms 14.067 ms 37.106 ms 28.576 ms 17.089 ms 39.876 ms 29.047 ms 20.327 ms 45.189 ms 30.756 ms 25.327 ms 56.123 ms 33.820 ms 37.164 ms 78.149 ms 39.901 ms
13 25.160 ms 49.107 ms 34.081 ms 24.506 ms 51.312 ms 34.522 ms 32.744 ms 55.933 ms 35.096 ms 37.551 ms 64.735 ms 36.608 ms 47.398 ms 85.124 ms 39.664 ms 68.463 ms 120.51 ms 45.862 ms
14 47.492 ms 68.994 ms 40.577 ms 48.785 ms 73.915 ms 41.215 ms 59.573 ms 82.038 ms 41.829 ms 74.122 ms 97.575 ms 43.116 ms 91.947 ms 136.03 ms 46.076 ms 135.48 ms 199.40 ms 51.865 ms
15 91.939 ms 99.272 ms 47.294 ms 96.935 ms 108.07 ms 47.828 ms 117.05 ms 125.45 ms 47.702 ms 136.48 ms 155.52 ms 49.365 ms 168.01 ms 228.66 ms 52.516 ms 258.04 ms 350.63 ms 58.538 ms
16 182.10 ms 153.08 ms 54.255 ms 187.62 ms 168.58 ms 54.280 ms 225.88 ms 200.47 ms 54.377 ms 257.95 ms 265.90 ms 56.214 ms 314.61 ms 409.76 ms 59.009 ms 510.02 ms 645.97 ms 65.328 ms
17 349.21 ms 251.01 ms 61.587 ms 379.80 ms 290.55 ms 61.731 ms 460.18 ms 368.57 ms 62.013 ms 494.80 ms 491.78 ms 63.857 ms 615.47 ms 773.54 ms 66.570 ms 1.0142 s 1.2444 s 72.700 ms
18 743.98 ms 459.56 ms 68.865 ms 731.44 ms 509.67 ms 69.138 ms 907.16 ms 662.64 ms 69.577 ms 1.0224 s 890.04 ms 71.006 ms 1.3216 s 1.4409 s 74.066 ms 2.1844 s 2.3578 s 79.930 ms
19 1.5244 s 952.44 ms 76.040 ms 1.4700 s 1.0531 s 77.126 ms 1.7081 s 1.2848 s 77.057 ms 2.1718 s 1.8043 s 78.720 ms 2.7322 s 2.7241 s 81.757 ms 4.4898 s 4.6265 s 87.659 ms
20 3.0092 s 1.8017 s 84.938 ms 3.1013 s 2.0809 s 85.785 ms 3.4352 s 2.4971 s 85.549 ms 4.2900 s 3.5549 s 87.273 ms 5.3019 s 5.4156 s 89.982 ms 8.8881 s 9.3985 s 94.517 ms

Extension field polynomials.

num_vars commit 2 open 2 verify 2 commit 4 open 4 verify 4 commit 8 open 8 verify 8 commit 16 open 16 verify 16 commit 32 open 32 verify 32 commit 64 open 64 verify 64
10 5.1710 ms 19.161 ms 17.497 ms 5.5321 ms 19.955 ms 18.305 ms 6.9805 ms 21.759 ms 19.686 ms 8.9328 ms 25.419 ms 22.656 ms 13.693 ms 33.031 ms 28.425 ms 20.444 ms 48.541 ms 39.944 ms
11 8.0846 ms 26.683 ms 22.736 ms 8.7722 ms 27.960 ms 23.286 ms 11.471 ms 30.198 ms 24.923 ms 14.816 ms 35.018 ms 27.905 ms 23.470 ms 45.130 ms 33.669 ms 38.998 ms 64.989 ms 44.939 ms
12 13.597 ms 36.545 ms 28.453 ms 15.692 ms 38.159 ms 29.033 ms 20.758 ms 41.582 ms 30.526 ms 28.319 ms 48.414 ms 33.358 ms 45.702 ms 63.874 ms 39.284 ms 74.750 ms 90.984 ms 50.567 ms
13 25.318 ms 50.212 ms 34.392 ms 26.065 ms 52.826 ms 34.925 ms 39.735 ms 58.303 ms 36.333 ms 53.787 ms 69.227 ms 39.137 ms 85.366 ms 95.939 ms 45.134 ms 137.86 ms 138.80 ms 56.171 ms
14 45.369 ms 70.691 ms 40.542 ms 48.369 ms 76.043 ms 41.212 ms 71.622 ms 84.837 ms 42.524 ms 101.33 ms 103.54 ms 45.557 ms 164.94 ms 151.43 ms 51.172 ms 290.80 ms 225.09 ms 62.291 ms
15 91.363 ms 103.29 ms 47.268 ms 101.06 ms 111.36 ms 47.849 ms 141.88 ms 130.64 ms 49.008 ms 196.54 ms 164.65 ms 51.851 ms 315.35 ms 258.02 ms 57.693 ms 574.17 ms 392.52 ms 68.854 ms
16 170.71 ms 157.63 ms 54.054 ms 188.66 ms 176.75 ms 54.660 ms 268.26 ms 212.33 ms 55.858 ms 370.74 ms 280.92 ms 58.855 ms 571.54 ms 456.43 ms 64.414 ms 1.1130 s 714.94 ms 75.532 ms
17 354.48 ms 261.99 ms 61.653 ms 367.62 ms 308.54 ms 61.766 ms 526.09 ms 386.15 ms 63.494 ms 755.97 ms 522.14 ms 65.950 ms 1.2399 s 890.21 ms 71.718 ms 2.4407 s 1.3821 s 82.832 ms
18 739.07 ms 520.60 ms 69.276 ms 794.45 ms 591.77 ms 69.392 ms 1.1169 s 737.85 ms 70.507 ms 1.5141 s 1.0001 s 73.637 ms 2.6478 s 1.5715 s 79.124 ms 5.1186 s 2.6724 s 90.325 ms
19 1.4967 s 946.73 ms 77.226 ms 1.5801 s 1.0208 s 77.349 ms 2.2429 s 1.4031 s 78.447 ms 3.0869 s 1.9302 s 80.906 ms 5.0353 s 3.0734 s 87.126 ms 10.004 s 5.2565 s 96.227 ms
20 3.0468 s 1.8904 s 85.361 ms 3.1666 s 2.1642 s 85.396 ms 4.5045 s 2.7066 s 86.597 ms 5.6582 s 3.8164 s 88.964 ms 10.934 s 6.1087 s 93.487 ms 19.762 s 10.453 s 104.39 ms

@yczhangsjtu
Copy link
Collaborator Author

yczhangsjtu commented Sep 3, 2024

For a more clear comparison

  • With BaseFold encoding, polynomial over base field:
    • Commit 16.007s
    • Open 9.3620s
    • Verify 93.459ms
  • With BaseFold encoding, polynomial over ext field:
    • Commit 32.986s
    • Open 10.384s
    • Verify 102.96ms
  • With RS encoding, polynomial over base field:
    • Commit 8.8881 s
    • Open 9.3985 s
    • Verify 94.517 ms
  • With RS encoding, polynomial over ext field:
    • Commit 19.762 s
    • Open 10.453 s
    • Verify 104.39 ms

When the rate bit is set to 1, (rate = 2), and queries set to 973 accordingly, for base field polynomials:

  • Commit: 2.0501 s
  • Open: 3.2066 s
  • Verify 301.09 ms

@yczhangsjtu yczhangsjtu marked this pull request as ready for review September 3, 2024 02:29
@yczhangsjtu yczhangsjtu requested a review from dreamATD September 3, 2024 02:29
@yczhangsjtu
Copy link
Collaborator Author

Security bits analysis for BaseFold

Following is copied from JupyterLab.

BaseFold parameter selection:

Choose $\delta,\gamma\in(0,1)$ and integer $\ell$ such that:

  • $(2d/\gamma^3|\mathbb{F}|)+(1-\delta+d\gamma)^{\ell}\leq 2^{-\lambda}$
  • $\delta<J_{\gamma}(J_{\gamma}(\Delta_{C_d}))$
  • $3\delta-d\gamma<\Delta_{C_d}$

Here

  • $\Delta_{C_d}$ is the relative code distance.
  • $J_{\gamma}(x)=1-\sqrt{1-x(1-\gamma)}$.
  • $d$ is the number of rounds.

For our case, since we are using RS code, $\Delta_{C_d}$ can be $1/2$, $3/4$ or $7/8$ depending on the expansion factor.
$|\mathbb{F}|$ is $2^{128}$ since we are using degree two expansion of Goldilocks field. The round number $d$ is approximately $13$. The target security bits is $100$.

For the first part, $2d/\gamma^3|\mathbb{F}|\leq 2^{-\lambda}$ can be reduced to $\gamma>\sqrt[3]{13}/2^9$.

import math
math.pow(13,1/3)/(2**9)
0.004592450561954604

So $\gamma$ is larger than $0.004$. Since we want $d\gamma$ smaller than $1$ (so that $1-\delta+d\gamma$ has a chance to be smaller than $1$, we also need $\gamma<1/d$. In our case, $\gamma<0.05$.

Now let's choose $\delta$, which satisfies:

  • $\delta>d\gamma$
  • $\delta<(d\gamma+\Delta_{C_d})/3$
  • $\delta<J_{\gamma}(J_{\gamma}(\Delta_{C_d}))$

For different $\gamma$ and $\Delta_{C_d}$, those bounds are

for gamma in [0.004, 0.004592, 0.005, 0.01, 0.02, 0.03, 0.04]:
    d = 13
    for dist in [1/2, 3/4, 7/8]:
        up0 = (d * gamma + dist) / 3
        up1 = 1 - math.sqrt(1 - (1 - math.sqrt(1 - dist * (1 - gamma))) * (1 - gamma))
        print(
            f'dist={dist}, gamma={gamma}: ',
            d * gamma,
            up0,
            up1,
            min(up0, up1) - d * gamma
        )
dist=0.5, gamma=0.004:  0.052000000000000005 0.18400000000000002 0.1575716617875993 0.10557166178759929
dist=0.75, gamma=0.004:  0.052000000000000005 0.26733333333333337 0.2893811926328024 0.21533333333333338
dist=0.875, gamma=0.004:  0.052000000000000005 0.309 0.39913804354711924 0.257
dist=0.5, gamma=0.004592:  0.059696 0.18656533333333333 0.15734588481726863 0.09764988481726863
dist=0.75, gamma=0.004592:  0.059696 0.2698986666666667 0.2888653720193639 0.21020266666666668
dist=0.875, gamma=0.004592:  0.059696 0.3115653333333333 0.3982248317164482 0.2518693333333333
dist=0.5, gamma=0.005:  0.065 0.18833333333333332 0.15719042351299672 0.09219042351299672
dist=0.75, gamma=0.005:  0.065 0.27166666666666667 0.28851046263469493 0.20666666666666667
dist=0.875, gamma=0.005:  0.065 0.3133333333333333 0.397597370999087 0.2483333333333333
dist=0.5, gamma=0.01:  0.13 0.21 0.1552946164633383 0.025294616463338304
dist=0.75, gamma=0.01:  0.13 0.29333333333333333 0.2841996559989196 0.1541996559989196
dist=0.875, gamma=0.01:  0.13 0.33499999999999996 0.3900317334085216 0.20499999999999996
dist=0.5, gamma=0.02:  0.26 0.25333333333333335 0.15155437062991528 -0.10844562937008473
dist=0.75, gamma=0.02:  0.26 0.33666666666666667 0.275786028228975 0.015786028228975013
dist=0.875, gamma=0.02:  0.26 0.37833333333333335 0.3755467292677003 0.11554672926770027
dist=0.5, gamma=0.03:  0.39 0.2966666666666667 0.14788149029682773 -0.24211850970317228
dist=0.75, gamma=0.03:  0.39 0.38000000000000006 0.267637476599081 -0.12236252340091902
dist=0.875, gamma=0.03:  0.39 0.4216666666666667 0.3618452634642416 -0.028154736535758396
dist=0.5, gamma=0.04:  0.52 0.34 0.1442746673791374 -0.3757253326208626
dist=0.75, gamma=0.04:  0.52 0.42333333333333334 0.25974041598612085 -0.26025958401387916
dist=0.875, gamma=0.04:  0.52 0.465 0.3488471761560117 -0.17115282384398833

Obviously, for large $\gamma$, it's impossible to choose a proper $\delta$. So we ignore the large $\gamma$.

We want $\delta-d\gamma$ to be as large as possible, so a smaller gamma is more appropriate. Let's choose $\gamma=0.004$, although smaller than the lower bound on $\gamma$, the resulting soundness error on the first term is

math.log(26/((0.004)**3)/(2**128), 2)
-99.40220742787265

Quite close to 100. So it's acceptable.

Then, obviously, there is a tradeoff between the distance and the soundness error for the second term.
The difference value, i.e., $\delta-d\gamma$, determines the number of queries by
$$
(1-(\delta-d\gamma))^{\ell}<1^{-\lambda}
$$
So $\ell&gt;\lambda/(-\log(1-(\delta-d\gamma)))$.

for diff in [0.10557166178759929, 0.21533333333333338, 0.257]:
    print(100 / (-math.log(1-diff)))
896.2943272850086
412.3774602649325
336.63319791298466

So it's very large. 336 is already the minimal we can achieve, and it's already with RS code and expansion factor 8. This number is larger than the one specified in the current code.

Now let's summarize an algorithm for computing the number of required queries.

def queries(security_bit, rate_bit, num_vars, basecode_size_log, dist=None):
    d = num_vars - basecode_size_log
    if dist is None:
        dist = 1 - (1 / (2 ** rate_bit))
    gamma = math.pow(2*d/(2**(128-security_bit)), 1/3)
    up0 = (d * gamma + dist) / 3
    up1 = 1 - math.sqrt(1 - (1 - math.sqrt(1 - dist * (1 - gamma))) * (1 - gamma))
    diff = min(up0, up1) - d * gamma
    q = security_bit / (-math.log(1-diff))
    return q
print(queries(100, 3, 20, 7))
print(queries(100, 3, 20, 8))
print(queries(100, 2, 20, 8))
print(queries(100, 3, 20, 7, 0.557))
344.6227035920193
338.3267577726375
414.78596208854384
766.1956244621043

Here the 0.557 is the code distance of the original code in BaseFold with expansion factor 8. Overwrite the distance with it to obtain the required number of queries. Therefore, the required number of queries for the original code in BaseFold is actually 766. The 260 in current code only provides roughly 50 to 60 bits of security.

def proof_size(security_bit, rate_bit, num_vars, basecode_size_log, dist=None):
    q = queries(security_bit, rate_bit, num_vars, basecode_size_log, dist)
    d = num_vars - basecode_size_log
    merkle_path_num_hashes = (num_vars + rate_bit + (basecode_size_log - 1) + rate_bit) * d / 2
    merkle_path_size = merkle_path_num_hashes * 16

    commitments_size = (d - 1) * 16

    final_message_size = (2 ** basecode_size_log) * 16

    return merkle_path_size * q + commitments_size + final_message_size
print(proof_size(100, 3, 20, 7))
1149144.3575542404

So the proof size is roughly a bit more than 1MB for RS code. In comparison, the proof size for the original BaseFold code is

print(proof_size(100, 3, 20, 7, 0.557))
2552139.038209883

More than twice as large as the RS code.

@yczhangsjtu
Copy link
Collaborator Author

According to the experiment results in the BaseFold paper, the proof size for 20 variables is approximately 4MB, which is even bigger than the analyzed result above. The paper does not mention what parameters were selected for BaseFold, so maybe it is something that favors the prover over the verifier and proof size.

Comment on lines +244 to +255
let res = poly
.par_chunks_exact(message_size)
.map(|chunk| {
let mut target = vec![F::ZERO; message_size * rate];
// Just Reed-Solomon code, but with the naive domain
target
.iter_mut()
.enumerate()
.for_each(|(i, target)| *target = horner(chunk, &domain[i]));
target
})
.collect::<Vec<Vec<F>>>();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part seems to be at least quadratic complexity, Because target.iter_mut() is O(message_size * rate)-sized loop and the complexity of horner is O(message_size * rate).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right. But the quadratic complexity is only for the small chunks of constant size. The overall time is linear.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much is message_size?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like the complexity is $N / 2^7 * (2^7 * 2^3)^2 = 8192 \cdot N?$, correct me if I'm wrong. I think it's better to add a comment here FIXME: it's expensive.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

message_size is roughly 2^7.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the FIXME. I think we can leave optimizing this later because this code is currently not in use anyway.

Comment on lines +326 to +329
let mut cipher = Aes128Ctr64LE::new(
GenericArray::from_slice(&key[..]),
GenericArray::from_slice(&iv[..]),
);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The BaseFold code scheme uses random twist factors for the folding. To allow the verifier to get the same factors as the prover efficiently, these random factors are generated using AES.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that the same as extracting from transcript?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. These parameters should be determined at least before committing to the polynomial.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK got you.

@dreamATD
Copy link
Collaborator

dreamATD commented Sep 4, 2024

I think FieldType is just to solve the case where the sumcheck input could be both from base field and extension field. It seems that when calling functions in PCS, all polynomials should be of the same type. Why do we still keep the FieldType representation, use match here and there, instead of just convert polys to either Vec or Vec before reaching PCS API?

@dreamATD
Copy link
Collaborator

dreamATD commented Sep 4, 2024

Benchmarks with the RS code.

num_vars commit 2 open 2 verify 2 commit 4 open 4 verify 4 commit 8 open 8 verify 8 commit 16 open 16 verify 16 commit 32 open 32 verify 32 commit 64 open 64 verify 64
10 4.5243 ms 18.988 ms 17.162 ms 5.1485 ms 19.157 ms 17.568 ms 6.1629 ms 20.383 ms 18.371 ms 7.3498 ms 22.838 ms 19.928 ms 8.5209 ms 27.893 ms 23.061 ms 11.508 ms 37.876 ms 29.209 ms
11 7.9425 ms 26.458 ms 22.503 ms 8.1803 ms 27.220 ms 22.962 ms 9.9768 ms 28.833 ms 23.416 ms 11.417 ms 32.115 ms 25.164 ms 14.481 ms 38.973 ms 28.217 ms 19.840 ms 52.815 ms 34.623 ms
12 13.362 ms 35.963 ms 28.199 ms 14.067 ms 37.106 ms 28.576 ms 17.089 ms 39.876 ms 29.047 ms 20.327 ms 45.189 ms 30.756 ms 25.327 ms 56.123 ms 33.820 ms 37.164 ms 78.149 ms 39.901 ms
13 25.160 ms 49.107 ms 34.081 ms 24.506 ms 51.312 ms 34.522 ms 32.744 ms 55.933 ms 35.096 ms 37.551 ms 64.735 ms 36.608 ms 47.398 ms 85.124 ms 39.664 ms 68.463 ms 120.51 ms 45.862 ms
14 47.492 ms 68.994 ms 40.577 ms 48.785 ms 73.915 ms 41.215 ms 59.573 ms 82.038 ms 41.829 ms 74.122 ms 97.575 ms 43.116 ms 91.947 ms 136.03 ms 46.076 ms 135.48 ms 199.40 ms 51.865 ms
15 91.939 ms 99.272 ms 47.294 ms 96.935 ms 108.07 ms 47.828 ms 117.05 ms 125.45 ms 47.702 ms 136.48 ms 155.52 ms 49.365 ms 168.01 ms 228.66 ms 52.516 ms 258.04 ms 350.63 ms 58.538 ms
16 182.10 ms 153.08 ms 54.255 ms 187.62 ms 168.58 ms 54.280 ms 225.88 ms 200.47 ms 54.377 ms 257.95 ms 265.90 ms 56.214 ms 314.61 ms 409.76 ms 59.009 ms 510.02 ms 645.97 ms 65.328 ms
17 349.21 ms 251.01 ms 61.587 ms 379.80 ms 290.55 ms 61.731 ms 460.18 ms 368.57 ms 62.013 ms 494.80 ms 491.78 ms 63.857 ms 615.47 ms 773.54 ms 66.570 ms 1.0142 s 1.2444 s 72.700 ms
18 743.98 ms 459.56 ms 68.865 ms 731.44 ms 509.67 ms 69.138 ms 907.16 ms 662.64 ms 69.577 ms 1.0224 s 890.04 ms 71.006 ms 1.3216 s 1.4409 s 74.066 ms 2.1844 s 2.3578 s 79.930 ms
19 1.5244 s 952.44 ms 76.040 ms 1.4700 s 1.0531 s 77.126 ms 1.7081 s 1.2848 s 77.057 ms 2.1718 s 1.8043 s 78.720 ms 2.7322 s 2.7241 s 81.757 ms 4.4898 s 4.6265 s 87.659 ms
20 3.0092 s 1.8017 s 84.938 ms 3.1013 s 2.0809 s 85.785 ms 3.4352 s 2.4971 s 85.549 ms 4.2900 s 3.5549 s 87.273 ms 5.3019 s 5.4156 s 89.982 ms 8.8881 s 9.3985 s 94.517 ms
num_vars commit 2 open 2 verify 2 commit 4 open 4 verify 4 commit 8 open 8 verify 8 commit 16 open 16 verify 16 commit 32 open 32 verify 32 commit 64 open 64 verify 64
10 5.1710 ms 19.161 ms 17.497 ms 5.5321 ms 19.955 ms 18.305 ms 6.9805 ms 21.759 ms 19.686 ms 8.9328 ms 25.419 ms 22.656 ms 13.693 ms 33.031 ms 28.425 ms 20.444 ms 48.541 ms 39.944 ms
11 8.0846 ms 26.683 ms 22.736 ms 8.7722 ms 27.960 ms 23.286 ms 11.471 ms 30.198 ms 24.923 ms 14.816 ms 35.018 ms 27.905 ms 23.470 ms 45.130 ms 33.669 ms 38.998 ms 64.989 ms 44.939 ms
12 13.597 ms 36.545 ms 28.453 ms 15.692 ms 38.159 ms 29.033 ms 20.758 ms 41.582 ms 30.526 ms 28.319 ms 48.414 ms 33.358 ms 45.702 ms 63.874 ms 39.284 ms 74.750 ms 90.984 ms 50.567 ms
13 25.318 ms 50.212 ms 34.392 ms 26.065 ms 52.826 ms 34.925 ms 39.735 ms 58.303 ms 36.333 ms 53.787 ms 69.227 ms 39.137 ms 85.366 ms 95.939 ms 45.134 ms 137.86 ms 138.80 ms 56.171 ms
14 45.369 ms 70.691 ms 40.542 ms 48.369 ms 76.043 ms 41.212 ms 71.622 ms 84.837 ms 42.524 ms 101.33 ms 103.54 ms 45.557 ms 164.94 ms 151.43 ms 51.172 ms 290.80 ms 225.09 ms 62.291 ms
15 91.363 ms 103.29 ms 47.268 ms 101.06 ms 111.36 ms 47.849 ms 141.88 ms 130.64 ms 49.008 ms 196.54 ms 164.65 ms 51.851 ms 315.35 ms 258.02 ms 57.693 ms 574.17 ms 392.52 ms 68.854 ms
16 170.71 ms 157.63 ms 54.054 ms 188.66 ms 176.75 ms 54.660 ms 268.26 ms 212.33 ms 55.858 ms 370.74 ms 280.92 ms 58.855 ms 571.54 ms 456.43 ms 64.414 ms 1.1130 s 714.94 ms 75.532 ms
17 354.48 ms 261.99 ms 61.653 ms 367.62 ms 308.54 ms 61.766 ms 526.09 ms 386.15 ms 63.494 ms 755.97 ms 522.14 ms 65.950 ms 1.2399 s 890.21 ms 71.718 ms 2.4407 s 1.3821 s 82.832 ms
18 739.07 ms 520.60 ms 69.276 ms 794.45 ms 591.77 ms 69.392 ms 1.1169 s 737.85 ms 70.507 ms 1.5141 s 1.0001 s 73.637 ms 2.6478 s 1.5715 s 79.124 ms 5.1186 s 2.6724 s 90.325 ms
19 1.4967 s 946.73 ms 77.226 ms 1.5801 s 1.0208 s 77.349 ms 2.2429 s 1.4031 s 78.447 ms 3.0869 s 1.9302 s 80.906 ms 5.0353 s 3.0734 s 87.126 ms 10.004 s 5.2565 s 96.227 ms
20 3.0468 s 1.8904 s 85.361 ms 3.1666 s 2.1642 s 85.396 ms 4.5045 s 2.7066 s 86.597 ms 5.6582 s 3.8164 s 88.964 ms 10.934 s 6.1087 s 93.487 ms 19.762 s 10.453 s 104.39 ms

What's the difference between the two table? Also, there's no need to put so much data.

@yczhangsjtu
Copy link
Collaborator Author

yczhangsjtu commented Sep 5, 2024

I think FieldType is just to solve the case where the sumcheck input could be both from base field and extension field. It seems that when calling functions in PCS, all polynomials should be of the same type. Why do we still keep the FieldType representation, use match here and there, instead of just convert polys to either Vec or Vec before reaching PCS API?

I think we discussed this before I started aligning the mpcs code with the other part of Ceno.

It was designed so, but with complex template types to make the same code work for both types of polynomials. Then the design was changed to current after using the new ExtensionField type defined in ff_ext crate because it is hard to specify the template types:

PCS<ChallengeField: ExtensionField, PolynomialField: ???>

This ??? should be satisfied by both ChallengeField and ChallengeField::BaseField. A more general trait, e.g., Field will do, but with a lot of where clauses.

Originally, both GoldilocksExt2 and Goldilocks implement the SmallField trait defined, so I was able to achieve this with a lot of where clauses, which I don't think is a good design now.

PCS<ChallengeField: SmallField, PolynomialField: SmallField>
where PolynomialField::BaseField = ChallengeField::BaseField,
PolynomialField: Into<ChallengeField>,
ChallengeField: TryInto<ChallengeField>

This bunch of stuff will be carried everywhere whenever you define a new function that invokes PCS functions.

Finally, I do think polynomials of mixed types may be opened together. Maybe not in GKR, but for a typical PIOP, so keeping it flexible may be of some value. For example, in PLONK, the first committed witnesses are over the base field, then the prover will commit to some extension field polynomials after receiving challenges. Finally, all these polynomials are opened together.

@yczhangsjtu
Copy link
Collaborator Author

Benchmarks with the RS code.
num_vars commit 2 open 2 verify 2 commit 4 open 4 verify 4 commit 8 open 8 verify 8 commit 16 open 16 verify 16 commit 32 open 32 verify 32 commit 64 open 64 verify 64
10 4.5243 ms 18.988 ms 17.162 ms 5.1485 ms 19.157 ms 17.568 ms 6.1629 ms 20.383 ms 18.371 ms 7.3498 ms 22.838 ms 19.928 ms 8.5209 ms 27.893 ms 23.061 ms 11.508 ms 37.876 ms 29.209 ms
11 7.9425 ms 26.458 ms 22.503 ms 8.1803 ms 27.220 ms 22.962 ms 9.9768 ms 28.833 ms 23.416 ms 11.417 ms 32.115 ms 25.164 ms 14.481 ms 38.973 ms 28.217 ms 19.840 ms 52.815 ms 34.623 ms
12 13.362 ms 35.963 ms 28.199 ms 14.067 ms 37.106 ms 28.576 ms 17.089 ms 39.876 ms 29.047 ms 20.327 ms 45.189 ms 30.756 ms 25.327 ms 56.123 ms 33.820 ms 37.164 ms 78.149 ms 39.901 ms
13 25.160 ms 49.107 ms 34.081 ms 24.506 ms 51.312 ms 34.522 ms 32.744 ms 55.933 ms 35.096 ms 37.551 ms 64.735 ms 36.608 ms 47.398 ms 85.124 ms 39.664 ms 68.463 ms 120.51 ms 45.862 ms
14 47.492 ms 68.994 ms 40.577 ms 48.785 ms 73.915 ms 41.215 ms 59.573 ms 82.038 ms 41.829 ms 74.122 ms 97.575 ms 43.116 ms 91.947 ms 136.03 ms 46.076 ms 135.48 ms 199.40 ms 51.865 ms
15 91.939 ms 99.272 ms 47.294 ms 96.935 ms 108.07 ms 47.828 ms 117.05 ms 125.45 ms 47.702 ms 136.48 ms 155.52 ms 49.365 ms 168.01 ms 228.66 ms 52.516 ms 258.04 ms 350.63 ms 58.538 ms
16 182.10 ms 153.08 ms 54.255 ms 187.62 ms 168.58 ms 54.280 ms 225.88 ms 200.47 ms 54.377 ms 257.95 ms 265.90 ms 56.214 ms 314.61 ms 409.76 ms 59.009 ms 510.02 ms 645.97 ms 65.328 ms
17 349.21 ms 251.01 ms 61.587 ms 379.80 ms 290.55 ms 61.731 ms 460.18 ms 368.57 ms 62.013 ms 494.80 ms 491.78 ms 63.857 ms 615.47 ms 773.54 ms 66.570 ms 1.0142 s 1.2444 s 72.700 ms
18 743.98 ms 459.56 ms 68.865 ms 731.44 ms 509.67 ms 69.138 ms 907.16 ms 662.64 ms 69.577 ms 1.0224 s 890.04 ms 71.006 ms 1.3216 s 1.4409 s 74.066 ms 2.1844 s 2.3578 s 79.930 ms
19 1.5244 s 952.44 ms 76.040 ms 1.4700 s 1.0531 s 77.126 ms 1.7081 s 1.2848 s 77.057 ms 2.1718 s 1.8043 s 78.720 ms 2.7322 s 2.7241 s 81.757 ms 4.4898 s 4.6265 s 87.659 ms
20 3.0092 s 1.8017 s 84.938 ms 3.1013 s 2.0809 s 85.785 ms 3.4352 s 2.4971 s 85.549 ms 4.2900 s 3.5549 s 87.273 ms 5.3019 s 5.4156 s 89.982 ms 8.8881 s 9.3985 s 94.517 ms
num_vars commit 2 open 2 verify 2 commit 4 open 4 verify 4 commit 8 open 8 verify 8 commit 16 open 16 verify 16 commit 32 open 32 verify 32 commit 64 open 64 verify 64
10 5.1710 ms 19.161 ms 17.497 ms 5.5321 ms 19.955 ms 18.305 ms 6.9805 ms 21.759 ms 19.686 ms 8.9328 ms 25.419 ms 22.656 ms 13.693 ms 33.031 ms 28.425 ms 20.444 ms 48.541 ms 39.944 ms
11 8.0846 ms 26.683 ms 22.736 ms 8.7722 ms 27.960 ms 23.286 ms 11.471 ms 30.198 ms 24.923 ms 14.816 ms 35.018 ms 27.905 ms 23.470 ms 45.130 ms 33.669 ms 38.998 ms 64.989 ms 44.939 ms
12 13.597 ms 36.545 ms 28.453 ms 15.692 ms 38.159 ms 29.033 ms 20.758 ms 41.582 ms 30.526 ms 28.319 ms 48.414 ms 33.358 ms 45.702 ms 63.874 ms 39.284 ms 74.750 ms 90.984 ms 50.567 ms
13 25.318 ms 50.212 ms 34.392 ms 26.065 ms 52.826 ms 34.925 ms 39.735 ms 58.303 ms 36.333 ms 53.787 ms 69.227 ms 39.137 ms 85.366 ms 95.939 ms 45.134 ms 137.86 ms 138.80 ms 56.171 ms
14 45.369 ms 70.691 ms 40.542 ms 48.369 ms 76.043 ms 41.212 ms 71.622 ms 84.837 ms 42.524 ms 101.33 ms 103.54 ms 45.557 ms 164.94 ms 151.43 ms 51.172 ms 290.80 ms 225.09 ms 62.291 ms
15 91.363 ms 103.29 ms 47.268 ms 101.06 ms 111.36 ms 47.849 ms 141.88 ms 130.64 ms 49.008 ms 196.54 ms 164.65 ms 51.851 ms 315.35 ms 258.02 ms 57.693 ms 574.17 ms 392.52 ms 68.854 ms
16 170.71 ms 157.63 ms 54.054 ms 188.66 ms 176.75 ms 54.660 ms 268.26 ms 212.33 ms 55.858 ms 370.74 ms 280.92 ms 58.855 ms 571.54 ms 456.43 ms 64.414 ms 1.1130 s 714.94 ms 75.532 ms
17 354.48 ms 261.99 ms 61.653 ms 367.62 ms 308.54 ms 61.766 ms 526.09 ms 386.15 ms 63.494 ms 755.97 ms 522.14 ms 65.950 ms 1.2399 s 890.21 ms 71.718 ms 2.4407 s 1.3821 s 82.832 ms
18 739.07 ms 520.60 ms 69.276 ms 794.45 ms 591.77 ms 69.392 ms 1.1169 s 737.85 ms 70.507 ms 1.5141 s 1.0001 s 73.637 ms 2.6478 s 1.5715 s 79.124 ms 5.1186 s 2.6724 s 90.325 ms
19 1.4967 s 946.73 ms 77.226 ms 1.5801 s 1.0208 s 77.349 ms 2.2429 s 1.4031 s 78.447 ms 3.0869 s 1.9302 s 80.906 ms 5.0353 s 3.0734 s 87.126 ms 10.004 s 5.2565 s 96.227 ms
20 3.0468 s 1.8904 s 85.361 ms 3.1666 s 2.1642 s 85.396 ms 4.5045 s 2.7066 s 86.597 ms 5.6582 s 3.8164 s 88.964 ms 10.934 s 6.1087 s 93.487 ms 19.762 s 10.453 s 104.39 ms

What's the difference between the two table? Also, there's no need to put so much data.

One is for polynomials over base field, another is over extension field. Updated the description.

I'll choose fewer number of variables and batch sizes later.

Copy link
Collaborator

@dreamATD dreamATD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dreamATD dreamATD merged commit a225ab7 into master Sep 9, 2024
4 checks passed
@dreamATD dreamATD deleted the basefold-improve-open branch September 9, 2024 08:22
@kunxian-xia kunxian-xia linked an issue Sep 10, 2024 that may be closed by this pull request
hero78119 pushed a commit that referenced this pull request Sep 30, 2024
Main tasks accomplished by this PR:
- [x] Replace the naive batch commit (committing to individual polys) to
real batch commit, i.e., committing to multiple polynomials in a single
Merkle tree.
- [x] Add the `simple_batch_prove` and `simple_batch_verify` methods.
These methods support opening:
- One commitment that commits to multiple polynomials of the same size.
    - One opening point.
- [x] Switch the encoding algorithm from the one in BaseFold paper to
Reed Solomon code. The encoding algorithm of RS code is much faster, and
RS code has better distance so allows a better parameter.
- [x] Estimate the appropriate parameter for RS code.

(The original `batch_prove` and `batch_verify` methods supports opening
multiple commitments, multiple points and a flexible combination between
polys and points, but only allow each input commitment to contain only
one polynomial)

---------

Co-authored-by: Wisdom Ogwu <[email protected]>
Co-authored-by: dreamATD <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request mpcs
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

optimize batch_commit method
4 participants