Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fake user Inclusion Proof verified in contract #9

Open
rkdud007 opened this issue Apr 20, 2024 · 4 comments
Open

Fake user Inclusion Proof verified in contract #9

rkdud007 opened this issue Apr 20, 2024 · 4 comments

Comments

@rkdud007
Copy link

rkdud007 commented Apr 20, 2024

participants: @qpzm, @rkdud007

Context

We managed to generate fake inclusion proof of a non-existing user, the user id is out of the bounds of the original CSV file. As all the balance cells’ values are zero, it might not be a critical issue in the CEX use case of proof of liability, but this becomes critical issue when Summa’s core logic is utilized in more general application use cases that inclusion itself plays an important role.

The intended behavior of the blind/unblind advice column

user_name is a blind advice column, which is the original implementation of the advice column. However, balance_i columns are defined as an unblind advice column which is implemented in halo2 in this commit.

As can be seen, both the blind column and unblind column share the same region.assign_advice method to fill in the values in the table. Unlike the blind column, unblind keeps the &advice_values[unusable_rows_start..] in the Scalar::ZERO value, which is described in this line. And as we use a fixed column from range_check, basically $2^{K}$ - N_USER rows of both user_name and balance_i columns are considered unusable rows.

Meaning, to visualize commitments computation to advice column polynomials step, it would look like below :

row index user_name balance_1 balance_2 range_check
0 alice 10 100 0
1 bob 20 0 1
2 chad 0 0 2
$2^{16} - 2$ random value 1 0 0 $2^{16} - 2$
$2^{16} - 1$ random value 2 0 0 $2^{16} - 1$
_
$2^{17} - 1$ random value i 0 0 _

This case, it will not be problematic in checking total liability as this only verifies via the sum of balances, but for a small amount of possibility that corresponds with hash collision, there might be a change that someone asks for inclusion proof.

Commit with KZG not using blind column

As checked above, both blind/unblind columns pass an additional parameter Blind which blind set as a random value, and unblind set as zero.

However, if we go into the commit_lagrange function which is actually expected to utilize this param to compute commitment, it comments out the value and does not use it anywhere.

Commit with KZG:

fn commit_lagrange(
    &self,
    poly: &Polynomial<E::Scalar, LagrangeCoeff>,
    _: Blind<E::Scalar>,
) -> E::G1 {
    let mut scalars = Vec::with_capacity(poly.len());
    scalars.extend(poly.iter());
    let bases = &self.g_lagrange;
    let size = scalars.len();
    assert!(bases.len() >= size);
    best_multiexp(&scalars, &bases[0..size])
}

It’s different behavior from the IPA commitment version.

Commit with IPA :

/// This commits to a polynomial using its evaluations over the $2^k$ size
/// evaluation domain. The commitment will be blinded by the blinding factor
/// `r`.
fn commit_lagrange(
    &self,
    poly: &Polynomial<C::Scalar, LagrangeCoeff>,
    r: Blind<C::Scalar>,
) -> C::Curve {
    let mut tmp_scalars = Vec::with_capacity(poly.len() + 1);
    let mut tmp_bases = Vec::with_capacity(poly.len() + 1);

    tmp_scalars.extend(poly.iter());
    tmp_scalars.push(r.0);

    tmp_bases.extend(self.g_lagrange.iter());
    tmp_bases.push(self.w);

    best_multiexp::<C>(&tmp_scalars, &tmp_bases)
}

To sum up, we can now think this is how the table looks like:

row index user_name balance_1 balance_2 range_check
0 alice 10 100 0
1 bob 20 0 1
2 chad 0 0 2
$2^{16} - 2$ 0 0 0 $2^{16} - 2$
$2^{16} - 1$ 0 0 0 $2^{16} - 1$
$2^{17} - 1$ 0 0 0 _

And $2^{K}$ - N_USER index can generate fake inclusion proof of user_name, balance_i all set in 0 value.

PoC

We did a PoC with this commit to generate a non-existing user’s inclusion proof and check if it passed the verification of InclusionVerifier.

Basically gen_commit_and_proof.rs binary generates fake inclusion proof call data file of user index 18. Note that csv file contains only 16 lists of users.

Run with command, contract verifies the fake proof file as valid :

❯ REPORT_GAS=true npx hardhat test    

  Summa Contract
    deployment tests
      ✓ should not deploy with invalid currencies
      ✓ should not deploy with invalid byte range
      ✓ should not deploy with invalid verification key
      ✓ should not deploy with invalid snark verifier
      ✓ should not deploy with invalid grand sum verifier
      ✓ should not deploy with invalid inclusion verifier
      ✓ should not deploy if the number of cryptocurrencies is not matching the verification key 
    verify address ownership
      ✓ should verify the address ownership and store the addresses
      ✓ should revert if the caller is not the owner
      ✓ should revert if the address ownership has already been verified
      ✓ should revert if the proof of address ownership has invalid address
      ✓ should revert if the proof of address ownership has invalid chain type
      ✓ should revert if the proof of address ownership has invalid signature
      ✓ should revert if the proof of address ownership has invalid message
      ✓ should revert if requesting proof for unverified address
    submit commitment
      ✓ should submit a valid commitment
      ✓ should revert when the grand sum proof length mismatches with the total balances
      ✓ should revert a snark proof if its length is less than the grand sum proof
      ✓ should revert due to an invalid snark proof
    verify inclusion proof
      ✓ should verify inclusion proof with `verifyInclusionProof` function
      ✓ should not verify inclusion proof with wrong snark proof
      ✓ should not verify inclusion proof with wrong challenge points
      ✓ should not verify inclusion proof with value length mismatches with config

  Verifier Contracts
    Snark Proof Verifier
      ✓ should verify snark proof
      ✓ should revert with invalid proof
    GrandSum Proof Verifier
      ✓ should verify grand sum proof
    Inclusion Proof Verifier
      ✓ should verify inclusion proof
      ✓ should revert invalid inclusion proof
@qpzm
Copy link

qpzm commented Apr 21, 2024

Another example: random username hashes added at the last rows from 131066 to 131071 become valid witnesses with 0 balances.

POC

  1. To know the random appended value, adjust the prover to always set it as 97 equivalent to "a" in ASCII.
- *cell = Scheme::Scalar::random(&mut rng);
+ *cell = Scheme::Scalar::from(97);

https://github.com/summa-dev/halo2/blob/16934009f5355f9bbc6e0ae0d51bca6601ff4700/halo2_proofs/src/plonk/prover.rs#L370

  1. Then set the user index as one of the appended blind rows, especially in our case [131066..131072].
    Create an inclusion proof with [97, 0, 0].
    // @audit user_index can be any value in [131066..131072]
    let user_index = 131071;
    let challenge = omega.pow_vartime([user_index as u64]);

    let user_values = [Fp::from(97), Fp::zero(), Fp::zero()];

    let column_range = 0..N_CURRENCIES + 1;
    let mut inclusion_proof: Vec<Vec<u8>> = Vec::new();
    for column_index in column_range {
        let f_poly = advice_polys.advice_polys.get(column_index).unwrap();

        let z = if column_index == 0 {
        Fp::from(97)
        // big_uint_to_fp(entries[user_index as usize].username_as_big_uint())
        } else {
        Fp::zero()
        // big_uint_to_fp(&entries[user_index as usize].balances()[column_index - 1])
        };

https://github.com/summa-dev/summa-solvency/blob/fec83a747ead213261aecfaf4a01b43fff9731ee/prover/bin/gen_commit_and_proofs.rs#L161-L183

  1. Generate the commitment and proofs. The verification should pass.
cd ./prover 
cargo run generate_commitment_and_proofs

Recommendation

I suggest to change the username hash column to an unblinded advice column and regard {username_hash: 0, token0_balance: 0, token1_balance: 0} as invalid.
For unblinded advice columns, the random value is replaced with 0.

-         let username = meta.advice_column();
+         let username = meta.unblind_advice_column(); 

https://github.com/summa-dev/summa-solvency/blob/fec83a747ead213261aecfaf4a01b43fff9731ee/prover/src/circuits/univariate_grand_sum.rs#L75

@0xalizk
Copy link
Member

0xalizk commented May 30, 2024

critical issue when Summa’s core logic is utilized in more general application use cases that inclusion itself plays an important role.

but for a small amount of possibility that corresponds with hash collision, there might be a change that someone asks for inclusion proof.

If these two conditions hold in-circuit:

  • (a) the blinding values are random
  • (b) the filler balance values in blinding rows are zero

Doesn't that eliminate this attack?

@qpzm
Copy link

qpzm commented Jun 1, 2024

critical issue when Summa’s core logic is utilized in more general application use cases that inclusion itself plays an important role.

but for a small amount of possibility that corresponds with hash collision, there might be a change that someone asks for inclusion proof.

If these two conditions hold in-circuit:

  • (a) the blinding values are random
  • (b) the filler balance values in blinding rows are zero

Doesn't that eliminate this attack?

Thank you for considering it. I agree that these two conditions are already satisfied by this snippet.
https://github.com/summa-dev/halo2/blob/16934009f5355f9bbc6e0ae0d51bca6601ff4700/halo2_proofs/src/plonk/prover.rs#L367-L378

I would like to point out that a blinding row (random value, 0, 0) is indistiguishable from a row (real user_name hash, 0, 0) created by a user with zero balances. This allows 6 unrelated username hashes in our gen_commit_and_proof.rs example. This risk will be removed if we change the username hash column into an unblinded advice column and regard {username_hash: 0, token0_balance: 0, token1_balance: 0} as invalid.

@sifnoc
Copy link

sifnoc commented Jul 1, 2024

I would like to point out that a blinding row (random value, 0, 0) is indistiguishable from a row (real user_name hash, 0, 0) created by a user with zero balances. This allows 6 unrelated username hashes in our gen_commit_and_proof.rs example. This risk will be removed if we change the username hash column into an unblinded advice column and regard {username_hash: 0, token0_balance: 0, token1_balance: 0} as invalid.

There are assumptions in this issue:

  • The prover would generate all inclusion proofs for all users, including non-existent users, referred to as fake users.
  • Even though inclusion proofs are intended to be sent individually to each user, proofs of fake users could be shared (published or leaked) somehow.
  • A verifier wanting to verify a fake user's proof must obtain the proof from the prover.

As per your description of this issue, I agree that the fake users' proof can be verified as valid in the contract. However, the assumptions might not actually hold because the prover has no motivation to generate fake proofs in the current implementation.

But, as you pointed out in the case of 'general application',
I will respond to your concern based on the assumption that there is someone who possesses all inclusion proofs, not the prover.

Contrary to your concern, I believe that indistinguishability helps protect sensitive information about the custodian, specifically the number of customers existing in the CEX.

For example, let’s assume there is a CEX generating a commitment with K=20 param, but their actual user base is slight more than $2^{19}$.
Now, a competitor wanting to know the number of users could collect user inclusion proofs from someone who possesses all of them. By analyzing around $2^{19}$ proofs containing zero values—accounting for the difference between the $2^{20}$ polynomials and the actual users—not only in balances but also in usernames, they could easily estimate the number of users in the CEX.

Thus, while your recommendation is a good optional feature for general applications based on Summa B that aim to more explicitly show the existence of users and non-users in the future, the indistinguishability provided by the blinding factor currently helps to protect sensitive information within the general scope of Summa B.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants