Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: end-to-end modexp, ecadd, ecmul and ecpairing #22

Open
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

NikitaMasych
Copy link

@NikitaMasych NikitaMasych commented Sep 5, 2024

What?

Implementation of modexp, ecadd, ecmul and ecpairing precompiles as circuits.

Previous PR

Basically all changes are made to boojum crate's gadgets. Previous PR for deprecated repository which has been reviewed:

era-boojum

Homogeneity with previous PR

Code has certain changes apart from contained in previous PR such as caused by rebasing and noticing missed problems.

@NikitaMasych NikitaMasych force-pushed the dl-precompiles branch 2 times, most recently from ca1c9a0 to 7e89898 Compare October 11, 2024 14:55
@NikitaMasych NikitaMasych marked this pull request as ready for review October 17, 2024 12:52
Copy link

@Fitznik Fitznik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are many places in the code where it can be optimized. I have a question about the tests; they check whether the witnesses are correctly calculated but do not check whether the constraints are satisfied. Please add this check (check_if_satisfied). This function is an important indicator of whether or not the circuits are working

let params = f.get_params();
let mut c0 = f.c0.clone();
let mut c1 = f.c1.clone();

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you use two versions here but didn't implement two versions for the other code?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We based our torus implementation on franklin-crypto. There with flag safe are following methods:

  1. square
  2. mul
  3. pow
  4. compress

mul

We did use what in franklin-crypto is marked as unsafe due to several reasons:

  1. mathematically, as you mentioned, "unsafe" version is actually quite safe
  2. computationally cheaper
  3. easier to use and maintain from the engineering perspective

pow

This one has safe flag only because square has one. Below I'll explain why we didn't use it there, thus it is not added here.

square and compress safeness:

These two actually have different taste of what unsafe flag is switching. While in square we potentially divide by zero, in compress, we at first make an enforcement that there won't be one.

overall

In production settings, we use safe version. I am struggling to imagine where the unsafe variant may be utilized, therefore we favoured the implementations of safe methods.

The flag was added for compress because it is still quite safe compared to others, and we thought that it may be still usable in some production settings.

Therefore, if we desire to achieve some homogeneity of the codebase, we better to simply remove unsafe version from compress method, or, if this has some potential, leave it in there.

// rhs = g == 0 ? zero : gamma
let zero = Fq6::zero(cs, params);
let gamma = Fq6::gamma(cs, params);
let is_zero_g = self.encoding.is_zero(cs);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same is mention here

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, it isn't possible that gamma could be zero

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case of multiplication, we ensure that g + g' cannot be zero due to algebraic impossibility of that. Here, on the other hand, it is completely possible that g will be zero, leading to division by that. That's why we need the check; I mean, in code, the gamma/g must be masked to zero, if g is zero.

This is particularly what is happening in franklin-crypto:

https://github.com/matter-labs/franklin-crypto/blob/finalization/src/plonk/circuit/bigint/algebraic_torus.rs#L254

pub fn inverse<CS>(&mut self, cs: &mut CS) -> Self
where
CS: ConstraintSystem<F>,
{
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this version will be cheaper if you compute inv like a witness and prove a*a_inv = 1

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, inversion in torus is a fairly cheap operation, since it's additive negation −g in Fq6​ maps to the multiplicative inverse of the corresponding element in Fq12. So, since we make inner negation there is no optimizational trick like this one.

Self::from_le_bytes(cs, bytes)
}

/// Finds the result of multiplying `self` by `other` mod `modulo`.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do mul through the ab = qP + rem

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cannot quite understand what do you mean here, can you please clarify?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are talking about leveraging the same approach as in U256, I would rather not do that because:

  1. We don't actually use types u1024, u2048 and u4096. These were planned for modexp, but the decision was to go with u256. Since we've spent some time on these implementations, they were left.
  2. There is enhanced complexity compared to U256, because primitive_types crate, from which we take the functionality for witnesses is limited by max U512 type, meaning that in these bigger types, there would be need to build complex compositional logics.

Summarizing, there is no particular benefit to spend time on building extra optimizations for unused things, which as well may introduce additional space for error.

@@ -351,6 +358,64 @@ impl<F: SmallField> UInt256<F> {
});
Self::from_le_bytes(cs, bytes)
}

/// Finds the result of multiplying `self` by `other` mod `modulo`.
pub fn modmul<CS: ConstraintSystem<F>>(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function has trouble with generating witnesses.
2. You check that the module is zero, which is exceptional; you should do this through the constraint is_zero.
3. You did this wrong first; it creates many inefficiencies because you perform additional range checks on the variables that were already checked.
4. It looks like you didn't wrap properly carry
// let mut carry = (q[0] * p[0] + r[0] - a[0] * b[0]) >> 16;
// carry = (q[1] * p[0] + q[0] * p[1] + r[1] - a[1] * b[0] - a[0] * b[0] + carry) >> 16;

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed second point. I am not particularly sure what to change in 3-rd and 4-th, can you please explain wdym?

crates/boojum/src/gadgets/u256/mod.rs Outdated Show resolved Hide resolved
let (rhs, overflow) = rhs.overflowing_add(cs, &r_u512);
Boolean::enforce_equal(cs, &overflow, &bool_false);

let are_equal = UInt512::equals(cs, &lhs, &rhs);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dont do this twice; write enforce_equal for UInt512

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No enforce_equal is implemented for any uint gadgets unfortunately


/// Returns `true` if `self >= other`, and `false` otherwise.
/// Here, `self` and `other` are represented as `UInt512<F>` and `UInt256<F>` respectively.
pub fn geq_than_u256<CS>(&self, cs: &mut CS, other: &UInt256<F>) -> Boolean<F>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here could be optimization for this; we have a special type of gate UIntXAddGate
see u16_long_subtraction_noborrow_must_borrow

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that it is also used in u16 and u32 implementations. However, can't use it in u256 directly because max supported width for this gate is 32. It is though used under the hood in UInt256::overflowing_sub -> UInt32::overflowing_sub_with_borrow_in, so I suppose the optimization via this gate already holds 🙂

@NikitaMasych
Copy link
Author

Hey there, @Fitznik !

I've marked as resolved all undisputable issues, which I resolved in code, there are left a few ones which I need your agreement/clarification

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants