Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thoroughly change NoiseDistribution #127

Merged
merged 14 commits into from
Oct 4, 2024
Merged

Conversation

ludopulles
Copy link
Contributor

This pull request contains a lot of modifications to the code base, in particular the file estimator/nd.py has been changed. The main contribution is four-fold:

  1. Currently, NoiseDistribution is one bulky class containing all the code specifics for different distributions: DiscreteGaussian, Uniform, CenteredBinomial and SparseTernary and has in various places case-by-case logic to return the correct value.
    This pull-request makes NoiseDistribution look like an "interface", for which there are many instantations, and helper functions (DiscreteGaussianAlpha, UniformMod, SparseBinary, Binary, Ternary) which call these instances. This improves usability of ND's, extendability to more distributions/functions, and flexibility in attacks that target specific ND's (i.e. like MITM in estimator/lwe_guess.py that has code for SparseTernary).

  2. Moreover, there was a TODO in the code of SparseTernary that access to the variables p and m was needed to compute the actual support_size. That is now very easy, so I did that. The code in estimator/lwe_guess.py has been adjusted properly to compute the right probability, but also by moving this logic to SparseTernary, such that it may well be used in different places (lwe_dual.py perhaps?). This changes the attack costs, by generally lowering it by 1 or 2 bits of security (the search space is a ~6√3/(πn)-fraction smaller), but also weirdly increases sometimes (probably: when doing MitM, you split the secret in two parts, one of the two might have a higher density -> bigger search space).

  3. The parameter n does not need to be given to a NoiseDistribution (especially a secret!) when creating a LWEParameter, because __post_init__ sets n to the right value. Thus, remove n in all example codes. Especially, in the docs the secret was called with a different n than the n of LWEParameters. This shows that this redundancy should be removed to prevent errors. Note: I also moved n to last parameter in SparseTernary to make this possible, and to make it more consistent with the other constructors.

  4. Small fixes in the documentation.

Random quirk: running the doctest on estimator/util.py gave me an error. I wonder how this passed CI.

>>> LWE.dual(params)
rop: ≈2^103.4, mem: ≈2^63.9, m: 904, β: 251, d: 1928, ↻: 1, tag: dual
>>> dual_hybrid(params)
rop: ≈2^92.1, mem: ≈2^78.2, m: 716, β: 170, d: 1464, ↻: 1989, ζ: 276, h1: 8, tag: dual_hybrid
rop: ≈2^91.6, mem: ≈2^77.2, m: 711, β: 168, d: 1456, ↻: ≈2^11.2, ζ: 279, h1: 8, tag: dual_hybrid
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because the search space of sparse ternaries becomes smaller when you take the number of +1's and -1's into account. Related to below, I'll also put these changes to the docs into a different PR.

@@ -428,7 +424,7 @@ def __call__(

params = params.normalize()

if params.Xs.is_sparse:
if type(params.Xs) is SparseTernary:
Copy link
Owner

@malb malb Sep 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_sparse could cover more than merely SparseTernary. I think we should keep is_sparse and classes self-report True or False?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the current sparsity cutoff at 1/2, I think the only other sparse noise distributions are:

  • Uniform(x, x): a constant value is not a secret so this makes LWE trivially broken, and the code will produce some errors.
  • DiscreteGaussian with a very small stddev: I believe the current code does not accurately handle such Discrete Gaussians. Namely, experimenting with stddev=0.5 show me that density and support_size are currently computing under- and overestimates. (I'm happy to create an issue for this if needed!)

The reason I changed it to SparseTernary is to 1) take the number of +1's and -1's into account and 2) remove the calculation using the bounds as these are giving overestimates on the actual support size for the above case of the DiscreteGaussian.

Let me split up this PR into only refactoring ND, and I'll put the SparseTernary changes into a separate one.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. FWIW I was also thinking about future proving the code :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, I created a branch for just refactoring ND:
main...ludopulles:lattice-estimator:refactor-ND

But it seems there are some weird cases, where SparseTernary is called with p a half-integer. Is it okay to not split up the PR, but instead to keep the generic "is_sparse" code with a special case for SparseTernary inside there?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by: "keep the generic "is_sparse" code with a special case for SparseTernary inside there" ? An attribute that checks "isinstance(SparseTernary)"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically this commit: 31957df

I meant to revert the code back to "if params.Xs.is_sparse:", and then inside this if-statement add code specific for SparseTernary to arrive at the right estimates, because the previous generic sparse secret code did not take into account the weight and +1/-1 conditions for SparseTernary.

ludopulles added a commit to ludopulles/lattice-estimator that referenced this pull request Sep 23, 2024
Now, use the general NoiseDistribution.is_sparse method to do a different cost estimation for sparse secrets. But, when it is sparse ternary, use the splitting methods from that noise distribution to get the precise costs.

In the case of lwe_dual, there was only code that worked for SparseTernary, so crash if other objects are put in.
In the case of lwe_guess, use an estimate, i.e. the given for all values in the bounded range the probability to be non-zero is assumed to be the same.

See: malb#127 (comment)
@bencrts
Copy link
Collaborator

bencrts commented Sep 25, 2024

Just a note that #130 will have some merge conflicts with this, I can always wait for this PR to be merged and rebase over there based on any changes.

@ludopulles
Copy link
Contributor Author

Thanks for the heads up, it should be fairly easy to make it into a class after this PR.

I think I'm done here. I'm waiting to know if the changes in here are OK:

Goal: Put every type of NoiseDistribution into its own class, which inherits form the NoiseDistribution class.
Benefit: when doing things specific for a type of noise, you can put all the logic in this class, giving more flexibility to do custom things, and improving readability!
And, while we are at it, make a beautiful function that splits two sparse ternary distributions into two, while preserving its hamming weight.
Also: add SparseBinary, Binary, Ternary as noise distributions :).

Note: this is a breaking change.
	  You have to change the way you constructe ND.SparseTernary objects!
	  Benefit: you don't have to specify the dimension `n` anymore when putting it in a LWEParameters/NTRUParameters object :).
Now, use the general NoiseDistribution.is_sparse method to do a different cost estimation for sparse secrets. But, when it is sparse ternary, use the splitting methods from that noise distribution to get the precise costs.

In the case of lwe_dual, there was only code that worked for SparseTernary, so crash if other objects are put in.
In the case of lwe_guess, use an estimate, i.e. the given for all values in the bounded range the probability to be non-zero is assumed to be the same.

See: malb#127 (comment)
Use the fastest method for rough estimates so mitm=false and babai=false!
@malb
Copy link
Owner

malb commented Oct 4, 2024

I'm happy to merge this, unless there's something on your todo list before?

@malb
Copy link
Owner

malb commented Oct 4, 2024

It's really nice, thank you!

@ludopulles
Copy link
Contributor Author

There are no TODOs for now, so ready to merge 😃.

@malb malb merged commit 5c25455 into malb:main Oct 4, 2024
2 checks passed
@ludopulles ludopulles deleted the change-ND-fix-ST branch October 4, 2024 12:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants