Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add discrete Laplace noise w/ SumVec and Histogram #3302

Merged
merged 7 commits into from
Jul 23, 2024

Conversation

divergentdave
Copy link
Contributor

@divergentdave divergentdave commented Jul 11, 2024

This closes #3301. Depends on divviup/libprio-rs#1072.

@divergentdave divergentdave marked this pull request as ready for review July 19, 2024 17:23
@divergentdave divergentdave requested a review from a team as a code owner July 19, 2024 17:23
Copy link
Contributor

@tgeoghegan tgeoghegan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like straightforward plumbing changes.

@@ -443,3 +458,60 @@ async fn janus_in_process_one_round_with_agg_param_time_interval() {
)
.await;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Might as well check the plumbing of Prio3SumVec with a test, i.e. janus_in_process_sumvec_dp_noise()?

let mut un_noised_result = [0u128; HISTOGRAM_LENGTH];
un_noised_result[0] = report_count.into();
// Smoke test: Just confirm that some noise was added. Since epsilon is small, the noise will be
// large (drawn from Laplace_Z(20) + Laplace_Z(20)), and it is highly unlikely that all 100
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naive: what is _Z? (you can point me to a paper :D).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a lo-fi version of the notation $Lap_\mathbb{Z}(t)$ from section 4 of https://arxiv.org/pdf/2004.00010. (where blackboard Z represents the set of the integers)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also naive: there is no bound on the noise that could be applied to the result, yes? I'm mildly uncomfortable that the set of all valid results for this test is "all integers except for n", but seeing as the test fails if I turn off DP, I'm fine with it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, the noise value could be any integer, thus we could be adding any field elements to the shares. (Though the probabilities become astronomically low once you get many multiples of the scale parameter out)

I also did a negative test where I cranked epsilon way up (to u128::MAX) and confirmed that the test failed then. Even though it was still adding noise, the noise was all zeros because of the absurd parameter choice.

We could improve this with some sort of statistical test that takes each dimension of the noise vector as one sample of the noise distribution, and then performs a statistical test to see how good of a fit it is for the desired distribution. (with a low enough false positive rate to not cause problems) I filed #3325 to track this as a follow-up. It'll be a bit tricky because the distribution of the sum of two numbers drawn from base distributions is the convolution of those distributions, plus I'll need to read up on statistical tests.

@divergentdave divergentdave enabled auto-merge (squash) July 23, 2024 19:36
@divergentdave divergentdave merged commit 3eab5d4 into main Jul 23, 2024
8 checks passed
@divergentdave divergentdave deleted the david/dp-sumvec-histogram branch July 23, 2024 20:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorporate DP into Prio3SumVec tasks
3 participants