Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should PQCP implementations ship randombytes() #86

Closed
mkannwischer opened this issue Jul 19, 2024 · 10 comments
Closed

Should PQCP implementations ship randombytes() #86

mkannwischer opened this issue Jul 19, 2024 · 10 comments

Comments

@mkannwischer
Copy link
Contributor

mkannwischer commented Jul 19, 2024

During the 2024-07-18 TSC meeting, we were discussing #4.

Let me try to summarize (please correct me if I got something wrong):
There appears to be consensus that the derandomized APIs should be supported by all implementations in the PQ Code Package.
The discussion then drifted towards if the randomized APIs should also be required or not. Some opinions were raised that the derandomized versions are much easier to integrate into other libraries/software as secure randomness can be provided from the outside. This lead to discussions whether or not we should take on the responsibility of providing a good source of randomness at all or if this should be expected to be provided by the user.

While these discussions were mostly about APIs, I think it makes sense to discuss this separately: Do we want to include a decent default randombytes implementation or not?
If we don't, then not supporting the randomized APIs sounds like a reasonable choice as it will ease the integration.
If we do, then the randomized APIs are easier to use and harder to misuse.
Some strong opinions on this have been voiced, and the conclusion was that we need to discuss this in a broader group. We may want a TSC vote on this in case we want this to be consistent across projects.

@mkannwischer
Copy link
Contributor Author

My view is that the developers and maintainers of pqcp are (hopefully) more qualified to pick a randombytes implementation that is less likely going to result in disasters. This at least applies to common platforms and operating systems.
This may not matter for integrations into other big crypto libraries like liboqs or mbedtls, but it does matter for smaller projects developed by people with less cryptography experience.

I'd very much prefer to provide a good default choice even if we want to document that user should replace the randomness generation with their own suitably secure randomness generator. I think we should make it easy to replace it, but very hard to misuse.

I'm worried that we will see pqcp code being used in the wild with hard-coded randomness - because who really knows what coins is supposed to be? I am not sure how we can avoid this if we really decide to only support the derandomized APIs.

To confirm my point, I asked ChatGPT to use the API we proposed in #4 (mostly for fun - I don't think this is what we should worry about most). It produced the following code:

...
uint8_t coins[32];     // Random coins

// In a real-world scenario, you would obtain these from a proper source
for (int i = 0; i < 32; i++) {
    coins[i] = (uint8_t)(255 - i);
}

// Call the pqcp_mlkem512_ref_enc_derand function
int result = pqcp_mlkem512_ref_enc_derand(ct, ss, pk, coins);

This results in convieniently useless crypto. At least ChatGPT seems to know that you should not do it like this, but who will really read that comment here?
I am also curious where it learned that this is a good example of initializing randomness.

@hanno-becker
Copy link
Contributor

hanno-becker commented Jul 19, 2024

I lean towards thinking that providing a good source of randomness is a difficult problem on its own that is outside of the scope of PQCP and should be handled by whatever higher-level stack PQCP will be used with.

Some thoughts on the risk @mkannwischer flags regarding people directly using the PQCP API and doing it badly by providing poor random coins:

  • As @mkannwischer says, we should distinguish the question of scope from the question of API. The issue of a coins parameter that's tempting to use with 0's first and then forget about it, is one of API, not one of scope. If we decide randomness is out of scope, one could instead consider an API with an unimplemented randombytes hook that users need to link against, and instead of a coins parameter the user has to pass a randomness context. Something like int mlkem512_ref_enc(uint8_t *ct, uint8_t *ss, const uint8_t *pk, void *rnd_ctxt); where rnd_ctxt is an opaque pointer passed to a user-provided stateful randombytes (this is a common pattern in Mbed TLS at least). I'm not saying that this is necessarily the best API, but it would be easier to flag in the documentation of randombytes() that this MUST be a proper source of randomness, and it takes more effort for a user to ignore -- so certainly there's options to consider other than necessarily having a plain coins parameter.
  • How likely are customers of PQCP that do just use PQCP and not also another cryptographic stack? Using just PQCP seems like a bad idea because you should be using some hybrid PK scheme, not solely rely on MLKEM. So I'm not sure if we even serve the customer well by making it easy to use PQCP standalone.

@cothan
Copy link
Contributor

cothan commented Jul 19, 2024

I lean towards providing randombytes() in PQCP code.
I like the idea of "it works." When "it doesn't work", the as-is PQCP randombytes() "shouldn't compile", then the developer must sit down and read the manual.

  • PQCP should ship the randombytes() function for the "it works!" idea. This will make the majority happy.
  • When it doesn't work, any potential error should appear at compile time, IMO. In that case, I'm sure the developers know what they are dealing with on their platforms/OS, perhaps they already have a solution somewhere in their stack.

I'm concerned that if randombytes() isn't included in PQCP or if it complicates the API, things can easily go wrong.

@dstebila
Copy link
Contributor

If there's a randomized version in the API, then PQCP is going to need an implementation of randombytes at least for testing. So then the question becomes whether you make that testing implementation a serious implementation or not. If the testing implementation isn't done properly, there's a risk that someone will copy that code, thinking it's okay, even though it's not.

@franziskuskiefer
Copy link

A question we need to answer here is how we expect folks to use the code here.

  1. Integration into an existing crypto library will most likely NOT require a randombytes function. No serious crypto library can do without a proper rng.
  2. Integration into other (potentially, not crypto) code. This would be the "it works" case as @cothan called it, and would require a randombytes function.

If we actually expect usage outside of crypto library, we have to provide randomness.
I personally think people really shouldn't use it in this way. But at the same time do I expect people to actually use it that way 😔. Which implies that we should provide something.

@planetf1
Copy link
Contributor

One of the consumers of the implementations here in general would be something like liboqs - effectively a cryptolibrary. That would likely take the responsibility to provide a rng. Or either via liboqs or directly into say OpenSSL - again with rng. All examples of 1. above.

This seems most likely, but I still agree with @franziskuskiefer 's last point that we do include a rng (with clear documented caveats etc) since some I think will use directly

@hanno-becker
Copy link
Contributor

hanno-becker commented Sep 10, 2024

I think the barrier created by not providing an RNG is a more effective way of conveying that there's non-trivial work to be done here, than providing an insecure default and aiming to grab people's attention through documentation and warning messages.

I remain opposed to shipping randombytes() -- it's out of scope IMO.

mkannwischer added a commit to pq-code-package/mlkem-native that referenced this issue Oct 28, 2024
This removes our current randombytes() implementation and replaces it with a deterministic randombytes() based on surf (https://cr.yp.to/hash.html#surf).

Resolves #260.

As per pq-code-package/tsc#86, the consensus of the PQCP TCP is that a secure implementation of randombytes() should be provided by the software consuming code from the PQCP.

For testing, we should use a deterministic randombytes() that no one is going to put into a production system. I propose to use the popular implementation by Daniel J. Bernstein based on surf (https://cr.yp.to/hash.html#surf).
This implementation is clearly not secure which hopefully decreases the risk of someone just taking it.

I called it notrandombytes.c and moved it into the test folder to decrease the risk of misuse.

Signed-off-by: Matthias J. Kannwischer <[email protected]>
mkannwischer added a commit to pq-code-package/mlkem-native that referenced this issue Oct 28, 2024
This removes our current randombytes() implementation and replaces it with a deterministic randombytes() based on surf (https://cr.yp.to/hash.html#surf).

Resolves #260.

As per pq-code-package/tsc#86, the consensus of the PQCP TCP is that a secure implementation of randombytes() should be provided by the software consuming code from the PQCP.

For testing, we should use a deterministic randombytes() that no one is going to put into a production system. I propose to use the popular implementation by Daniel J. Bernstein based on surf (https://cr.yp.to/hash.html#surf).
This implementation is clearly not secure which hopefully decreases the risk of someone just taking it.

I called it notrandombytes.c and moved it into the test folder to decrease the risk of misuse.

Signed-off-by: Matthias J. Kannwischer <[email protected]>
@mkannwischer
Copy link
Contributor Author

My impression was that in the TSC meetings the consensus was that we do not ship a randombytes(). Can we close this issue then?

@planetf1
Copy link
Contributor

planetf1 commented Nov 7, 2024

I think that's the consensus too, I think the group were keen to get your final input. Let's confirm on call today?

@mkannwischer
Copy link
Contributor Author

This was discussed during the 11/21 TSC meeting again and there are no objections. We can close this issue.

@github-project-automation github-project-automation bot moved this from TSC Discussion to Done in TSC tracking Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

No branches or pull requests

6 participants