MessageEncryptor design concerns #35

btoews · 2023-10-04T20:43:51Z

I was reading through the MessageEncryptor implementation and have some concerns. Nothing appears to be outright broken, so I feel like it's appropriate to have this conversation in public.

The thing closest to being broken is that long keys are being truncated to 32 bytes:

plug_crypto/lib/plug/crypto/message_encryptor.ex

Line 60 in 5884b46

aes128_gcm_encrypt(plain_text, aad, binary_part(secret, 0, 32), sign_secret)

This is pretty sketchy because you're reducing the user's key size without their awareness. AES has fixed key sizes, so it would be much better to return an error if a key of the wrong size is given. If you want to keep the API easy to use, you should pass keys through a KDF. Erlang's :crypto.pbkdf2_hmac/5 function would be suitable for this.

To give a concrete example of why this is concerning, imagine a user storing their key in PEM format for some reason:

-----BEGIN AES-256-GCM KEY-----
lvAGFJD1EpnaH+IkgrbWzQSpAWH/yT78/dN76cDzXOE=
-----END AES-256-GCM KEY-----

Passing that into MessageEncryptor, their key is now functionally -----BEGIN AES-256-GCM KEY-----\n.

My other concerns are with the overall encryption scheme. In pseudocode, the scheme is

def encrypt(message, aad // "A128GCM", secret, sign_secret)
  k1 = rand_bytes(16)
  iv1 = rand_bytes(12)
  {ct1, tag1} = aes_gcm(%{key: k1, iv: iv1, aad: "A128GCM", pt: message})

  iv2 = rand_bytes(12)
  {ct2, tag2} = aes_gcm(%{key: secret, iv: iv2, aad: sign_secret, pt: k1})

  [ct2, tag2, iv2, iv1, ct1, tag1]
    |> Enum.map(&Base.encode64/1)
    |> String.join(".")
end

There's a fair bit of undue complexity here. Most notably, sign_secret is serving no purpose as far as I can tell. GCM is an authenticated mode, so the ciphertext is already protected from tampering. I can't think of a scenario where users would benefit from the ability to roll these secrets separately. Requiring sign_secret does make the API more difficult to use though. Unless there's some purpose I'm missing, I'd recommend removing it. If there is a purpose, it would be nice to have it docummented.

I also question the key wrapping scheme. If you were expecting users to decompose the returned blob and store the first and second ciphertexsts separately, there's concievably some value to it. This is meant to be a black-box API though, so I don't imagine users are doing that. As with the sign_secret, I'd recommend docummenting any purpose to the key wrapping scheme and removing it if there is none.

In the same pseudocode, I think you'd be better served by the following implementation:

def encrypt(message, aad // "", secret, sign_secret)
  iv = rand_bytes(12)
  {ct, tag} = aes_gcm(%{key: secret, iv: iv, aad: aad, pt: message})

  [iv, ct, tag]
    |> Enum.map(&Base.encode64/1)
    |> String.join(".")
end

The text was updated successfully, but these errors were encountered:

tqbf · 2023-10-04T20:49:23Z

The conventional answer for how to handle arbitrary-sized keys as an input to a function like this would be HKDF, right?

It appears to me, given the terminology this code is using, that the key wrapping is taking from JWE. JWE does this in order to implement hybrid encryption with RSA and ECC, where you can't encrypt the whole message with the "outer" key. It doesn't really make sense for GCM (or any other AEAD).

btoews · 2023-10-04T20:50:37Z

The conventional answer for how to handle arbitrary-sized keys as an input to a function like this would be HKDF, right?

Yeah. It doesn't look like the underlying crypto library has an HKDF implementation though. Given that users might give passwords instead of keys to this API, I feel like PBKDF is fine.

josevalim · 2023-10-04T21:36:46Z

Thank you! I will drop more comments and a PR tomorrow but for additional context:

the front-end to this module is in Plug.Crypto and there is KDF in there
the sign secret in there is an empty string already and i believe it was kept as is mostly for backwards compatibility. we should indeed allow it to be nil and skip the key wrapping altogether :)

josevalim · 2023-10-04T21:37:33Z

/cc @potatosalad

josevalim · 2023-10-04T22:36:06Z

I couldn't sleep, here is a PR: #36

The PR no longer uses the sign secret. However, we still keep it around for decrypting old messages. The goal is to release it as v2. This means v2 messages cannot be decrypted by v1, but v2 can still decrypt v1.

Feedback is welcome!

potatosalad · 2023-10-04T22:36:14Z

Key wrapping in this context does provide some benefits against key exhaustion. Specifically, AES-128-GCM unfortunately suffers from the problem where the same nonce used with the same key can result in a "catastrophic" leak where other ciphertexts may be decrypted (see https://crypto.stackexchange.com/a/68525 and https://datatracker.ietf.org/doc/html/rfc8645).

Is the primary goal to reduce the overall size of the encrypted output or is it simply to reduce complexity of the implementation?

I agree that the user-provided keying information should be derived, and HKDF is probably the right option for that use-case (PBKDF2 also works, but it's much slower and designed for passwords). OTP 26 also provides XOF functions for SHAKE128 and SHAKE256, which could potentially be used for KDF, too (see https://www.erlang.org/doc/man/crypto#hash_xof-3).

josevalim · 2023-10-04T22:42:49Z

@potatosalad our usage always derive the keys and we can make it extra clear in the docs. Do we need key wrapping even if using KDF? Hopefully the PR can provide a reference point but I am also glad to drop it.

I agree that the user-provided keying information should be derived, and HKDF is probably the right option for that use-case

Unfortunately the KDF is not part of the signed/encrypted value, so we cannot change it between versions. We could allow users to opt-in into new implementations. In any case, we do cache the key derivation parts, so performance may not be a major concern?

potatosalad · 2023-10-04T22:52:21Z

Do we need key wrapping even if using KDF?

It just depends, if the KDF output doesn't change very often, it's eventually the same as using the original or master key without KDF.

I agree that the user-provided keying information should be derived, and HKDF is probably the right option for that use-case

Unfortunately the KDF is not part of the signed/encrypted value, so we cannot change it between versions. We could allow users to opt-in into new implementations. In any case, we do cache the key derivation parts, so performance may not be a major concern?

KDF in this context would be better than binary_part in the case where the keying information is larger than 128-bits (better to derive the key we need versus throwing away keying information). Caching with PBKDF2 is probably fine. HKDF is simpler to implement compared to PBKDF2; it just uses HMAC underneath. XOF functions can also theoretically be used as a KDF, but it's a little early for them to be available everywhere (OTP 26 only recently added them to the crypto library).

potatosalad · 2023-10-04T22:57:55Z

If we wanted to drop the key wrapping entirely, it might make sense to have the "v2" use something like xchacha20-poly1305 which doesn't suffer from the same catastrophic failure (2^96 messages can be sent with the same key versus 2^32 for others before the birthday paradox problem is reached).

btoews · 2023-10-05T00:49:56Z

The current nonce is 12 bytes, so you'd get ~2^48 messages before there's 50% chance of a collision. If MessageEncryptor is intended to be used for that many messages, then re-keying is definitely reasonable. 👍 for switching to xchacha20-poly1305 if we're talking about a v2 anyway.

potatosalad · 2023-10-05T02:27:04Z

The current nonce is 12 bytes, so you'd get ~2^48 messages before there's 50% chance of a collision.

@btoews You're correct, I mixed up the collision problem with the guidance specific to AES-GCM found in NIST SP 800-38D:

The probability that the authenticated encryption function ever will be invoked with the same IV and the same key on two (or more) distinct sets of input data shall be no greater than 2^-32.
…
The total number of invocations of the authenticated encryption function shall not exceed 2³², including all IV lengths and all instances of the authenticated encryption function with the given key.

By comparison, AES-GCM-SIV is designed to allow for some IV collisions without the "catastrophic failure" found in AES-GCM and can support 2⁵⁰ messages, depending on the size of the messages, if you want to be below the 2^-32 chance of collision.

XChaCha20-Poly1305 has a 2⁹⁶ message limit with a 2⁸⁰ message limit below the 2^-32 chance of collision.

Reference of my current understanding:

Algorithm	2^-32 Chance of Collision	Message Size Limit
AES-GCM	2³²	64GB
AES-GCM-SIV	2⁵⁰	64GB (?)
ChaCha20-Poly1305	2⁴⁰ (?)	256GB
XChaCha20-Poly1305	2⁸⁰	1ZB (?)

josevalim · 2023-10-05T07:12:19Z

I have updated the pull request to use CHACHA20 Poly1305, can you please review and let me know? Thank you.

josevalim · 2023-10-05T07:16:59Z

The link for convenience: #36

btoews · 2023-10-06T16:32:46Z

@josevalim Thanks for listening to my concerns and responding to them so quickly! ❤️

potatosalad mentioned this issue Oct 5, 2023

Use XChaCha20 Poly1305 in message encryptor and ignore sign_secret #36

Merged

josevalim closed this as completed Oct 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MessageEncryptor design concerns #35

MessageEncryptor design concerns #35

btoews commented Oct 4, 2023

tqbf commented Oct 4, 2023

btoews commented Oct 4, 2023

josevalim commented Oct 4, 2023

josevalim commented Oct 4, 2023

josevalim commented Oct 4, 2023

potatosalad commented Oct 4, 2023 •

edited

Loading

josevalim commented Oct 4, 2023

potatosalad commented Oct 4, 2023

potatosalad commented Oct 4, 2023

btoews commented Oct 5, 2023

potatosalad commented Oct 5, 2023

josevalim commented Oct 5, 2023

josevalim commented Oct 5, 2023 •

edited by Gazler

Loading

btoews commented Oct 6, 2023

MessageEncryptor design concerns #35

MessageEncryptor design concerns #35

Comments

btoews commented Oct 4, 2023

tqbf commented Oct 4, 2023

btoews commented Oct 4, 2023

josevalim commented Oct 4, 2023

josevalim commented Oct 4, 2023

josevalim commented Oct 4, 2023

potatosalad commented Oct 4, 2023 • edited Loading

josevalim commented Oct 4, 2023

potatosalad commented Oct 4, 2023

potatosalad commented Oct 4, 2023

btoews commented Oct 5, 2023

potatosalad commented Oct 5, 2023

josevalim commented Oct 5, 2023

josevalim commented Oct 5, 2023 • edited by Gazler Loading

btoews commented Oct 6, 2023

potatosalad commented Oct 4, 2023 •

edited

Loading

josevalim commented Oct 5, 2023 •

edited by Gazler

Loading