Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Groups support #42

Merged
merged 55 commits into from
Sep 6, 2023
Merged

New Groups support #42

merged 55 commits into from
Sep 6, 2023

Conversation

jagerman
Copy link
Member

@jagerman jagerman commented Aug 3, 2023

Adds a group info config type, which tracks distributed info for v2 groups.

This requires introducing/using a few new concepts not currently used for user config messages:

  • Multiple decryption keys. User config doesn't do this at all (rather it generates a single decryption key from the private key for each namespace).

This doesn't yet add support for sharing and distributing those keys, just for having being able to load a config with a list of multiple possible keys.

  • Config signing and verification. For user configs this isn't done, since only the owner can actually encrypt/decrypt a config message, just being able to decrypt it is authentication enough.

This required various modifications to make the config library properly prevent modifications when we can't modify, and to properly follow that through in terms of merging, updates, etc.

TODO:

  • storage server subaccount auth signature generation
  • storage server subaccount authentication helper
  • C API for user groups group_info
  • convo info volatile entry for group
  • support for wrapping/unwrapping a config in a protobuf wrapper, for existing config types (moved to PR Protobuf handling #51 instead of here)
  • add members/keys size() functions for group types

include/session/config/groups/members.hpp Outdated Show resolved Hide resolved
src/config/groups/members.cpp Outdated Show resolved Hide resolved
@frtget
Copy link

frtget commented Aug 16, 2023

should have done everything in a single review, sorry about that :)

@jagerman jagerman force-pushed the groups branch 4 times, most recently from 3a96949 to b3e195d Compare August 21, 2023 23:22
@jagerman jagerman changed the title Group metadata; signed & read-only configs support New Groups support Aug 22, 2023
Adds a group info config type, which tracks distributed info for v2
groups.

This requires introducing/using a few new concepts not currently used
for user config messages:

- Multiple decryption keys.  User config doesn't do this at all (rather
  it generates a single decryption key from the private key for each
  namespace).

This doesn't yet add support for sharing and distributing those keys,
just for having being able to load a config with a list of multiple
possible keys.

- Config signing and verification.  For user configs this isn't
  done, since only the owner can actually encrypt/decrypt a config
  message, just being able to decrypt it is authentication enough.

This required various modifications to make the config library properly
prevent modifications when we can't modify, and to properly follow
that through in terms of merging, updates, etc.
Fixes a bug where construction-provided keys were loaded in reversed
priority.
Tests to follow.
The first part of this assert shouldn't fail if we are starting from a
fresh, blank config (in which case we don't actually serialize our own
message anymore).
It doesn't really make sense to have this as we either want signatures
(for shared messages) or don't (for personally encrypted messages).  The
only place we were passing it as `true` was in a place that also didn't
pass a verifier or signer (during dumping), and so already wasn't
adding/checking signatures.
This makes Keys construction, loading, and rekeying take the Info and
Members object to update their keys, thus removing the requirement for
application code to worry about key management at all (they just feed
the keys in, and they propagate to info/members).

Also adds a state dump (similar to base config `dump()`) to Keys.
Messages to a new group can't go into 0 (since that is publicly
depositable), so reserve a namespace (11) for messages.
Adds method for encrypting/decrypting a message.

This supports both compressed+encrypted and just plain encrypted.

Abstracts the zstd compression implementation from base.cpp into
internal.cpp, and uses it inside the new group message encryption

Compression is only used if beneficial (that is, only if compression
actually reduces the message size).
This adds methods to `Keys` that generates subaccount tokens and
signatures as needed to do storage server subaccount authentication
(which currently requires testnet as the subaccount code is not yet
active on mainnet), along with test code to test it.

Also adds a tests/swarm-auth-test binary that spits out storage requests
for store/retrieve testing.
It adds a nlohmann::json dependency, and isn't really part of the normal
test suite.
- Adds C wrappers for all the swarm authentication methods
- Adds C wrapper for querying if the keys object has admin permission
- Adds C wrapper for key_supplement to generate supplemental key messages

- Changes the swarm `subaccount` value to be base64 instead of hex
  (since it isn't really a pubkey)
- Makes the vector passed into key_supplement const&
jagerman and others added 5 commits August 31, 2023 15:39
- Add keys.size() return the number of keys in the object.
- Add C API for retrieving keys & key size
- Fix copy-and-paste error in groups_members_size C API name
jagerman and others added 7 commits August 31, 2023 18:26
Also adds tests to verify that auth_data and secretkey data propagate as
expected.

Many thanks to Harris for identifying and tracking this down.
- Added config_groups_keys to config/base.h
- Looped in test call into config user groups C api unit test
Adds current hash tracking to groups::Keys; this adds a `current_hashes`
method to group::Keys that works similarly to the base config
current_hashes (though returns a set instead of vector).

(The C version is `groups_keys_current_hashes`)

Testing this also triggered a bug in that we weren't probably re-loading
the verified-signature state on a dump->load cycle, which caused an
assertion failure on merge (because the current state couldn't be
successfully serialized-then-deserialized), also fixed here.

Also renames the recently added C config_group_keys to config_get_keys
to better reflect its purpose.
This adds an `.invited` flag for all group types (legacy, new, and
communities) that can be used to track a invited-but-not-yet-joined room
by session clients.

Also adds `.name` to new groups data so that the name from an invitation
can be stored before accepting (after accepting the name will come from
the shared group config message).

Add `setKicked()` and `kicked()` methods which clear both auth_data and
secretkey and report whether both are empty, respectively.  (For the C
API these are `ugroups_group_set_kicked` and `ugroups_group_is_kicked`
and take a pointer to the `ugroups_group_info`).
The key lists weren't being set properly when loading a Keys object from
a dump; these sets the key lists properly.
@jagerman jagerman merged commit 1e5f381 into oxen-io:dev Sep 6, 2023
@jagerman jagerman mentioned this pull request Sep 6, 2023
///
/// Generates a supplemental key message for one or more session IDs. This is used to
/// distribute existing active keys to a new member so that that member can access existing
/// keys, configs, and messages. Only admins can call this.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you meant by "access[ing] existing keys" here. Removed in #64 because it seems redundant.

/// including a proper signature by an admin of the group. The signing value must have read
/// permission, but parameters can be given to also require write or delete permissions. A
/// subaccount signing value should always be checked for validity using this before creating a
/// group that would depend on it.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"creating a group" as in adding a group to user configs ?

///
/// Inputs:
/// - `groupid` -- the group id/pubkey, in hex, beginning with "03".
/// - `session_ed25519_secretkey` -- the user's Session ID secret key.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we just take the session ID instead of the secret key ? We only use it to extract the pk and then remove the sign to get the xpk and use that.

_sign_sk.data(),
32, // Just the seed part of the value, not the last half (which is just the pubkey)
reinterpret_cast<const unsigned char*>(key.data()),
std::min<size_t>(key.size(), 64));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should assert that key is smaller than 64 instead in order to avoid using the same key in different contexts by mistake by using several keys that start with the same 64 characters. I can add that to #64

crypto_sign_ed25519_sk_to_pk(T.data(), user_ed25519_sk.data());
bool neg = T[31] & 0x80;
T[31] &= 0x7f;
if (0 != crypto_scalarmult_ed25519_noclamp(to_unsigned(token.data() + 4), k.data(), T.data()))
Copy link

@frtget frtget Oct 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why we don't want to clamp here ?
Tor and Lokinet both clamp blinding factors but I'm not sure whether it really makes the DH key exchange more secure.
I don't think it protects against small subgroup attacks because a is a multiple of the cofactor so ±akB should either be the identity element or be in the main subgroup anyway.
Do you think we should clamp just in case we decide to reuse this blinded keypair for X25519 key exchange in the future for some reason ?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems the person who implemented the key blinding in tor wasn't sure either.
@jagerman I see you're the one who added the code that clamps the blinding factor in lokinet so I'm curious to know the reason why it's done if you know it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The short answer is that it "doesn't work" with clamping half the time, because when we have to use -a instead of a, clamping it changes the value and breaks the -A = -aG relationship that we're depending on. There's a fair bit to unpack behind this though, so let me elaborate.

I wish we didn't need that relationship at all: but unfortunately it's an artifact of a suboptimal choice (which unfortunately followed Signal) of having Session IDs be the X25519 pubkey conversions of Ed25519, rather than direct Ed25519 keys. Unfortunately that Ed->X conversion loses the sign bit from the Ed25519 pubkey, which is what causes all this trouble: when reversing the Ed->X conversion we get ourselves to Ed=±sqrt(...) and there's no way to know which one was the correct Ed pubkey. So what we're doing here is effectively always signing as if the real pubkey that yielded the Session ID was in fact always the positive of the two possible Ed25519 pubkeys, when in reality half the time it is negative.

In order to do that when we actually have a negative pubkey, however, we have to effectively produce a signature from -a rather than from a -- but that won't work if we clamp -a.

To illustrate:

#!/usr/bin/python3

from nacl.signing import SigningKey
import nacl.bindings as sodium

for seed31 in (b'\0', b'\4'):
    seed = SigningKey(b"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0" + seed31)
    pubkey = seed.verify_key

    # This gives the private scalar, i.e. "a" for keypair a/A that yields aG = A.  (That's not its primary usage in
    # libsodium, and thus the function name here is understandably confusing).
    a = seed.to_curve25519_private_key().encode()

    print(f"\nPubkey:     {pubkey.encode().hex()}")
    Ac = sodium.crypto_scalarmult_ed25519_base(a)
    An = sodium.crypto_scalarmult_ed25519_base_noclamp(a)
    print(f"A noclamp:  {An.hex()}")
    print(f"A clamped:  {Ac.hex()}")

    if Ac[31] & 0x80:
        a_neg = sodium.crypto_core_ed25519_scalar_negate(a)
        Ac_neg = sodium.crypto_scalarmult_ed25519_base(a_neg)
        An_neg = sodium.crypto_scalarmult_ed25519_base_noclamp(a_neg)
        print(f"-A noclamp: {An_neg.hex()}")
        print(f"-A clamped: {Ac_neg.hex()}")

produces:

Pubkey:     3b6a27bcceb6a42d62a3a8d02a6f0d73653215771de243a63ac048a18b59da29
A noclamp:  3b6a27bcceb6a42d62a3a8d02a6f0d73653215771de243a63ac048a18b59da29
A clamped:  3b6a27bcceb6a42d62a3a8d02a6f0d73653215771de243a63ac048a18b59da29

Pubkey:     fd50b8e3b144ea244fbf7737f550bc8dd0c2650bbc1aada833ca17ff8dbf329b
A noclamp:  fd50b8e3b144ea244fbf7737f550bc8dd0c2650bbc1aada833ca17ff8dbf329b
A clamped:  fd50b8e3b144ea244fbf7737f550bc8dd0c2650bbc1aada833ca17ff8dbf329b
-A noclamp: fd50b8e3b144ea244fbf7737f550bc8dd0c2650bbc1aada833ca17ff8dbf321b
-A clamped: d1cd8406d8f93617b10a91a20c4126c08bb4f0fb23807fcfd25e939b7a9924ea

(Keep in mind these are encoded in little-endian, and so the sign bit is the most significant bit of the last byte, i.e. the second-last hex digit).

So in short, without clamping we can keep up the illusion that we were actually the positive alternative of the two pubkeys that could have given the X25519, even when we aren't.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we should clamp just in case we decide to reuse this blinded keypair for X25519 key exchange in the future for some reason ?

I really wish we could, because you're right, we're okay for Ed25519 signatures, but definitely don't want to go back to X again with this derived key.

Copy link

@frtget frtget Nov 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are talking about different things. I was talking about clamping the blinding factor k, not clamping t.

After some more digging it turns out that the security provided by clamping scalars is not preserved by multiplication operations on scalars because they are done modulo l (the order of the main subgroup).
For a point P with a torsion component (i.e. not in the main subgroup):
(-a mod l).P ≠ -(a.P)
and (k*t mod l).P ≠ k.(t.P)
Which means that even though a was already clamped during key generation, -a mod l and ±ak mod l as private key scalars are vulnerable to small subgroup attacks and clamping k is useless.

I think the reason why it was done in tor is that the developers weren't sure about the impact of clamping on the security of their key blinding scheme at the time and it seems the reason why it was added to lokinet was to make the crypto identical to tor's.

My understanding is that if we really wanted to make sure that this derived scalar is safe against such attacks we would have to use its torsion safe representative (which is an element of GF(8l)) or do arithmetic operations modulo the order of the entire elliptic curve group 8l (which is effectively the same thing when a is a multiple of the cofactor) so that:
kt.P = ±k.(a.P)
P can be written as Q + T where Q is a point in the main subgroup and T is a point of small order.
±k.(a.(Q + T)) = ±k.(a.Q + a.T) = ±k.(a.Q + ∞) = ±k.(a.Q) because a is a multiple of the order of T.
The torsion component is absorbed and the resulting point does not leak any information about our private scalars.

But that's not supported by libsodium and, in my opinion, not necessary because if we ever decide to reuse this keypair for DH we would have to use either the "noclamp" scalar multiplication function which fails when given a point that is not in the main subgroup or the ristretto functions which, by design, can't decode input bytes to a point outside of the main subgroup.
(Everything else clamps the input scalar and breaks the kt.B = b.(k.T) relationship we need where b/B is the other party's keypair.)
So, as far as I know, it would still be safe to use this derived scalar as is without caring about any of the above. Unless we switch to another crypto library.

In conclusion, clamping k is useless and there is not much we can do for now except maybe adding some comments to make it clear that these derived keys should never be reused for DH without extra precautions because they are vulnerable to small subgroup attacks.

/// - `rekey()` is called, returning the new keys config.
/// - `info.push()` is called to get the new info config (re-encrypted with the new key)
/// - `members.push()` is called to get the new members config (using the new key)
/// - all three new configs are pushed (ideally all at once, in a single batch request).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be a sequence request instead of a batch request ?

@venezuela01
Copy link

We need more experts to review the design and implementation. I hope @KeeJef will publish more internal design drafts before an actual software release. We have seen examples in the past where in-house developed code was vulnerable to algebraic attacks due to subtle mistakes by the core team. This situation further evidences the need for more external volunteers like @frtget to provide valuable opinions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants