Add authn protocol #7

mkobetic · 2022-08-09T20:34:07Z

This PR proposes a fairly significant change to our authn story. Instead of running a separate protocol, we would just add a Token type, carrying pretty much the same payload as authn.Request does, to the message/v1 protocol. The Token would need to be attached to every request and validated by the nodes (again same type of validation as authn did), so the validation would happen on every request instead of on every run of the authn protocol (although significant chunk of this could be cached since the token remains the same during its validity period).

The token can be carried as a standard HTTP/GRPC authorization header (The gateway maps these automatically https://github.com/grpc-ecosystem/grpc-gateway#mapping-grpc-to-http). The server side processing can be implemented with a GRPC interceptor.

Upsides:

no need to orchestrate the interaction of the the two protocols
the token is portable across nodes, so the client doesn't need to "reauthenticate" if it gets loadbalanced to a different node
implementation should be simpler

Downsides:

The token/signature validation happens on every request (we can cache some of that if needed)
We are losing the channel binding of the token that was mediated by the peer IDs, so the blast radius of a leaked token is necessarily larger (but comparable to JWT token style authorization)

proto/authn/v1/authn.proto

mkobetic · 2022-08-09T20:40:50Z

@neekolas I'm not entirely sure what to do on the TS side. It doesn't seem like I should just add

export * from "./auth/v1/authn.pb"

to ts/index.ts. Wouldn't that yield a single package with everything mixed together?

neekolas · 2022-08-09T20:43:06Z

Wouldn't that yield a single package with everything mixed together?

Ya, we should probably move to namespacing the exports. export * as authn from './auth/v1/authn.pb'

neekolas · 2022-08-09T20:46:25Z

Although we don't have to export anything right now. All we are ultimately going to care about from Typescript is whatever GRPC Gateway API gets exposed. We shouldn't need any typescript for the actual protos.

mkobetic · 2022-08-09T21:08:48Z

Hrm, this will be a problem, when I run npm run generate I get:

authn/v1/authn.proto: is a proto3 file that contains optional fields, but code generator protoc-gen-grpc-gateway-ts hasn't been updated to support optional fields in proto3. Please ask the owner of this code generator to support proto3 optional.--grpc-gateway-ts_out:

neekolas · 2022-08-09T22:00:58Z

@mkobetic There is a hack listed in this issue, but it's not great. The suggestion is just to modify the generator and make it register itself as supporting optional fields when it doesn't. Given that we don't actually have optional fields (I'm not entirely sure why this is triggered, but I suspect it's because oneof fields with a single option are effectively the same as optional in proto3) that may be enough to make it generate valid code.

Open to better suggestions.

neekolas · 2022-08-09T22:02:21Z

proto/authn/v1/authn.proto

+message ClientAuthResponse {
+    oneof union {
+        bytes token = 1;  // if auth was successful
+        string error_str = 2; // if auth was not successful


Why can't we just use GRPC's native status codes, with custom messages, to pass error responses?

Maybe we can, I'm not quite sure how that works. I'm still just playing with this, trying to figure out what I want to do.

At this point I'm thinking the token could really be just a random number, that we associate with the wallet when auth succeeds (this is the replacement for peer Id in peer store). The client will have to pass the token along with every request to the API and until the token is purged from the store (purgeLoop) the requests will be allowed. I think this would be an equivalent of what we had.

But then it gets interesting what happens when we purge the token and the connection is still active, somehow the client would have to re-engage the auth flow to get a new token. Makes me wonder if we couldn't just use the AuthData bundled as a token with every request to begin with. Then we don't have to worry about the protocol interplay and requests will be automatically portable between nodes too. Thoughts?

To clarify the token would be produced by the client and it would sign a timestamp and wallet address using the identity key signed by the wallet (very much like what Jazz had in the authn request). The server would just check the token on every request. If the token is too old (or from the future) or signatures don't check out, the request is rejected (probably using GRPC interceptors to implement this). That's it. So there wouldn't even be a separate Authn API.

That doesn't sound crazy to me. Little bit of extra overhead verifying all the signatures on each publish, but there's no network calls or anything like that. I like that it's stateless and portable.

Apps would be able to generate tokens infinitely into the future, but that was possible with the previous flow as well (so long as they kept the same peerID). Probably not our biggest worry.

Wdyt @jazzz?

We can reject tokens with future timestamps.

Isn't the timestamp basically a "slowly changing network variable"? I don't see stockpiling as an issue. There's no point for a valid client to stockpile tokens. There's nothing to gain from it other then increasing a chance of someone stealing them.

effectively establishing an upper and lower bound on the validity of tokens. This would mean when the rest of the malicious client mitigation work is completed we won't need to worry about the risk of legacy tokens.

The bounds I want to set with this design are

node_time < token.timestamp + expiration(1h?)

token.timestamp < node_time

So a token is only valid between token.timestamp and token.timestamp + expiration as interpreted in node time.

There's no point for a valid client to stockpile tokens.

Agreed but I'm thinking in the context of a malicious client.

Isn't the timestamp basically a "slowly changing network variable"?

Time is deterministic, and since the timestamp field is client supplied and untrustworthy theres no guarantee of when that token was generated. By adding a random variable to be included, we build confidence that the token was generated after the challenge was made public.

There are other more substantial issues that need to be addressed before this is the limiting factor, so I'm happy to revisit it later. However having a "proof of life" style mechanic that establishes a lowerbound on when the token was generated would reduce risk of token abuse

Agreed but I'm thinking in the context of a malicious client.

At least right now, the malicious client would also have the IdentityKeys needed to generate new tokens on demand.

mkobetic · 2022-08-09T22:09:38Z

@mkobetic There is a hack listed in this issue, but it's not great. The suggestion is just to modify the generator and make it register itself as supporting optional fields when it doesn't. Given that we don't actually have optional fields (I'm not entirely sure why this is triggered, but I suspect it's because oneof fields with a single option are effectively the same as optional in proto3) that may be enough to make it generate valid code.

Open to better suggestions.

It's because of the optional Signature on the PublicKey. We can actually remove optional on it since we require the wallet signature anyway. That's also why I dropped the separate signature field from the message, we my as well use the field on the PublicKey struct (I understand that Jazz was trying to preserve the original definition, but I don't think we really need to in this case). So I just removed optional on that field and it's not complaining now.

neekolas · 2022-08-09T22:17:29Z

@mkobetic OK, great.

neekolas · 2022-08-09T22:19:32Z

proto/authn/v1/authn.proto

+service AuthnApi {
+    rpc Authenticate (ClientAuthRequest) returns (ClientAuthResponse) {
+        option (google.api.http) = {
+        post: "/authn/v1/authenticate"


Feels like this and the next two lines should be indented. Does the formatter let you?

I just run dev/lint on this and it mostly autofixes things. I'll check.

proto/authn/v1/authn.proto

mkobetic · 2022-08-10T15:04:34Z

@neekolas, the latest is how it would look if Authn was just a Token that is part of the message/v1 protocol. WDYT?

neekolas · 2022-08-10T15:52:47Z

dev/ts/generate

We are going to want to change this to use protoc-gen-ts_proto for the authn protos to get real protobuf generated code. This is going to generate types and code for a GRPC Gateway JSON API.

Think we'll need to add the --ts_proto_opt=esModuleInterop=true flag as well

Can you suggest the change you have in mind. This stuff is black magic to me.

I'll try and commit it to this branch. It'll take a little fiddling for me to figure out too. Protoc arguments never work the way I expect them to.

You can see the changes I made here

Basically, I added the generator to the Docker image and run it as a second codegen step, since the output formats are different.

This build is getting complicated enough I am starting to wonder if using buf would make this more maintainable/readable. I hesitated to include it originally, since it's giving us very little value in xmtp-js. But maybe it makes sense here just to make this configuration more declarative.

Thank you! I'd rather postpone switching to buf for a separate PR.

Makes sense. I can take that one.

neekolas · 2022-08-10T15:54:28Z

proto/message_api/v1/authn.proto

+
+// Token is used by clients to prove to the nodes
+// that they are serving a specific wallet.
+message Token {


This looks good

mkobetic · 2022-08-10T17:27:47Z

I updated the PR description to summarize where we are atm (including Jazz's objections). Let me know if something's off.

I'd like to go ahead with this so that I can move forward on the node side. I'd appreciate some ✅ if we are far enough here.

mkobetic · 2022-08-10T17:31:18Z

proto/message_api/v1/authn.proto

+    // address of the wallet
+    string wallet_addr = 1;
+    // time when the token was generated/signed 
+    uint64 timestamp = 2; 


One thought: Should we turn this into valid_until to allow the client to control the validity expiration within node enforced limits? (e.g. let's say the nodes will only accept up to an hour or something). This would allow sensitive clients to use more restrictive expiration than the nodes allow.

It's hard for me to imagine a client developer choosing anything less than the maximum amount of time, given that the tokens are there to protect XMTP more than they are to protect the user.

I look at it differently. A stolen token is not an issue for the network, it doesn't matter who used up the limit. It is an issue for the user, since it allows someone to DOS them. So I was thinking a more paranoid user may want to reduce the expiration period at the cost of having to create tokens more often.

s/paranoid/security minded/g

I still don't see it. If we have a problem where people are able to compromise these tokens and use them for DOS, pretty sure the response is going to be for compromised users to not use the network at all. Being DOS'd for 20 minutes and 60 feels about the same to the end user: unacceptable. Especially given the likelihood that whatever exploit gave the attacker the last token could be re-used to get the new one if the user doesn't change their behavior in between.

Fair enough, I'll leave this be.

I think having a time the token was generated would be helpful into the future.

Though I could see having an addition field for expiry like in the JWT spec.

If we care about having the client be aware of when the tokens are due to expire we could always add a response header where the server would provide that value. Not sure how useful it is given that the client can always just regenerate when a request fails with a 401 status code, but that feels preferable to having the client guess when the token is going to expire.

neekolas · 2022-08-10T18:40:17Z

Great. This is going to be quite easy to integrate into my branch.

mkobetic · 2022-08-10T18:45:29Z

Sigh, as soon as I merged I realized that AuthData.timestamp should probably be AuthData.created_ns to fit the message/v1 theme. Worth fixing in a follow up?

I also note that there's also PublicKey.timestamp, but changing that would move us away from the xmtp-js stuff. Not quite sure that actually matters though.

mkobetic · 2022-08-10T19:02:17Z

And I think I tripped up with the semantic releaser thingy 🤦

github-actions · 2022-08-10T20:05:14Z

🎉 This PR is included in version 1.1.1 🎉

The release is available on:

Your semantic-release bot 📦🚀

Add authn protocol

318e527

mkobetic force-pushed the authn branch from a322cd8 to 318e527 Compare August 9, 2022 20:35

neekolas reviewed Aug 9, 2022

View reviewed changes

proto/authn/v1/authn.proto Outdated Show resolved Hide resolved

grpc

bef55df

mkobetic added 2 commits August 9, 2022 17:11

lint

753e839

response token

05dfc7d

neekolas reviewed Aug 9, 2022

View reviewed changes

proto/authn/v1/authn.proto Outdated Show resolved Hide resolved

mkobetic added 4 commits August 10, 2022 10:24

authn/v1 => Token

8f004a1

authn/v1 => Token

53fa5ec

authn/v1 => Token

a4c8b5d

lint

22f3212

tweaks

7f0ec03

neekolas reviewed Aug 10, 2022

View reviewed changes

neekolas added 3 commits August 10, 2022 09:20

build: add real protobuf generate for authn

4a3ff12

docs: add generated comments

f6f347b

Merge branch 'main' into authn

8a52efc

mkobetic commented Aug 10, 2022

View reviewed changes

neekolas approved these changes Aug 10, 2022

View reviewed changes

mkobetic merged commit 1a7490b into main Aug 10, 2022

mkobetic deleted the authn branch August 10, 2022 18:27

mkobetic mentioned this pull request Aug 10, 2022

fix: authn cleanup #8

Merged

github-actions bot added the released label Aug 10, 2022

mkobetic mentioned this pull request Aug 15, 2022

message API authn xmtp/xmtp-node-go#109

Merged

2 tasks

Add authn protocol #7

Add authn protocol #7

Conversation

mkobetic commented Aug 9, 2022 • edited Loading

mkobetic commented Aug 9, 2022

neekolas commented Aug 9, 2022

neekolas commented Aug 9, 2022

mkobetic commented Aug 9, 2022 • edited Loading

neekolas commented Aug 9, 2022

Choose a reason for hiding this comment

mkobetic Aug 9, 2022 • edited Loading

Choose a reason for hiding this comment

mkobetic Aug 9, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mkobetic Aug 10, 2022 • edited Loading

Choose a reason for hiding this comment

jazzz Aug 10, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mkobetic commented Aug 9, 2022

neekolas commented Aug 9, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mkobetic commented Aug 10, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mkobetic commented Aug 10, 2022

mkobetic Aug 10, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

neekolas Aug 10, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

neekolas commented Aug 10, 2022

mkobetic commented Aug 10, 2022

mkobetic commented Aug 10, 2022

github-actions bot commented Aug 10, 2022

mkobetic commented Aug 9, 2022 •

edited

Loading

mkobetic commented Aug 9, 2022 •

edited

Loading

mkobetic Aug 9, 2022 •

edited

Loading

mkobetic Aug 9, 2022 •

edited

Loading

mkobetic Aug 10, 2022 •

edited

Loading

jazzz Aug 10, 2022 •

edited

Loading

mkobetic Aug 10, 2022 •

edited

Loading

neekolas Aug 10, 2022 •

edited

Loading