-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add authn protocol #7
Conversation
@neekolas I'm not entirely sure what to do on the TS side. It doesn't seem like I should just add
to |
Ya, we should probably move to namespacing the exports. |
Although we don't have to export anything right now. All we are ultimately going to care about from Typescript is whatever GRPC Gateway API gets exposed. We shouldn't need any typescript for the actual protos. |
Hrm, this will be a problem, when I run
|
@mkobetic There is a hack listed in this issue, but it's not great. The suggestion is just to modify the generator and make it register itself as supporting optional fields when it doesn't. Given that we don't actually have optional fields (I'm not entirely sure why this is triggered, but I suspect it's because Open to better suggestions. |
proto/authn/v1/authn.proto
Outdated
message ClientAuthResponse { | ||
oneof union { | ||
bytes token = 1; // if auth was successful | ||
string error_str = 2; // if auth was not successful |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't we just use GRPC's native status codes, with custom messages, to pass error responses?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can, I'm not quite sure how that works. I'm still just playing with this, trying to figure out what I want to do.
At this point I'm thinking the token could really be just a random number, that we associate with the wallet when auth succeeds (this is the replacement for peer Id in peer store). The client will have to pass the token along with every request to the API and until the token is purged from the store (purgeLoop) the requests will be allowed. I think this would be an equivalent of what we had.
But then it gets interesting what happens when we purge the token and the connection is still active, somehow the client would have to re-engage the auth flow to get a new token. Makes me wonder if we couldn't just use the AuthData bundled as a token with every request to begin with. Then we don't have to worry about the protocol interplay and requests will be automatically portable between nodes too. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify the token would be produced by the client and it would sign a timestamp and wallet address using the identity key signed by the wallet (very much like what Jazz had in the authn request). The server would just check the token on every request. If the token is too old (or from the future) or signatures don't check out, the request is rejected (probably using GRPC interceptors to implement this). That's it. So there wouldn't even be a separate Authn API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That doesn't sound crazy to me. Little bit of extra overhead verifying all the signatures on each publish, but there's no network calls or anything like that. I like that it's stateless and portable.
Apps would be able to generate tokens infinitely into the future, but that was possible with the previous flow as well (so long as they kept the same peerID). Probably not our biggest worry.
Wdyt @jazzz?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can reject tokens with future timestamps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the timestamp basically a "slowly changing network variable"? I don't see stockpiling as an issue. There's no point for a valid client to stockpile tokens. There's nothing to gain from it other then increasing a chance of someone stealing them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
effectively establishing an upper and lower bound on the validity of tokens. This would mean when the rest of the malicious client mitigation work is completed we won't need to worry about the risk of legacy tokens.
The bounds I want to set with this design are
- node_time < token.timestamp + expiration(1h?)
- token.timestamp < node_time
So a token is only valid between token.timestamp and token.timestamp + expiration as interpreted in node time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no point for a valid client to stockpile tokens.
Agreed but I'm thinking in the context of a malicious client.
Isn't the timestamp basically a "slowly changing network variable"?
Time is deterministic, and since the timestamp field is client supplied and untrustworthy theres no guarantee of when that token was generated. By adding a random variable to be included, we build confidence that the token was generated after the challenge was made public.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are other more substantial issues that need to be addressed before this is the limiting factor, so I'm happy to revisit it later. However having a "proof of life" style mechanic that establishes a lowerbound on when the token was generated would reduce risk of token abuse
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed but I'm thinking in the context of a malicious client.
At least right now, the malicious client would also have the IdentityKey
s needed to generate new tokens on demand.
It's because of the optional Signature on the PublicKey. We can actually remove optional on it since we require the wallet signature anyway. That's also why I dropped the separate signature field from the message, we my as well use the field on the PublicKey struct (I understand that Jazz was trying to preserve the original definition, but I don't think we really need to in this case). So I just removed optional on that field and it's not complaining now. |
@mkobetic OK, great. |
proto/authn/v1/authn.proto
Outdated
service AuthnApi { | ||
rpc Authenticate (ClientAuthRequest) returns (ClientAuthResponse) { | ||
option (google.api.http) = { | ||
post: "/authn/v1/authenticate" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feels like this and the next two lines should be indented. Does the formatter let you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just run dev/lint
on this and it mostly autofixes things. I'll check.
@neekolas, the latest is how it would look if Authn was just a Token that is part of the message/v1 protocol. WDYT? |
dev/ts/generate
Outdated
docker run --rm -i -v${PWD}:/code xmtp/protoc -I=./proto --grpc-gateway-ts_out=ts message_api/v1/message_api.proto message_api/v1/authn.proto |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are going to want to change this to use protoc-gen-ts_proto
for the authn
protos to get real protobuf generated code. This is going to generate types and code for a GRPC Gateway JSON API.
Think we'll need to add the --ts_proto_opt=esModuleInterop=true
flag as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you suggest the change you have in mind. This stuff is black magic to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try and commit it to this branch. It'll take a little fiddling for me to figure out too. Protoc arguments never work the way I expect them to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can see the changes I made here
Basically, I added the generator to the Docker image and run it as a second codegen step, since the output formats are different.
This build is getting complicated enough I am starting to wonder if using buf
would make this more maintainable/readable. I hesitated to include it originally, since it's giving us very little value in xmtp-js
. But maybe it makes sense here just to make this configuration more declarative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! I'd rather postpone switching to buf for a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I can take that one.
|
||
// Token is used by clients to prove to the nodes | ||
// that they are serving a specific wallet. | ||
message Token { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good
I updated the PR description to summarize where we are atm (including Jazz's objections). Let me know if something's off. I'd like to go ahead with this so that I can move forward on the node side. I'd appreciate some ✅ if we are far enough here. |
// address of the wallet | ||
string wallet_addr = 1; | ||
// time when the token was generated/signed | ||
uint64 timestamp = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thought: Should we turn this into valid_until
to allow the client to control the validity expiration within node enforced limits? (e.g. let's say the nodes will only accept up to an hour or something). This would allow sensitive clients to use more restrictive expiration than the nodes allow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's hard for me to imagine a client developer choosing anything less than the maximum amount of time, given that the tokens are there to protect XMTP more than they are to protect the user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I look at it differently. A stolen token is not an issue for the network, it doesn't matter who used up the limit. It is an issue for the user, since it allows someone to DOS them. So I was thinking a more paranoid user may want to reduce the expiration period at the cost of having to create tokens more often.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/paranoid/security minded/g
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't see it. If we have a problem where people are able to compromise these tokens and use them for DOS, pretty sure the response is going to be for compromised users to not use the network at all. Being DOS'd for 20 minutes and 60 feels about the same to the end user: unacceptable. Especially given the likelihood that whatever exploit gave the attacker the last token could be re-used to get the new one if the user doesn't change their behavior in between.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, I'll leave this be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think having a time the token was generated would be helpful into the future.
Though I could see having an addition field for expiry like in the JWT spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we care about having the client be aware of when the tokens are due to expire we could always add a response header where the server would provide that value. Not sure how useful it is given that the client can always just regenerate when a request fails with a 401 status code, but that feels preferable to having the client guess when the token is going to expire.
Great. This is going to be quite easy to integrate into my branch. |
Sigh, as soon as I merged I realized that I also note that there's also |
And I think I tripped up with the semantic releaser thingy 🤦 |
🎉 This PR is included in version 1.1.1 🎉 The release is available on: Your semantic-release bot 📦🚀 |
This PR proposes a fairly significant change to our authn story. Instead of running a separate protocol, we would just add a
Token
type, carrying pretty much the same payload asauthn.Request
does, to themessage/v1
protocol. The Token would need to be attached to every request and validated by the nodes (again same type of validation as authn did), so the validation would happen on every request instead of on every run of the authn protocol (although significant chunk of this could be cached since the token remains the same during its validity period).The token can be carried as a standard HTTP/GRPC authorization header (The gateway maps these automatically https://github.com/grpc-ecosystem/grpc-gateway#mapping-grpc-to-http). The server side processing can be implemented with a GRPC interceptor.
Upsides:
Downsides: