-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add server configuration options for mandatory client connection options #1022
Comments
I have transferred this issue since it is related to NATS Streaming, not NATS Server. Our client libraries usually set a default for If we were to change the protocol to include this parameter in the connect request the server could simply reject the connection if the value is not set. However, this would be a compatibility issue since the server would not be able to tell the difference between old clients that do not pass this value to the connect request, and new client that incorrectly had user set it to 0. This report has several asks, so it may become messy, but regarding the number of reconnect, what are you suggesting? For instance, with the issue of We understand and want to make sure that the NATS cluster stays healthy, but some of the proposed approaches won't work. We need to discuss this more. |
The clientID would solve the reconnection issue for us, if it is mandatory, but it is not. You can connect to a NATS server without mention a clientid. The NATS server will generate an own ID for the client and the current connection. So everytime the same client creates a connection it gets a new id and the new connection can not be blocked as duplicated. |
@wutkemtt I have transferred this issue to NATS Streaming because you referenced the issue being a client that did not set MaxPubInflight option, and this is only relevant to NATS Streaming, not core NATS. And as I mentioned, this option is a client level option, not server. As I hinted before, even if the server were to detect that a remote IP has been connecting/disconnecting very quickly in the past say few seconds, it (the server) could hold the connection after the accept but not do anything with it, but a rogue client can always close its connection or simply try to connect again: this would still use up all TCP ports. The bottom line is that if in this case the filtering has to be done upstream. I think that you had an issue that after investigation it turned out to be an application flooding the NATS Streaming server because that application was not setting MaxPubInflight. We could make sure that if you are using one of the library that does not set this as a default, we fix those libraries. But ultimately, you will need to vet your applications and make sure they behave correctly. I don't see currently an approach that would have prevented the issue you had, but I am surely opened to suggestions. |
The MaxPubInflight issue was the root cause for the flooding of our nats streaming instance. The reconnect problem is also based on a fault client implementation. So it seems, the only way to prevent harm to our nats cluster and to prevent thousands of euro costs when the system break is to "certify" the client implementations of all participants wanting to send data over the nats cluster. |
@wutkemtt I am sorry if my answer sounded like "this is your problem". I had left the issue opened because I wanted us to discuss the issue and possible approaches to fix the problem. I did not have an answer on how to prevent the issue, but does not mean that we would not find one. So I am re-opening this and let's brainstorm and try to find a solution. |
Sorry, sounded more like can't be solved in NATS server. |
Feature Requests
Use Case:
Misconfigured clients are able to destabilize or crash a whole nats cluster with missing MaxPubInFlight option or too many re-connection attempts (thundering herd).
Proposed Change:
The NATS server should be able to limit the number of connections from a single client, preventing it to create a new connection every time a reconnect happens, while the old connection is still there.
Furthermore it should be able to block client connections if specific connection features are not set, especially MaxPubInFlight. Unset this option leads to enormous caching amount on NATS-Streaming server side leading to unresponsive clusters and dropped client connections when requesting a huge number of messages.
Who Benefits From The Change(s)?
All NATS server / cluster operators and their customers benefit from a stable cluster, who can defend itself against "rogue" clients.
Alternative Approaches
The text was updated successfully, but these errors were encountered: