Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Create exposed STAN channel for storing map of all active channels #1077

Open
wilstoff opened this issue Aug 2, 2020 · 5 comments

Comments

@wilstoff
Copy link

wilstoff commented Aug 2, 2020

Since Stan does not allow for wildcards, I would like to be notified whenever a new channel is created or a channel is destroyed. This way subscribers can listen to this channel for updates and potentially subscribe to them when they come online. While we could implement this outside of STAN server itself, it seems like STAN would already need to know all this information, and can publicly expose it. I don't think we'd need the full metadata (active subscriptions and such) around each channel, but just be updated when a new one is created or destroyed.

Ideally the topic would only ever store one data object (last known state, StanChannels below) so it would only expand with the number of channels created. Here's an example of the protobuf that could be stored in the topic:

message StanChannelMetadata {
  string subject = 1;
  uint64 createdTimeStamp = 2;
  ...
}

message ChannelUpdate {
  enum ChannelChangeType {
    ADDITION = 1;
    REMOVAL = 2;
    UPDATE = 3; // in case there's something else
  }
  ChannelChangeType changeType = 1;
  StanChannelMetadata channel = 2;
  string reason = 3;
}

message StanChannels {
  map<string, StanChannelMetadata> channelsBySubject = 1;
  ChannelUpdate lastChange = 2;
}
@kozlovic
Copy link
Member

kozlovic commented Aug 3, 2020

@wilstoff I like the idea and actually experimented in the past with something similar. However, I am a bit torn your approach. Having the map of all existing channels for each message stored in this "notification" channel is nice because if setting limits on this channel, you won't lose knowledge of previously created channels, but the other side of this is that for every added/remove channel, then the server would have to store a possibly very large map (if there are many channels). We also have the NATS message limit that then could get in the way because the notification message payload to contain the list could make it go over the limit.

I would have initially thought of storing simple message such as name/timestamp/action, so the notification channel could hold say: "foo/timestamp/created", "bar/timestamp/created", "foo/timestamp/deleted", "baz/timestamp/created", "foo/timestamp/created".
Since channels can be deleted and later recreated, an application wishing to reconstruct the state of current channels could consume this all and end-up with a map containing "foo", "bar" and "baz" (foo would have been inserted, removed, and inserted again). But obviously, the issue starts if when limits on this notification channel are reached, and the server drops old messages there. You would end-up with only "foo/timestamp/deleted", "baz/timestamp/created", "foo/timestamp/created" for instance if this channel had a limit of 3 messages (as an illustration). In that case, having "foo/deleted" first is not an issue, but losing information about "bar" channel is.
What are your thoughts?

@wilstoff
Copy link
Author

wilstoff commented Aug 4, 2020

So my coworker also pointed out the potential for this to blow up. To me both options aren't really great.
One hand you have 1 large message stored that can will eventually hit the large message size, unless you have a reasonable channel limit. This requires less work for each subscriber to know what channels would be available, because they wouldn't have to replay all messages in a channel (which would have more data overhead than 1 large message). With your proposal you have only changes as messages, and then only broadcast out small diffs to clients when things change, but will eventually lead to losing data for long lived subscriptions, unless something was to guarantee a message every so often. Would still require to store at least N messages where N is the number of active channels.

Another idea but probably more work, and kinda feels like implementing wildcards is this:

Have an api endpoint to get active channels and subscribe to updates, in one transaction.
GetAndSubscribeTopicUpdates

The endpoint would let you subscribe to ('BaseName') which would first give you message of all topics with 'BaseName' as a start so you would get 'BaseName.A' or 'BaseName.B' to start and then if 'BaseName.C' came up you would get notifications on it being created. With this API you could prevent durable subscriptions, and other such, since it basically is a synthetic subscription and only needs to keep track of the client id while it's alive. Or perhaps just use NATS for this, as on any network issue you could just re-ping this GetAndSubscribeTopicUpdates and it would always give you a lastknown state + updates.

With this it would allow subscriptions to filter down what they actually care about, as i don't care about every topic, but only the new ones that may come up underneath a certain "domain" like BaseName. And you may not need to actually store anything if we were to use NATS under the hood since the state of what channels exist is already stored in the STAN file store.

The initial message retrieved could still hit some large message size set on the server, but at least clients could tailor what they care about, to avoid this. A Reasonable error message could occur if someone where to request this endpoint, with a unlimited channel setting, but small message size, something like "could not retrieve list of all known channels matching X, size of message is greater than max Message Size Y".

@kozlovic
Copy link
Member

kozlovic commented Aug 6, 2020

@wilstoff I need to noodle on this a bit more. The problem I see with your proposed approach is that if the server that was handling a client crashes, and as long as the client is not disconnected, client would not know and would stop receiving updates. Imagine this setup:

S0       S1       S2 (Streaming servers)
 |        |       |
N0-------N1-------N2 (NATS servers)
          |
          C (Client)

If S0 was the one servicing C and S0 crashes, C won't know.

@wilstoff
Copy link
Author

wilstoff commented Aug 7, 2020

I am not fully aware of the internals but if we do a combined nats and nats streaming would that change anything? Which part is responsible for the clustering raft protocol? If S0 goes down normally how does S1 and S2 know? Or are you saying these are different shards and the streaming is not clustered? Yea i am not fully set on using NATs as the transport just that we use a synthetic like channel instead of something that is stored explicitly. All the info about channel creation/deletion should already be stored, as when i restart the server i sometimes see replay of channel creation and deletion for short configured channels which i assume is the raft protocol or something to resync.

@kozlovic
Copy link
Member

kozlovic commented Aug 7, 2020

I am not fully aware of the internals but if we do a combined nats and nats streaming would that change anything?

If you mean run NATS Streaming with embedded server, the answer is still no, because as you can see above, the client 'C' could still have its TCP connection to a different server in the cluster (even if Sn and Nn are now the same process).

What I discussed was in the context that we would use core NATS for getting the list of channels and following updates.

The list of channels is obviously known by the servers, the issue is how to send that to a client and notify of updates (addition/deletion) in a reliable way. If we use streaming itself, say by having a channel were we put the list/updates, then we have the issue we discussed earlier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants