-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NPE in WebSocket client subscriptions #3093
Comments
The issue is caused by the fact that |
100ms long blocks make it fail often. |
@roman-khimov, @AnnaShaleva, @AliceInHunterland, there is some race-by-design currently and I am not sure what can be done here (mostly because I want to make a fix small and reliable, no big changes). The key reason is that the subscription lock is always taken after RPC is done, (e.g. neo-go/pkg/rpcclient/wsclient.go Lines 774 to 779 in f0ae14e
This became an often issue when we started to wait for some TXs in our test runs: https://github.com/nspcc-dev/neofs-node/blob/24fc815cd4b262b625e24bac62259325aba3444c/pkg/morph/client/meta.go#L32 cause it uses subscriptions per every call: neo-go/pkg/rpcclient/waiter/waiter.go Lines 247 to 260 in f0ae14e
Every client's (Un)Subscription call does two things: an RPC call and a subscription map lock (two of maps currently). If we imagine that there is one routine that tries to subscribe (A) and one routine that tries to unsubscribe (B), the following sequence can happen:
1The easiest solution is to put subscription RPC inside 2The second idea is to make the subscription ID be smth like the request ID (the later subscription call is made the bigger its ID is always). I am not sure it may happen, cause there is some logic about the subscription numbers: neo-go/pkg/services/rpcsrv/server.go Lines 2776 to 2799 in f0ae14e
3Also, I think it is possible to make every subscription completely unique and not be a simple number. This may be some hash or time-dependent thing, or requests-dependent, is not important, it just has to be unique for every subscription and has to make conflicts impossible. Thoughts? |
Server can return any IDs it wants and client has to live with that, so any server-side "fix" is not appropriate. Simple lock extension can affect event processing since it has to access the same map as well. What we need in fact is to have the same order of events as we have on the server. And it's not a simple thing since two concurrent requests can be served in any order by the server. Still, we do have some order in |
The other idea is to fetch the subscriber before |
Every client's (Un)Subscription call does two things: an RPC call and a subscription map lock (two of maps currently). If we imagine that there is one routine that tries to subscribe (A) and one routine that tries to unsubscribe (B), the following sequence can happen: 0. Current number of subscriptions is X 1. B does an RPC and makes number of subscriptions X-1 2. A does an RPC and makes number of subscriptions X again 3. A holds subscription locks and rewrites client's subscription state (subscription with ID X now points to a different channel; channel that was registered by B is lost and is not related to any real subscription but is still included in the `receivers` map) 4. B holds subscription locks and drops subscription X (first, it is an error and we have just lost a subscription that we think was made successfully second, we have lost a channel in the `receivers` map, and no corresponding subscription points to it) 5. X subscription is received by the WS client (in practice it is a new block, 100ms, quite often to be sure this issue happens every hour), we range through the receivers, see no corresponding subscription, and panic. Closes #3093. Signed-off-by: Pavel Karpy <[email protected]>
caught suddenly while debugging nspcc-dev/neofs-node#2444, don’t have a special context. I subscribe to new blocks and notary requests
may be i violated some strict requirement on my side, but didn't find it after a quick look
The text was updated successfully, but these errors were encountered: